Am Tue, Sep 07, 2021 at 09:52:42PM +0200 schrieb Martin Pieuchot:
> On 07/09/21(Tue) 21:47, Patrick Wildt wrote:
> > Am Tue, Sep 07, 2021 at 02:43:22PM +0200 schrieb Patrick Wildt:
> > > Am Mon, Sep 06, 2021 at 09:43:29PM +0200 schrieb Patrick Wildt:
> > > > Am Fri, Jul 30, 2021 at 07:55:29PM +0200 schrieb Alexander Bluhm:
> > > > > On Mon, Jul 26, 2021 at 08:12:39AM -0500, Scott Cheloha wrote:
> > > > > > On Fri, Jun 25, 2021 at 06:09:27PM -0500, Scott Cheloha wrote:
> > > > > > 1 month bump.  I really appreciate the tests I've gotten so far, 
> > > > > > thank
> > > > > > you.
> > > > > 
> > > > > On my Xeon machine it works and all regress tests pass.
> > > > > 
> > > > > But it fails on my old Opteron machine.  It hangs after attaching
> > > > > cpu1.
> > > > 
> > > > This seems to be caused by contention on the mutex in i8254's gettick().
> > > > 
> > > > With Scott's diff, delay_func is i8254_delay() on that old AMD machine.
> > > > Its gettick() implementation uses a mutex to protect I/O access to the
> > > > i8254.
> > > > 
> > > > When secondary CPUs come up, they will wait for CPU0 to let them boot up
> > > > further by checking for a flag:
> > > > 
> > > >         /*
> > > >          * We need to wait until we can identify, otherwise dmesg
> > > >          * output will be messy.
> > > >          */
> > > >         while ((ci->ci_flags & CPUF_IDENTIFY) == 0)
> > > >                 delay(10);
> > > > 
> > > > Now that machine has 3 secondary cores that are spinning like that.  At
> > > > the same time CPU0 waits for the core to come up:
> > > > 
> > > >         /* wait for it to identify */
> > > >         for (i = 2000000; (ci->ci_flags & CPUF_IDENTIFY) && i > 0; i--)
> > > >                 delay(10);
> > > > 
> > > > That means we have 3-4 cores spinning just to be able to delay().  Our
> > > > mutex implementation isn't fair, which means whoever manages to claim
> > > > the free mutex wins.  Now if CPU2 and CPU3 are spinning all the time,
> > > > CPU1 identifies and needs delay() and CPU0 waits for CPU1, maybe the
> > > > one that needs to make progress never gets it.
> > > > 
> > > > I changed those delay(10) in cpu_hatch() to CPU_BUSY_CYCLE() and it went
> > > > ahead a bit better instead of hanging forever.
> > > > 
> > > > Then I remembered an idea something from years ago: fair kernel mutexes,
> > > > so basically mutexes implemented as ticket lock, like our kerne lock.
> > > > 
> > > > I did a quick diff, which probably contains a million bugs, but with
> > > > this bluhm's machine boots up well.
> > > > 
> > > > I'm not saying this is the solution, but it might be.
> > > > 
> > > > Patrick
> > > 
> > > Cleaned the diff up a little, changes since last time:
> > > 
> > > * Rename the struct members to be the same as mplock.
> > > * Change the code to use ticket/user numbers like mplock.  This
> > >   has one obvious downside: If a mutex is not initialized, trying
> > >   to get this mutex will result in a hang.  At least that just let
> > >   me find some uninitialized mutexes.
> > > * More consistent use of the 'ci' variable.
> > > * Definitely compiles with/without DIAGNOSTIC.
> > > * Made sure mtx_enter() still has the membar.
> > > * No need for READ_ONCE() when members are volatile.
> > > 
> > > Apart from being fair, this diff also changes the behaviour while
> > > spinning for a lock.  Previously mtx_enter called mtx_enter_try
> > > in a loop until it got the lock.  mtx_enter_try does splraise,
> > > try lock, splx.  This diff currently spins with the SPL raised,
> > > so that's a change in behaviour.  I'm sure I can change the diff
> > > to splraise/splx while looping, if we prefer that behaviour.
> > > 
> > > Patrick
> 
> This change makes sense on its own as the contention is switching away
> from KERNEL_LOCK() to mutexes.
> 
> Note that hppa has its own mutex implementation in case somebody wants
> to keep in sync.
>  
> > make -j17 seems to have used less system time, so that seemed to have
> > made the machine slightly faster:
> > 
> > old: make -j17  1160.01s user 3244.58s system 1288% cpu 5:41.96 total
> > new: make -j17  1171.80s user 3059.67s system 1295% cpu 5:26.65 total
> 
> Is it with -current or with the UVM unlocking diff that put more
> pressure on mutxes?

It's with the UVM unlocking diff that put more pressure on mutexes.

> > I'll change the diff to do splraise/splx while looping, to make the
> > behaviour more similar to before, and then re-do my testing.
> 
> That'd be nice.  You could also start a new thread to get attention of
> more people, maybe dlg@, visa@ or kettenis@ have an opinion on this.

You can read my mind. :)

Reply via email to