hi,

On Mon, Jan 20, 2020 at 6:07 AM Andrew Doran <a...@netbsd.org> wrote:

> In my testing lately I've often run into a high context switch rate over RW
> locks.
>
> The RW locks we have now are "half adaptive" so to speak.  Lock waiters can
> spin while the lock is write locked and the owner is running, because the
> owner can be identified in that case, and then it can be determined whether
> or not the owner is running on a CPU: if so, the waiter spins, if not, it
> sleeps.
>
> The same does not happen if a RW lock is read locked and someone wants to
> take a write hold, because in that case there is no identifiable owner, and
> so it can't be determined whether the owner(s) are running or not.  In that
> case the lock waiter always sleeps.  It then has a kind of snowball effect
> because it delays the lock getting back to its unheld state, and every
> thread arriving during the aftermath sleeps too.
>
> If you only take mostly read holds, or only take mostly write holds on an
> individual RW lock this isn't a big deal because it occurs rarely.  But if
> you introduce more of a mix of read/write you start to get the poor
> behaviour.  The correct answer here may be "don't do that", but it's not a
> very practical one.
>
> I'd like to reduce this blocking activity, so here is a patch against
> -current that implements the adaptive behaviour in the reader case too, by
> having each LWP holding RW locks track those holds, and if going to sleep,
> cause any held RW locks to be temporarily put into "sleep mode", with the
> default being "spin mode".  In my testing with mixed vnode locks this works
> well.
>
>         http://www.netbsd.org/~ad/2020/rwlock.diff


i guess in rw_downgrade
newown = (owner & RW_NODEBUG) | RW_SPIN;

should be
newown = (owner & (RW_NODEBUG | RW_SPIN));

as the owner might have failed to record the lock on acquisition.



>
>
> This removes the assembly stubs but the compiler seems to do a pretty good
> job on the C ones I added, and in microbenchmarking it the difference is
> tiny.  I've only included the amd64 MD changes in the diff.
>
> In trying this out on builds it seems that tracking at most 2 holds catches
> ~99.8% of activity, but I have made the limit 4 (it's a soft limit, if
> exceeded all further taken locks are put into "sleep mode").
>
> I also added a rw_owner_running() for the pagedaemon, to match mutexes (not
> used yet).
>
> Thoughts?
>
> cheers,
> Andrew
>

Reply via email to