On Thu, Apr 06, 2017 at 01:46:16AM -0700, Davidlohr Bueso wrote: > +/* > + * Range/interval rw-locking > + * ------------------------- > + * > + * An interval tree of locked and to-be-locked ranges is kept. When a new > range > + * lock is requested, we add its interval to the tree and store number of > + * intervals intersecting it to 'blocking_ranges'.
You're again confusing semantics with implementation here. > For the reader case, > + * 'blocking_ranges' is only accounted for if the intersecting range is > + * marked as a writer. To achieve mutual exclusion of arbitrary ranges, we > + * guarantee that task is blocked until there are no overlapping ranges in > the > + * tree. > + * > + * When a range is unlocked, we again walk intervals that overlap with the > + * unlocked one and decrement their 'blocking_ranges'. Naturally, we wake up > + * owner of any range lock whose 'blocking_ranges' drops to 0. Wakeup order > + * therefore relies on the order of the interval tree -- as opposed to a > + * more traditional fifo mechanism. Which order is that? (I could of course go read the interval tree code, but it shouldn't be too much effort to mention it here). > There is no lock stealing either, which > + * prevents starvation and guarantees fairness. So no lock stealing has always been very bad for performance. So are you sure people will not frob this back in? > +#ifndef _LINUX_RANGE_RWLOCK_H Still don't like the name... rwlock_t is a spinlock.