On Sun, Jun 02, 2019 at 10:29:01AM -0300, Martin Pieuchot wrote:
> On 02/06/19(Sun) 04:30, Visa Hankala wrote:
> > On Sat, Jun 01, 2019 at 07:04:23PM -0300, Martin Pieuchot wrote:
> > > On 01/06/19(Sat) 23:22, Mark Kettenis wrote:
> > > > > Date: Sat, 1 Jun 2019 17:32:52 -0300
> > > > > From: Martin Pieuchot <m...@openbsd.org>
> > > > > 
> > > > > Currently it isn't safe to call mtx_enter_try(9) if you're already
> > > > > holding the mutex.  That means it isn't safe to call that function
> > > > > in hardclock(9), like with `windup_mtx'.  That's why the mutex needs
> > > > > to be initialized as IPL_CLOCK.
> > > > > 
> > > > > I'm working on removing the SCHED_LOCK() from inside hardclock(9).
> > > > > That leads me to wonder if I should initialize all mutexes to 
> > > > > IPL_SCHED,
> > > > > possibly blocking clock interrupts, or if we should change the mutex 
> > > > > API
> > > > > to allow mtx_enter_try(9) to deal with recursion.
> > > > > 
> > > > > The diff below removes the recursion check for mtx_enter_try(9).
> > > > > 
> > > > > Comments?  Oks?
> > > > 
> > > > My initial reaction is that if you're trying to lock when you already
> > > > have the lock, there is something wrong with your locking strategy and
> > > > that this is something we don't want.
> > > 
> > > Could you elaborate?  Are you saying that preventing hardclock(9) to run
> > > is the way to move forward to unlock its internals?  Why isn't that
> > > strategy wrong?
> > > 
> > > In the `windup_mtx' case, does it matter if the mutex is taken by
> > > another CPU or by myself?  What's the problem when CPU0 is one holding
> > > the lock?
> > 
> > mutex(9) is not and should not become recursive. Recursive locking
> > works when it is voluntary. If recursion was allowed with interrupts,
> > the CPU could re-enter the critical section at any moment, possibly
> > seeing inconsistent state or breaking assumptions made by the original
> > entry.
> 
> I don't understand how your answer is related to my diff :(
> 
> Why allowing mtx_enter_try(9) to fail and not to panic(9) when the CPU
> executing the code already owns the mutex is wrong?
> 
> Is that what you call recursion?  Not entering the critical section in
> the interrupt instead of blocking the interrupt?
> 
> How is that different from not entering the critical section because
> another CPU is holding the mutex?

To me, it looks very suspicious if a CPU tries to lock a mutex that it
already holds because it indicates that there is a potential recursion.
My stance is that such a locking behaviour is a design error in general
and the current way of checking recursion in mtx_enter_try() is sound.

If a mutex can be taken on multiple interrupt priority levels, the lock
has to be configured to block the highest level, and mutexes used
in hardclock() are no exception.

Reply via email to