On Wed, May 14, 2014 at 02:58:05AM -0400, Carlos O'Donell wrote:
> >>    The handling of -EDEADLOCK is even more impressive. Instead of
> >>    propagating it to the caller something in the guts of glibc calls 
> >> pause().
> >>
> >>      futex(0x601300, FUTEX_LOCK_PI_PRIVATE, 1) = -1 EDEADLK (Resource 
> >> deadlock avoided)
> >>      pause(
> >>
> > 
> > Gotta love comments like these though - such trust!:
> > 
> >     /* The mutex is locked.  The kernel will now take care of
> >            everything. */
> > 
> > IIRC, glibc takes the approach that if this operation fails, there is no 
> > way for
> > it to recovery "properly", and so it chooses to:
> > 
> >     /* Delay the thread indefinitely. */
> > 
> > I believe the thinking goes that if we get to here, then the lock is in an
> > inconsistent state (between kernel and userspace). I don't have an answer 
> > for
> > why pausing forever would be preferable to returning an error however...
> 
> What error would we return?

EDEADLK is a valid user return for pthread_mutex_lock() as per:

  
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html

> This particular case is a serious error for which we have no good error code
> to return to userspace. It's an implementation defect, a bug, we should 
> probably
> assert instead of pausing.

No, its perfectly fine to have a lock sequence abort with -EDEADLK.
Userspace should release its locks and re-attempt.

You can implement usable locking schemes using this error, like
wound/wait locking.

> We can't cancel the stuck thread because pthread_mutex_lock is not a 
> cancellation
> point.
> 
> In practice the rest of the application can make forward progress with a 
> single
> thread stuck. You can attach the debugger and inspect state, so it's useful
> from that perspective.

That's just totally braindead. Return EDEADLK to userspace already, let
the user deal with it.

Attachment: pgpHV6j8vfhzb.pgp
Description: PGP signature

Reply via email to