On Wed, May 14, 2014 at 02:58:05AM -0400, Carlos O'Donell wrote: > >> The handling of -EDEADLOCK is even more impressive. Instead of > >> propagating it to the caller something in the guts of glibc calls > >> pause(). > >> > >> futex(0x601300, FUTEX_LOCK_PI_PRIVATE, 1) = -1 EDEADLK (Resource > >> deadlock avoided) > >> pause( > >> > > > > Gotta love comments like these though - such trust!: > > > > /* The mutex is locked. The kernel will now take care of > > everything. */ > > > > IIRC, glibc takes the approach that if this operation fails, there is no > > way for > > it to recovery "properly", and so it chooses to: > > > > /* Delay the thread indefinitely. */ > > > > I believe the thinking goes that if we get to here, then the lock is in an > > inconsistent state (between kernel and userspace). I don't have an answer > > for > > why pausing forever would be preferable to returning an error however... > > What error would we return?
EDEADLK is a valid user return for pthread_mutex_lock() as per: http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html > This particular case is a serious error for which we have no good error code > to return to userspace. It's an implementation defect, a bug, we should > probably > assert instead of pausing. No, its perfectly fine to have a lock sequence abort with -EDEADLK. Userspace should release its locks and re-attempt. You can implement usable locking schemes using this error, like wound/wait locking. > We can't cancel the stuck thread because pthread_mutex_lock is not a > cancellation > point. > > In practice the rest of the application can make forward progress with a > single > thread stuck. You can attach the debugger and inspect state, so it's useful > from that perspective. That's just totally braindead. Return EDEADLK to userspace already, let the user deal with it.
pgpHV6j8vfhzb.pgp
Description: PGP signature