On Thu, 2016-11-03 at 16:27 -0600, Bart Van Assche wrote:
> On 10/28/2016 08:08 PM, James Bottomley wrote:
> > This is a deadlock caused by an inversion issue in kernfs (suicide
> > vs
> > non-suicide removes); so fixing it in SCSI alone really isn't
> > appropriate.  I count at least five other subsystems all using this
> > mechanism, so they'll all be similarly affected.  It looks to be
> > fairly
> > simply fixable inside kernfs, so please fix it that way.
> 
> Hello James,
> 
> How about fixing this deadlock with the below patch?

Without a description file, it's a bit hard to follow, but reading the
patch it's somewhat cumbersome and also incomplete: you change the API
for self removal, so every driver that uses the
device_remove_file_self() scheme (about five of them) has to be
changed.  I was actually thinking of something much simpler:

You know after

if (device_remove_file_self(dev, attr))

returns true that s_active is held and also that KERNFS_SUICIDAL is set
on the node, so the non-self remove paths can simply check for this
flag and return without descending into __kernfs_remove(), which would
mean they never take s_active.  That means we never get the inversion.

This should be about a 10 line patch because there are only two places
this check needs to be done: kernfs_remove() and
 kernfs_remove_by_name_ns().

It has the advantage that the api isn't altered so all the other driver
callers simply pick up the benefit.

There may be a bit of back end cleanup to do on the eventual parent
directory removal because I think it needs to wait for all the suicidal
nodes to signal they've finally suicided.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to