On Tue, 17 Feb 2015, Josh Poimboeuf wrote:

> On Mon, Feb 16, 2015 at 05:06:15PM +0100, Miroslav Benes wrote:
> > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > 
> > > On Fri, Feb 13, 2015 at 05:17:10PM +0100, Miroslav Benes wrote:
> > > > On Fri, 13 Feb 2015, Josh Poimboeuf wrote:
> > > > > Hm, even with Jiri Slaby's suggested fix to add the completion to the
> > > > > unregister path, I still get a lockdep warning.  This looks more 
> > > > > insidious,
> > > > > related to the locking order of a kernfs lock and the klp lock.  I'll 
> > > > > need to
> > > > > look at this some more...
> > > > 
> > > > Yes, I was afraid of this. Lockdep warning is a separate bug. It is 
> > > > caused 
> > > > by taking klp_mutex in enabled_store. During rmmod klp_unregister_patch 
> > > > takes klp_mutex and destroys the sysfs structure. If somebody writes to 
> > > > enabled just after unregister takes the mutex and before the sysfs 
> > > > removal, he would cause the deadlock, because enabled_store takes the 
> > > > "sysfs lock" and then klp_mutex. That is exactly what the lockdep tells 
> > > > us 
> > > > below.
> > > > 
> > > > We can look for inspiration elsewhere. Grep for s_active through git 
> > > > log 
> > > > of the mainline offers several commits which dealt exactly with this. 
> > > > Will 
> > > > browse through that...
> > > 
> > > Thanks Miroslav, please let me know what you find.  It wouldn't surprise
> > > me if this were a very common problem.
> > > 
> > > One option would be to move the enabled_store() work out to a workqueue
> > > or something.
> > 
> > Yes, that is one possibility. It is not the only one.
> > 
> > 1. we could replace mutex_lock in enabled_store with mutex_trylock. If the 
> > lock was not acquired we would return -EBUSY. Or could we 'return 
> > restart_syscall' (maybe after some tiny msleep)?
> 
> Hm, doesn't that still violate the locking order rules?  I thought locks
> always had to be taken in the same order -- always sysfs before klp, or
> klp before sysfs.  Not sure if there would still be any deadlocks
> lurking, but lockdep might still complain.

Yes, but in this case you break the possible deadlock order. From the 
lockdep report...

   CPU0                    CPU1
   ----                    ----
   lock(klp_mutex);
                           lock(s_active#70);
                           lock(klp_mutex);
   lock(s_active#70);

CPU0 called klp_unregister_patch and CPU1 possible enabled_store in a race 
window. Deadlock wouldn't be there because trylock(klp_mutex) on CPU1 
would return 0 and enabled_store thus EBUSY. And in every other scenario 
trylock would prevent deadlock too or klp_unregister_patch would wait on 
klp_mutex (I hope I did not miss anything).

I tried it and lockdep did not complain. 

And you can look at 36c38fb7144aa941dc072ba8f58b2dbe509c0345 or 
5e33bc4165f3edd558d9633002465a95230effc1. They dealt with it the same way 
(but it does not mean anything).

It would need more testing to be sure though.

> > 2. we could reorganize klp_unregister_patch somehow and move sysfs removal 
> > out of mutex protection.
> 
> Yeah, I was thinking about this too.  Pretty sure we'd have to remove
> both the sysfs add and the sysfs removal from mutex protection.  I like
> this option if we can get it to work.

Yes, why not.

Maybe someone else will share his opinion on this...

Miroslav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to