Hi On Fri, Apr 12, 2013 at 4:47 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Fri, Apr 12, 2013 at 3:32 PM, Anatol Pomozov > <anatol.pomo...@gmail.com> wrote: >> >> Here is timeline for the crash in case if kset_find_obj() searches for >> an object tht nobody holds and other thread is doing kobject_put() >> on the same kobject: >> >> THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put()) >> splin_lock() >> atomic_dec_return(kobj->kref), counter >> gets zero here >> ... starts kobject cleanup .... >> spin_lock() // WAIT thread A in >> kobj_kset_leave() >> iterate over kset->list >> atomic_inc(kobj->kref) (counter becomes 1) >> spin_unlock() >> spin_lock() // taken >> // it does not know that thread A >> increased counter so it >> remove obj from list >> spin_unlock() >> vfree(module) // frees module object >> with containing kobj >> >> // kobj points to freed memory area!! >> koubject_put(kobj) // OOPS!!!! > > This is a much more generic bug in kobjects, and I would hate to add > some random workaround for just one case of this bug like you do. The > more fundamental bug needs to be fixed too. > > I think the more fundamental bugfix is to just fix kobject_get() to > return NULL if the refcount was zero, because in that case the kobject > no longer really exists. > > So instead of having > > kref_get(&kobj->kref); > > it should do > > if (!atomic_inc_not_zero(&kobj->kref.refcount)) > kobj = NULL;
Does it make sense to move it to a separate function in kref.h? /** Useful when kref_get is racing with kref_put and refcounter might be 0 */ int kref_get_not_zero(kref* ref) { return atomic_inc_not_zero(&kref->refcount); } or maybe instead change default behavior of kref_get() to atomic_inc_not_zero and force callers check the return value from kref_get()? > > and I think that should fix your race automatically, no? Proper patch > attached (but TOTALLY UNTESTED - it seems to compile, though). > > The problem is that we lose the warning for when the refcount is zero > and somebody does a kobject_get(), but that is ok *assuming* that > people actually check the return value of kobject_get() rather than > just "know" that if they passed in a non-NULL kobj, they'll get it > right back. > > Greg - please take a look... I'm adding Al to the discussion too, > because Al just *loooves* these kinds of races ;) > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/