Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Wed, 2018-01-17 at 21:04 +0800, Ming Lei wrote: > On Wed, Jan 17, 2018 at 10:17 AM, Bart Van Assche >wrote: > > Sorry but I think what you wrote is wrong. kobject_del(>kobj) waits > > until all > > ongoing sysfs callback methods, including elv_iosched_store(), have > > finished and > > prevents that any new elv_iosched_store() calls start. That is why I think > > the > > above changes do not reintroduce the race fixed by commit e9a823fb34a8 > > ("block: > > fix warning when I/O elevator is changed as request_queue is being > > removed"). > > But your patch basically reverts commit e9a823fb34a8, and I just saw the > warning > again after applying your patch in my stress test of switching elelvato: > > [ 225.999505] kobject_add_internal failed for iosched (error: -2 parent: > queue) > [ 225.999521] WARNING: CPU: 0 PID: 12707 at lib/kobject.c:244 > kobject_add_internal+0x236/0x24c > [ 225.999566] Call Trace: > [ 225.999570] kobject_add+0x9e/0xc5 > [ 225.999573] elv_register_queue+0x35/0xa2 > [ 225.999575] elevator_switch+0x7a/0x1a4 > [ 225.999577] elv_iosched_store+0xd2/0x103 > [ 225.999579] queue_attr_store+0x66/0x82 > [ 225.999581] kernfs_fop_write+0xf3/0x135 > [ 225.999583] __vfs_write+0x31/0x142 > [ 225.999591] vfs_write+0xcb/0x16e > [ 225.999593] SyS_write+0x5d/0xab > [ 225.999595] entry_SYSCALL_64_fastpath+0x1a/0x7d The above call stack means that sysfs_create_dir_ns() returned -ENOENT because kobj->parent->sd was NULL (where kobj is the elevator kernel object). That's probably becase sysfs_remove_dir() clears the sd pointer before performing the actual removal. From fs/sysfs/dir.c: spin_lock(_symlink_target_lock); kobj->sd = NULL; spin_unlock(_symlink_target_lock); if (kn) { WARN_ON_ONCE(kernfs_type(kn) != KERNFS_DIR); kernfs_remove(kn); } In the past these two actions were performed in the opposite order. See also commit aecdcedaab49 ("sysfs: implement kobj_sysfs_assoc_lock"). Anyway, I will rework this patch. Bart.
Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Wed, Jan 17, 2018 at 10:17 AM, Bart Van Asschewrote: > On Wed, 2018-01-17 at 09:23 +0800, Ming Lei wrote: >> On Wed, Jan 17, 2018 at 8:03 AM, Bart Van Assche >> wrote: >> > On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: >> > > Therefore it seems to me that all queue_attr_{show,store} are racey vs >> > > blk_unregister_queue() removing the 'queue' kobject. >> > > >> > > And it was just that __elevator_change() was myopicly fixed to address >> > > the race whereas a more generic solution was/is needed. But short of >> > > that more generic fix your change will reintroduce the potential for >> > > hitting the issue that commit e9a823fb34a8b fixed. >> > > >> > > In that light, think it best to leave blk_unregister_queue()'s >> > > mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update >> > > queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding >> > > sysfs_lock. >> > > >> > > Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from >> > > __elevator_change(). >> > >> > Thanks Mike for the feedback. However, I think a simpler approach exists >> > than >> > what has been described above, namely by unregistering the sysfs attributes >> > earlier. How about the patch below? >> > >> > [PATCH] block: Protect less code with sysfs_lock in >> > blk_{un,}register_queue() >> > --- >> > block/blk-sysfs.c | 39 ++- >> > block/elevator.c | 4 >> > 2 files changed, 26 insertions(+), 17 deletions(-) >> > >> > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c >> > index 4a6a40ffd78e..ce32366c6db7 100644 >> > --- a/block/blk-sysfs.c >> > +++ b/block/blk-sysfs.c >> > @@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = { >> > .release= blk_release_queue, >> > }; >> > >> > +/** >> > + * blk_register_queue - register a block layer queue with sysfs >> > + * @disk: Disk of which the request queue should be registered with sysfs. >> > + */ >> > int blk_register_queue(struct gendisk *disk) >> > { >> > int ret; >> > @@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk) >> > if (q->request_fn || (q->mq_ops && q->elevator)) { >> > ret = elv_register_queue(q); >> > if (ret) { >> > + mutex_unlock(>sysfs_lock); >> > kobject_uevent(>kobj, KOBJ_REMOVE); >> > kobject_del(>kobj); >> > blk_trace_remove_sysfs(dev); >> > kobject_put(>kobj); >> > - goto unlock; >> > + return ret; >> > } >> > } >> > ret = 0; >> > @@ -923,6 +928,13 @@ int blk_register_queue(struct gendisk *disk) >> > } >> > EXPORT_SYMBOL_GPL(blk_register_queue); >> > >> > +/** >> > + * blk_unregister_queue - counterpart of blk_register_queue() >> > + * @disk: Disk of which the request queue should be unregistered from >> > sysfs. >> > + * >> > + * Note: the caller is responsible for guaranteeing that this function is >> > called >> > + * after blk_register_queue() has finished. >> > + */ >> > void blk_unregister_queue(struct gendisk *disk) >> > { >> > struct request_queue *q = disk->queue; >> > @@ -934,28 +946,29 @@ void blk_unregister_queue(struct gendisk *disk) >> > if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) >> > return; >> > >> > - /* >> > -* Protect against the 'queue' kobj being accessed >> > -* while/after it is removed. >> > -*/ >> > - mutex_lock(>sysfs_lock); >> > - >> > spin_lock_irq(q->queue_lock); >> > queue_flag_clear(QUEUE_FLAG_REGISTERED, q); >> > spin_unlock_irq(q->queue_lock); >> > >> > - wbt_exit(q); >> > - >> > + /* >> > +* Remove the sysfs attributes before unregistering the queue data >> > +* structures that can be modified through sysfs. >> > +*/ >> > + mutex_lock(>sysfs_lock); >> > if (q->mq_ops) >> > blk_mq_unregister_dev(disk_to_dev(disk), q); >> > - >> > - if (q->request_fn || (q->mq_ops && q->elevator)) >> > - elv_unregister_queue(q); >> > + mutex_unlock(>sysfs_lock); >> > >> > kobject_uevent(>kobj, KOBJ_REMOVE); >> > kobject_del(>kobj); >> >> elevator switch still can come just after the above line code is completed, >> so the previous warning addressed in e9a823fb34a8b can be triggered >> again. >> >> > blk_trace_remove_sysfs(disk_to_dev(disk)); >> > - kobject_put(_to_dev(disk)->kobj); >> > >> > + wbt_exit(q); >> > + >> > + mutex_lock(>sysfs_lock); >> > + if (q->request_fn || (q->mq_ops && q->elevator)) >> > + elv_unregister_queue(q); >> > mutex_unlock(>sysfs_lock); >> > + >> > + kobject_put(_to_dev(disk)->kobj); >> > } >> > diff --git a/block/elevator.c
Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Wed, 2018-01-17 at 09:23 +0800, Ming Lei wrote: > On Wed, Jan 17, 2018 at 8:03 AM, Bart Van Assche> wrote: > > On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: > > > Therefore it seems to me that all queue_attr_{show,store} are racey vs > > > blk_unregister_queue() removing the 'queue' kobject. > > > > > > And it was just that __elevator_change() was myopicly fixed to address > > > the race whereas a more generic solution was/is needed. But short of > > > that more generic fix your change will reintroduce the potential for > > > hitting the issue that commit e9a823fb34a8b fixed. > > > > > > In that light, think it best to leave blk_unregister_queue()'s > > > mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update > > > queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding > > > sysfs_lock. > > > > > > Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from > > > __elevator_change(). > > > > Thanks Mike for the feedback. However, I think a simpler approach exists > > than > > what has been described above, namely by unregistering the sysfs attributes > > earlier. How about the patch below? > > > > [PATCH] block: Protect less code with sysfs_lock in > > blk_{un,}register_queue() > > --- > > block/blk-sysfs.c | 39 ++- > > block/elevator.c | 4 > > 2 files changed, 26 insertions(+), 17 deletions(-) > > > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > > index 4a6a40ffd78e..ce32366c6db7 100644 > > --- a/block/blk-sysfs.c > > +++ b/block/blk-sysfs.c > > @@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = { > > .release= blk_release_queue, > > }; > > > > +/** > > + * blk_register_queue - register a block layer queue with sysfs > > + * @disk: Disk of which the request queue should be registered with sysfs. > > + */ > > int blk_register_queue(struct gendisk *disk) > > { > > int ret; > > @@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk) > > if (q->request_fn || (q->mq_ops && q->elevator)) { > > ret = elv_register_queue(q); > > if (ret) { > > + mutex_unlock(>sysfs_lock); > > kobject_uevent(>kobj, KOBJ_REMOVE); > > kobject_del(>kobj); > > blk_trace_remove_sysfs(dev); > > kobject_put(>kobj); > > - goto unlock; > > + return ret; > > } > > } > > ret = 0; > > @@ -923,6 +928,13 @@ int blk_register_queue(struct gendisk *disk) > > } > > EXPORT_SYMBOL_GPL(blk_register_queue); > > > > +/** > > + * blk_unregister_queue - counterpart of blk_register_queue() > > + * @disk: Disk of which the request queue should be unregistered from > > sysfs. > > + * > > + * Note: the caller is responsible for guaranteeing that this function is > > called > > + * after blk_register_queue() has finished. > > + */ > > void blk_unregister_queue(struct gendisk *disk) > > { > > struct request_queue *q = disk->queue; > > @@ -934,28 +946,29 @@ void blk_unregister_queue(struct gendisk *disk) > > if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) > > return; > > > > - /* > > -* Protect against the 'queue' kobj being accessed > > -* while/after it is removed. > > -*/ > > - mutex_lock(>sysfs_lock); > > - > > spin_lock_irq(q->queue_lock); > > queue_flag_clear(QUEUE_FLAG_REGISTERED, q); > > spin_unlock_irq(q->queue_lock); > > > > - wbt_exit(q); > > - > > + /* > > +* Remove the sysfs attributes before unregistering the queue data > > +* structures that can be modified through sysfs. > > +*/ > > + mutex_lock(>sysfs_lock); > > if (q->mq_ops) > > blk_mq_unregister_dev(disk_to_dev(disk), q); > > - > > - if (q->request_fn || (q->mq_ops && q->elevator)) > > - elv_unregister_queue(q); > > + mutex_unlock(>sysfs_lock); > > > > kobject_uevent(>kobj, KOBJ_REMOVE); > > kobject_del(>kobj); > > elevator switch still can come just after the above line code is completed, > so the previous warning addressed in e9a823fb34a8b can be triggered > again. > > > blk_trace_remove_sysfs(disk_to_dev(disk)); > > - kobject_put(_to_dev(disk)->kobj); > > > > + wbt_exit(q); > > + > > + mutex_lock(>sysfs_lock); > > + if (q->request_fn || (q->mq_ops && q->elevator)) > > + elv_unregister_queue(q); > > mutex_unlock(>sysfs_lock); > > + > > + kobject_put(_to_dev(disk)->kobj); > > } > > diff --git a/block/elevator.c b/block/elevator.c > > index e87e9b43aba0..4b7957b28a99 100644 > > --- a/block/elevator.c > > +++ b/block/elevator.c > > @@ -1080,10 +1080,6 @@ static int __elevator_change(struct request_queue > > *q,
Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Wed, Jan 17, 2018 at 8:03 AM, Bart Van Asschewrote: > On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: >> Therefore it seems to me that all queue_attr_{show,store} are racey vs >> blk_unregister_queue() removing the 'queue' kobject. >> >> And it was just that __elevator_change() was myopicly fixed to address >> the race whereas a more generic solution was/is needed. But short of >> that more generic fix your change will reintroduce the potential for >> hitting the issue that commit e9a823fb34a8b fixed. >> >> In that light, think it best to leave blk_unregister_queue()'s >> mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update >> queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding >> sysfs_lock. >> >> Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from >> __elevator_change(). > > Thanks Mike for the feedback. However, I think a simpler approach exists than > what has been described above, namely by unregistering the sysfs attributes > earlier. How about the patch below? > > [PATCH] block: Protect less code with sysfs_lock in blk_{un,}register_queue() > --- > block/blk-sysfs.c | 39 ++- > block/elevator.c | 4 > 2 files changed, 26 insertions(+), 17 deletions(-) > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 4a6a40ffd78e..ce32366c6db7 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = { > .release= blk_release_queue, > }; > > +/** > + * blk_register_queue - register a block layer queue with sysfs > + * @disk: Disk of which the request queue should be registered with sysfs. > + */ > int blk_register_queue(struct gendisk *disk) > { > int ret; > @@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk) > if (q->request_fn || (q->mq_ops && q->elevator)) { > ret = elv_register_queue(q); > if (ret) { > + mutex_unlock(>sysfs_lock); > kobject_uevent(>kobj, KOBJ_REMOVE); > kobject_del(>kobj); > blk_trace_remove_sysfs(dev); > kobject_put(>kobj); > - goto unlock; > + return ret; > } > } > ret = 0; > @@ -923,6 +928,13 @@ int blk_register_queue(struct gendisk *disk) > } > EXPORT_SYMBOL_GPL(blk_register_queue); > > +/** > + * blk_unregister_queue - counterpart of blk_register_queue() > + * @disk: Disk of which the request queue should be unregistered from sysfs. > + * > + * Note: the caller is responsible for guaranteeing that this function is > called > + * after blk_register_queue() has finished. > + */ > void blk_unregister_queue(struct gendisk *disk) > { > struct request_queue *q = disk->queue; > @@ -934,28 +946,29 @@ void blk_unregister_queue(struct gendisk *disk) > if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) > return; > > - /* > -* Protect against the 'queue' kobj being accessed > -* while/after it is removed. > -*/ > - mutex_lock(>sysfs_lock); > - > spin_lock_irq(q->queue_lock); > queue_flag_clear(QUEUE_FLAG_REGISTERED, q); > spin_unlock_irq(q->queue_lock); > > - wbt_exit(q); > - > + /* > +* Remove the sysfs attributes before unregistering the queue data > +* structures that can be modified through sysfs. > +*/ > + mutex_lock(>sysfs_lock); > if (q->mq_ops) > blk_mq_unregister_dev(disk_to_dev(disk), q); > - > - if (q->request_fn || (q->mq_ops && q->elevator)) > - elv_unregister_queue(q); > + mutex_unlock(>sysfs_lock); > > kobject_uevent(>kobj, KOBJ_REMOVE); > kobject_del(>kobj); elevator switch still can come just after the above line code is completed, so the previous warning addressed in e9a823fb34a8b can be triggered again. > blk_trace_remove_sysfs(disk_to_dev(disk)); > - kobject_put(_to_dev(disk)->kobj); > > + wbt_exit(q); > + > + mutex_lock(>sysfs_lock); > + if (q->request_fn || (q->mq_ops && q->elevator)) > + elv_unregister_queue(q); > mutex_unlock(>sysfs_lock); > + > + kobject_put(_to_dev(disk)->kobj); > } > diff --git a/block/elevator.c b/block/elevator.c > index e87e9b43aba0..4b7957b28a99 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -1080,10 +1080,6 @@ static int __elevator_change(struct request_queue *q, > const char *name) > char elevator_name[ELV_NAME_MAX]; > struct elevator_type *e; > > - /* Make sure queue is not in the middle of being removed */ > - if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) > - return -ENOENT; > - The above check shouldn't be removed, as I explained above. -- Ming Lei
Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: > Therefore it seems to me that all queue_attr_{show,store} are racey vs > blk_unregister_queue() removing the 'queue' kobject. > > And it was just that __elevator_change() was myopicly fixed to address > the race whereas a more generic solution was/is needed. But short of > that more generic fix your change will reintroduce the potential for > hitting the issue that commit e9a823fb34a8b fixed. > > In that light, think it best to leave blk_unregister_queue()'s > mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update > queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding > sysfs_lock. > > Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from > __elevator_change(). Thanks Mike for the feedback. However, I think a simpler approach exists than what has been described above, namely by unregistering the sysfs attributes earlier. How about the patch below? [PATCH] block: Protect less code with sysfs_lock in blk_{un,}register_queue() --- block/blk-sysfs.c | 39 ++- block/elevator.c | 4 2 files changed, 26 insertions(+), 17 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 4a6a40ffd78e..ce32366c6db7 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = { .release= blk_release_queue, }; +/** + * blk_register_queue - register a block layer queue with sysfs + * @disk: Disk of which the request queue should be registered with sysfs. + */ int blk_register_queue(struct gendisk *disk) { int ret; @@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk) if (q->request_fn || (q->mq_ops && q->elevator)) { ret = elv_register_queue(q); if (ret) { + mutex_unlock(>sysfs_lock); kobject_uevent(>kobj, KOBJ_REMOVE); kobject_del(>kobj); blk_trace_remove_sysfs(dev); kobject_put(>kobj); - goto unlock; + return ret; } } ret = 0; @@ -923,6 +928,13 @@ int blk_register_queue(struct gendisk *disk) } EXPORT_SYMBOL_GPL(blk_register_queue); +/** + * blk_unregister_queue - counterpart of blk_register_queue() + * @disk: Disk of which the request queue should be unregistered from sysfs. + * + * Note: the caller is responsible for guaranteeing that this function is called + * after blk_register_queue() has finished. + */ void blk_unregister_queue(struct gendisk *disk) { struct request_queue *q = disk->queue; @@ -934,28 +946,29 @@ void blk_unregister_queue(struct gendisk *disk) if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) return; - /* -* Protect against the 'queue' kobj being accessed -* while/after it is removed. -*/ - mutex_lock(>sysfs_lock); - spin_lock_irq(q->queue_lock); queue_flag_clear(QUEUE_FLAG_REGISTERED, q); spin_unlock_irq(q->queue_lock); - wbt_exit(q); - + /* +* Remove the sysfs attributes before unregistering the queue data +* structures that can be modified through sysfs. +*/ + mutex_lock(>sysfs_lock); if (q->mq_ops) blk_mq_unregister_dev(disk_to_dev(disk), q); - - if (q->request_fn || (q->mq_ops && q->elevator)) - elv_unregister_queue(q); + mutex_unlock(>sysfs_lock); kobject_uevent(>kobj, KOBJ_REMOVE); kobject_del(>kobj); blk_trace_remove_sysfs(disk_to_dev(disk)); - kobject_put(_to_dev(disk)->kobj); + wbt_exit(q); + + mutex_lock(>sysfs_lock); + if (q->request_fn || (q->mq_ops && q->elevator)) + elv_unregister_queue(q); mutex_unlock(>sysfs_lock); + + kobject_put(_to_dev(disk)->kobj); } diff --git a/block/elevator.c b/block/elevator.c index e87e9b43aba0..4b7957b28a99 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -1080,10 +1080,6 @@ static int __elevator_change(struct request_queue *q, const char *name) char elevator_name[ELV_NAME_MAX]; struct elevator_type *e; - /* Make sure queue is not in the middle of being removed */ - if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) - return -ENOENT; - /* * Special case for mq, turn off scheduling */ -- 2.15.1
Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
On Tue, Jan 16 2018 at 1:17pm -0500, Bart Van Asschewrote: > The __blk_mq_register_dev(), blk_mq_unregister_dev(), > elv_register_queue() and elv_unregister_queue() calls need to be > protected with sysfs_lock but other code in these functions not. > Hence protect only this code with sysfs_lock. This patch fixes a > locking inversion issue in blk_unregister_queue() and also in an > error path of blk_register_queue(): it is not allowed to hold > sysfs_lock around the kobject_del(>kobj) call. > > Signed-off-by: Bart Van Assche > --- > block/blk-sysfs.c | 13 - > 1 file changed, 4 insertions(+), 9 deletions(-) > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 4a6a40ffd78e..e9ce45ff0ef2 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -909,11 +909,12 @@ int blk_register_queue(struct gendisk *disk) > if (q->request_fn || (q->mq_ops && q->elevator)) { > ret = elv_register_queue(q); > if (ret) { > + mutex_unlock(>sysfs_lock); > kobject_uevent(>kobj, KOBJ_REMOVE); > kobject_del(>kobj); > blk_trace_remove_sysfs(dev); > kobject_put(>kobj); > - goto unlock; > + return ret; > } > } > ret = 0; > @@ -934,28 +935,22 @@ void blk_unregister_queue(struct gendisk *disk) > if (!test_bit(QUEUE_FLAG_REGISTERED, >queue_flags)) > return; > > - /* > - * Protect against the 'queue' kobj being accessed > - * while/after it is removed. > - */ > - mutex_lock(>sysfs_lock); > - > spin_lock_irq(q->queue_lock); > queue_flag_clear(QUEUE_FLAG_REGISTERED, q); > spin_unlock_irq(q->queue_lock); > > wbt_exit(q); > > + mutex_lock(>sysfs_lock); > if (q->mq_ops) > blk_mq_unregister_dev(disk_to_dev(disk), q); > > if (q->request_fn || (q->mq_ops && q->elevator)) > elv_unregister_queue(q); My concern with this change is detailed in the following portion of the header for commit 667257e8b2988c0183ba23e2bcd6900e87961606: 2) Conversely, __elevator_change() is testing for QUEUE_FLAG_REGISTERED in case elv_iosched_store() loses the race with blk_unregister_queue(), it needs a way to know the 'queue' kobj isn't there. I don't think moving mutex_lock(>sysfs_lock); after the clearing of QUEUE_FLAG_REGISTERED is a step in the right direction. Current code shows: blk_cleanup_queue() calls blk_set_queue_dying() while holding the sysfs_lock. queue_attr_{show,store} both test if blk_queue_dying(q) while holding the sysfs_lock. BUT drivers can/do call del_gendisk() _before_ blk_cleanup_queue(). (if your proposed change above were to go in all of the block drivers would first need to be audited for the need to call blk_cleanup_queue() before del_gendisk() -- seems awful). Therefore it seems to me that all queue_attr_{show,store} are racey vs blk_unregister_queue() removing the 'queue' kobject. And it was just that __elevator_change() was myopicly fixed to address the race whereas a more generic solution was/is needed. But short of that more generic fix your change will reintroduce the potential for hitting the issue that commit e9a823fb34a8b fixed. In that light, think it best to leave blk_unregister_queue()'s mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding sysfs_lock. Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from __elevator_change(). But it could be I'm wrong for some reason.. as you know that happens ;) Mike