On Wed, 10 Oct 2012, Ben Hutchings wrote: > > >>>>>> I'm still seeing this on linux-next. > > >>>> I think this is floppy related (see redo_fd_request() in the stack > > >>>> trace). And there were quite some changes to the area recently. Adding > > >>>> maintainer to CC. > > >> Hmm ... I don't immediately see how this is happening. > > >> > > >> Sasha, could you please do git bisect on drivers/block/floppy.c between > > >> f6365201d and your git HEAD for starters (assuming that f6365201d works > > >> well for you?). > > >> > > > > > > A bisect on floppy.c yielded the following: > > > > > > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit > > > commit b33d002f4b6bae912463e5a66387c498aa69b6fe > > > Author: Ben Hutchings <b...@decadent.org.uk> > > > Date: Mon Aug 27 20:56:53 2012 -0300 > > > > > > genhd: Make put_disk() safe for disks that have not been registered > > > > 2 more things: > > > > 1. The guest vm which I'm testing on doesn't emulate anything which even > > looks like a floppy. > > 2. I'm seeing the following lines before the BUG: > > > > [ 9.836604] floppy0: no floppy controllers found > > [ 9.837246] work still pending > > [ 9.837743] floppy0: floppy_shutdown: timeout handler died. > > I see two problems: > > 1. redo_fd_request() races with tear-down of the disks, but because > set_next_request() checks disk->queue before doing anything this was > usually harmless. Now that do_floppy_init() doesn't clear disk->queue, > the race condition is much easier to hit. This may fix that problem in > do_floppy_init(), though there appear to be worse bugs in tear-down > order in floppy_module_exit(): > > --- a/drivers/block/floppy.c > +++ b/drivers/block/floppy.c > @@ -4320,13 +4320,13 @@ out_unreg_region: > out_unreg_blkdev: > unregister_blkdev(FLOPPY_MAJOR, "fd"); > out_put_disk: > + destroy_workqueue(floppy_wq); > while (dr--) { > del_timer_sync(&motor_off_timer[dr]); > if (disks[dr]->queue) > blk_cleanup_queue(disks[dr]->queue); > put_disk(disks[dr]); > } > - destroy_workqueue(floppy_wq); > return err; > } > > --- END --- > > 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is > cleared by del_gendisk(). Incremental patch below, but it should be > squashed into the previous patch if that branch is still rebase-able.
Sasha, did you manage to test this to see if it fixes the symptom you are seeing, please? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/