On Sat, Apr 18, 2015 at 4:30 PM, Jens Axboe <ax...@kernel.dk> wrote: > On 04/17/2015 10:23 PM, Ming Lei wrote: >> >> Hi Dongsu, >> >> On Fri, Apr 17, 2015 at 5:41 AM, Dongsu Park >> <dongsu.p...@profitbricks.com> wrote: >>> >>> Hi, >>> >>> there's a critical bug regarding CPU hotplug, blk-mq, and scsi-mq. >>> Every time when a CPU is offlined, some arbitrary range of kernel memory >>> seems to get corrupted. Then after a while, kernel panics at random >>> places >>> when block IOs are issued. (for example, see the call traces below) >> >> >> Thanks for the report. >> >>> >>> This bug can be easily reproducible with a Qemu VM running with >>> virtio-scsi, >>> when its guest kernel is 3.19-rc1 or higher, and when scsi-mq is loaded >>> with blk-mq enabled. And yes, 4.0 release is still affected, as well as >>> Jens' for-4.1/core. How to reproduce: >>> >>> # echo 0 > /sys/devices/system/cpu/cpu1/online >>> (and issue some block IOs, that's it.) >>> >>> Bisecting between 3.18 and 3.19-rc1, it looks like this bug had been >>> hidden >>> until commit ccbedf117f01 ("virtio_scsi: support multi hw queue of >>> blk-mq"), >>> which started to allow virtio-scsi to map virtqueues to hardware queues >>> of >>> blk-mq. Reverting that commit makes the bug go away. However, I suppose >>> reverting it could not be a correct solution. >> >> >> I agree, and that patch only enables multiple hw queues. >> >>> >>> More precisely, every time a CPU hotplug event gets triggered, >>> a call graph is like the following: >>> >>> blk_mq_queue_reinit_notify() >>> -> blk_mq_queue_reinit() >>> -> blk_mq_map_swqueue() >>> -> blk_mq_free_rq_map() >>> -> scsi_exit_request() >>> >>> From that point, as soon as any address in the request gets modified, an >>> arbitrary range of memory gets corrupted. My first guess was that >>> probably >>> the exit routine could try to deallocate tags->rqs[] where invalid >>> addresses are stored. But actually it looks like it's not the case, >>> and cmd->sense_buffer looks also valid. >>> It's not obvious to me, exactly what could go wrong. >>> >>> Does anyone have an idea? >> >> >> As far as I can see, at least two problems exist: >> - race between timeout and CPU hotplug >> - in case of shared tags, during CPU online handling, about setting >> and checking hctx->tags >> >> So could you please test the attached two patches to see if they fix your >> issue? >> >> I run them in my VM, and looks opps does disappear. > > > Hard to comment on your patches directly when they are attached. Both look > good to me. I'd perhaps change the ->tags check in #1 to use > blk_mq_hw_queue_mapped() instead of checking directly. Might even be worth
It makes sense and blk_mq_hw_queue_mapped() is easy to backport too. I will send out v1 later with this change. > considering changing the normal iterator to skip unmapped queues, but that > can be left for a later change. Yes, that should be left later because we want easy backport to stable. > > -- > Jens Axboe > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/