Re: [PATCHSET v6] blk-mq scheduling framework
On 01/16/2017 08:16 AM, Jens Axboe wrote: > On 01/16/2017 08:12 AM, Jens Axboe wrote: >> On 01/16/2017 01:11 AM, Hannes Reinecke wrote: >>> On 01/13/2017 05:02 PM, Jens Axboe wrote: On 01/13/2017 09:00 AM, Jens Axboe wrote: > On 01/13/2017 08:59 AM, Hannes Reinecke wrote: >> On 01/13/2017 04:34 PM, Jens Axboe wrote: >>> On 01/13/2017 08:33 AM, Hannes Reinecke wrote: >> [ .. ] Ah, indeed. There is an ominous udev rule here, trying to switch to 'deadline'. # cat 60-ssd-scheduler.rules # do not edit this file, it will be overwritten on update ACTION!="add", GOTO="ssd_scheduler_end" SUBSYSTEM!="block", GOTO="ssd_scheduler_end" IMPORT{cmdline}="elevator" ENV{elevator}=="*?", GOTO="ssd_scheduler_end" KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline" LABEL="ssd_scheduler_end" Still shouldn't crash the kernel, though ... >>> >>> Of course not, and it's not a given that it does, it could just be >>> triggering after the device load and failing like expected. But just in >>> case, can you try and disable that rule and see if it still crashes with >>> MQ_DEADLINE set as the default? >>> >> Yes, it does. >> Same stacktrace as before. > > Alright, that's as expected. I've tried with your rule and making > everything modular, but it still boots fine for me. Very odd. Can you > send me your .config? And are all the SCSI disks hanging off ahci? Or > sdb specifically, is that ahci or something else? Also, would be great if you could pull: git://git.kernel.dk/linux-block blk-mq-sched into current 'master' and see if it still reproduces. I expect that it will, but just want to ensure that it's a problem in the current code base as well. >>> Actually, it doesn't. Seems to have resolved itself with the latest drop. >>> >>> However, not I've got a lockdep splat: >>> >>> Jan 16 09:05:02 lammermuir kernel: [ cut here ] >>> Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at >>> kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490 >>> Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0) >>> Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas >>> raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e >>> Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm >>> ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa >>> Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not >>> tainted 4.10.0-rc3+ #540 >>> Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p >>> Gen8, BIOS P72 09/08/2013 >>> Jan 16 09:05:02 lammermuir kernel: Call Trace: >>> Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9 >>> Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0 >>> Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170 >>> Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60 >>> Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490 >>> Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0 >>> Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170 >>> Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0 >>> Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0 >>> Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20 >>> Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6 >> >> Odd, not sure that's me. What did you pull my branch into? And what is the >> sha of the stuff you pulled in? > > Forgot to ask, please send me the fio job you ran here. Nevermind, it's a mainline bug that's fixed in -rc4: commit a12f1ae61c489076a9aeb90bddca7722bf330df3 Author: Shaohua Li Date: Tue Dec 13 12:09:56 2016 -0800 aio: fix lock dep warning -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/16/2017 08:12 AM, Jens Axboe wrote: > On 01/16/2017 01:11 AM, Hannes Reinecke wrote: >> On 01/13/2017 05:02 PM, Jens Axboe wrote: >>> On 01/13/2017 09:00 AM, Jens Axboe wrote: On 01/13/2017 08:59 AM, Hannes Reinecke wrote: > On 01/13/2017 04:34 PM, Jens Axboe wrote: >> On 01/13/2017 08:33 AM, Hannes Reinecke wrote: > [ .. ] >>> Ah, indeed. >>> There is an ominous udev rule here, trying to switch to 'deadline'. >>> >>> # cat 60-ssd-scheduler.rules >>> # do not edit this file, it will be overwritten on update >>> >>> ACTION!="add", GOTO="ssd_scheduler_end" >>> SUBSYSTEM!="block", GOTO="ssd_scheduler_end" >>> >>> IMPORT{cmdline}="elevator" >>> ENV{elevator}=="*?", GOTO="ssd_scheduler_end" >>> >>> KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", >>> ATTR{queue/scheduler}="deadline" >>> >>> LABEL="ssd_scheduler_end" >>> >>> Still shouldn't crash the kernel, though ... >> >> Of course not, and it's not a given that it does, it could just be >> triggering after the device load and failing like expected. But just in >> case, can you try and disable that rule and see if it still crashes with >> MQ_DEADLINE set as the default? >> > Yes, it does. > Same stacktrace as before. Alright, that's as expected. I've tried with your rule and making everything modular, but it still boots fine for me. Very odd. Can you send me your .config? And are all the SCSI disks hanging off ahci? Or sdb specifically, is that ahci or something else? >>> >>> Also, would be great if you could pull: >>> >>> git://git.kernel.dk/linux-block blk-mq-sched >>> >>> into current 'master' and see if it still reproduces. I expect that it >>> will, but just want to ensure that it's a problem in the current code >>> base as well. >>> >> Actually, it doesn't. Seems to have resolved itself with the latest drop. >> >> However, not I've got a lockdep splat: >> >> Jan 16 09:05:02 lammermuir kernel: [ cut here ] >> Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at >> kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490 >> Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0) >> Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas >> raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e >> Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm >> ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa >> Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not >> tainted 4.10.0-rc3+ #540 >> Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p >> Gen8, BIOS P72 09/08/2013 >> Jan 16 09:05:02 lammermuir kernel: Call Trace: >> Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9 >> Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0 >> Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170 >> Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60 >> Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490 >> Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0 >> Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170 >> Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0 >> Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0 >> Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20 >> Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6 > > Odd, not sure that's me. What did you pull my branch into? And what is the > sha of the stuff you pulled in? Forgot to ask, please send me the fio job you ran here. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/16/2017 01:11 AM, Hannes Reinecke wrote: > On 01/13/2017 05:02 PM, Jens Axboe wrote: >> On 01/13/2017 09:00 AM, Jens Axboe wrote: >>> On 01/13/2017 08:59 AM, Hannes Reinecke wrote: On 01/13/2017 04:34 PM, Jens Axboe wrote: > On 01/13/2017 08:33 AM, Hannes Reinecke wrote: [ .. ] >> Ah, indeed. >> There is an ominous udev rule here, trying to switch to 'deadline'. >> >> # cat 60-ssd-scheduler.rules >> # do not edit this file, it will be overwritten on update >> >> ACTION!="add", GOTO="ssd_scheduler_end" >> SUBSYSTEM!="block", GOTO="ssd_scheduler_end" >> >> IMPORT{cmdline}="elevator" >> ENV{elevator}=="*?", GOTO="ssd_scheduler_end" >> >> KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", >> ATTR{queue/scheduler}="deadline" >> >> LABEL="ssd_scheduler_end" >> >> Still shouldn't crash the kernel, though ... > > Of course not, and it's not a given that it does, it could just be > triggering after the device load and failing like expected. But just in > case, can you try and disable that rule and see if it still crashes with > MQ_DEADLINE set as the default? > Yes, it does. Same stacktrace as before. >>> >>> Alright, that's as expected. I've tried with your rule and making >>> everything modular, but it still boots fine for me. Very odd. Can you >>> send me your .config? And are all the SCSI disks hanging off ahci? Or >>> sdb specifically, is that ahci or something else? >> >> Also, would be great if you could pull: >> >> git://git.kernel.dk/linux-block blk-mq-sched >> >> into current 'master' and see if it still reproduces. I expect that it >> will, but just want to ensure that it's a problem in the current code >> base as well. >> > Actually, it doesn't. Seems to have resolved itself with the latest drop. > > However, not I've got a lockdep splat: > > Jan 16 09:05:02 lammermuir kernel: [ cut here ] > Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at > kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490 > Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0) > Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas > raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e > Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm > ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa > Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not > tainted 4.10.0-rc3+ #540 > Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p > Gen8, BIOS P72 09/08/2013 > Jan 16 09:05:02 lammermuir kernel: Call Trace: > Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9 > Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0 > Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170 > Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60 > Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490 > Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0 > Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170 > Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0 > Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0 > Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20 > Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6 Odd, not sure that's me. What did you pull my branch into? And what is the sha of the stuff you pulled in? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 05:02 PM, Jens Axboe wrote: > On 01/13/2017 09:00 AM, Jens Axboe wrote: >> On 01/13/2017 08:59 AM, Hannes Reinecke wrote: >>> On 01/13/2017 04:34 PM, Jens Axboe wrote: On 01/13/2017 08:33 AM, Hannes Reinecke wrote: >>> [ .. ] > Ah, indeed. > There is an ominous udev rule here, trying to switch to 'deadline'. > > # cat 60-ssd-scheduler.rules > # do not edit this file, it will be overwritten on update > > ACTION!="add", GOTO="ssd_scheduler_end" > SUBSYSTEM!="block", GOTO="ssd_scheduler_end" > > IMPORT{cmdline}="elevator" > ENV{elevator}=="*?", GOTO="ssd_scheduler_end" > > KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", > ATTR{queue/scheduler}="deadline" > > LABEL="ssd_scheduler_end" > > Still shouldn't crash the kernel, though ... Of course not, and it's not a given that it does, it could just be triggering after the device load and failing like expected. But just in case, can you try and disable that rule and see if it still crashes with MQ_DEADLINE set as the default? >>> Yes, it does. >>> Same stacktrace as before. >> >> Alright, that's as expected. I've tried with your rule and making >> everything modular, but it still boots fine for me. Very odd. Can you >> send me your .config? And are all the SCSI disks hanging off ahci? Or >> sdb specifically, is that ahci or something else? > > Also, would be great if you could pull: > > git://git.kernel.dk/linux-block blk-mq-sched > > into current 'master' and see if it still reproduces. I expect that it > will, but just want to ensure that it's a problem in the current code > base as well. > Actually, it doesn't. Seems to have resolved itself with the latest drop. However, not I've got a lockdep splat: Jan 16 09:05:02 lammermuir kernel: [ cut here ] Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490 Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0) Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not tainted 4.10.0-rc3+ #540 Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 Jan 16 09:05:02 lammermuir kernel: Call Trace: Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9 Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0 Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170 Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60 Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490 Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0 Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170 Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0 Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0 Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20 Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6 Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/15/2017 03:12 AM, Paolo Valente wrote: > >> Il giorno 11 gen 2017, alle ore 22:39, Jens Axboe ha scritto: >> >> Another year, another posting of this patchset. The previous posting >> was here: >> >> https://www.spinics.net/lists/kernel/msg2406106.html >> >> (yes, I've skipped v5, it was fixes on top of v4, not the rework). >> >> I've reworked bits of this to get rid of the shadow requests, thanks >> to Bart for the inspiration. The missing piece, for me, was the fact >> that we have the tags->rqs[] indirection array already. I've done this >> somewhat differently, though, by having the internal scheduler tag >> map be allocated/torn down when an IO scheduler is attached or >> detached. This also means that when we run without a scheduler, we >> don't have to do double tag allocations, it'll work like before. >> >> The patchset applies on top of 4.10-rc3, or can be pulled here: >> >> git://git.kernel.dk/linux-block blk-mq-sched.6 >> >> > > Hi Jens, > I have checked this new version to find solutions to the apparent > errors, mistakes or just unclear parts (to me) that I have pointed out > before Christmas last year. But I have found no changes related to > these problems. > > As I have already written, I'm willing to try to fix those errors > myself, if they really are errors, but I would first need at least > some minimal initial feedback and guidance. If needed, tell me how I > can help you get in sync again with these issues (sending my reports > again, sending a digest of them, ...). Sorry Paolo, but focus has been on getting the framework in both a mergeable and stable state, which it is now. I'll tend to BFQ specific issues next week, so we can get those resolved as well. Do you have a place where you have posted your in-progress conversion? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
> Il giorno 11 gen 2017, alle ore 22:39, Jens Axboe ha scritto: > > Another year, another posting of this patchset. The previous posting > was here: > > https://www.spinics.net/lists/kernel/msg2406106.html > > (yes, I've skipped v5, it was fixes on top of v4, not the rework). > > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently, though, by having the internal scheduler tag > map be allocated/torn down when an IO scheduler is attached or > detached. This also means that when we run without a scheduler, we > don't have to do double tag allocations, it'll work like before. > > The patchset applies on top of 4.10-rc3, or can be pulled here: > > git://git.kernel.dk/linux-block blk-mq-sched.6 > > Hi Jens, I have checked this new version to find solutions to the apparent errors, mistakes or just unclear parts (to me) that I have pointed out before Christmas last year. But I have found no changes related to these problems. As I have already written, I'm willing to try to fix those errors myself, if they really are errors, but I would first need at least some minimal initial feedback and guidance. If needed, tell me how I can help you get in sync again with these issues (sending my reports again, sending a digest of them, ...). Thanks, Paolo > block/Kconfig.iosched| 50 > block/Makefile |3 > block/blk-core.c | 19 - > block/blk-exec.c |3 > block/blk-flush.c| 15 - > block/blk-ioc.c | 12 > block/blk-merge.c|4 > block/blk-mq-sched.c | 354 + > block/blk-mq-sched.h | 157 > block/blk-mq-sysfs.c | 13 + > block/blk-mq-tag.c | 58 ++-- > block/blk-mq-tag.h |4 > block/blk-mq.c | 413 +++--- > block/blk-mq.h | 40 +++ > block/blk-tag.c |1 > block/blk.h | 26 +- > block/cfq-iosched.c |2 > block/deadline-iosched.c |2 > block/elevator.c | 247 +++- > block/mq-deadline.c | 569 > +++ > block/noop-iosched.c |2 > drivers/nvme/host/pci.c |1 > include/linux/blk-mq.h |9 > include/linux/blkdev.h |6 > include/linux/elevator.h | 36 ++ > 25 files changed, 1732 insertions(+), 314 deletions(-) > > -- > Jens Axboe > > -- > To unsubscribe from this list: send the line "unsubscribe linux-block" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 09:02 AM, Jens Axboe wrote: > Also, would be great if you could pull: > > git://git.kernel.dk/linux-block blk-mq-sched > > into current 'master' and see if it still reproduces. I expect that it > will, but just want to ensure that it's a problem in the current code > base as well. Hannes, can you try the current branch? I believe your problem should be fixed now, would be great if you could verify. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 09:00 AM, Jens Axboe wrote: > On 01/13/2017 08:59 AM, Hannes Reinecke wrote: >> On 01/13/2017 04:34 PM, Jens Axboe wrote: >>> On 01/13/2017 08:33 AM, Hannes Reinecke wrote: >> [ .. ] Ah, indeed. There is an ominous udev rule here, trying to switch to 'deadline'. # cat 60-ssd-scheduler.rules # do not edit this file, it will be overwritten on update ACTION!="add", GOTO="ssd_scheduler_end" SUBSYSTEM!="block", GOTO="ssd_scheduler_end" IMPORT{cmdline}="elevator" ENV{elevator}=="*?", GOTO="ssd_scheduler_end" KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline" LABEL="ssd_scheduler_end" Still shouldn't crash the kernel, though ... >>> >>> Of course not, and it's not a given that it does, it could just be >>> triggering after the device load and failing like expected. But just in >>> case, can you try and disable that rule and see if it still crashes with >>> MQ_DEADLINE set as the default? >>> >> Yes, it does. >> Same stacktrace as before. > > Alright, that's as expected. I've tried with your rule and making > everything modular, but it still boots fine for me. Very odd. Can you > send me your .config? And are all the SCSI disks hanging off ahci? Or > sdb specifically, is that ahci or something else? Also, would be great if you could pull: git://git.kernel.dk/linux-block blk-mq-sched into current 'master' and see if it still reproduces. I expect that it will, but just want to ensure that it's a problem in the current code base as well. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 08:59 AM, Hannes Reinecke wrote: > On 01/13/2017 04:34 PM, Jens Axboe wrote: >> On 01/13/2017 08:33 AM, Hannes Reinecke wrote: > [ .. ] >>> Ah, indeed. >>> There is an ominous udev rule here, trying to switch to 'deadline'. >>> >>> # cat 60-ssd-scheduler.rules >>> # do not edit this file, it will be overwritten on update >>> >>> ACTION!="add", GOTO="ssd_scheduler_end" >>> SUBSYSTEM!="block", GOTO="ssd_scheduler_end" >>> >>> IMPORT{cmdline}="elevator" >>> ENV{elevator}=="*?", GOTO="ssd_scheduler_end" >>> >>> KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", >>> ATTR{queue/scheduler}="deadline" >>> >>> LABEL="ssd_scheduler_end" >>> >>> Still shouldn't crash the kernel, though ... >> >> Of course not, and it's not a given that it does, it could just be >> triggering after the device load and failing like expected. But just in >> case, can you try and disable that rule and see if it still crashes with >> MQ_DEADLINE set as the default? >> > Yes, it does. > Same stacktrace as before. Alright, that's as expected. I've tried with your rule and making everything modular, but it still boots fine for me. Very odd. Can you send me your .config? And are all the SCSI disks hanging off ahci? Or sdb specifically, is that ahci or something else? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 04:34 PM, Jens Axboe wrote: > On 01/13/2017 08:33 AM, Hannes Reinecke wrote: [ .. ] >> Ah, indeed. >> There is an ominous udev rule here, trying to switch to 'deadline'. >> >> # cat 60-ssd-scheduler.rules >> # do not edit this file, it will be overwritten on update >> >> ACTION!="add", GOTO="ssd_scheduler_end" >> SUBSYSTEM!="block", GOTO="ssd_scheduler_end" >> >> IMPORT{cmdline}="elevator" >> ENV{elevator}=="*?", GOTO="ssd_scheduler_end" >> >> KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", >> ATTR{queue/scheduler}="deadline" >> >> LABEL="ssd_scheduler_end" >> >> Still shouldn't crash the kernel, though ... > > Of course not, and it's not a given that it does, it could just be > triggering after the device load and failing like expected. But just in > case, can you try and disable that rule and see if it still crashes with > MQ_DEADLINE set as the default? > Yes, it does. Same stacktrace as before. Cheers Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 08:33 AM, Hannes Reinecke wrote: > On 01/13/2017 04:23 PM, Jens Axboe wrote: >> On 01/13/2017 04:04 AM, Hannes Reinecke wrote: >>> On 01/13/2017 09:15 AM, Hannes Reinecke wrote: On 01/11/2017 10:39 PM, Jens Axboe wrote: > Another year, another posting of this patchset. The previous posting > was here: > > https://www.spinics.net/lists/kernel/msg2406106.html > > (yes, I've skipped v5, it was fixes on top of v4, not the rework). > > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently, though, by having the internal scheduler tag > map be allocated/torn down when an IO scheduler is attached or > detached. This also means that when we run without a scheduler, we > don't have to do double tag allocations, it'll work like before. > > The patchset applies on top of 4.10-rc3, or can be pulled here: > > git://git.kernel.dk/linux-block blk-mq-sched.6 > Well ... something's wrong here on my machine: [ 39.886886] [ cut here ] [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 __blk_mq_finish_request+0x124/0x140 [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted 4.10.0-rc3+ #528 [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 [ 39.886917] Workqueue: events_unbound async_run_entry_fn [ 39.886918] Call Trace: [ 39.886923] dump_stack+0x85/0xc9 [ 39.886927] __warn+0xd1/0xf0 [ 39.886928] warn_slowpath_null+0x1d/0x20 [ 39.886930] __blk_mq_finish_request+0x124/0x140 [ 39.886932] blk_mq_finish_request+0x55/0x60 [ 39.886934] blk_mq_sched_put_request+0x78/0x80 [ 39.886936] blk_mq_free_request+0xe/0x10 [ 39.886938] blk_put_request+0x25/0x60 [ 39.886944] __scsi_execute.isra.24+0x104/0x160 [ 39.886946] scsi_execute_req_flags+0x94/0x100 [ 39.886948] scsi_report_opcode+0xab/0x100 checking ... >>> Ah. >>> Seems like the elevator switch races with device setup: >>> >>> 1188.490326] [ cut here ] >>> [ 1188.490334] WARNING: CPU: 9 PID: 30155 at block/blk-mq.c:342 >>> __blk_mq_finish_request+0x172/0x180 >>> [ 1188.490335] Modules linked in: mpt3sas(+) raid_class rpcsec_gss_krb5 >>> auth_rpcgss nfsv4 nfs lockd grace fscache ebtable_filt >>> er ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables >>> af_packet br_netfilter bridge stp llc iscsi_ibft iscs >>> i_boot_sysfs sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp >>> coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_p >>> clmul tg3 ixgbe ghash_clmulni_intel pcbc ptp aesni_intel pps_core >>> aes_x86_64 ipmi_ssif hpilo hpwdt mdio libphy pcc_cpufreq cry >>> pto_simd glue_helper iTCO_wdt iTCO_vendor_support acpi_cpufreq tpm_tis >>> ipmi_si ipmi_devintf cryptd lpc_ich pcspkr ioatdma tpm_ >>> tis_core thermal wmi shpchp dca ipmi_msghandler tpm fjes button sunrpc >>> btrfs xor sr_mod raid6_pq cdrom ehci_pci mgag200 i2c_al >>> go_bit drm_kms_helper syscopyarea sysfillrect uhci_hcd >>> [ 1188.490399] sysimgblt fb_sys_fops sd_mod ahci ehci_hcd ttm libahci >>> crc32c_intel serio_raw drm libata usbcore usb_common hp >>> sa scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc >>> scsi_dh_alua autofs4 >>> [ 1188.490411] CPU: 9 PID: 30155 Comm: kworker/u130:6 Not tainted >>> 4.10.0-rc3+ #535 >>> [ 1188.490411] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 >>> [ 1188.490425] Workqueue: events_unbound async_run_entry_fn >>> [ 1188.490427] Call Trace: >>> [ 1188.490433] dump_stack+0x85/0xc9 >>> [ 1188.490436] __warn+0xd1/0xf0 >>> [ 1188.490438] warn_slowpath_null+0x1d/0x20 >>> [ 1188.490440] __blk_mq_finish_request+0x172/0x180 >>> [ 1188.490442] blk_mq_finish_request+0x55/0x60 >>> [ 1188.490443] blk_mq_sched_put_request+0x78/0x80 >>> [ 1188.490445] blk_mq_free_request+0xe/0x10 >>> [ 1188.490448] blk_put_request+0x25/0x60 >>> [ 1188.490453] __scsi_execute.isra.24+0x104/0x160 >>> [ 1188.490455] scsi_execute_req_flags+0x94/0x100 >>> [ 1188.490457] scsi_report_opcode+0xab/0x100 >>> [ 1188.490461] sd_revalidate_disk+0xaef/0x1450 [sd_mod] >>> [ 1188.490464] sd_probe_async+0xd1/0x1d0 [sd_mod] >>> [ 1188.490466] async_run_entry_fn+0x37/0x150 >>> [ 1188.490470] process_one_work+0x1d0/0x660 >>> [ 1188.490472] ? process_one_work+0x151/0x660 >>> [ 1188.490474] worker_thread+0x12b/0x4a0 >>> [ 1188.490475] kthread+0x10c/0x140 >>> [ 1188.490477] ? proc
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 04:23 PM, Jens Axboe wrote: > On 01/13/2017 04:04 AM, Hannes Reinecke wrote: >> On 01/13/2017 09:15 AM, Hannes Reinecke wrote: >>> On 01/11/2017 10:39 PM, Jens Axboe wrote: Another year, another posting of this patchset. The previous posting was here: https://www.spinics.net/lists/kernel/msg2406106.html (yes, I've skipped v5, it was fixes on top of v4, not the rework). I've reworked bits of this to get rid of the shadow requests, thanks to Bart for the inspiration. The missing piece, for me, was the fact that we have the tags->rqs[] indirection array already. I've done this somewhat differently, though, by having the internal scheduler tag map be allocated/torn down when an IO scheduler is attached or detached. This also means that when we run without a scheduler, we don't have to do double tag allocations, it'll work like before. The patchset applies on top of 4.10-rc3, or can be pulled here: git://git.kernel.dk/linux-block blk-mq-sched.6 >>> Well ... something's wrong here on my machine: >>> >>> [ 39.886886] [ cut here ] >>> [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 >>> __blk_mq_finish_request+0x124/0x140 >>> [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci >>> mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata >>> usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod >>> scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 >>> [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted >>> 4.10.0-rc3+ #528 >>> [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 >>> [ 39.886917] Workqueue: events_unbound async_run_entry_fn >>> [ 39.886918] Call Trace: >>> [ 39.886923] dump_stack+0x85/0xc9 >>> [ 39.886927] __warn+0xd1/0xf0 >>> [ 39.886928] warn_slowpath_null+0x1d/0x20 >>> [ 39.886930] __blk_mq_finish_request+0x124/0x140 >>> [ 39.886932] blk_mq_finish_request+0x55/0x60 >>> [ 39.886934] blk_mq_sched_put_request+0x78/0x80 >>> [ 39.886936] blk_mq_free_request+0xe/0x10 >>> [ 39.886938] blk_put_request+0x25/0x60 >>> [ 39.886944] __scsi_execute.isra.24+0x104/0x160 >>> [ 39.886946] scsi_execute_req_flags+0x94/0x100 >>> [ 39.886948] scsi_report_opcode+0xab/0x100 >>> >>> checking ... >>> >> Ah. >> Seems like the elevator switch races with device setup: >> >> 1188.490326] [ cut here ] >> [ 1188.490334] WARNING: CPU: 9 PID: 30155 at block/blk-mq.c:342 >> __blk_mq_finish_request+0x172/0x180 >> [ 1188.490335] Modules linked in: mpt3sas(+) raid_class rpcsec_gss_krb5 >> auth_rpcgss nfsv4 nfs lockd grace fscache ebtable_filt >> er ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables >> af_packet br_netfilter bridge stp llc iscsi_ibft iscs >> i_boot_sysfs sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp >> coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_p >> clmul tg3 ixgbe ghash_clmulni_intel pcbc ptp aesni_intel pps_core >> aes_x86_64 ipmi_ssif hpilo hpwdt mdio libphy pcc_cpufreq cry >> pto_simd glue_helper iTCO_wdt iTCO_vendor_support acpi_cpufreq tpm_tis >> ipmi_si ipmi_devintf cryptd lpc_ich pcspkr ioatdma tpm_ >> tis_core thermal wmi shpchp dca ipmi_msghandler tpm fjes button sunrpc >> btrfs xor sr_mod raid6_pq cdrom ehci_pci mgag200 i2c_al >> go_bit drm_kms_helper syscopyarea sysfillrect uhci_hcd >> [ 1188.490399] sysimgblt fb_sys_fops sd_mod ahci ehci_hcd ttm libahci >> crc32c_intel serio_raw drm libata usbcore usb_common hp >> sa scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc >> scsi_dh_alua autofs4 >> [ 1188.490411] CPU: 9 PID: 30155 Comm: kworker/u130:6 Not tainted >> 4.10.0-rc3+ #535 >> [ 1188.490411] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 >> [ 1188.490425] Workqueue: events_unbound async_run_entry_fn >> [ 1188.490427] Call Trace: >> [ 1188.490433] dump_stack+0x85/0xc9 >> [ 1188.490436] __warn+0xd1/0xf0 >> [ 1188.490438] warn_slowpath_null+0x1d/0x20 >> [ 1188.490440] __blk_mq_finish_request+0x172/0x180 >> [ 1188.490442] blk_mq_finish_request+0x55/0x60 >> [ 1188.490443] blk_mq_sched_put_request+0x78/0x80 >> [ 1188.490445] blk_mq_free_request+0xe/0x10 >> [ 1188.490448] blk_put_request+0x25/0x60 >> [ 1188.490453] __scsi_execute.isra.24+0x104/0x160 >> [ 1188.490455] scsi_execute_req_flags+0x94/0x100 >> [ 1188.490457] scsi_report_opcode+0xab/0x100 >> [ 1188.490461] sd_revalidate_disk+0xaef/0x1450 [sd_mod] >> [ 1188.490464] sd_probe_async+0xd1/0x1d0 [sd_mod] >> [ 1188.490466] async_run_entry_fn+0x37/0x150 >> [ 1188.490470] process_one_work+0x1d0/0x660 >> [ 1188.490472] ? process_one_work+0x151/0x660 >> [ 1188.490474] worker_thread+0x12b/0x4a0 >> [ 1188.490475] kthread+0x10c/0x140 >> [ 1188.490477] ? process_one_work+0x660/0x660 >> [ 1188.490478] ? kthread_create_on_node+0x40/0x40 >> [ 1188.490483] ret_from_fork+0x2a/0x40 >> [ 1188.490484] ---[ e
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 04:04 AM, Hannes Reinecke wrote: > On 01/13/2017 09:15 AM, Hannes Reinecke wrote: >> On 01/11/2017 10:39 PM, Jens Axboe wrote: >>> Another year, another posting of this patchset. The previous posting >>> was here: >>> >>> https://www.spinics.net/lists/kernel/msg2406106.html >>> >>> (yes, I've skipped v5, it was fixes on top of v4, not the rework). >>> >>> I've reworked bits of this to get rid of the shadow requests, thanks >>> to Bart for the inspiration. The missing piece, for me, was the fact >>> that we have the tags->rqs[] indirection array already. I've done this >>> somewhat differently, though, by having the internal scheduler tag >>> map be allocated/torn down when an IO scheduler is attached or >>> detached. This also means that when we run without a scheduler, we >>> don't have to do double tag allocations, it'll work like before. >>> >>> The patchset applies on top of 4.10-rc3, or can be pulled here: >>> >>> git://git.kernel.dk/linux-block blk-mq-sched.6 >>> >> Well ... something's wrong here on my machine: >> >> [ 39.886886] [ cut here ] >> [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 >> __blk_mq_finish_request+0x124/0x140 >> [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci >> mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata >> usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod >> scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 >> [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted >> 4.10.0-rc3+ #528 >> [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 >> [ 39.886917] Workqueue: events_unbound async_run_entry_fn >> [ 39.886918] Call Trace: >> [ 39.886923] dump_stack+0x85/0xc9 >> [ 39.886927] __warn+0xd1/0xf0 >> [ 39.886928] warn_slowpath_null+0x1d/0x20 >> [ 39.886930] __blk_mq_finish_request+0x124/0x140 >> [ 39.886932] blk_mq_finish_request+0x55/0x60 >> [ 39.886934] blk_mq_sched_put_request+0x78/0x80 >> [ 39.886936] blk_mq_free_request+0xe/0x10 >> [ 39.886938] blk_put_request+0x25/0x60 >> [ 39.886944] __scsi_execute.isra.24+0x104/0x160 >> [ 39.886946] scsi_execute_req_flags+0x94/0x100 >> [ 39.886948] scsi_report_opcode+0xab/0x100 >> >> checking ... >> > Ah. > Seems like the elevator switch races with device setup: > > 1188.490326] [ cut here ] > [ 1188.490334] WARNING: CPU: 9 PID: 30155 at block/blk-mq.c:342 > __blk_mq_finish_request+0x172/0x180 > [ 1188.490335] Modules linked in: mpt3sas(+) raid_class rpcsec_gss_krb5 > auth_rpcgss nfsv4 nfs lockd grace fscache ebtable_filt > er ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables > af_packet br_netfilter bridge stp llc iscsi_ibft iscs > i_boot_sysfs sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_p > clmul tg3 ixgbe ghash_clmulni_intel pcbc ptp aesni_intel pps_core > aes_x86_64 ipmi_ssif hpilo hpwdt mdio libphy pcc_cpufreq cry > pto_simd glue_helper iTCO_wdt iTCO_vendor_support acpi_cpufreq tpm_tis > ipmi_si ipmi_devintf cryptd lpc_ich pcspkr ioatdma tpm_ > tis_core thermal wmi shpchp dca ipmi_msghandler tpm fjes button sunrpc > btrfs xor sr_mod raid6_pq cdrom ehci_pci mgag200 i2c_al > go_bit drm_kms_helper syscopyarea sysfillrect uhci_hcd > [ 1188.490399] sysimgblt fb_sys_fops sd_mod ahci ehci_hcd ttm libahci > crc32c_intel serio_raw drm libata usbcore usb_common hp > sa scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc > scsi_dh_alua autofs4 > [ 1188.490411] CPU: 9 PID: 30155 Comm: kworker/u130:6 Not tainted > 4.10.0-rc3+ #535 > [ 1188.490411] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 > [ 1188.490425] Workqueue: events_unbound async_run_entry_fn > [ 1188.490427] Call Trace: > [ 1188.490433] dump_stack+0x85/0xc9 > [ 1188.490436] __warn+0xd1/0xf0 > [ 1188.490438] warn_slowpath_null+0x1d/0x20 > [ 1188.490440] __blk_mq_finish_request+0x172/0x180 > [ 1188.490442] blk_mq_finish_request+0x55/0x60 > [ 1188.490443] blk_mq_sched_put_request+0x78/0x80 > [ 1188.490445] blk_mq_free_request+0xe/0x10 > [ 1188.490448] blk_put_request+0x25/0x60 > [ 1188.490453] __scsi_execute.isra.24+0x104/0x160 > [ 1188.490455] scsi_execute_req_flags+0x94/0x100 > [ 1188.490457] scsi_report_opcode+0xab/0x100 > [ 1188.490461] sd_revalidate_disk+0xaef/0x1450 [sd_mod] > [ 1188.490464] sd_probe_async+0xd1/0x1d0 [sd_mod] > [ 1188.490466] async_run_entry_fn+0x37/0x150 > [ 1188.490470] process_one_work+0x1d0/0x660 > [ 1188.490472] ? process_one_work+0x151/0x660 > [ 1188.490474] worker_thread+0x12b/0x4a0 > [ 1188.490475] kthread+0x10c/0x140 > [ 1188.490477] ? process_one_work+0x660/0x660 > [ 1188.490478] ? kthread_create_on_node+0x40/0x40 > [ 1188.490483] ret_from_fork+0x2a/0x40 > [ 1188.490484] ---[ end trace d5e3a32ac269fc2a ]--- > [ 1188.490485] rq (487/52) rqs (-1/-1) > [ 1188.523518] sd 7:0:0:0: [sdb] Attached SCSI disk > [ 1188.540954]
Re: [PATCHSET v6] blk-mq scheduling framework
On Fri, Jan 13 2017, Hannes Reinecke wrote: > On 01/13/2017 12:04 PM, Hannes Reinecke wrote: > > On 01/13/2017 09:15 AM, Hannes Reinecke wrote: > >> On 01/11/2017 10:39 PM, Jens Axboe wrote: > >>> Another year, another posting of this patchset. The previous posting > >>> was here: > >>> > >>> https://www.spinics.net/lists/kernel/msg2406106.html > >>> > >>> (yes, I've skipped v5, it was fixes on top of v4, not the rework). > >>> > >>> I've reworked bits of this to get rid of the shadow requests, thanks > >>> to Bart for the inspiration. The missing piece, for me, was the fact > >>> that we have the tags->rqs[] indirection array already. I've done this > >>> somewhat differently, though, by having the internal scheduler tag > >>> map be allocated/torn down when an IO scheduler is attached or > >>> detached. This also means that when we run without a scheduler, we > >>> don't have to do double tag allocations, it'll work like before. > >>> > >>> The patchset applies on top of 4.10-rc3, or can be pulled here: > >>> > >>> git://git.kernel.dk/linux-block blk-mq-sched.6 > >>> > >> Well ... something's wrong here on my machine: > >> > [ .. ] > > Turns out that selecting CONFIG_DEFAULT_MQ_DEADLINE is the culprit; > switching to CONFIG_DEFAULT_MQ_NONE and selecting mq-deadline after > booting manually makes the problem go away. > > So there is a race condition during device init and switching the I/O > scheduler. > > But the results from using mq-deadline are promising; the performance > drop I've seen on older hardware seems to be resolved: > > mq iosched: > seq read : io=13383MB, bw=228349KB/s, iops=57087 > rand read : io=12876MB, bw=219709KB/s, iops=54927 > seq write: io=14532MB, bw=247987KB/s, iops=61996 > rand write: io=13779MB, bw=235127KB/s, iops=58781 > mq default: > seq read : io=13056MB, bw=222588KB/s, iops=55647 > rand read : io=12908MB, bw=220069KB/s, iops=55017 > seq write: io=13986MB, bw=238444KB/s, iops=59611 > rand write: io=13733MB, bw=234128KB/s, iops=58532 > sq default: > seq read : io=10240MB, bw=194787KB/s, iops=48696 > rand read : io=10240MB, bw=191374KB/s, iops=47843 > seq write: io=10240MB, bw=245333KB/s, iops=61333 > rand write: io=10240MB, bw=228239KB/s, iops=57059 > > measured on mpt2sas with SSD devices. Perfect! Straight on the path of kill of non scsi-mq, then. I'll fix up the async scan issue. The new mq schedulers don't really behave differently in this regard, so I'm a bit puzzled. Hopefully it reproduces here. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On Fri, Jan 13 2017, Hannes Reinecke wrote: > On 01/13/2017 09:15 AM, Hannes Reinecke wrote: > > On 01/11/2017 10:39 PM, Jens Axboe wrote: > >> Another year, another posting of this patchset. The previous posting > >> was here: > >> > >> https://www.spinics.net/lists/kernel/msg2406106.html > >> > >> (yes, I've skipped v5, it was fixes on top of v4, not the rework). > >> > >> I've reworked bits of this to get rid of the shadow requests, thanks > >> to Bart for the inspiration. The missing piece, for me, was the fact > >> that we have the tags->rqs[] indirection array already. I've done this > >> somewhat differently, though, by having the internal scheduler tag > >> map be allocated/torn down when an IO scheduler is attached or > >> detached. This also means that when we run without a scheduler, we > >> don't have to do double tag allocations, it'll work like before. > >> > >> The patchset applies on top of 4.10-rc3, or can be pulled here: > >> > >> git://git.kernel.dk/linux-block blk-mq-sched.6 > >> > > Well ... something's wrong here on my machine: > > > > [ 39.886886] [ cut here ] > > [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 > > __blk_mq_finish_request+0x124/0x140 > > [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci > > mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata > > usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod > > scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 > > [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted > > 4.10.0-rc3+ #528 > > [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 > > [ 39.886917] Workqueue: events_unbound async_run_entry_fn > > [ 39.886918] Call Trace: > > [ 39.886923] dump_stack+0x85/0xc9 > > [ 39.886927] __warn+0xd1/0xf0 > > [ 39.886928] warn_slowpath_null+0x1d/0x20 > > [ 39.886930] __blk_mq_finish_request+0x124/0x140 > > [ 39.886932] blk_mq_finish_request+0x55/0x60 > > [ 39.886934] blk_mq_sched_put_request+0x78/0x80 > > [ 39.886936] blk_mq_free_request+0xe/0x10 > > [ 39.886938] blk_put_request+0x25/0x60 > > [ 39.886944] __scsi_execute.isra.24+0x104/0x160 > > [ 39.886946] scsi_execute_req_flags+0x94/0x100 > > [ 39.886948] scsi_report_opcode+0xab/0x100 > > > > checking ... > > > Ah. > Seems like the elevator switch races with device setup: Huh, funky, haven't seen that. I'll see if I can reproduce it here. I don't have SCAN_ASYNC turned on, on my test box. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 12:04 PM, Hannes Reinecke wrote: > On 01/13/2017 09:15 AM, Hannes Reinecke wrote: >> On 01/11/2017 10:39 PM, Jens Axboe wrote: >>> Another year, another posting of this patchset. The previous posting >>> was here: >>> >>> https://www.spinics.net/lists/kernel/msg2406106.html >>> >>> (yes, I've skipped v5, it was fixes on top of v4, not the rework). >>> >>> I've reworked bits of this to get rid of the shadow requests, thanks >>> to Bart for the inspiration. The missing piece, for me, was the fact >>> that we have the tags->rqs[] indirection array already. I've done this >>> somewhat differently, though, by having the internal scheduler tag >>> map be allocated/torn down when an IO scheduler is attached or >>> detached. This also means that when we run without a scheduler, we >>> don't have to do double tag allocations, it'll work like before. >>> >>> The patchset applies on top of 4.10-rc3, or can be pulled here: >>> >>> git://git.kernel.dk/linux-block blk-mq-sched.6 >>> >> Well ... something's wrong here on my machine: >> [ .. ] Turns out that selecting CONFIG_DEFAULT_MQ_DEADLINE is the culprit; switching to CONFIG_DEFAULT_MQ_NONE and selecting mq-deadline after booting manually makes the problem go away. So there is a race condition during device init and switching the I/O scheduler. But the results from using mq-deadline are promising; the performance drop I've seen on older hardware seems to be resolved: mq iosched: seq read : io=13383MB, bw=228349KB/s, iops=57087 rand read : io=12876MB, bw=219709KB/s, iops=54927 seq write: io=14532MB, bw=247987KB/s, iops=61996 rand write: io=13779MB, bw=235127KB/s, iops=58781 mq default: seq read : io=13056MB, bw=222588KB/s, iops=55647 rand read : io=12908MB, bw=220069KB/s, iops=55017 seq write: io=13986MB, bw=238444KB/s, iops=59611 rand write: io=13733MB, bw=234128KB/s, iops=58532 sq default: seq read : io=10240MB, bw=194787KB/s, iops=48696 rand read : io=10240MB, bw=191374KB/s, iops=47843 seq write: io=10240MB, bw=245333KB/s, iops=61333 rand write: io=10240MB, bw=228239KB/s, iops=57059 measured on mpt2sas with SSD devices. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 09:15 AM, Hannes Reinecke wrote: > On 01/11/2017 10:39 PM, Jens Axboe wrote: >> Another year, another posting of this patchset. The previous posting >> was here: >> >> https://www.spinics.net/lists/kernel/msg2406106.html >> >> (yes, I've skipped v5, it was fixes on top of v4, not the rework). >> >> I've reworked bits of this to get rid of the shadow requests, thanks >> to Bart for the inspiration. The missing piece, for me, was the fact >> that we have the tags->rqs[] indirection array already. I've done this >> somewhat differently, though, by having the internal scheduler tag >> map be allocated/torn down when an IO scheduler is attached or >> detached. This also means that when we run without a scheduler, we >> don't have to do double tag allocations, it'll work like before. >> >> The patchset applies on top of 4.10-rc3, or can be pulled here: >> >> git://git.kernel.dk/linux-block blk-mq-sched.6 >> > Well ... something's wrong here on my machine: > > [ 39.886886] [ cut here ] > [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 > __blk_mq_finish_request+0x124/0x140 > [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci > mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata > usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod > scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 > [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted > 4.10.0-rc3+ #528 > [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 > [ 39.886917] Workqueue: events_unbound async_run_entry_fn > [ 39.886918] Call Trace: > [ 39.886923] dump_stack+0x85/0xc9 > [ 39.886927] __warn+0xd1/0xf0 > [ 39.886928] warn_slowpath_null+0x1d/0x20 > [ 39.886930] __blk_mq_finish_request+0x124/0x140 > [ 39.886932] blk_mq_finish_request+0x55/0x60 > [ 39.886934] blk_mq_sched_put_request+0x78/0x80 > [ 39.886936] blk_mq_free_request+0xe/0x10 > [ 39.886938] blk_put_request+0x25/0x60 > [ 39.886944] __scsi_execute.isra.24+0x104/0x160 > [ 39.886946] scsi_execute_req_flags+0x94/0x100 > [ 39.886948] scsi_report_opcode+0xab/0x100 > > checking ... > Ah. Seems like the elevator switch races with device setup: 1188.490326] [ cut here ] [ 1188.490334] WARNING: CPU: 9 PID: 30155 at block/blk-mq.c:342 __blk_mq_finish_request+0x172/0x180 [ 1188.490335] Modules linked in: mpt3sas(+) raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache ebtable_filt er ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet br_netfilter bridge stp llc iscsi_ibft iscs i_boot_sysfs sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_p clmul tg3 ixgbe ghash_clmulni_intel pcbc ptp aesni_intel pps_core aes_x86_64 ipmi_ssif hpilo hpwdt mdio libphy pcc_cpufreq cry pto_simd glue_helper iTCO_wdt iTCO_vendor_support acpi_cpufreq tpm_tis ipmi_si ipmi_devintf cryptd lpc_ich pcspkr ioatdma tpm_ tis_core thermal wmi shpchp dca ipmi_msghandler tpm fjes button sunrpc btrfs xor sr_mod raid6_pq cdrom ehci_pci mgag200 i2c_al go_bit drm_kms_helper syscopyarea sysfillrect uhci_hcd [ 1188.490399] sysimgblt fb_sys_fops sd_mod ahci ehci_hcd ttm libahci crc32c_intel serio_raw drm libata usbcore usb_common hp sa scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 [ 1188.490411] CPU: 9 PID: 30155 Comm: kworker/u130:6 Not tainted 4.10.0-rc3+ #535 [ 1188.490411] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 [ 1188.490425] Workqueue: events_unbound async_run_entry_fn [ 1188.490427] Call Trace: [ 1188.490433] dump_stack+0x85/0xc9 [ 1188.490436] __warn+0xd1/0xf0 [ 1188.490438] warn_slowpath_null+0x1d/0x20 [ 1188.490440] __blk_mq_finish_request+0x172/0x180 [ 1188.490442] blk_mq_finish_request+0x55/0x60 [ 1188.490443] blk_mq_sched_put_request+0x78/0x80 [ 1188.490445] blk_mq_free_request+0xe/0x10 [ 1188.490448] blk_put_request+0x25/0x60 [ 1188.490453] __scsi_execute.isra.24+0x104/0x160 [ 1188.490455] scsi_execute_req_flags+0x94/0x100 [ 1188.490457] scsi_report_opcode+0xab/0x100 [ 1188.490461] sd_revalidate_disk+0xaef/0x1450 [sd_mod] [ 1188.490464] sd_probe_async+0xd1/0x1d0 [sd_mod] [ 1188.490466] async_run_entry_fn+0x37/0x150 [ 1188.490470] process_one_work+0x1d0/0x660 [ 1188.490472] ? process_one_work+0x151/0x660 [ 1188.490474] worker_thread+0x12b/0x4a0 [ 1188.490475] kthread+0x10c/0x140 [ 1188.490477] ? process_one_work+0x660/0x660 [ 1188.490478] ? kthread_create_on_node+0x40/0x40 [ 1188.490483] ret_from_fork+0x2a/0x40 [ 1188.490484] ---[ end trace d5e3a32ac269fc2a ]--- [ 1188.490485] rq (487/52) rqs (-1/-1) [ 1188.523518] sd 7:0:0:0: [sdb] Attached SCSI disk [ 1188.540954] elevator: switch to deadline failed (The 'rqs' line is a debug output from me: struct request *rqs_rq = hctx->tags->rqs[rq->tag];
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/11/2017 10:39 PM, Jens Axboe wrote: > Another year, another posting of this patchset. The previous posting > was here: > > https://www.spinics.net/lists/kernel/msg2406106.html > > (yes, I've skipped v5, it was fixes on top of v4, not the rework). > > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently, though, by having the internal scheduler tag > map be allocated/torn down when an IO scheduler is attached or > detached. This also means that when we run without a scheduler, we > don't have to do double tag allocations, it'll work like before. > > The patchset applies on top of 4.10-rc3, or can be pulled here: > > git://git.kernel.dk/linux-block blk-mq-sched.6 > Fun continues: [ 28.976708] ata3.00: configured for UDMA/100 [ 28.987625] BUG: unable to handle kernel NULL pointer dereference at 0048 [ 28.987632] IP: deadline_add_request+0x15/0x70 [ 28.987633] PGD 0 [ 28.987634] [ 28.987636] Oops: [#1] SMP [ 28.987638] Modules linked in: ahci libahci libata uhci_hcd(+) mgag200(+) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm tg3 libphy ehci_pci ehci_hcd usbcore usb_common ixgbe mdio hpsa(+) dca ptp pps_core scsi_transport_sas fjes(+) sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 [ 28.987654] CPU: 0 PID: 268 Comm: kworker/u2:2 Not tainted 4.10.0-rc3+ #535 [ 28.987655] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 [ 28.987660] Workqueue: events_unbound async_run_entry_fn [ 28.987661] task: 880029391600 task.stack: c938c000 [ 28.987663] RIP: 0010:deadline_add_request+0x15/0x70 [ 28.987664] RSP: 0018:c938fb00 EFLAGS: 00010286 [ 28.987665] RAX: 88003260c400 RBX: RCX: [ 28.987666] RDX: c938fb68 RSI: RDI: 8800293b9040 [ 28.987666] RBP: c938fb18 R08: 0087668] R13: 88003260c400 R14: R15: [ 28.987670] FS: () GS:880035c0() knlGS: [ 28.987670] CS: 0010 DS: ES: CR0: 80050033 [ 28.987671] CR2: 0048 CR3: 32b64000 CR4: 000406f0 [ 28.987672] Call Trace: [ 28.987677] blk_mq_sched_get_request+0x12e/0x310 [ 28.987678] ? blk_mq_sched_get_request+0x5/0x310 [ 28.987681] blk_mq_alloc_request+0x40/0x90 [ 28.987684] blk_get_request+0x35/0x110 [ 28.987689] __scsi_execute.isra.24+0x3c/0x160 [ 28.987691] scsi_execute_req_flags+0x94/0x100 [ 28.987694] scsi_probe_and_add_lun+0x207/0xd60 [ 28.987699] ? __pm_rme_resume+0x5c/0x80 [ 28.987701] __scsi_add_device+0x103/0x120 [ 28.987709] ata_scsi_scan_host+0xa3/0x1d0 [libata] [ 28.987716] async_port_probe+0x43/0x60 [libata] [ 28.987718] async_run_entry_fn+0x37/0x150 [ 28.987722] process_one_work+0x1d0/0x660 Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On 01/11/2017 10:39 PM, Jens Axboe wrote: > Another year, another posting of this patchset. The previous posting > was here: > > https://www.spinics.net/lists/kernel/msg2406106.html > > (yes, I've skipped v5, it was fixes on top of v4, not the rework). > > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently, though, by having the internal scheduler tag > map be allocated/torn down when an IO scheduler is attached or > detached. This also means that when we run without a scheduler, we > don't have to do double tag allocations, it'll work like before. > > The patchset applies on top of 4.10-rc3, or can be pulled here: > > git://git.kernel.dk/linux-block blk-mq-sched.6 > Well ... something's wrong here on my machine: [ 39.886886] [ cut here ] [ 39.886895] WARNING: CPU: 9 PID: 62 at block/blk-mq.c:342 __blk_mq_finish_request+0x124/0x140 [ 39.886895] Modules linked in: sd_mod ahci uhci_hcd ehci_pci mpt3sas(+) libahci ehci_hcd serio_raw crc32c_intel raid_class drm libata usbcore hpsa usb_common scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 [ 39.886910] CPU: 9 PID: 62 Comm: kworker/u130:0 Not tainted 4.10.0-rc3+ #528 [ 39.886911] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 09/08/2013 [ 39.886917] Workqueue: events_unbound async_run_entry_fn [ 39.886918] Call Trace: [ 39.886923] dump_stack+0x85/0xc9 [ 39.886927] __warn+0xd1/0xf0 [ 39.886928] warn_slowpath_null+0x1d/0x20 [ 39.886930] __blk_mq_finish_request+0x124/0x140 [ 39.886932] blk_mq_finish_request+0x55/0x60 [ 39.886934] blk_mq_sched_put_request+0x78/0x80 [ 39.886936] blk_mq_free_request+0xe/0x10 [ 39.886938] blk_put_request+0x25/0x60 [ 39.886944] __scsi_execute.isra.24+0x104/0x160 [ 39.886946] scsi_execute_req_flags+0x94/0x100 [ 39.886948] scsi_report_opcode+0xab/0x100 checking ... Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v6] blk-mq scheduling framework
On Wed, 2017-01-11 at 14:39 -0700, Jens Axboe wrote: > I've reworked bits of this to get rid of the shadow requests, thanks > to Bart for the inspiration. The missing piece, for me, was the fact > that we have the tags->rqs[] indirection array already. I've done this > somewhat differently, though, by having the internal scheduler tag > map be allocated/torn down when an IO scheduler is attached or > detached. This also means that when we run without a scheduler, we > don't have to do double tag allocations, it'll work like before. Hello Jens, Thanks for having done the rework! This series looks great to me. I have a few small comments though. I will post these as replies to the individual patches. Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHSET v6] blk-mq scheduling framework
Another year, another posting of this patchset. The previous posting was here: https://www.spinics.net/lists/kernel/msg2406106.html (yes, I've skipped v5, it was fixes on top of v4, not the rework). I've reworked bits of this to get rid of the shadow requests, thanks to Bart for the inspiration. The missing piece, for me, was the fact that we have the tags->rqs[] indirection array already. I've done this somewhat differently, though, by having the internal scheduler tag map be allocated/torn down when an IO scheduler is attached or detached. This also means that when we run without a scheduler, we don't have to do double tag allocations, it'll work like before. The patchset applies on top of 4.10-rc3, or can be pulled here: git://git.kernel.dk/linux-block blk-mq-sched.6 block/Kconfig.iosched| 50 block/Makefile |3 block/blk-core.c | 19 - block/blk-exec.c |3 block/blk-flush.c| 15 - block/blk-ioc.c | 12 block/blk-merge.c|4 block/blk-mq-sched.c | 354 + block/blk-mq-sched.h | 157 block/blk-mq-sysfs.c | 13 + block/blk-mq-tag.c | 58 ++-- block/blk-mq-tag.h |4 block/blk-mq.c | 413 +++--- block/blk-mq.h | 40 +++ block/blk-tag.c |1 block/blk.h | 26 +- block/cfq-iosched.c |2 block/deadline-iosched.c |2 block/elevator.c | 247 +++- block/mq-deadline.c | 569 +++ block/noop-iosched.c |2 drivers/nvme/host/pci.c |1 include/linux/blk-mq.h |9 include/linux/blkdev.h |6 include/linux/elevator.h | 36 ++ 25 files changed, 1732 insertions(+), 314 deletions(-) -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html