Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 01:22 PM, Keith Busch wrote: > One performance oddity we observe is that servicing the interrupt on the > thread sibling of the core that submitted the I/O is the worst performing > cpu you can chose; it's actually better to use a different core on the > same node. At least that's

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 2014-06-13 13:29, Jens Axboe wrote: On 06/13/2014 01:22 PM, Keith Busch wrote: On Fri, 13 Jun 2014, Jens Axboe wrote: OK, same setup as mine. The affinity hint is really screwing us over, no question about it. We just need a: irq_set_affinity_hint(dev->entry[nvmeq->cq_vector].vector,

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 01:22 PM, Keith Busch wrote: > On Fri, 13 Jun 2014, Jens Axboe wrote: >> OK, same setup as mine. The affinity hint is really screwing us over, no >> question about it. We just need a: >> >> irq_set_affinity_hint(dev->entry[nvmeq->cq_vector].vector, >> hctx->cpumask); >> >> in the

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: OK, same setup as mine. The affinity hint is really screwing us over, no question about it. We just need a: irq_set_affinity_hint(dev->entry[nvmeq->cq_vector].vector, hctx->cpumask); in the ->init_hctx() methods to fix that up. That brings us to roughly

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 09:16 AM, Keith Busch wrote: > On Fri, 13 Jun 2014, Jens Axboe wrote: >> On 06/13/2014 09:05 AM, Keith Busch wrote: >>> Here are the performance drops observed with blk-mq with the existing >>> driver as baseline: >>> >>> CPU : Drop >>> :. >>>0 : -6% >>>8 : -36% >>>

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/13/2014 09:05 AM, Keith Busch wrote: Here are the performance drops observed with blk-mq with the existing driver as baseline: CPU : Drop :. 0 : -6% 8 : -36% 16 : -12% We need the hints back for sure, I'll run some of the same

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 09:05 AM, Keith Busch wrote: > On Fri, 13 Jun 2014, Jens Axboe wrote: >> On 06/12/2014 06:06 PM, Keith Busch wrote: >>> When cancelling IOs, we have to check if the hwctx has a valid tags >>> for some reason. I have 32 cores in my system and as many queues, but >> >> It's because

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/12/2014 06:06 PM, Keith Busch wrote: When cancelling IOs, we have to check if the hwctx has a valid tags for some reason. I have 32 cores in my system and as many queues, but It's because unused queues are torn down, to save memory. blk-mq is

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/12/2014 06:06 PM, Keith Busch wrote: > When cancelling IOs, we have to check if the hwctx has a valid tags > for some reason. I have 32 cores in my system and as many queues, but It's because unused queues are torn down, to save memory. > blk-mq is only using half of those queues and freed

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/12/2014 06:06 PM, Keith Busch wrote: When cancelling IOs, we have to check if the hwctx has a valid tags for some reason. I have 32 cores in my system and as many queues, but It's because unused queues are torn down, to save memory. blk-mq is only using half of those queues and freed

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/12/2014 06:06 PM, Keith Busch wrote: When cancelling IOs, we have to check if the hwctx has a valid tags for some reason. I have 32 cores in my system and as many queues, but It's because unused queues are torn down, to save memory. blk-mq is

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 09:05 AM, Keith Busch wrote: On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/12/2014 06:06 PM, Keith Busch wrote: When cancelling IOs, we have to check if the hwctx has a valid tags for some reason. I have 32 cores in my system and as many queues, but It's because unused queues

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/13/2014 09:05 AM, Keith Busch wrote: Here are the performance drops observed with blk-mq with the existing driver as baseline: CPU : Drop :. 0 : -6% 8 : -36% 16 : -12% We need the hints back for sure, I'll run some of the same

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 09:16 AM, Keith Busch wrote: On Fri, 13 Jun 2014, Jens Axboe wrote: On 06/13/2014 09:05 AM, Keith Busch wrote: Here are the performance drops observed with blk-mq with the existing driver as baseline: CPU : Drop :. 0 : -6% 8 : -36% 16 : -12% We need the

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Keith Busch
On Fri, 13 Jun 2014, Jens Axboe wrote: OK, same setup as mine. The affinity hint is really screwing us over, no question about it. We just need a: irq_set_affinity_hint(dev-entry[nvmeq-cq_vector].vector, hctx-cpumask); in the -init_hctx() methods to fix that up. That brings us to roughly the

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 01:22 PM, Keith Busch wrote: On Fri, 13 Jun 2014, Jens Axboe wrote: OK, same setup as mine. The affinity hint is really screwing us over, no question about it. We just need a: irq_set_affinity_hint(dev-entry[nvmeq-cq_vector].vector, hctx-cpumask); in the -init_hctx() methods

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 2014-06-13 13:29, Jens Axboe wrote: On 06/13/2014 01:22 PM, Keith Busch wrote: On Fri, 13 Jun 2014, Jens Axboe wrote: OK, same setup as mine. The affinity hint is really screwing us over, no question about it. We just need a: irq_set_affinity_hint(dev-entry[nvmeq-cq_vector].vector,

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-13 Thread Jens Axboe
On 06/13/2014 01:22 PM, Keith Busch wrote: One performance oddity we observe is that servicing the interrupt on the thread sibling of the core that submitted the I/O is the worst performing cpu you can chose; it's actually better to use a different core on the same node. At least that's true

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Keith Busch
On Thu, 12 Jun 2014, Keith Busch wrote: On Thu, 12 Jun 2014, Matias Bjørling wrote: On 06/12/2014 12:51 AM, Keith Busch wrote: So far so good: it passed the test that was previously failing. I'll let the remaining xfstests run and see what happens. Great. The flushes was a fluke. I haven't

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Keith Busch
On Thu, 12 Jun 2014, Matias Bjørling wrote: On 06/12/2014 12:51 AM, Keith Busch wrote: So far so good: it passed the test that was previously failing. I'll let the remaining xfstests run and see what happens. Great. The flushes was a fluke. I haven't been able to reproduce. Cool, most of

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Matias Bjørling
On 06/12/2014 12:51 AM, Keith Busch wrote: On Wed, 11 Jun 2014, Matias Bjørling wrote: I've rebased nvmemq_review and added two patches from Jens that add support for requests with single range virtual addresses. Keith, will you take it for a spin and see if it fixes 068 for you? There might

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Matias Bjørling
On 06/12/2014 12:51 AM, Keith Busch wrote: On Wed, 11 Jun 2014, Matias Bjørling wrote: I've rebased nvmemq_review and added two patches from Jens that add support for requests with single range virtual addresses. Keith, will you take it for a spin and see if it fixes 068 for you? There might

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Keith Busch
On Thu, 12 Jun 2014, Matias Bjørling wrote: On 06/12/2014 12:51 AM, Keith Busch wrote: So far so good: it passed the test that was previously failing. I'll let the remaining xfstests run and see what happens. Great. The flushes was a fluke. I haven't been able to reproduce. Cool, most of

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-12 Thread Keith Busch
On Thu, 12 Jun 2014, Keith Busch wrote: On Thu, 12 Jun 2014, Matias Bjørling wrote: On 06/12/2014 12:51 AM, Keith Busch wrote: So far so good: it passed the test that was previously failing. I'll let the remaining xfstests run and see what happens. Great. The flushes was a fluke. I haven't

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Keith Busch
On Wed, 11 Jun 2014, Matias Bjørling wrote: I've rebased nvmemq_review and added two patches from Jens that add support for requests with single range virtual addresses. Keith, will you take it for a spin and see if it fixes 068 for you? There might still be a problem with some flushes, I'm

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Matias Bjørling
On Wed, Jun 11, 2014 at 7:09 PM, Matthew Wilcox wrote: > On Wed, Jun 11, 2014 at 10:54:52AM -0600, Jens Axboe wrote: >> OK, so essentially any single request must be a virtually contig piece >> of memory. Is there any size limitations to how big this contig segment >> can be? > > The maximum size

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Matthew Wilcox
On Wed, Jun 11, 2014 at 10:54:52AM -0600, Jens Axboe wrote: > OK, so essentially any single request must be a virtually contig piece > of memory. Is there any size limitations to how big this contig segment > can be? The maximum size of an I/O is 65536 sectors. So on a 512-byte sector device,

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Jens Axboe
On 06/10/2014 03:33 PM, Matthew Wilcox wrote: > On Tue, Jun 10, 2014 at 03:21:18PM -0600, Keith Busch wrote: >> Yeah, nvme_setup_prps is probably the least readable code in this driver. >> Maybe some comments are in order here... >> >> There are two rules for an SGL to be mappable to a PRP: >> >>

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Jens Axboe
On 06/10/2014 03:33 PM, Matthew Wilcox wrote: On Tue, Jun 10, 2014 at 03:21:18PM -0600, Keith Busch wrote: Yeah, nvme_setup_prps is probably the least readable code in this driver. Maybe some comments are in order here... There are two rules for an SGL to be mappable to a PRP: 1. Every

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Matthew Wilcox
On Wed, Jun 11, 2014 at 10:54:52AM -0600, Jens Axboe wrote: OK, so essentially any single request must be a virtually contig piece of memory. Is there any size limitations to how big this contig segment can be? The maximum size of an I/O is 65536 sectors. So on a 512-byte sector device,

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Matias Bjørling
On Wed, Jun 11, 2014 at 7:09 PM, Matthew Wilcox wi...@linux.intel.com wrote: On Wed, Jun 11, 2014 at 10:54:52AM -0600, Jens Axboe wrote: OK, so essentially any single request must be a virtually contig piece of memory. Is there any size limitations to how big this contig segment can be? The

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-11 Thread Keith Busch
On Wed, 11 Jun 2014, Matias Bjørling wrote: I've rebased nvmemq_review and added two patches from Jens that add support for requests with single range virtual addresses. Keith, will you take it for a spin and see if it fixes 068 for you? There might still be a problem with some flushes, I'm

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Matthew Wilcox
On Tue, Jun 10, 2014 at 03:21:18PM -0600, Keith Busch wrote: > Yeah, nvme_setup_prps is probably the least readable code in this driver. > Maybe some comments are in order here... > > There are two rules for an SGL to be mappable to a PRP: > > 1. Every element must have zero page offset, except

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 03:10 PM, Keith Busch wrote: On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 01:29 PM, Keith Busch wrote: I have two devices, one formatted 4k, the other 512. The 4k is used as the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
On 06/10/2014 03:10 PM, Keith Busch wrote: > On Tue, 10 Jun 2014, Jens Axboe wrote: >> On 06/10/2014 01:29 PM, Keith Busch wrote: >>> I have two devices, one formatted 4k, the other 512. The 4k is used as >>> the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG >>> when >>>

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 01:29 PM, Keith Busch wrote: I have two devices, one formatted 4k, the other 512. The 4k is used as the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG when unmounting the scratch dev in xfstests generic/068. The bug looks

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
On 06/10/2014 01:29 PM, Keith Busch wrote: > On Tue, 10 Jun 2014, Jens Axboe wrote: >>> On Jun 10, 2014, at 9:52 AM, Keith Busch wrote: >>> On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. >>> >>> I'd like to run xfstests on

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On Jun 10, 2014, at 9:52 AM, Keith Busch wrote: On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly don't know much

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
> On Jun 10, 2014, at 9:52 AM, Keith Busch wrote: > >> On Tue, 10 Jun 2014, Matias Bjørling wrote: >> This converts the current NVMe driver to utilize the blk-mq layer. > > I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly > don't know much about this area, but I think

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly don't know much about this area, but I think this may be from the recent chunk sectors patch causing a

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly don't know much about this area, but I think this may be from the recent chunk sectors patch causing a

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
On Jun 10, 2014, at 9:52 AM, Keith Busch keith.bu...@intel.com wrote: On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly don't know much about this area,

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On Jun 10, 2014, at 9:52 AM, Keith Busch keith.bu...@intel.com wrote: On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on this, but it is failing mkfs.xfs. I honestly

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
On 06/10/2014 01:29 PM, Keith Busch wrote: On Tue, 10 Jun 2014, Jens Axboe wrote: On Jun 10, 2014, at 9:52 AM, Keith Busch keith.bu...@intel.com wrote: On Tue, 10 Jun 2014, Matias Bjørling wrote: This converts the current NVMe driver to utilize the blk-mq layer. I'd like to run xfstests on

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 01:29 PM, Keith Busch wrote: I have two devices, one formatted 4k, the other 512. The 4k is used as the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG when unmounting the scratch dev in xfstests generic/068. The bug looks

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Jens Axboe
On 06/10/2014 03:10 PM, Keith Busch wrote: On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 01:29 PM, Keith Busch wrote: I have two devices, one formatted 4k, the other 512. The 4k is used as the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always hitting a BUG when unmounting the

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Keith Busch
On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 03:10 PM, Keith Busch wrote: On Tue, 10 Jun 2014, Jens Axboe wrote: On 06/10/2014 01:29 PM, Keith Busch wrote: I have two devices, one formatted 4k, the other 512. The 4k is used as the TEST_DEV and 512 is used as SCRATCH_DEV. I'm always

Re: [PATCH v7] NVMe: conversion to blk-mq

2014-06-10 Thread Matthew Wilcox
On Tue, Jun 10, 2014 at 03:21:18PM -0600, Keith Busch wrote: Yeah, nvme_setup_prps is probably the least readable code in this driver. Maybe some comments are in order here... There are two rules for an SGL to be mappable to a PRP: 1. Every element must have zero page offset, except the