[PATCH 0/7] ACPI HMAT memory sysfs representation

2018-11-14 Thread Keith Busch
characteristics about the memory nodes. Documentation desribing the new representation are provided. Finally the series adds a kernel user for these new APIs from parsing the ACPI HMAT. Keith Busch (7): node: Link memory nodes to their compute nodes node: Add heterogenous memory performance

[PATCH 6/7] acpi: Create subtable parsing infrastructure

2018-11-14 Thread Keith Busch
parsing the entries array may be more reused for all ACPI system tables. Signed-off-by: Keith Busch --- drivers/acpi/tables.c | 75 --- 1 file changed, 65 insertions(+), 10 deletions(-) diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index

[PATCH 3/7] doc/vm: New documentation for memory performance

2018-11-14 Thread Keith Busch
and the attributes the kernel makes available to aid applications wishing to query this information. Signed-off-by: Keith Busch --- Documentation/vm/numaperf.rst | 71 +++ 1 file changed, 71 insertions(+) create mode 100644 Documentation/vm/numaperf.rst diff --git

[PATCH 5/7] doc/vm: New documentation for memory cache

2018-11-14 Thread Keith Busch
attributes for application optimization. Signed-off-by: Keith Busch --- Documentation/vm/numacache.rst | 76 ++ 1 file changed, 76 insertions(+) create mode 100644 Documentation/vm/numacache.rst diff --git a/Documentation/vm/numacache.rst b/Documentation/vm

[PATCH 2/7] node: Add heterogenous memory performance

2018-11-14 Thread Keith Busch
/nodeY/initiator_access /sys/devices/system/node/nodeY/initiator_access |-- read_bandwidth |-- read_latency |-- write_bandwidth `-- write_latency The bandwidth is exported as MB/s and latency is reported in nanoseconds. Signed-off-by: Keith Busch --- drivers/base/Kconfig | 8

[PATCH 7/7] acpi/hmat: Parse and report heterogeneous memory

2018-11-14 Thread Keith Busch
with the memory numa nodes so they can be observed through sysfs. Signed-off-by: Keith Busch --- drivers/acpi/Kconfig | 9 ++ drivers/acpi/Makefile | 1 + drivers/acpi/hmat.c | 384 ++ drivers/acpi/tables.c | 10 ++ 4 files changed, 404 insertions

[PATCH 4/7] node: Add memory caching attributes

2018-11-14 Thread Keith Busch
. Signed-off-by: Keith Busch --- drivers/base/node.c | 117 +++ include/linux/node.h | 23 ++ 2 files changed, 140 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index 232535761998..bb94f1d18115 100644 --- a/drivers/base

[PATCH 1/7] node: Link memory nodes to their compute nodes

2018-11-14 Thread Keith Busch
Y' local to commpute node 'X': # ls -l /sys/devices/system/node/nodeX/initiator* /sys/devices/system/node/nodeX/targetY -> ../nodeY # ls -l /sys/devices/system/node/nodeY/target* /sys/devices/system/node/nodeY/initiatorX -> ../nodeX Signed-off-by: Keith Busch ---

Re: [PATCH 1/7] node: Link memory nodes to their compute nodes

2018-11-15 Thread Keith Busch
On Thu, Nov 15, 2018 at 05:57:10AM -0800, Matthew Wilcox wrote: > On Wed, Nov 14, 2018 at 03:49:14PM -0700, Keith Busch wrote: > > Memory-only nodes will often have affinity to a compute node, and > > platforms have ways to express that locality relationship. > > >

Re: [PATCH 0/7] ACPI HMAT memory sysfs representation

2018-11-16 Thread Keith Busch
On Fri, Nov 16, 2018 at 11:57:58AM +0530, Anshuman Khandual wrote: > On 11/15/2018 04:19 AM, Keith Busch wrote: > > This series provides a new sysfs representation for heterogeneous > > system memory. > > > > The previous series that was specific to HMAT that this serie

Re: [PATCH 1/7] node: Link memory nodes to their compute nodes

2018-11-16 Thread Keith Busch
On Thu, Nov 15, 2018 at 12:36:54PM -0800, Matthew Wilcox wrote: > On Thu, Nov 15, 2018 at 07:59:20AM -0700, Keith Busch wrote: > > On Thu, Nov 15, 2018 at 05:57:10AM -0800, Matthew Wilcox wrote: > > > On Wed, Nov 14, 2018 at 03:49:14PM -0700, Keith Busch wrote: > > > >

Re: Enable tracing only for one function and its children?

2018-11-16 Thread Keith Busch
On Fri, Nov 16, 2018 at 04:37:55PM -0600, Timur Tabi wrote: > Is there a way to enable ftrace tracing only for one specific function > and all the functions it calls? Then when the function returns, > disable tracing until the next time? > > When I pass the function name only to

Re: [PATCH 2/7] node: Add heterogenous memory performance

2018-11-19 Thread Keith Busch
On Mon, Nov 19, 2018 at 09:05:07AM +0530, Anshuman Khandual wrote: > On 11/15/2018 04:19 AM, Keith Busch wrote: > > Heterogeneous memory systems provide memory nodes with latency > > and bandwidth performance attributes that are different from other > > nodes. Create an int

Re: [PATCH 1/7] node: Link memory nodes to their compute nodes

2018-11-19 Thread Keith Busch
On Mon, Nov 19, 2018 at 08:45:25AM +0530, Anshuman Khandual wrote: > On 11/17/2018 12:02 AM, Keith Busch wrote: > > On Thu, Nov 15, 2018 at 12:36:54PM -0800, Matthew Wilcox wrote: > >> So ... let's imagine a hypothetical system (I've never seen one built like > >> th

Re: [PATCH v2 2/2] PCI: pciehp: Add HXT quirk for Command Completed errata

2018-11-19 Thread Keith Busch
On Wed, Nov 07, 2018 at 03:25:05PM +0800, Shunyong Yang wrote: > The HXT SD4800 PCI controller does not set the Command Completed > bit unless writes to the Slot Command register change "Control" > bits. > > This patch adds SD4800 to the quirk. > > Cc: Joey Zheng > Signed-off-by: Shunyong Yang

Re: [PATCH 1/3] PCI/AER: Option to leave System Error Interrupts as-is

2018-11-02 Thread Keith Busch
On Fri, Nov 02, 2018 at 10:53:00AM +0100, Borislav Petkov wrote: > On Mon, Oct 29, 2018 at 04:06:51PM -0500, Bjorn Helgaas wrote: > > If I squint hard enough this sort of makes sense, but it also makes me > > confused about the normal APEI firmware-first model works. > > > > In the

Re: [PATCH 1/3] PCI/AER: Option to leave System Error Interrupts as-is

2018-11-02 Thread Keith Busch
On Fri, Nov 02, 2018 at 05:26:23PM +0100, Borislav Petkov wrote: > On Fri, Nov 02, 2018 at 10:17:30AM -0600, Keith Busch wrote: > > VMD acts a bit like a host-bus adapter. The firmware knows about the > > adapter, but not about anything on the bus that it attaches to. > &

Re: linux-next: build failure after merge of the pci tree

2018-09-26 Thread Keith Busch
On Wed, Sep 26, 2018 at 03:00:51PM +1000, Stephen Rothwell wrote: > Hi Bjorn, > > After merging the pci tree, today's linux-next build (powerpc allnoconfig) > failed like this: > > ld: drivers/pci/pci.o: in function `pci_bus_error_reset': > pci.c:(.text+0x5fba): undefined reference to

Re: linux-next: build failure after merge of the pci tree

2018-09-26 Thread Keith Busch
On Wed, Sep 26, 2018 at 08:25:40AM -0600, Keith Busch wrote: > On Wed, Sep 26, 2018 at 03:00:51PM +1000, Stephen Rothwell wrote: > > Hi Bjorn, > > > > After merging the pci tree, today's linux-next build (powerpc allnoconfig) > > failed like this: > > >

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-08 Thread Keith Busch
On Thu, Nov 08, 2018 at 02:01:17PM -0800, Greg Kroah-Hartman wrote: > On Thu, Nov 08, 2018 at 02:09:17PM -0600, Bjorn Helgaas wrote: > > I'm having second thoughts about this. One thing I'm uncomfortable > > with is that sprinkling pci_dev_is_disconnected() around feels ad hoc > > instead of

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-08 Thread Keith Busch
On Thu, Nov 08, 2018 at 02:42:55PM -0800, Greg Kroah-Hartman wrote: > On Thu, Nov 08, 2018 at 03:32:58PM -0700, Keith Busch wrote: > > On Thu, Nov 08, 2018 at 02:01:17PM -0800, Greg Kroah-Hartman wrote: > > > On Thu, Nov 08, 2018 at 02:09:17PM -0600, Bjorn Helgaas wrote: > &g

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-09 Thread Keith Busch
On Fri, Nov 09, 2018 at 03:32:57AM -0800, Greg Kroah-Hartman wrote: > On Fri, Nov 09, 2018 at 08:29:53AM +0100, Lukas Wunner wrote: > > On Thu, Nov 08, 2018 at 02:01:17PM -0800, Greg Kroah-Hartman wrote: > > > On Thu, Nov 08, 2018 at 02:09:17PM -0600, Bjorn Helgaas wrote: > > > > I'm having second

[PATCH 1/6] mm/gup_benchmark: Time put_page

2018-10-10 Thread Keith Busch
-off-by: Keith Busch --- mm/gup_benchmark.c | 8 ++-- tools/testing/selftests/vm/gup_benchmark.c | 6 -- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/gup_benchmark.c b/mm/gup_benchmark.c index 7405c9d89d65..b344abd6e8e4 100644 --- a/mm

[PATCH 2/6] mm/gup_benchmark: Add additional pinning methods

2018-10-10 Thread Keith Busch
This patch provides new gup benchmark ioctl commands to run different user page pinning methods, get_user_pages_longterm and get_user_pages, in addition to the existing get_user_pages_fast. Cc: Dave Hansen Cc: Dan Williams Acked-by: Kirill A. Shutemov Signed-off-by: Keith Busch --- mm

[PATCHv4] mm/gup: Cache dev_pagemap while pinning pages

2018-10-10 Thread Keith Busch
when dev_pagemap is used by caching the last dev_pagemap while getting use pages. The gup_benchmark kernel self test reports this reduces time to get user pages to as low as 1/3 the previous time. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- v3 ->

[PATCH 3/6] tools/gup_benchmark: Fix 'write' flag usage

2018-10-10 Thread Keith Busch
If the '-w' parameter was provided, the benchmark would exit due to a mssing 'break'. Cc: Dave Hansen Cc: Dan Williams Acked-by: Kirill A. Shutemov Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing

[PATCH 4/6] tools/gup_benchmark: Allow user specified file

2018-10-10 Thread Keith Busch
Williams Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/gup_benchmark.c b/tools/testing/selftests/vm/gup_benchmark.c index b2082df8beb4..b675a3d60975 100644

[PATCH 6/6] tools/gup_benchmark: Add MAP_HUGETLB option

2018-10-10 Thread Keith Busch
This patch adds a new option, '-H', to the gup benchmark to help compare how hugetlb mapping pages compare with the default. Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests

[PATCH 5/6] tools/gup_benchmark: Add MAP_SHARED option

2018-10-10 Thread Keith Busch
This patch adds a new benchmark option, -S, to request MAP_SHARED. This can be used to compare with MAP_PRIVATE, or for files that require this option, like dax. Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 10 +++--- 1 file changed, 7 insertions(+), 3

Re: [PATCH 1/6] mm/gup_benchmark: Time put_page

2018-10-10 Thread Keith Busch
On Wed, Oct 10, 2018 at 03:26:55PM -0700, Andrew Morton wrote: > On Wed, 10 Oct 2018 13:56:00 -0600 Keith Busch wrote: > > > We'd like to measure time to unpin user pages, so this adds a second > > benchmark timer on put_page, separate from get_page. > > > >

Re: [PATCH 1/6] mm/gup_benchmark: Time put_page

2018-10-10 Thread Keith Busch
On Wed, Oct 10, 2018 at 03:41:11PM -0700, Andrew Morton wrote: > On Wed, 10 Oct 2018 16:28:43 -0600 Keith Busch wrote: > > > > > struct gup_benchmark { > > > > - __u64 delta_usec; > > > > + __u64 get_delta_usec; > > > > +

Re: [PATCH 4/6] tools/gup_benchmark: Allow user specified file

2018-10-10 Thread Keith Busch
On Wed, Oct 10, 2018 at 03:31:01PM -0700, Andrew Morton wrote: > On Wed, 10 Oct 2018 13:56:03 -0600 Keith Busch wrote: > > + filed = open(file, O_RDWR|O_CREAT); > > + if (filed < 0) > > + perror("open"), exit(filed); > > Ick. Like thi

Re: [PATCH v3] PCI/AER: Enable reporting for ports enumerated after AER driver registration

2018-10-11 Thread Keith Busch
On Thu, Oct 11, 2018 at 08:26:18AM -0700, Bjorn Helgaas wrote: > From: Bjorn Helgaas > > Previously we enabled AER error reporting only for Switch Ports that were > enumerated prior to registering the AER service driver. Switch Ports > enumerated after AER driver registration were left with

[PATCHv2] mm/gup: Cache dev_pagemap while pinning pages

2018-10-11 Thread Keith Busch
when dev_pagemap is used by caching the last dev_pagemap while getting user pages. The gup_benchmark kernel self test reports this reduces time to get user pages to as low as 1/3 of the previous time. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- Changes from

Re: [PATCHv2] mm/gup: Cache dev_pagemap while pinning pages

2018-10-12 Thread Keith Busch
On Fri, Oct 12, 2018 at 09:58:18AM -0700, Dan Williams wrote: > On Fri, Oct 12, 2018 at 4:00 AM Kirill A. Shutemov > wrote: > [..] > > > Does this have defined behavior? I would feel better with " = { 0 }" > > > to be explicit. > > > > Well, it's not allowed by the standart, but GCC allows this.

[PATCHv3] mm/gup: Cache dev_pagemap while pinning pages

2018-10-12 Thread Keith Busch
when dev_pagemap is used by caching the last dev_pagemap while getting user pages. The gup_benchmark kernel self test reports this reduces time to get user pages to as low as 1/3 of the previous time. Cc: Dave Hansen Reviewed-by: Dan Williams Acked-by: Kirill A. Shutemov Signed-off-by: Keith Busch

Re: [PATCH 00/12] error handling and pciehp maintenance

2018-11-06 Thread Keith Busch
On Tue, Nov 06, 2018 at 04:34:08PM +, Lorenzo Pieralisi wrote: > The question is whether we really need to dynamically patch the kernel > with ftrace to achieve what that patch does. > > Furthermore, it would also be good to report what bugs we are actually > fixing, from what you are writing

Re: [PATCH 00/12] error handling and pciehp maintenance

2018-11-06 Thread Keith Busch
On Tue, Nov 06, 2018 at 05:21:00PM +, Lorenzo Pieralisi wrote: > If you have a simple reproducer for the bugs I am happy to help you test > it (I can also apply arm64 DYNAMIC_FTRACE_WITH_REGS patches and test that > new code path if that's the final direction we are taking). The easiest way

[PATCH 1/7] mm/gup_benchmark: Time put_page

2018-09-19 Thread Keith Busch
We'd like to measure time to unpin user pages, so this adds a second benchmark timer on put_page, separate from get_page. This will break ABI on this ioctl, but being an in-kernel benchmark may be acceptable. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch

[PATCH 2/7] mm/gup_benchmark: Add additional pinning methods

2018-09-19 Thread Keith Busch
This patch provides new gup benchmark ioctl commands to run different user page pinning methods, get_user_pages_longterm and get_user_pages, in addition to the existing get_user_pages_fast. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- mm/gup_benchmark.c

[PATCH 7/7] mm/gup: Cache dev_pagemap while pinning pages

2018-09-19 Thread Keith Busch
This avoids a repeated costly radix tree lookup when dev_pagemap is used. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- include/linux/mm.h | 8 +++- mm/gup.c | 41 - mm/huge_memory.c | 35

[PATCH 5/7] tools/gup_benchmark: Add parameter for hugetlb

2018-09-19 Thread Keith Busch
Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/vm/gup_benchmark.c b/tools/testing/selftests/vm/gup_benchmark.c index

[PATCH 6/7] mm/gup: Combine parameters into struct

2018-09-19 Thread Keith Busch
This will make it easier to add new parameters that we may wish to thread through these function calls. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- include/linux/huge_mm.h | 12 +-- include/linux/hugetlb.h | 2 +- include/linux/mm.h | 21

[PATCH 4/7] tools/gup_benchmark: Allow user specified file

2018-09-19 Thread Keith Busch
The gup benchmark by default maps anonymous memory. This patch allows a user to specify a file to map, providing a means to test various file backings, like device and filesystem DAX. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- tools/testing/selftests

[PATCH 0/7] mm: faster get user pages

2018-09-19 Thread Keith Busch
After: 375786 usec Not bad; the after is the same time as using baseline anonymous system RAM after this patch set, where before was nearly 3x longer. Keith Busch (7): mm/gup_benchmark: Time put_page mm/gup_benchmark: Add additional pinning methods tools/gup_benchmark: Fix 'write' flag

[PATCH 3/7] tools/gup_benchmark: Fix 'write' flag usage

2018-09-19 Thread Keith Busch
If the '-w' parameter was provided, the benchmark would exit due to a mssing 'break'. Cc: Kirill Shutemov Cc: Dave Hansen Cc: Dan Williams Signed-off-by: Keith Busch --- tools/testing/selftests/vm/gup_benchmark.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/vm

Re: [PATCH 0/7] mm: faster get user pages

2018-09-19 Thread Keith Busch
On Wed, Sep 19, 2018 at 02:15:28PM -0700, Dave Hansen wrote: > On 09/19/2018 02:02 PM, Keith Busch wrote: > > Pinning user pages out of nvdimm dax memory is significantly slower > > compared to system ram. Analysis points to software overhead incurred > > from a radix tr

Re: [PATCH 6/7] mm/gup: Combine parameters into struct

2018-09-19 Thread Keith Busch
On Wed, Sep 19, 2018 at 03:02:49PM -0600, Keith Busch wrote: > if (is_hugepd(__hugepd(pmd_val(pmdval { > - page = follow_huge_pd(vma, address, > - __hugepd(pmd_val(pmdval)), flags, > -

Re: [PATCH 00/12] error handling and pciehp maintenance

2018-10-08 Thread Keith Busch
so makes the error > injector dependent on DYNAMIC_FTRACE_WITH_REGS, which not all arches > support. Note that this question is only about the error *injection* > module used for testing. It doesn't affect AER support itself.] > > On Thu, Oct 04, 2018 at 04:11:37PM -0600, Keith Busc

Re: [PATCH v13] NVMe: Convert to blk-mq

2014-09-30 Thread Keith Busch
On Tue, 30 Sep 2014, Matias Bjørling wrote: @@ -1967,27 +1801,30 @@ static struct nvme_ns *nvme_alloc_ns(struct nvme_dev *dev, unsigned nsid, { ... - ns->queue->queue_flags = QUEUE_FLAG_DEFAULT; + queue_flag_set_unlocked(QUEUE_FLAG_DEFAULT, ns->queue); Instead of the above, you

[PATCH] block: Fix dev_t liftime allocations

2014-08-25 Thread Keith Busch
-by: Keith Busch --- This was briefly discussed here: http://lists.infradead.org/pipermail/linux-nvme/2014-August/001120.html This patch goes one step further and fixes the same problem for partitions and disks. block/genhd.c | 18 +- block/partition-generic.c |2 +- 2

[PATCHv2] block: Fix dev_t minor allocation lifetime

2014-08-26 Thread Keith Busch
-by: Keith Busch --- v1->v2: Applied comments from Willy: fixed gfp mask in idr_alloc to not wait, and preload. block/genhd.c | 24 ++-- block/partition-generic.c |2 +- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/block/genhd.c b/bl

Re: 4.0.0-rc4 NVMe NULL pointer dereference and hang

2015-03-23 Thread Keith Busch
On Sun, 22 Mar 2015, Steven Noonan wrote: This happens on boot, and then eventually results in an RCU stall. [8.047533] nvme :05:00.0: Device not ready; aborting initialisation Note that the above is expected with this hardware (long story). Although 3.19.x prints the above and then

Re: [PATCH] NVMe: Fix error handling of class_create("nvme")

2015-03-16 Thread Keith Busch
On Fri, 6 Mar 2015, Alexey Khoroshilov wrote: class_create() returns ERR_PTR on failure, so IS_ERR() should be used instead of check for NULL. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov Thanks for the fix. Acked-by: Keith Busch

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-22 Thread Keith Busch
On Thu, 21 May 2015, Parav Pandit wrote: On Fri, May 22, 2015 at 1:04 AM, Keith Busch wrote: The q_lock is held to protect polling from reading inconsistent data. ah, yes. I can see the nvme_kthread can poll the CQ while its getting created through the nvme_resume(). I think this opens up

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-22 Thread Keith Busch
On Fri, 22 May 2015, Parav Pandit wrote: On Fri, May 22, 2015 at 8:18 PM, Keith Busch wrote: The rcu protection on nvme queues was removed with the blk-mq conversion as we rely on that layer for h/w access. o.k. But above is at level where data I/Os are not even active. Its between

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-22 Thread Keith Busch
On Fri, 22 May 2015, Parav Pandit wrote: During normal positive path probe, (a) device is added to dev_list in nvme_dev_start() (b) nvme_kthread got created, which will eventually refers to dev->queues[qid] to check for NULL. (c) dev_start() worker thread has started probing device and creating

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-22 Thread Keith Busch
On Fri, 22 May 2015, Parav Pandit wrote: On Fri, May 22, 2015 at 9:53 PM, Keith Busch wrote: A memory barrier before incrementing the dev->queue_count (and assigning the pointer in the array before that) should address this concern. Sure. mb() will solve the publisher side problem.

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-22 Thread Keith Busch
On Fri, 22 May 2015, Parav Pandit wrote: I agree to it that nvmeq won't be null after mb(); That alone is not sufficient. What I have proposed in previous email is, Converting, struct nvme_queue *nvmeq = dev->queues[i]; if (!nvmeq) continue; spin_lock_irq(nvmeq->q_lock); to replace with,

Re: Persistent Reservation API

2015-08-04 Thread Keith Busch
On Tue, 4 Aug 2015, Christoph Hellwig wrote: NVMe support currently isn't included as I don't have a multihost NVMe setup to test on, but if I can find a volunteer to test it I'm happy to write the code for it. Looks pretty good so far. I'd be happy to give try it out with NVMe subsystems. --

[PATCH 2/3] QIB: Removing usage of pcie_set_mps()

2015-07-29 Thread Keith Busch
From: Dave Jiang This is in perperation of un-exporting the pcie_set_mps() function symbol. A driver should not be changing the MPS as that is the responsibility of the PCI subsystem. Signed-off-by: Dave Jiang --- drivers/infiniband/hw/qib/qib_pcie.c | 27 +-- 1 file

[PATCH 0/3] PCI-e Max Payload Size configuration

2015-07-29 Thread Keith Busch
thing" to update the down stream port to match the upstream port if it is capable. Dave Jiang (2): QIB: Removing usage of pcie_set_mps() PCIE: Remove symbol export for pcie_set_mps() Keith Busch (1): pci: Default MPS tuning to match upstream port arch/arm/kernel/bios32.c

[PATCH 3/3] PCIE: Remove symbol export for pcie_set_mps()

2015-07-29 Thread Keith Busch
From: Dave Jiang The setting of PCIe MPS should be left to the PCI subsystem and not the driver. An ill configured MPS by the driver could cause the device to not function or unstablize the entire system. Removing the exported symbol. Signed-off-by: Dave Jiang --- drivers/pci/pci.c |1 -

[PATCH] pci: Default MPS tuning, match upstream port

2015-07-29 Thread Keith Busch
, or explicit request to rescan. Signed-off-by: Keith Busch Cc: Dave Jiang Cc: Austin Bolen Cc: Myron Stowe Cc: Jon Mason Cc: Bjorn Helgaas --- arch/arm/kernel/bios32.c | 12 arch/powerpc/kernel/pci-common.c |7 --- arch/tile/kernel/pci_gx.c |4

Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity

2015-07-15 Thread Keith Busch
On Wed, 15 Jul 2015, Bart Van Assche wrote: * With blk-mq and scsi-mq optimal performance can only be achieved if the relationship between MSI-X vector and NUMA node does not change over time. This is necessary to allow a blk-mq/scsi-mq driver to ensure that interrupts are processed on the

Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

2015-05-21 Thread Keith Busch
On Thu, 21 May 2015, Parav Pandit wrote: Avoid diabling interrupt and holding q_lock for the queue which is just getting initialized. With this change, online_queues is also incremented without lock during queue setup stage. if Power management nvme_suspend() kicks in during queue setup time,

Re: [PATCH 5/5 v2] nvme: LightNVM support

2015-04-16 Thread Keith Busch
On Wed, 15 Apr 2015, Matias Bjørling wrote: @@ -2316,7 +2686,9 @@ static int nvme_dev_add(struct nvme_dev *dev) struct nvme_id_ctrl *ctrl; void *mem; dma_addr_t dma_addr; - int shift = NVME_CAP_MPSMIN(readq(>bar->cap)) + 12; + u64 cap = readq(>bar->cap); +

Re: [PATCH 5/5 v2] nvme: LightNVM support

2015-04-16 Thread Keith Busch
On Thu, 16 Apr 2015, Javier González wrote: On 16 Apr 2015, at 16:55, Keith Busch wrote: Otherwise it looks pretty good to me, but I think it would be cleaner if the lightnvm stuff is not mixed in the same file with the standard nvme command set. We might end up splitting nvme-core

RE: [PATCH 5/5 v2] nvme: LightNVM support

2015-04-16 Thread Keith Busch
On Thu, 16 Apr 2015, James R. Bergsten wrote: My two cents worth is that it's (always) better to put ALL the commands into one place so that the entire set can be viewed at once and thus avoid inadvertent overloading of an opcode. Otherwise you don't know what you don't know. Yes, but these

Re: [PATCH] NVMe: fix type warning on 32-bit

2015-05-28 Thread Keith Busch
of different size [-Wint-to-pointer-cast] In order to shup up that warning, this introduces a new temporary variable that uses a double cast to extract the pointer from an __u64 structure member. Thanks for the fix. Acked-by: Keith Busch Signed-off-by: Arnd Bergmann Fixes: a67a95134ff ("NVMe:

Re: [PATCH] block:Change the function, nvme_alloc_queue to use -ENOMEM for when failing memory allocations

2015-05-13 Thread Keith Busch
On Tue, 12 May 2015, Nicholas Krause wrote: This changes the function,nvme_alloc_queue to use the kernel code, -ENOMEM for when failing to allocate the memory required for the nvme_queue structure pointer,nvme in order to correctly return to the caller the correct reason for this function's

Re: [PATCH] block:Remove including of the header file, linux/mm.h for the file,nvme-core.c

2015-05-13 Thread Keith Busch
On Wed, 13 May 2015, Matthew Wilcox wrote: On Wed, May 13, 2015 at 12:21:18PM -0400, Nicholas Krause wrote: This removes the include statement for including the header file, linux/mm.h in the file, nvme-core.c due this driver file never calling any functions from the header file, linux/mm.h

Re: [PATCH 01/10] block: make generic_make_request handle arbitrarily sized bios

2015-04-28 Thread Keith Busch
On Tue, 28 Apr 2015, Christoph Hellwig wrote: This seems to lack support for QUEUE_FLAG_SG_GAPS to work around the retarded PPR format in the NVMe driver. Might strong words, sir! I'm missing the context here, but I'll say PRP is much more efficient for h/w to process over SGL, and the

[RFC PATCH 1/2] x86: PCI bus specific MSI operations

2015-08-27 Thread Keith Busch
This patch adds struct x86_msi_ops to x86's PCI sysdata. This gives a host bridge driver the option to provide alternate MSI Data Register and MSI-X Table Entry programming for devices in PCI domains that do not subscribe to usual "IOAPIC" format. Signed-off-by: Keith Busch CC: Brya

[RFC PATCH 0/2] Driver for new PCI-e device

2015-08-27 Thread Keith Busch
the VMD domain using the root bus configuration interface provided by the PCI subsystem. CC: Bryan Veal CC: Dan Williams CC: x...@kernel.org CC: linux-kernel@vger.kernel.org CC: linux-...@vger.kernel.org Keith Busch (2): x86: PCI bus specific MSI operations x86/pci: Initial commit for new VMD

[RFC PATCH 2/2] x86/pci: Initial commit for new VMD device driver

2015-08-27 Thread Keith Busch
led by BIOS for such enpdoints. Contributers to this patch include: Artur Paszkiewicz Bryan Veal Jon Derrick Signed-off-by: Keith Busch CC: Bryan Veal CC: Dan Williams CC: x...@kernel.org CC: linux-kernel@vger.kernel.org CC: linux-...@vger.kernel.org --- arch/x86/Kconfig | 11 ++

Re: [RFC PATCH 1/2] x86: PCI bus specific MSI operations

2015-08-28 Thread Keith Busch
On Fri, 28 Aug 2015, Thomas Gleixner wrote: On Thu, 27 Aug 2015, Keith Busch wrote: This patch adds struct x86_msi_ops to x86's PCI sysdata. This gives a host bridge driver the option to provide alternate MSI Data Register and MSI-X Table Entry programming for devices in PCI domains that do

[PATCH] Regulator: Suppress compiler warnings

2015-08-31 Thread Keith Busch
though, and only uses the variables if they were successfully set, so suppressing the warning with uninitialized_val. Signed-off-by: Keith Busch --- drivers/regulator/helpers.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/regulator/helpers.c b/drivers/regulator

Re: [PATCH] Regulator: Suppress compiler warnings

2015-09-01 Thread Keith Busch
On Tue, 1 Sep 2015, Mark Brown wrote: On Tue, Sep 01, 2015 at 09:52:13AM +0900, Krzysztof Kozlowski wrote: 2015-09-01 1:41 GMT+09:00 Keith Busch : int regulator_is_enabled_regmap(struct regulator_dev *rdev) { - unsigned int val; + unsigned int uninitialized_var(val); int

Re: [PATCH] pci: Default MPS tuning, match upstream port

2015-08-17 Thread Keith Busch
On Mon, 17 Aug 2015, Bjorn Helgaas wrote: On Wed, Jul 29, 2015 at 04:18:53PM -0600, Keith Busch wrote: The new pcie tuning will check the device's MPS against the parent bridge when it is initially added to the pci subsystem, prior to attaching to a driver. If MPS is mismatched, the downstream

Re: Persistent Reservation API V2

2015-08-20 Thread Keith Busch
On Tue, 11 Aug 2015, Christoph Hellwig wrote: This series adds support for a simplified Persistent Reservation API to the block layer. The intent is that both in-kernel and userspace consumers can use the API instead of having to hand craft SCSI or NVMe command through the various pass through

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-26 Thread Keith Busch
On Sun, 25 Jan 2015, Christoph Hellwig wrote: On Fri, Jan 23, 2015 at 03:57:06PM -0800, Yan Liu wrote: When a passthrough IO command is issued with a specific block device file descriptor. It should be applied at the namespace which is associated with that block device file descriptor. This

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-21 Thread Keith Busch
On Wed, 21 Jan 2015, Yan Liu wrote: When a passthrough IO command is issued with a specific block device file descriptor. It should be applied at the namespace which is associated with that block device file descriptor. This patch makes such passthrough command ingore nsid in nvme_passthru_cmd

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-22 Thread Keith Busch
On Wed, 21 Jan 2015, Yan Liu wrote: For IO passthrough command, it uses an IO queue associated with the device. Actually, this patch does not modify that part. This patch is not really focused on io queues; instead, it is more about namespace protection from other namespace's user ios. The

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-22 Thread Keith Busch
On Thu, 22 Jan 2015, Christoph Hellwig wrote: On Thu, Jan 22, 2015 at 12:47:24AM +, Keith Busch wrote: The IOCTL's purpose was to let someone submit completely arbitrary commands on IO queues. This technically shouldn't even need a namespace handle, but we don't have a request_queue

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-22 Thread Keith Busch
On Thu, 22 Jan 2015, Christoph Hellwig wrote: On Thu, Jan 22, 2015 at 03:21:28PM +, Keith Busch wrote: But if you really need to restrict namespace access, shouldn't that be enforced on the target side with reservations or similar mechanism? Think for example about containers where we

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-23 Thread Keith Busch
On Thu, 22 Jan 2015, Christoph Hellwig wrote: On Thu, Jan 22, 2015 at 04:02:08PM -0800, Yan Liu wrote: When a passthrough IO command is issued with a specific block device file descriptor. It should be applied at the namespace which is associated with that block device file descriptor. This

Re: [PATCH 1/1] NVMe: Do not take nsid while a passthrough IO command is being issued via a block device file descriptor

2015-01-23 Thread Keith Busch
On Fri, 23 Jan 2015, Christoph Hellwig wrote: On Fri, Jan 23, 2015 at 04:22:02PM +, Keith Busch wrote: The namespace id should be enforced on block devices, but is there a problem allowing arbitrary commands through the management char device? I have a need for a pure passthrough

Re: [PATCH 1/1] NVMe : Corrected memory freeing.

2015-06-17 Thread Keith Busch
On Wed, 17 Jun 2015, Dheepthi K wrote: Memory freeing order has been corrected incase of allocation failure. This isn't necessary. The nvme_dev is zero'ed on allocation, and kfree(NULL or (void *)0) is okay to do. Signed-off-by: Dheepthi K --- drivers/block/nvme-core.c |7 --- 1

Re: [PATCH 4/5] lightnvm: NVMe integration

2014-10-08 Thread Keith Busch
On Wed, 8 Oct 2014, Matias Bjørling wrote: NVMe devices are identified by the vendor specific bits: Bit 3 in OACS (device-wide). Currently made per device, as the nvme namespace is missing in the completion path. The NVM-Express 1.2 actually defined this bit for Namespace Management, so I

Re: blk-mq crash with dm-multipath in for-3.20/core

2015-02-09 Thread Keith Busch
On Mon, 9 Feb 2015, Mike Snitzer wrote: On Mon, Feb 09 2015 at 11:38am -0500, Dongsu Park wrote: So that commit 6d6285c45f5a should be either reverted, or moved to linux-dm tree, doesn't it? Cheers, Dongsu [1] https://www.redhat.com/archives/dm-devel/2015-January/msg00171.html [2]

[PATCH] misc: Increase available dyanmic minors

2014-12-08 Thread Keith Busch
starts minors at the last defined misc minor (255) and works up to the max possible. Signed-off-by: Keith Busch --- drivers/char/misc.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/char/misc.c b/drivers/char/misc.c index ffa97d2..229dba5 100644

Re: [PATCH] misc: Increase available dyanmic minors

2014-12-09 Thread Keith Busch
On Tue, 9 Dec 2014, Arnd Bergmann wrote: On Monday 08 December 2014 16:01:50 Keith Busch wrote: This increases the number of available miscellaneous character device dynamic minors from 63 to the max minor, 1M. Dynamic minor previously started at 63 and went down to zero. That's not enough

Re: [PATCH] NVMe: add explicit BLK_DEV_INTEGRITY dependency

2015-02-23 Thread Keith Busch
On Mon, 23 Feb 2015, Arnd Bergmann wrote: A patch that was added to 4.0-rc1 in the last minute caused a build break in the NVMe driver unless integrity support is also enabled: drivers/block/nvme-core.c: In function 'nvme_dif_remap': drivers/block/nvme-core.c:523:24: error: dereferencing

Re: [RFC] PCI: Change default MPS behavior

2016-12-07 Thread Keith Busch
On Tue, Dec 06, 2016 at 07:20:27PM -0500, Jon Mason wrote: > Not all systems have a BIOS or firmware to preconfigure the PCIE MPS > prior to Linux booting. Without any firmware to pre-setup the MPS, the > PCIE_BUS_DEFAULT will simply set everything to 0 (128b). This behavior > causes these

Re: [PATCH] nvme: create the correct number of queues

2016-12-07 Thread Keith Busch
On Wed, Dec 07, 2016 at 05:03:26PM -0500, Dan Streetman wrote: > Change nr_io_queues variable name to nr_queues, as it includes not > only the io queues but also the admin queue in its count; and change > the variable name in functions that it is passed into, for clarity. > > Also correct misuses

Re: [PATCH] nvme: use the correct msix vector for each queue

2016-12-07 Thread Keith Busch
On Wed, Dec 07, 2016 at 05:03:48PM -0500, Dan Streetman wrote: > Change each queue's cq_vector to match its qid, instead of qid - 1. > > The first queue is always the admin queue, and the remaining queues are > I/O queues. The interrupt vectors they use are all in the same array, > however, the

Re: [PATCH] nvme: use the correct msix vector for each queue

2016-12-07 Thread Keith Busch
On Wed, Dec 07, 2016 at 05:36:00PM -0500, Dan Streetman wrote: > On Wed, Dec 7, 2016 at 5:44 PM, Keith Busch wrote: > > pci_alloc_irq_vectors doesn't know you intend to make the first > > vector special, so it's going to come up with a CPU affinity from > > blk_mq_pci_ma

Re: [PATCH] nvme: use the correct msix vector for each queue

2016-12-07 Thread Keith Busch
On Wed, Dec 07, 2016 at 05:46:34PM -0500, Dan Streetman wrote: > > Is there a reason you want to share the interrupt between the queues? The admin queue is hardly ever used (if at all) compared to an IO queue's usage, so why waste the resource? I bet you can't measure a preformance difference on

Re: [PATCH 3/3] pciehp: Fix race condition handling surprise link-down

2016-12-07 Thread Keith Busch
On Wed, Dec 07, 2016 at 05:40:54PM -0600, Bjorn Helgaas wrote: > On Sat, Nov 19, 2016 at 12:32:47AM -0800, Ashok Raj wrote: > > --- a/drivers/pci/hotplug/pciehp_ctrl.c > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > > @@ -182,6 +182,7 @@ static void pciehp_power_thread(struct work_struct > > *work)

<    8   9   10   11   12   13   14   15   16   >