Re: [PATCH v5 7/7] libnvdimm: Add blk-mq pmem driver
Hi Dave, [auto build test ERROR on slave-dma/next] [also build test ERROR on next-20170824] [cannot apply to linus/master linux-nvdimm/libnvdimm-for-next v4.13-rc6] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Dave-Jiang/libnvdimm-add-DMA-support-for-pmem-blk-mq/20170824-164932 base: https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/slave-dma.git next config: i386-randconfig-c0-08250030 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): ERROR: "dma_request_chan" [sound/soc/snd-soc-core.ko] undefined! ERROR: "dma_request_slave_channel" [sound/soc/snd-soc-core.ko] undefined! ERROR: "dma_get_slave_caps" [sound/soc/snd-soc-core.ko] undefined! ERROR: "dma_release_channel" [sound/soc/snd-soc-core.ko] undefined! >> ERROR: "__dma_request_channel" [sound/core/snd-pcm-dmaengine.ko] undefined! >> ERROR: "dma_release_channel" [sound/core/snd-pcm-dmaengine.ko] undefined! >> ERROR: "dma_find_channel" [drivers/nvdimm/nd_pmem_mq.ko] undefined! >> ERROR: "dmaengine_unmap_put" [drivers/nvdimm/nd_pmem_mq.ko] undefined! >> ERROR: "dmaengine_get_unmap_data" [drivers/nvdimm/nd_pmem_mq.ko] undefined! >> ERROR: "dmaengine_get" [drivers/nvdimm/nd_pmem_mq.ko] undefined! >> ERROR: "dmaengine_put" [drivers/nvdimm/nd_pmem_mq.ko] undefined! --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v5 7/7] libnvdimm: Add blk-mq pmem driver
On Wed, Aug 23, 2017 at 12:56 PM, Dave Jiang wrote: > > > On 08/23/2017 11:39 AM, Dan Williams wrote: >> On Mon, Aug 21, 2017 at 2:11 PM, Dave Jiang wrote: >>> Adding a DMA supported blk-mq driver for pmem. >> >> "Add support for offloading pmem block-device I/O operations to a DMA >> engine." >> >>> This provides signficant CPU >> >> *significant >> >>> utilization reduction. >> >> "at the cost of some increased latency and bandwidth reduction in some >> cases." >> >>> By default the pmem driver will be using blk-mq with >> >> "By default the current cpu-copy based pmem driver will load, but this >> driver can be manually selected with a modprobe configuration." >> >>> DMA through the dmaengine API. DMA can be turned off with use_dma=0 kernel >>> parameter. >> >> Do we need the module option? It seems for debug / testing a user can >> simply unload the ioatdma driver, otherwise we should use dma by >> default. >> >>> Additional kernel parameters are provided: >>> >>> queue_depth: The queue depth for blk-mq. Typically in relation to what the >>> DMA engine can provide per queue/channel. This needs to take >>> into account of num_sg as well for some DMA engines. i.e. >>> num_sg * queue_depth < total descriptors available per queue or >>> channel. >>> >>> q_per_node: Hardware queues per node. Typically the number of channels the >>> DMA engine can provide per socket. >>> num_sg: Number of scatterlist we can handle per I/O request. >> >> Why do these need to be configurable? > > The concern is with other arch/platforms that have different DMA > engines. The configurations would be platform dependent. ...but these are answers we should be able to get from dmaengine and the specific DMA drivers in use. An end user has no chance of guessing the right values. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v5 7/7] libnvdimm: Add blk-mq pmem driver
On 08/23/2017 11:39 AM, Dan Williams wrote: > On Mon, Aug 21, 2017 at 2:11 PM, Dave Jiang wrote: >> Adding a DMA supported blk-mq driver for pmem. > > "Add support for offloading pmem block-device I/O operations to a DMA engine." > >> This provides signficant CPU > > *significant > >> utilization reduction. > > "at the cost of some increased latency and bandwidth reduction in some cases." > >> By default the pmem driver will be using blk-mq with > > "By default the current cpu-copy based pmem driver will load, but this > driver can be manually selected with a modprobe configuration." > >> DMA through the dmaengine API. DMA can be turned off with use_dma=0 kernel >> parameter. > > Do we need the module option? It seems for debug / testing a user can > simply unload the ioatdma driver, otherwise we should use dma by > default. > >> Additional kernel parameters are provided: >> >> queue_depth: The queue depth for blk-mq. Typically in relation to what the >> DMA engine can provide per queue/channel. This needs to take >> into account of num_sg as well for some DMA engines. i.e. >> num_sg * queue_depth < total descriptors available per queue or >> channel. >> >> q_per_node: Hardware queues per node. Typically the number of channels the >> DMA engine can provide per socket. >> num_sg: Number of scatterlist we can handle per I/O request. > > Why do these need to be configurable? The concern is with other arch/platforms that have different DMA engines. The configurations would be platform dependent. > >> >> Numbers below are measured against pmem simulated via DRAM using >> memmap=NN!SS. DMA engine used is the ioatdma on Intel Skylake Xeon >> platform. Keep in mind the performance for persistent memory >> will differ. >> Fio 2.21 was used. >> >> 64k: 1 task queuedepth=1 >> CPU Read: 7631 MB/s 99.7% CPUDMA Read: 2415 MB/s 54% CPU >> CPU Write: 3552 MB/s 100% CPU DMA Write 2173 MB/s 54% CPU >> >> 64k: 16 tasks queuedepth=16 >> CPU Read: 36800 MB/s 1593% CPUDMA Read: 29100 MB/s 607% CPU >> CPU Write 20900 MB/s 1589% CPUDMA Write: 23400 MB/s 585% CPU >> >> 2M: 1 task queuedepth=1 >> CPU Read: 6013 MB/s 99.3% CPUDMA Read: 7986 MB/s 59.3% CPU >> CPU Write: 3579 MB/s 100% CPU DMA Write: 5211 MB/s 58.3% CPU >> >> 2M: 16 tasks queuedepth=16 >> CPU Read: 18100 MB/s 1588% CPUDMA Read: 21300 MB/s 180.9% CPU >> CPU Write: 14100 MB/s 1594% CPUDMA Write: 20400 MB/s 446.9% CPU >> >> Signed-off-by: Dave Jiang >> Reviewed-by: Ross Zwisler >> >> Signed-off-by: Dave Jiang >> --- >> drivers/nvdimm/Kconfig | 23 + >> drivers/nvdimm/Makefile |3 >> drivers/nvdimm/pmem.h|3 >> drivers/nvdimm/pmem_mq.c | 853 >> ++ >> 4 files changed, 882 insertions(+) >> create mode 100644 drivers/nvdimm/pmem_mq.c >> >> diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig >> index 5bdd499..c88c2bb 100644 >> --- a/drivers/nvdimm/Kconfig >> +++ b/drivers/nvdimm/Kconfig >> @@ -36,6 +36,29 @@ config BLK_DEV_PMEM >> >> Say Y if you want to use an NVDIMM >> >> +config BLK_DEV_PMEM_MQ > > Lets call the driver pmem_dma instead of pmem_mq, the "mq" is just a > side effect of enabling dma support. > >> + tristate "PMEM: Persistent memory block device multi-queue support" >> + depends on m > > Not sure what the exact kconfig syntax should be, but the policy I'm > looking for is that this driver should be allowed to be built-in if > the pmem driver is disabled. > >> + default LIBNVDIMM >> + select DAX >> + select ND_BTT if BTT >> + select ND_PFN if NVDIMM_PFN > > I think these should probably move to a common symbol used by both > pmem and pmem_dma. > >> + select DMA_ENGINE > > Shouldn't this be "depends on"? > >> + help >> + Memory ranges for PMEM are described by either an NFIT >> + (NVDIMM Firmware Interface Table, see CONFIG_NFIT_ACPI), a >> + non-standard OEM-specific E820 memory type (type-12, see >> + CONFIG_X86_PMEM_LEGACY), or it is manually specified by the >> + 'memmap=nn[KMG]!ss[KMG]' kernel command line (see >> + Documentation/admin-guide/kernel-parameters.rst). This driver >> + converts these persistent memory ranges into block devices that are >> + capable of DAX (direct-access) file system mappings. See >> + Documentation/nvdimm/nvdimm.txt for more details. This driver >> + utilizes block layer multi-queue in order to support using DMA >> + engines to help offload the data copying. > > Rather than duplicating some of the pmem text I think this help text > should only talk about the incremental differences relative to the > base pmem driver. > >> + >> + Say Y if you want to use an NVDIMM >> + >> config ND_BLK >> tristate "BLK: Block data window (aperture) device support" >>
Re: [PATCH v5 7/7] libnvdimm: Add blk-mq pmem driver
On Mon, Aug 21, 2017 at 2:11 PM, Dave Jiang wrote: > Adding a DMA supported blk-mq driver for pmem. "Add support for offloading pmem block-device I/O operations to a DMA engine." > This provides signficant CPU *significant > utilization reduction. "at the cost of some increased latency and bandwidth reduction in some cases." > By default the pmem driver will be using blk-mq with "By default the current cpu-copy based pmem driver will load, but this driver can be manually selected with a modprobe configuration." > DMA through the dmaengine API. DMA can be turned off with use_dma=0 kernel > parameter. Do we need the module option? It seems for debug / testing a user can simply unload the ioatdma driver, otherwise we should use dma by default. >Additional kernel parameters are provided: > > queue_depth: The queue depth for blk-mq. Typically in relation to what the > DMA engine can provide per queue/channel. This needs to take > into account of num_sg as well for some DMA engines. i.e. > num_sg * queue_depth < total descriptors available per queue or > channel. > > q_per_node: Hardware queues per node. Typically the number of channels the > DMA engine can provide per socket. > num_sg: Number of scatterlist we can handle per I/O request. Why do these need to be configurable? > > Numbers below are measured against pmem simulated via DRAM using > memmap=NN!SS. DMA engine used is the ioatdma on Intel Skylake Xeon > platform. Keep in mind the performance for persistent memory > will differ. > Fio 2.21 was used. > > 64k: 1 task queuedepth=1 > CPU Read: 7631 MB/s 99.7% CPUDMA Read: 2415 MB/s 54% CPU > CPU Write: 3552 MB/s 100% CPU DMA Write 2173 MB/s 54% CPU > > 64k: 16 tasks queuedepth=16 > CPU Read: 36800 MB/s 1593% CPUDMA Read: 29100 MB/s 607% CPU > CPU Write 20900 MB/s 1589% CPUDMA Write: 23400 MB/s 585% CPU > > 2M: 1 task queuedepth=1 > CPU Read: 6013 MB/s 99.3% CPUDMA Read: 7986 MB/s 59.3% CPU > CPU Write: 3579 MB/s 100% CPU DMA Write: 5211 MB/s 58.3% CPU > > 2M: 16 tasks queuedepth=16 > CPU Read: 18100 MB/s 1588% CPUDMA Read: 21300 MB/s 180.9% CPU > CPU Write: 14100 MB/s 1594% CPUDMA Write: 20400 MB/s 446.9% CPU > > Signed-off-by: Dave Jiang > Reviewed-by: Ross Zwisler > > Signed-off-by: Dave Jiang > --- > drivers/nvdimm/Kconfig | 23 + > drivers/nvdimm/Makefile |3 > drivers/nvdimm/pmem.h|3 > drivers/nvdimm/pmem_mq.c | 853 > ++ > 4 files changed, 882 insertions(+) > create mode 100644 drivers/nvdimm/pmem_mq.c > > diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig > index 5bdd499..c88c2bb 100644 > --- a/drivers/nvdimm/Kconfig > +++ b/drivers/nvdimm/Kconfig > @@ -36,6 +36,29 @@ config BLK_DEV_PMEM > > Say Y if you want to use an NVDIMM > > +config BLK_DEV_PMEM_MQ Lets call the driver pmem_dma instead of pmem_mq, the "mq" is just a side effect of enabling dma support. > + tristate "PMEM: Persistent memory block device multi-queue support" > + depends on m Not sure what the exact kconfig syntax should be, but the policy I'm looking for is that this driver should be allowed to be built-in if the pmem driver is disabled. > + default LIBNVDIMM > + select DAX > + select ND_BTT if BTT > + select ND_PFN if NVDIMM_PFN I think these should probably move to a common symbol used by both pmem and pmem_dma. > + select DMA_ENGINE Shouldn't this be "depends on"? > + help > + Memory ranges for PMEM are described by either an NFIT > + (NVDIMM Firmware Interface Table, see CONFIG_NFIT_ACPI), a > + non-standard OEM-specific E820 memory type (type-12, see > + CONFIG_X86_PMEM_LEGACY), or it is manually specified by the > + 'memmap=nn[KMG]!ss[KMG]' kernel command line (see > + Documentation/admin-guide/kernel-parameters.rst). This driver > + converts these persistent memory ranges into block devices that are > + capable of DAX (direct-access) file system mappings. See > + Documentation/nvdimm/nvdimm.txt for more details. This driver > + utilizes block layer multi-queue in order to support using DMA > + engines to help offload the data copying. Rather than duplicating some of the pmem text I think this help text should only talk about the incremental differences relative to the base pmem driver. > + > + Say Y if you want to use an NVDIMM > + > config ND_BLK > tristate "BLK: Block data window (aperture) device support" > default LIBNVDIMM > diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile > index 909554c..8bfd107 100644 > --- a/drivers/nvdimm/Makefile > +++ b/drivers/nvdimm/Makefile > @@ -1,11 +1,14 @@ > obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o > obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o > +obj-$(CONFIG_BLK_DEV_PMEM_MQ) += nd_pmem_mq.o >
[PATCH v5 7/7] libnvdimm: Add blk-mq pmem driver
Adding a DMA supported blk-mq driver for pmem. This provides signficant CPU utilization reduction. By default the pmem driver will be using blk-mq with DMA through the dmaengine API. DMA can be turned off with use_dma=0 kernel parameter. Additional kernel parameters are provided: queue_depth: The queue depth for blk-mq. Typically in relation to what the DMA engine can provide per queue/channel. This needs to take into account of num_sg as well for some DMA engines. i.e. num_sg * queue_depth < total descriptors available per queue or channel. q_per_node: Hardware queues per node. Typically the number of channels the DMA engine can provide per socket. num_sg: Number of scatterlist we can handle per I/O request. Numbers below are measured against pmem simulated via DRAM using memmap=NN!SS. DMA engine used is the ioatdma on Intel Skylake Xeon platform. Keep in mind the performance for persistent memory will differ. Fio 2.21 was used. 64k: 1 task queuedepth=1 CPU Read: 7631 MB/s 99.7% CPUDMA Read: 2415 MB/s 54% CPU CPU Write: 3552 MB/s 100% CPU DMA Write 2173 MB/s 54% CPU 64k: 16 tasks queuedepth=16 CPU Read: 36800 MB/s 1593% CPUDMA Read: 29100 MB/s 607% CPU CPU Write 20900 MB/s 1589% CPUDMA Write: 23400 MB/s 585% CPU 2M: 1 task queuedepth=1 CPU Read: 6013 MB/s 99.3% CPUDMA Read: 7986 MB/s 59.3% CPU CPU Write: 3579 MB/s 100% CPU DMA Write: 5211 MB/s 58.3% CPU 2M: 16 tasks queuedepth=16 CPU Read: 18100 MB/s 1588% CPUDMA Read: 21300 MB/s 180.9% CPU CPU Write: 14100 MB/s 1594% CPUDMA Write: 20400 MB/s 446.9% CPU Signed-off-by: Dave Jiang Reviewed-by: Ross Zwisler Signed-off-by: Dave Jiang --- drivers/nvdimm/Kconfig | 23 + drivers/nvdimm/Makefile |3 drivers/nvdimm/pmem.h|3 drivers/nvdimm/pmem_mq.c | 853 ++ 4 files changed, 882 insertions(+) create mode 100644 drivers/nvdimm/pmem_mq.c diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig index 5bdd499..c88c2bb 100644 --- a/drivers/nvdimm/Kconfig +++ b/drivers/nvdimm/Kconfig @@ -36,6 +36,29 @@ config BLK_DEV_PMEM Say Y if you want to use an NVDIMM +config BLK_DEV_PMEM_MQ + tristate "PMEM: Persistent memory block device multi-queue support" + depends on m + default LIBNVDIMM + select DAX + select ND_BTT if BTT + select ND_PFN if NVDIMM_PFN + select DMA_ENGINE + help + Memory ranges for PMEM are described by either an NFIT + (NVDIMM Firmware Interface Table, see CONFIG_NFIT_ACPI), a + non-standard OEM-specific E820 memory type (type-12, see + CONFIG_X86_PMEM_LEGACY), or it is manually specified by the + 'memmap=nn[KMG]!ss[KMG]' kernel command line (see + Documentation/admin-guide/kernel-parameters.rst). This driver + converts these persistent memory ranges into block devices that are + capable of DAX (direct-access) file system mappings. See + Documentation/nvdimm/nvdimm.txt for more details. This driver + utilizes block layer multi-queue in order to support using DMA + engines to help offload the data copying. + + Say Y if you want to use an NVDIMM + config ND_BLK tristate "BLK: Block data window (aperture) device support" default LIBNVDIMM diff --git a/drivers/nvdimm/Makefile b/drivers/nvdimm/Makefile index 909554c..8bfd107 100644 --- a/drivers/nvdimm/Makefile +++ b/drivers/nvdimm/Makefile @@ -1,11 +1,14 @@ obj-$(CONFIG_LIBNVDIMM) += libnvdimm.o obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o +obj-$(CONFIG_BLK_DEV_PMEM_MQ) += nd_pmem_mq.o obj-$(CONFIG_ND_BTT) += nd_btt.o obj-$(CONFIG_ND_BLK) += nd_blk.o obj-$(CONFIG_X86_PMEM_LEGACY) += nd_e820.o nd_pmem-y := pmem.o +nd_pmem_mq-y := pmem_mq.o + nd_btt-y := btt.o nd_blk-y := blk.o diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h index 5434321..bbbe982 100644 --- a/drivers/nvdimm/pmem.h +++ b/drivers/nvdimm/pmem.h @@ -4,6 +4,7 @@ #include #include #include +#include #ifdef CONFIG_ARCH_HAS_PMEM_API #define ARCH_MEMREMAP_PMEM MEMREMAP_WB @@ -35,6 +36,8 @@ struct pmem_device { struct badblocksbb; struct dax_device *dax_dev; struct gendisk *disk; + struct blk_mq_tag_set tag_set; + struct request_queue*q; }; long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, diff --git a/drivers/nvdimm/pmem_mq.c b/drivers/nvdimm/pmem_mq.c new file mode 100644 index 000..db2fc2a --- /dev/null +++ b/drivers/nvdimm/pmem_mq.c @@ -0,0 +1,853 @@ +/* + * Persistent Memory Block Multi-Queue Driver + * - This driver is largely adapted from Ross's pmem block driver. + * Copyright (c) 2014-2017, Intel Corporation. + * Copyright (c) 2015, Christoph Hellwig . + * Copyright (c) 2015, Boaz Harrosh . + * + * This program is free software; you can redistribute it and/