Dave Jiang <dave.ji...@intel.com> writes:

> Adding DMA support for pmem blk reads. This provides signficant CPU
> reduction with large memory reads with good performance. DMAs are triggered
> with test against bio_multiple_segment(), so the small I/Os (4k or less?)
> are still performed by the CPU in order to reduce latency. By default
> the pmem driver will be using blk-mq with DMA.
>
> Numbers below are measured against pmem simulated via DRAM using
> memmap=NN!SS.  DMA engine used is the ioatdma on Intel Skylake Xeon
> platform.  Keep in mind the performance for actual persistent memory
> will differ.
> Fio 2.21 was used.
>
> 64k: 1 task queuedepth=1
> CPU Read:  7631 MB/s  99.7% CPU    DMA Read: 2415 MB/s  54% CPU
> CPU Write: 3552 MB/s  100% CPU     DMA Write 2173 MB/s  54% CPU
>
> 64k: 16 tasks queuedepth=16
> CPU Read: 36800 MB/s  1593% CPU    DMA Read:  29100 MB/s  607% CPU
> CPU Write 20900 MB/s  1589% CPU    DMA Write: 23400 MB/s  585% CPU
>
> 2M: 1 task queuedepth=1
> CPU Read:  6013 MB/s  99.3% CPU    DMA Read:  7986 MB/s  59.3% CPU
> CPU Write: 3579 MB/s  100% CPU     DMA Write: 5211 MB/s  58.3% CPU
>
> 2M: 16 tasks queuedepth=16
> CPU Read:  18100 MB/s 1588% CPU    DMA Read:  21300 MB/s 180.9% CPU
> CPU Write: 14100 MB/s 1594% CPU    DMA Write: 20400 MB/s 446.9% CPU
>
> Signed-off-by: Dave Jiang <dave.ji...@intel.com>
> ---

Hi Dave,

The above table shows that there's a performance benefit for 2M
transfers but a regression for 64k transfers, if we forget about the CPU
utilization for a second. Would it be beneficial to have heuristics on
the transfer size that decide when to use dma and when not? You
introduced this hunk:

-   rc = pmem_handle_cmd(cmd);
+   if (cmd->chan && bio_multiple_segments(req->bio))
+      rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req)));
+   else
+       rc = pmem_handle_cmd(cmd);

Which utilizes dma for bios with multiple segments and for single
segment bios you use the old path, maybe the single/multi segment logic
can be amended to have something like:

    if (cmd->chan && bio_segments(req->bio) > PMEM_DMA_THRESH)
       rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req));
    else
       rc = pmem_handle_cmd(cmd);

Just something woth considering IMHO.

> +     len = blk_rq_payload_bytes(req);
> +     page = virt_to_page(pmem_addr);
> +     off = (u64)pmem_addr & ~PAGE_MASK;

off = offset_in_page(pmem_addr); ?

-- 
Johannes Thumshirn                                          Storage
jthumsh...@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to