On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote:
> On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote:
> > Dan Williams and Christoph Hellwig have recently expressed doubt about
> > whether the rw_page() interface made sense for synchronous memory drivers
> > [1][2].  It's unclear whether this interface has any performance benefit
> > for these drivers, but as we continue to fix bugs it is clear that it does
> > have a maintenance burden.  This series removes the rw_page()
> > implementations in brd, pmem and btt to relieve this burden.
> 
> Why don't you measure whether it has performance benefits?  I don't
> understand why zram would see performance benefits and not other drivers.
> If it's going to be removed, then the whole interface should be removed,
> not just have the implementations removed from some drivers.

Okay, I've run a bunch of performance tests with the PMEM and with BTT entry
points for rw_pages() in a swap workload, and in all cases I do see an
improvement over the code when rw_pages() is removed.  Here are the results
from my random lab box:

  Average latency of swap_writepage()
+------+------------+---------+-------------+
|      | no rw_page | rw_page | Improvement |
+-------------------------------------------+
| PMEM |  5.0 us    |  4.7 us |     6%      |
+-------------------------------------------+
|  BTT |  6.8 us    |  6.1 us |    10%      |
+------+------------+---------+-------------+

  Average latency of swap_readpage()
+------+------------+---------+-------------+
|      | no rw_page | rw_page | Improvement |
+-------------------------------------------+
| PMEM |  3.3 us    |  2.9 us |    12%      |
+-------------------------------------------+
|  BTT |  3.7 us    |  3.4 us |     8%      |
+------+------------+---------+-------------+

The workload was pmbench, a memory benchmark, run on a system where I had
severely restricted the amount of memory in the system with the 'mem' kernel
command line parameter.  The benchmark was set up to test more memory than I
allowed the OS to have so it spilled over into swap.

The PMEM or BTT device was set up as my swap device, and during the test I got
a few hundred thousand samples of each of swap_writepage() and
swap_writepage().  The PMEM/BTT device was just memory reserved with the
memmap kernel command line parameter.

Thanks, Matthew, for asking for performance data.  It looks like removing this
code would have been a mistake.

Reply via email to