Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-12-15 Thread Stefan Ring
On Mon, Nov 13, 2017 at 8:01 PM, Mike Snitzer  wrote:
>
> But feel free to remove the cache for now.  Should be as simple as:
> lvconvert --uncache VG/CacheLV

I did a --splitcache yesterday and ran a scrub again, which completed
in 3h. More than the ~2h of the cached version in 4.11, but
significantly less than the ~4.5h of the 4.13 cached variant. The data
on this volume has not changed much since last month, so the scrub
times should be very comparable, give or take a few minutes.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-14 Thread Stefan Ring
On Tue, Nov 14, 2017 at 12:00 PM, Joe Thornber  wrote:
> I'm not sure what's going on here.  Would you mind sending me the
> metadata please?  Either a cache_dump of it, or a copy of the metadata
> dev?

Ok, I've copied the device to a file and run cache_dump on it:
https://www.dropbox.com/s/y7fu723oybuxbz0/cmeta.xml.xz?dl=0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-14 Thread Stefan Ring
On Tue, Nov 14, 2017 at 12:00 PM, Joe Thornber  wrote:
>
> I'm not sure what's going on here.  Would you mind sending me the
> metadata please?  Either a cache_dump of it, or a copy of the metadata
> dev?

I'd like to create a cache dump, but I'm not very experienced with
this stuff. I do:

$ cache_dump /dev/vg_zfs/lv_zfsdisk |less
syscall 'open' failed: Device or resource busy
Note: you cannot run this tool with these options on live metadata.

Which is what I feared/expected. However, after doing

$ lvchange -an vg_zfs/lv_zfsdisk

the device disappears, and I cannot run cache_dump on it as well. So
how do I make it not live, while still visible?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-14 Thread Joe Thornber
On Mon, Nov 13, 2017 at 02:01:11PM -0500, Mike Snitzer wrote:
> On Mon, Nov 13 2017 at 12:31pm -0500,
> Stefan Ring  wrote:
> 
> > On Thu, Nov 9, 2017 at 4:15 PM, Stefan Ring  wrote:
> > > On Tue, Nov 7, 2017 at 3:41 PM, Joe Thornber  wrote:
> > >> On Fri, Nov 03, 2017 at 07:50:23PM +0100, Stefan Ring wrote:
> > >>> It strikes me as odd that the amount read from the spinning disk is
> > >>> actually more than what comes out of the combined device in the end.
> > >>
> > >> This suggests dm-cache is trying to promote too way too much.
> > >> I'll try and reproduce the issue, your setup sounds pretty straight 
> > >> forward.
> > >
> > > I think it's actually the most straight-forward you can get ;).
> > >
> > > I've also tested kernel 4.12 in the meantime, which behaves just like
> > > 4.13. So the difference in behavior seems to have been introduced
> > > somewhere between 4.11 and 4.12.
> > >
> > > I've also done plain dd from the dm-cache disk to /dev/null a few
> > > times, which wrote enormous amounts of data to the SDD. My poor SSD
> > > has received the same amount of writes during the last week that it
> > > has had to endure during the entire previous year.
> > 
> > Do you think it would make a difference if I removed and recreated the 
> > cache?
> > 
> > I don't want to fry my SSD any longer. I've just copied several large
> > files into the dm-cached zfs dataset, and while reading them back
> > immediately afterwards, the SSD started writing crazy amounts again.
> > In my understanding, linear reads should rarely end up on the cache
> > device, but that is absolutely not what I'm experiencing.
> 
> Joe tried to reproduce your reported issue today and couldn't.

I'm not sure what's going on here.  Would you mind sending me the
metadata please?  Either a cache_dump of it, or a copy of the metadata
dev?

- Joe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-13 Thread Mike Snitzer
On Mon, Nov 13 2017 at 12:31pm -0500,
Stefan Ring  wrote:

> On Thu, Nov 9, 2017 at 4:15 PM, Stefan Ring  wrote:
> > On Tue, Nov 7, 2017 at 3:41 PM, Joe Thornber  wrote:
> >> On Fri, Nov 03, 2017 at 07:50:23PM +0100, Stefan Ring wrote:
> >>> It strikes me as odd that the amount read from the spinning disk is
> >>> actually more than what comes out of the combined device in the end.
> >>
> >> This suggests dm-cache is trying to promote too way too much.
> >> I'll try and reproduce the issue, your setup sounds pretty straight 
> >> forward.
> >
> > I think it's actually the most straight-forward you can get ;).
> >
> > I've also tested kernel 4.12 in the meantime, which behaves just like
> > 4.13. So the difference in behavior seems to have been introduced
> > somewhere between 4.11 and 4.12.
> >
> > I've also done plain dd from the dm-cache disk to /dev/null a few
> > times, which wrote enormous amounts of data to the SDD. My poor SSD
> > has received the same amount of writes during the last week that it
> > has had to endure during the entire previous year.
> 
> Do you think it would make a difference if I removed and recreated the cache?
> 
> I don't want to fry my SSD any longer. I've just copied several large
> files into the dm-cached zfs dataset, and while reading them back
> immediately afterwards, the SSD started writing crazy amounts again.
> In my understanding, linear reads should rarely end up on the cache
> device, but that is absolutely not what I'm experiencing.

Joe tried to reproduce your reported issue today and couldn't.

I think we need to better understand how you're triggering this
behaviour.  But we no longer have logic in place to avoid having
sequential IO bypass the cache... that _could_ start to explain things?
Whereas earlier versions of dm-cache definitely did ignore promoting
sequential IO.

But feel free to remove the cache for now.  Should be as simple as:
lvconvert --uncache VG/CacheLV

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-13 Thread Stefan Ring
On Thu, Nov 9, 2017 at 4:15 PM, Stefan Ring  wrote:
> On Tue, Nov 7, 2017 at 3:41 PM, Joe Thornber  wrote:
>> On Fri, Nov 03, 2017 at 07:50:23PM +0100, Stefan Ring wrote:
>>> It strikes me as odd that the amount read from the spinning disk is
>>> actually more than what comes out of the combined device in the end.
>>
>> This suggests dm-cache is trying to promote too way too much.
>> I'll try and reproduce the issue, your setup sounds pretty straight forward.
>
> I think it's actually the most straight-forward you can get ;).
>
> I've also tested kernel 4.12 in the meantime, which behaves just like
> 4.13. So the difference in behavior seems to have been introduced
> somewhere between 4.11 and 4.12.
>
> I've also done plain dd from the dm-cache disk to /dev/null a few
> times, which wrote enormous amounts of data to the SDD. My poor SSD
> has received the same amount of writes during the last week that it
> has had to endure during the entire previous year.

Do you think it would make a difference if I removed and recreated the cache?

I don't want to fry my SSD any longer. I've just copied several large
files into the dm-cached zfs dataset, and while reading them back
immediately afterwards, the SSD started writing crazy amounts again.
In my understanding, linear reads should rarely end up on the cache
device, but that is absolutely not what I'm experiencing.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-09 Thread Stefan Ring
On Tue, Nov 7, 2017 at 3:41 PM, Joe Thornber  wrote:
> On Fri, Nov 03, 2017 at 07:50:23PM +0100, Stefan Ring wrote:
>> It strikes me as odd that the amount read from the spinning disk is
>> actually more than what comes out of the combined device in the end.
>
> This suggests dm-cache is trying to promote too way too much.
> I'll try and reproduce the issue, your setup sounds pretty straight forward.

I think it's actually the most straight-forward you can get ;).

I've also tested kernel 4.12 in the meantime, which behaves just like
4.13. So the difference in behavior seems to have been introduced
somewhere between 4.11 and 4.12.

I've also done plain dd from the dm-cache disk to /dev/null a few
times, which wrote enormous amounts of data to the SDD. My poor SSD
has received the same amount of writes during the last week that it
has had to endure during the entire previous year.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-07 Thread Joe Thornber
On Fri, Nov 03, 2017 at 07:50:23PM +0100, Stefan Ring wrote:
> It strikes me as odd that the amount read from the spinning disk is
> actually more than what comes out of the combined device in the end.

This suggests dm-cache is trying to promote too way too much.
I'll try and reproduce the issue, your setup sounds pretty straight forward.

- Joe

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] Significantly dropped dm-cache performance in 4.13 compared to 4.11

2017-11-03 Thread Stefan Ring
Having just upgraded from a 4.11 kernel to a 4.13 one, I see a
significantly higher scrub time for a ZFS on Linux (=ZoL) pool that
lives on a dm-cache device consisting of a 800 GB partition on one
spinning 1TB disk and one partition on an SDD (something between 100
and 200 GB). ZFS scrubbing consists of reading everything stored in
the pool from start to finish, roughly in the order that it was
written. The data on the pool is for the most part more or less
linear, and the scrubbing used to achieve read rates from the spinning
disk in excess of 100MB/sec. With the old kernel, that is. These are
the scrub times for both kernels:

4.11.5-300.fc26: 1h56m
4.13.9-200.fc26: 4h32m

Nothing changed between those two runs except for the booted kernel.
ZoL is version 0.7.3 in both cases. Originally, I suspected ZoL 0.7.x
to be the culprit, which I upgraded simultaneously to the kernel, from
0.6.5.11. However, I built and installed it for both kernel versions
from the exact same sources, and scrub times are comparable to what
they were before on my home system which uses ZoL on four spinning
disks without an interposed dm-cache.

Typical output for iostat -dmx 3 with kernel 4.13 while scrub is going
on. Otherwise, there is no I/O activity on the system:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 300.67 0.00  462.670.0068.16 0.00
301.69 2.635.615.610.00   2.16  99.90
sdb   0.00   194.676.00   83.33 0.3814.01
329.82 0.202.220.502.34   1.58  14.13
dm-0  0.00 0.006.00  221.33 0.3813.83
128.01 0.542.380.502.43   0.29   6.63
dm-1  0.00 0.000.00   53.67 0.00 0.17
6.31 0.122.280.002.28   2.06  11.07
dm-2  0.00 0.00  763.330.0068.16 0.00
182.86 8.05   10.49   10.490.00   1.31  99.93
dm-3  0.00 0.00  440.000.0054.70 0.00
254.60 1.984.414.410.00   2.27 100.03

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 468.00 1.00  519.67   20.0082.39 0.24
313.60 2.935.385.492.50   1.83  98.63
sdb   0.00   356.00   18.67  109.33 1.0025.80
428.73 0.151.201.201.20   1.04  13.33
dm-0  0.00 0.00   18.67  426.00 1.0025.75
123.20 0.521.161.201.16   0.19   8.33
dm-1  0.00 0.000.00   39.67 0.00 0.13
6.66 0.061.520.001.52   1.43   5.67
dm-2  0.00 0.00  988.00   21.0082.68 0.24
168.31 9.638.979.112.38   0.98  98.60
dm-3  0.00 0.00  485.00   19.3357.84 0.24
235.88 2.144.294.411.41   1.98  99.87

dm-3 is the cached device which ZoL reads from. sda/dm-2 is the
spinning disk, sdb/dm-0 is the cache SDD.

It strikes me as odd that the amount read from the spinning disk is
actually more than what comes out of the combined device in the end.
It is exactly the other way around with the older kernel, which makes
much more sense to me. It looks like this with 4.11, where the
resulting amount of data is the sum of both reads:

Typical samples with kernel 4.11:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda  87.67 0.00  618.330.0062.53 0.00
207.12 1.582.562.560.00   1.36  84.37
sdb   0.67 0.00 1057.000.0086.96 0.00
168.49 0.440.410.410.00   0.23  24.37
dm-0  0.00 0.00 1057.670.0086.96 0.00
168.38 0.440.420.420.00   0.23  24.40
dm-1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
dm-2  0.00 0.00  706.000.0062.56 0.00
181.48 1.742.462.460.00   1.19  84.33
dm-3  0.00 0.00 1488.330.00   149.52 0.00
205.74 1.971.321.320.00   0.67 100.00

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 165.33 0.00  747.330.0091.42 0.00
250.52 1.702.272.270.00   1.14  85.37
sdb   0.00 0.00  746.330.0064.54 0.00
177.09 0.360.490.490.00   0.23  17.00
dm-0  0.00 0.00  746.330.0064.54 0.00
177.09 0.370.490.490.00   0.23  17.07
dm-1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
dm-2  0.00 0.00  912.670.0091.39 0.00
205.07 2.022.212.210.00   0.94  85.37
dm-3  0.00