Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-16 Thread Chris Murphy
On Tue, Feb 16, 2021 at 4:10 PM Jeremy Linton  wrote:

> On 2/14/21 2:20 PM, Chris Murphy wrote:

> > This isn't sufficiently qualified. It does work to reduce space
> > consumption and write amplification. It's just that there's a tradeoff
> > that you dislike, which is IO reduction. And it's completely
> > reasonable to have a subjective position on this tradeoff. But no
> > matter what there is a consequence to the choice.
>
> IO reduction in some cases (see below), for additional read latency, and
> and increase in CPU utilization.
>
> For a desktop workload the former is likely a larger problem. But as we
> all know sluggishness is a hard thing to measure on a desktop. QD1
> pointer chasing on disk though is a good approximation, sometimes boot
> times are too.

What is your counter proposal?


> > A larger file might have a mix of compressed and non-compressed
> > extents, based on this "is it worth it" estimate. This is the
> > difference between the compress and compress-force options, where
> > force drops this estimator and depends on the compression algorithm to
> > do that work. I sometimes call that estimator the "early bailout"
> > check.
>
> Compression estimation is its own ugly ball of wax. But ignoring that
> for the moment, consider what happens if you have a bunch of 2G database
> files with a reasonable compression ratio. Lets assume for a moment the
> database attempts to update records in the middle of the files. What
> happens when the compression ratio gets slightly worse? (its likely you
> already have nodatacow).

What percentage of Fedora desktop users do you estimate have a bunch
of 2G database files?

I don't assume datacow or nodatacow for databases, because some
databases and their workloads do OK on COW filesystems and others
don't.

Also, nodatacow disables compression. i.e. files having file attribute
'C' (nodatacow) with mount option compress(-force) remain
uncompressed.

> Although this becomes a case of
> seeing if the "compression estimation" logic is smart enough to detect
> its causing poor IO patterns (while still having a reasonably good
> compression ratio).

The "early bail" heuristic just tries to estimate if the effort of
compression is worth it. If it is, the data extent is submitted for
compression and if it's not worth it, it isn't. The max extent size
for this is 128KiB. There's no IO pattern detection. Once the
compression has happened, the write allocator works the same as
without compression.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/compression.c?h=v5.11#n1314
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/compression.c?h=v5.11#n1609



> In a past life, I spent a non inconsequential part of a decade
> engineering compressed ram+storage systems (similar to what has been
> getting merged to mainline over the past few years). Its really hard to
> make one that is performant across a wide range of workloads. What you
> get are areas where it can help, but if you average those case with the
> ones where it hurts the overwhelming analysis is you shouldn't be
> compressing unless you want the capacity. The worse part is that most
> synthetic file IO benchmarks tend to be on the "it helps" side of the
> equation and the real applications on the other.

This is why I tend to poo poo on benchmarks. They're useful for the
narrow purpose they're intended to measure. Synthetic benchmarks are
good at exposing problems, but won't tell you their significance, so
what they expose is the need for better testing. A databased benchmark
will do a good job showing performance issues with workloads that act
like the database that the benchmark is mimicking. Not all databases
have the same behavior.


> IMHO if fedora wanted to take a hit on the IO perf side, a much better
> place to focus would be flipping encryption on. The perf profile is
> flatter (aes-ni & the arm crypto extensions are common) with fewer evil
> edge cases. Or a more controlled method might to be picking a couple
> fairly atomic directories and enabling compression there (say /usr).

Workstation WG has been tracking these:
https://pagure.io/fedora-workstation/issue/136
https://pagure.io/fedora-workstation/issue/82

A significant impediment to ticket the "Encrypt my data" checkbox by
default in Automatic partitioning is the UI/UX. The current evaluation
centers on using systemd-homed to encrypt user data by default; and
optionally enabling system encryption with the key sealed in the TPM,
or protected on something like a yubikey. There's still some work to
do to get this integrated.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List 

Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-16 Thread Jeremy Linton

Hi,

On 2/14/21 2:20 PM, Chris Murphy wrote:

On Sat, Feb 13, 2021 at 9:45 PM Jeremy Linton  wrote:


Hi,

On 2/11/21 11:05 PM, Chris Murphy wrote:

On Thu, Feb 11, 2021 at 9:58 AM Jeremy Linton  wrote:


Hi,

On 1/1/21 8:59 PM, Chris Murphy wrote:



Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
not even sure someone with a very fast NVMe drive will notice a slow
down because the compression/decompression is threaded.


I disagree that everyone benefits. Any read latency sensitive workload
will be slower due to the application latency being both the drive
latency plus the decompression latency. And as the kernel benchmarks
indicate very few systems are going to get anywhere near the performance
of even baseline NVMe drives when its comes to throughput.


It's possible some workloads on NVMe might have faster reads or writes
without compression.

https://github.com/facebook/zstd

btrfs compress=zstd:1 translates into zstd -1 right now; there are
some ideas to remap btrfs zstd:1 to one of the newer zstd --fast
options, but it's just an idea. And in any case the default for btrfs
and zstd will remain as 3 and -3 respectively, which is what
'compress=zstd' maps to, making it identical to 'compress=zstd:3'.

I have a laptop with NVMe and haven't come across such a workload so
far, but this is obviously not a scientific sample. I think you'd need
a process that's producing read/write rates that the storage can meet,
but that the compression algorithm limits. Btrfs is threaded, as is
the compression.

What's typical, is no change in performance and sometimes a small
small increase in performance. It essentially trades some CPU cycles
in exchange for less IO. That includes less time reading and writing,
but also less latency, meaning the gain on rotational media is
greater.


Worse, if the workload is very parallel, and at max CPU already
the compression overhead will only make that situation worse as well. (I
suspect you could test this just by building some packages that have
good parallelism during the build).


This is compiling the kernel on a 4/8-core CPU (circa 2011) using make
-j8, the kernel running is 5.11-rc7.

no compression

real55m32.769s
user369m32.823s
sys 35m59.948s

--

compress=zstd:1

real53m44.543s
user368m17.614s
sys 36m2.505s

That's a one time test, and it's a ~3% improvement. *shrug* We don't
really care too much these days about 1-3% differences when doing
encryption, so I think this is probably in that ballpark, even if it
turns out another compile is 3% slower. This is not a significantly
read or write centric workload, it's mostly CPU. So this 3% difference
may not even be related to the compression.


Did you drop caches/etc between runs?


Yes. And also did the test with uncompressed source files when
compiling without the compress mount option. And compressed source
files when compiling with the compress mount option. While it's
possible to mix those around (there's four combinations), I kept them
the same since those are the most common.




Because I git cloned mainline,
copied the fedora kernel config from /boot and on a fairly recent laptop
(12 threads) with a software encrypted NVMe. Dropped caches and did a
time make against a compressed directory and an uncompressed one with
both a semi constrained (4G) setup and 32G ram setup (compressed
swapping disabled, because the machine has an encrypted swap for
hibernation and crashdumps).

compressed:
real22m40.129s
user221m9.816s
sys 23m37.038s

uncompressed:
real21m53.366s
user221m56.714s
sys 23m39.988s

uncompressed 4G ram:
real28m48.964s
user288m47.569s
sys 30m43.957s

compressed 4G
real29m54.061s
user281m7.120s
sys 29m50.613s



While the feature page doesn't claim it always increases performance,
it also doesn't say it can reduce performance. In the CPU intensive
workloads, it stands to reason there's going to be some competition.
The above results strongly suggest that's what's going on, even if I
couldn't reproduce it. But performance gain/loss isn't the only factor
for consideration.



and that is not an IO constrained workload its generally cpu
constrained, and since the caches are warm due to the software
encryption the decompress times should be much faster than machines that
aren't cache stashing.


I don't know, so I can't confirm or deny any of that.



The machine above, can actually peg all 6 cores until it hits thermal
limits simply doing cp's with btrfs/zstd compression, all the while
losing about 800MB/sec of IO bandwidth over the raw disk. Turning an IO
bound problem into a CPU bound one isn't ideal.


It's a set of tradeoffs. And there isn't a governor that can assess
when an IO bound bottleneck becomes a CPU bound one.


Compressed disks only work in the situation where the CPUs can
compress/decompress faster than the disk, or the workload is managing to
significantly reduce IO because the working set is 

Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-14 Thread Chris Murphy
On Sat, Feb 13, 2021 at 9:45 PM Jeremy Linton  wrote:
>
> Hi,
>
> On 2/11/21 11:05 PM, Chris Murphy wrote:
> > On Thu, Feb 11, 2021 at 9:58 AM Jeremy Linton  wrote:
> >>
> >> Hi,
> >>
> >> On 1/1/21 8:59 PM, Chris Murphy wrote:
> >
> >>> Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
> >>> not even sure someone with a very fast NVMe drive will notice a slow
> >>> down because the compression/decompression is threaded.
> >>
> >> I disagree that everyone benefits. Any read latency sensitive workload
> >> will be slower due to the application latency being both the drive
> >> latency plus the decompression latency. And as the kernel benchmarks
> >> indicate very few systems are going to get anywhere near the performance
> >> of even baseline NVMe drives when its comes to throughput.
> >
> > It's possible some workloads on NVMe might have faster reads or writes
> > without compression.
> >
> > https://github.com/facebook/zstd
> >
> > btrfs compress=zstd:1 translates into zstd -1 right now; there are
> > some ideas to remap btrfs zstd:1 to one of the newer zstd --fast
> > options, but it's just an idea. And in any case the default for btrfs
> > and zstd will remain as 3 and -3 respectively, which is what
> > 'compress=zstd' maps to, making it identical to 'compress=zstd:3'.
> >
> > I have a laptop with NVMe and haven't come across such a workload so
> > far, but this is obviously not a scientific sample. I think you'd need
> > a process that's producing read/write rates that the storage can meet,
> > but that the compression algorithm limits. Btrfs is threaded, as is
> > the compression.
> >
> > What's typical, is no change in performance and sometimes a small
> > small increase in performance. It essentially trades some CPU cycles
> > in exchange for less IO. That includes less time reading and writing,
> > but also less latency, meaning the gain on rotational media is
> > greater.
> >
> >> Worse, if the workload is very parallel, and at max CPU already
> >> the compression overhead will only make that situation worse as well. (I
> >> suspect you could test this just by building some packages that have
> >> good parallelism during the build).
> >
> > This is compiling the kernel on a 4/8-core CPU (circa 2011) using make
> > -j8, the kernel running is 5.11-rc7.
> >
> > no compression
> >
> > real55m32.769s
> > user369m32.823s
> > sys 35m59.948s
> >
> > --
> >
> > compress=zstd:1
> >
> > real53m44.543s
> > user368m17.614s
> > sys 36m2.505s
> >
> > That's a one time test, and it's a ~3% improvement. *shrug* We don't
> > really care too much these days about 1-3% differences when doing
> > encryption, so I think this is probably in that ballpark, even if it
> > turns out another compile is 3% slower. This is not a significantly
> > read or write centric workload, it's mostly CPU. So this 3% difference
> > may not even be related to the compression.
>
> Did you drop caches/etc between runs?

Yes. And also did the test with uncompressed source files when
compiling without the compress mount option. And compressed source
files when compiling with the compress mount option. While it's
possible to mix those around (there's four combinations), I kept them
the same since those are the most common.



>Because I git cloned mainline,
> copied the fedora kernel config from /boot and on a fairly recent laptop
> (12 threads) with a software encrypted NVMe. Dropped caches and did a
> time make against a compressed directory and an uncompressed one with
> both a semi constrained (4G) setup and 32G ram setup (compressed
> swapping disabled, because the machine has an encrypted swap for
> hibernation and crashdumps).
>
> compressed:
> real22m40.129s
> user221m9.816s
> sys 23m37.038s
>
> uncompressed:
> real21m53.366s
> user221m56.714s
> sys 23m39.988s
>
> uncompressed 4G ram:
> real28m48.964s
> user288m47.569s
> sys 30m43.957s
>
> compressed 4G
> real29m54.061s
> user281m7.120s
> sys 29m50.613s
>

While the feature page doesn't claim it always increases performance,
it also doesn't say it can reduce performance. In the CPU intensive
workloads, it stands to reason there's going to be some competition.
The above results strongly suggest that's what's going on, even if I
couldn't reproduce it. But performance gain/loss isn't the only factor
for consideration.


> and that is not an IO constrained workload its generally cpu
> constrained, and since the caches are warm due to the software
> encryption the decompress times should be much faster than machines that
> aren't cache stashing.

I don't know, so I can't confirm or deny any of that.


> The machine above, can actually peg all 6 cores until it hits thermal
> limits simply doing cp's with btrfs/zstd compression, all the while
> losing about 800MB/sec of IO bandwidth over the raw disk. Turning an IO
> bound problem into a CPU bound one isn't ideal.

It's a set of tradeoffs. And 

Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-14 Thread Chris Murphy
On Thu, Feb 11, 2021 at 11:48 PM Tom Seewald  wrote:
>
> > A few more things:
> >
> > *  btrfs-progs tools don't yet have a way to report  compression
> > information. While 'df' continues to report correctly about actual
> > blocks used and free, both regular 'du' (coreutils) and 'btrfs
> > filesystem du' will report uncompressed values.
>
> Are there plans for upstream to address this pretty major shortcoming in the 
> next release of btrfs-progs? From what I can see on the btrfs wiki the user 
> space support for compression is very rudimentary and with no real indication 
> that it is being worked on or seen as a priority.

I know there is an intent to incorporate it, I don't know the time
frame. It probably belongs in 'btrfs filesystem du' but I'm not
certain. Since F2FS is also doing compression, it might make sense to
enhance df or du (both are coreutils).

There is a tool called compsize that will be included in default
installations, that will provide statistical information. Speaking for
myself, it's been something of a short term novelty usage that tapers
off over time. The statistics satisfy curiosity, but I've found the
curiosity wanes because it doesn't affect any decision making,
contrary to df, du, and ls. The behavior of du and ls is unchanged,
whereas the behavior of df is that the rate of free space consumption
is always the same or less compared to uncompressed. There isn't a
mechanism for a program to consider 100G of free space as 200G
accounting for compression. It's still just 100G, and compression
maybe gets you a 50-70G file for things like binaries. And for text
files it can be quite a lot. And for multimedia files you're not going
to see any compression.

-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-13 Thread Jeremy Linton

Hi,

On 2/11/21 11:05 PM, Chris Murphy wrote:

On Thu, Feb 11, 2021 at 9:58 AM Jeremy Linton  wrote:


Hi,

On 1/1/21 8:59 PM, Chris Murphy wrote:



Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
not even sure someone with a very fast NVMe drive will notice a slow
down because the compression/decompression is threaded.


I disagree that everyone benefits. Any read latency sensitive workload
will be slower due to the application latency being both the drive
latency plus the decompression latency. And as the kernel benchmarks
indicate very few systems are going to get anywhere near the performance
of even baseline NVMe drives when its comes to throughput.


It's possible some workloads on NVMe might have faster reads or writes
without compression.

https://github.com/facebook/zstd

btrfs compress=zstd:1 translates into zstd -1 right now; there are
some ideas to remap btrfs zstd:1 to one of the newer zstd --fast
options, but it's just an idea. And in any case the default for btrfs
and zstd will remain as 3 and -3 respectively, which is what
'compress=zstd' maps to, making it identical to 'compress=zstd:3'.

I have a laptop with NVMe and haven't come across such a workload so
far, but this is obviously not a scientific sample. I think you'd need
a process that's producing read/write rates that the storage can meet,
but that the compression algorithm limits. Btrfs is threaded, as is
the compression.

What's typical, is no change in performance and sometimes a small
small increase in performance. It essentially trades some CPU cycles
in exchange for less IO. That includes less time reading and writing,
but also less latency, meaning the gain on rotational media is
greater.


Worse, if the workload is very parallel, and at max CPU already
the compression overhead will only make that situation worse as well. (I
suspect you could test this just by building some packages that have
good parallelism during the build).


This is compiling the kernel on a 4/8-core CPU (circa 2011) using make
-j8, the kernel running is 5.11-rc7.

no compression

real55m32.769s
user369m32.823s
sys 35m59.948s

--

compress=zstd:1

real53m44.543s
user368m17.614s
sys 36m2.505s

That's a one time test, and it's a ~3% improvement. *shrug* We don't
really care too much these days about 1-3% differences when doing
encryption, so I think this is probably in that ballpark, even if it
turns out another compile is 3% slower. This is not a significantly
read or write centric workload, it's mostly CPU. So this 3% difference
may not even be related to the compression.


Did you drop caches/etc between runs? Because I git cloned mainline, 
copied the fedora kernel config from /boot and on a fairly recent laptop 
(12 threads) with a software encrypted NVMe. Dropped caches and did a 
time make against a compressed directory and an uncompressed one with 
both a semi constrained (4G) setup and 32G ram setup (compressed 
swapping disabled, because the machine has an encrypted swap for 
hibernation and crashdumps).


compressed:
real22m40.129s
user221m9.816s
sys 23m37.038s

uncompressed:
real21m53.366s
user221m56.714s
sys 23m39.988s

uncompressed 4G ram:
real28m48.964s
user288m47.569s
sys 30m43.957s

compressed 4G
real29m54.061s
user281m7.120s
sys 29m50.613s

and that is not an IO constrained workload its generally cpu 
constrained, and since the caches are warm due to the software 
encryption the decompress times should be much faster than machines that 
aren't cache stashing.


The machine above, can actually peg all 6 cores until it hits thermal 
limits simply doing cp's with btrfs/zstd compression, all the while 
losing about 800MB/sec of IO bandwidth over the raw disk. Turning an IO 
bound problem into a CPU bound one isn't ideal.


Compressed disks only work in the situation where the CPUs can 
compress/decompress faster than the disk, or the workload is managing to 
significantly reduce IO because the working set is streaming rather than 
random. Any workload which has a random read component to it and is 
tending closer to page sized read/writes is going to get hurt, and god 
help if its a RMW cycle. Similarly for parallelized compression, which 
is only scalable if the IO sizes are large enough that its worth the IPI 
overhead of bringing additional cores online and the resulting chunks 
are still large enough to be dealt with individually.








Plus, the write amplification comment isn't even universal as there
continue to be controllers where the flash translation layer is
compressing the data.


At least consumer SSDs tend to just do concurrent write dedup. File
system compression isn't limited to Btrfs, there's also F2FS
contributed by Samsung which implements compression these days as
well, although they commit to it at mkfs time, where as on Btrfs it's
a mount option. Mix and match compressed extents is routine on Btrfs
anyway, so there's no 

Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-11 Thread Tom Seewald
> A few more things:
> 
> *  btrfs-progs tools don't yet have a way to report  compression
> information. While 'df' continues to report correctly about actual
> blocks used and free, both regular 'du' (coreutils) and 'btrfs
> filesystem du' will report uncompressed values.

Are there plans for upstream to address this pretty major shortcoming in the 
next release of btrfs-progs? From what I can see on the btrfs wiki the user 
space support for compression is very rudimentary and with no real indication 
that it is being worked on or seen as a priority.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-11 Thread Chris Murphy
On Thu, Feb 11, 2021 at 9:58 AM Jeremy Linton  wrote:
>
> Hi,
>
> On 1/1/21 8:59 PM, Chris Murphy wrote:

> > Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
> > not even sure someone with a very fast NVMe drive will notice a slow
> > down because the compression/decompression is threaded.
>
> I disagree that everyone benefits. Any read latency sensitive workload
> will be slower due to the application latency being both the drive
> latency plus the decompression latency. And as the kernel benchmarks
> indicate very few systems are going to get anywhere near the performance
> of even baseline NVMe drives when its comes to throughput.

It's possible some workloads on NVMe might have faster reads or writes
without compression.

https://github.com/facebook/zstd

btrfs compress=zstd:1 translates into zstd -1 right now; there are
some ideas to remap btrfs zstd:1 to one of the newer zstd --fast
options, but it's just an idea. And in any case the default for btrfs
and zstd will remain as 3 and -3 respectively, which is what
'compress=zstd' maps to, making it identical to 'compress=zstd:3'.

I have a laptop with NVMe and haven't come across such a workload so
far, but this is obviously not a scientific sample. I think you'd need
a process that's producing read/write rates that the storage can meet,
but that the compression algorithm limits. Btrfs is threaded, as is
the compression.

What's typical, is no change in performance and sometimes a small
small increase in performance. It essentially trades some CPU cycles
in exchange for less IO. That includes less time reading and writing,
but also less latency, meaning the gain on rotational media is
greater.

>Worse, if the workload is very parallel, and at max CPU already
> the compression overhead will only make that situation worse as well. (I
> suspect you could test this just by building some packages that have
> good parallelism during the build).

This is compiling the kernel on a 4/8-core CPU (circa 2011) using make
-j8, the kernel running is 5.11-rc7.

no compression

real55m32.769s
user369m32.823s
sys 35m59.948s

--

compress=zstd:1

real53m44.543s
user368m17.614s
sys 36m2.505s

That's a one time test, and it's a ~3% improvement. *shrug* We don't
really care too much these days about 1-3% differences when doing
encryption, so I think this is probably in that ballpark, even if it
turns out another compile is 3% slower. This is not a significantly
read or write centric workload, it's mostly CPU. So this 3% difference
may not even be related to the compression.


> Plus, the write amplification comment isn't even universal as there
> continue to be controllers where the flash translation layer is
> compressing the data.

At least consumer SSDs tend to just do concurrent write dedup. File
system compression isn't limited to Btrfs, there's also F2FS
contributed by Samsung which implements compression these days as
well, although they commit to it at mkfs time, where as on Btrfs it's
a mount option. Mix and match compressed extents is routine on Btrfs
anyway, so there's no concern with users mixing things up. They can
change the compression level and even the algorithm with impunity,
just tacking it onto a remount command. It's not even necessary to
reboot.


> OTOH, it makes a lot more sense on a lot of these arm/sbc boards
> utilizing MMC because the disks are so slow. Maybe if something like
> this were made the default the machine should run a quick CPU
> compress/decompress vs IO speed test and only enable compression if the
> compress/decompress speed is at least the IO rate.

It's not that simple because neither the user space writers nor
kworkers are single threaded. You'd need a particularly fast NVMe
matched with a not so fast CPU with a workload that somehow dumps a
lot of data in a way that the compression acts as a bottle neck.

It could exist. But it's not a per se problem that I've seen. But if
you propose a test, I can do A/B testing.


--
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-02-11 Thread Jeremy Linton

Hi,

On 1/1/21 8:59 PM, Chris Murphy wrote:

On Fri, Jan 1, 2021 at 11:31 AM Artem Tim  wrote:


It's faster. Here is some benchmark with different zstd compression ratios 
https://lkml.org/lkml/2019/1/28/1930. Could be outdated a little bit though.

But for HDD it makes sense to increase it probably. And IIRC Chris wrote about 
such plans.


There are ideas but it's difficult because the kernel doesn't expose
the information we really need to make an automatic determination.
sysfs commonly misreports rotational devices as being non-rotational
and vice versa. Since this is based on the device self-reporting, it's
not great.

I use zstd:1 for SSD/NVMe. And zstd:3 (which is the same as not
specifying a level) for HDD/USB sticks/eMMC/SD Card. For the more
archive style of backup, I use zstd:7. But these can all be mixed and
matched, Btrfs doesn't care. You can even mix and match algorithms.

Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
not even sure someone with a very fast NVMe drive will notice a slow
down because the compression/decompression is threaded.


I disagree that everyone benefits. Any read latency sensitive workload 
will be slower due to the application latency being both the drive 
latency plus the decompression latency. And as the kernel benchmarks 
indicate very few systems are going to get anywhere near the performance 
of even baseline NVMe drives when its comes to throughput. With PCIe 
Gen4 controllers the burst speeds are even higher (>3GB/sec read & 
write). Worse, if the workload is very parallel, and at max CPU already 
the compression overhead will only make that situation worse as well. (I 
suspect you could test this just by building some packages that have 
good parallelism during the build).


So, your penalizing a large majority of machines built in the past 
couple years.


Plus, the write amplification comment isn't even universal as there 
continue to be controllers where the flash translation layer is 
compressing the data.


OTOH, it makes a lot more sense on a lot of these arm/sbc boards 
utilizing MMC because the disks are so slow. Maybe if something like 
this were made the default the machine should run a quick CPU 
compress/decompress vs IO speed test and only enable compression if the 
compress/decompress speed is at least the IO rate.





I expect if we get the "fast" levels (the negative value levels) new
to zstd in the kernel, that Btrfs will likely remap its level 1 to one
of the negative levels, and keep level 3 set to zstd 3 (the default).
So we might actually see it get even faster at the cost of some
compression ratio. Given this possibility, I think level 1 is the best
choice as a default for Fedora.




___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-01-01 Thread Chris Murphy
On Fri, Jan 1, 2021 at 7:59 PM Chris Murphy  wrote:
>Given this possibility, I think level 1 is the best
> choice as a default for Fedora.

^ for the fstab mount option way of doing this for the entire file system.

If one day there's 'btrfs property' support for levels, it's easy to
imagine doing something like zstd:5 for /usr and /var/lib/flatpak
because the limiting factor is not write performance but download
bandwidth. Since there's effectively a wait for the download (slow no
matter what from the cpu perspective) why not compress more?


--
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-01-01 Thread Chris Murphy
On Fri, Jan 1, 2021 at 11:31 AM Artem Tim  wrote:
>
> It's faster. Here is some benchmark with different zstd compression ratios 
> https://lkml.org/lkml/2019/1/28/1930. Could be outdated a little bit though.
>
> But for HDD it makes sense to increase it probably. And IIRC Chris wrote 
> about such plans.

There are ideas but it's difficult because the kernel doesn't expose
the information we really need to make an automatic determination.
sysfs commonly misreports rotational devices as being non-rotational
and vice versa. Since this is based on the device self-reporting, it's
not great.

I use zstd:1 for SSD/NVMe. And zstd:3 (which is the same as not
specifying a level) for HDD/USB sticks/eMMC/SD Card. For the more
archive style of backup, I use zstd:7. But these can all be mixed and
matched, Btrfs doesn't care. You can even mix and match algorithms.

Anyway, compress=zstd:1 is a good default. Everyone benefits, and I'm
not even sure someone with a very fast NVMe drive will notice a slow
down because the compression/decompression is threaded.

I expect if we get the "fast" levels (the negative value levels) new
to zstd in the kernel, that Btrfs will likely remap its level 1 to one
of the negative levels, and keep level 3 set to zstd 3 (the default).
So we might actually see it get even faster at the cost of some
compression ratio. Given this possibility, I think level 1 is the best
choice as a default for Fedora.



-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-01-01 Thread Chris Murphy
A few more things:

*  btrfs-progs tools don't yet have a way to report  compression
information. While 'df' continues to report correctly about actual
blocks used and free, both regular 'du' (coreutils) and 'btrfs
filesystem du' will report uncompressed values.

*  'compsize' will report compression information and is in Fedora
repo for a while. But it requires privilege.

*  'filefrag' misreports fragmentation, it always over reports fragments.

--
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-01-01 Thread Artem Tim
It's faster. Here is some benchmark with different zstd compression ratios 
https://lkml.org/lkml/2019/1/28/1930. Could be outdated a little bit though.

But for HDD it makes sense to increase it probably. And IIRC Chris wrote about 
such plans. 
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2021-01-01 Thread Jonathan Underwood
On Wed, 30 Dec 2020 at 19:53, Ben Cotton  wrote:
> ** Update anaconda to perform the installation using mount -o
> compress=zstd:1
>

Any reason behind compression level of 1 rather than the default of 3?
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2020-12-31 Thread Chris Murphy
On Wed, Dec 30, 2020 at 12:53 PM Ben Cotton  wrote:
>
> ** Update anaconda to perform the installation using mount -o
> compress=zstd:1
> ** Set the proper option in fstab (alternatively: set the XATTR)

I think the most discoverable is using 'compress=zstd:1" as the mount
option, and any one who wants to opt out would remove this. Upon
removal, the system will become not compressed basically by attrition,
as files are replaced.

The mount option method is per file system. Since we have 'subvol'
mount options to mount '/' and '/home' it seems plausible that
compression is a per subvolume option, but it's not (see below). It's
file system wide.

The per subvolume, per directory, per file method of compression has
some pretty esoteric nuances:

- chattr +c method uses the default compression, currently this is zlib
- btrfs property method can't be unset
https://github.com/kdave/btrfs-progs/issues/308
- btrfs property compression 'none' is not the same as unsetting it,
and it inherits just like any other xattr; none means mount option
"compress" does not apply; mount option "compress-force" will compress
files set with compression 'none'.
- btrfs property method isn't recursive
https://github.com/kdave/btrfs-progs/issues/278
- both methods stop at subvolume boundaries; i.e. if you set
compression on a subvolume or directory, it inherits as you add new
directories or files, but stops at a subvolume
- compression flags survive through btrfs send/receive - this is
particularly confusing because it can make it a bit difficult to have
a copy without compression, and not immediately obvious that it's
continuing to tag along

This might best be turned into a flowchart :P



> This Change only applies to newly installed systems. Existing systems
> on upgrade will be unaffected, but can be converted manually with
> btrfs filesystem defrag -czstd -r, updating `/etc/fstab`
> and remounting.

Note that defragmenting to compress is an option. You can just add the
mount option to fstab and remount, but only new files will be
compressed, but again by attrition, eventually most of the file system
will end up compressed.




-- 
Chris Murphy
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2020-12-31 Thread Luya Tshimbalanga


On 2020-12-30 1:48 p.m., Michel Alexandre Salim wrote:

On Wed, 2020-12-30 at 16:28 -0500, Elliott Sales de Andrade wrote:

On Wed, 30 Dec 2020 at 14:53, Ben Cotton  wrote:

https://fedoraproject.org/wiki/Changes/BtrfsTransparentCompression

== How to test ==

Existing systems can be converted to use compression manually with
btrfs filesystem defrag -czstd -r, updating
`/etc/fstab`

Update `/etc/fstab` how? Please be more explicit.


Good point, thanks. Adding it now.
Additionally, make sure to apply "systemctl daemon-reload" after editing 
/etc/fstab otherwise some services will fail to boot on existing 
installed system.


--
Luya Tshimbalanga
Fedora Design Team
Fedora Design Suite maintainer
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2020-12-30 Thread Michel Alexandre Salim
On Wed, 2020-12-30 at 16:28 -0500, Elliott Sales de Andrade wrote:
> On Wed, 30 Dec 2020 at 14:53, Ben Cotton  wrote:
> > 
> > https://fedoraproject.org/wiki/Changes/BtrfsTransparentCompression
> > 
> > == How to test ==
> > 
> > Existing systems can be converted to use compression manually with
> > btrfs filesystem defrag -czstd -r, updating
> > `/etc/fstab`
> 
> Update `/etc/fstab` how? Please be more explicit.
> 
Good point, thanks. Adding it now.
> 

-- 
Michel Alexandre Salim
profile: https://keyoxide.org/mic...@michel-slm.name
chat via email: https://delta.chat/
GPG key: 5DCE 2E7E 9C3B 1CFF D335 C1D7 8B22 9D2F 7CCC 04F2


signature.asc
Description: This is a digitally signed message part
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

2020-12-30 Thread Elliott Sales de Andrade
On Wed, 30 Dec 2020 at 14:53, Ben Cotton  wrote:
>
> https://fedoraproject.org/wiki/Changes/BtrfsTransparentCompression
>
> == Summary ==
>
> On variants using btrfs as the default filesystem, enable transparent
> compression using zstd. Compression saves space and can significantly
> increase the lifespan of flash-based media by reducing write
> amplification. It can also increase read and write performance.
>
> == Owners ==
>
> * Name: [[User:salimma|Michel Salim]], [[User:dcavalca|Davide
> Cavalca]], [[User:josef|Josef Bacik]]
> * Email: mic...@michel-slm.name, dcava...@fb.com, jo...@toxicpanda.com
>
>
> == Detailed description ==
>
> Transparent compression is a btrfs feature that allows a btrfs
> filesystem to apply compression on a per-file basis. Of the three
> supported algorithms, zstd is the one with the best compression speed
> and ratio. Enabling compression saves space, but it also reduces write
> amplification, which is important for SSDs. Depending on the workload
> and the hardware, compression can also result in an increase in read
> and write performance.
>
> See https://pagure.io/fedora-btrfs/project/issue/5 for details. This
> was originally scoped as an optimization for
> https://fedoraproject.org/wiki/Changes/BtrfsByDefault during Fedora
> 33.
>
>
> == Benefit to Fedora ==
>
> Better disk space usage, reduction of write amplification, which in
> turn helps increase lifespan and performance on SSDs and other
> flash-based media. It can also increase read and write performance.
>
> == Scope ==
>
> * Proposal owners:
> ** Update anaconda to perform the installation using mount -o
> compress=zstd:1
> ** Set the proper option in fstab (alternatively: set the XATTR)
> ** Update disk image build tools to enable compression:
> *** lorax
> *** appliance-tools
> *** osbuild
> *** imagefactory
> ** [optional] Add support for
> [https://github.com/kdave/btrfs-progs/issues/328 setting compression
> level when defragmenting]
> ** [optional] Add support for
> [https://github.com/kdave/btrfs-progs/issues/329 setting compression
> level using `btrfs property`]
> * Other developers:
> ** anaconda: review PRs as needed
> * Release engineering: https://pagure.io/releng/issue/9920
> * Policies and guidelines: N/A
> * Trademark approval: N/A
>
> == Upgrade/compatibility impact ==
>
> This Change only applies to newly installed systems. Existing systems
> on upgrade will be unaffected, but can be converted manually with
> btrfs filesystem defrag -czstd -r, updating `/etc/fstab`
> and remounting.
>
> == How to test ==
>
> Existing systems can be converted to use compression manually with
> btrfs filesystem defrag -czstd -r, updating `/etc/fstab`

Update `/etc/fstab` how? Please be more explicit.

> and remounting.
>
> == User experience ==
>
> Compression will result in file sizes (e.g. as reported by du) not
> matching the actual space occupied on disk. The
> [https://src.fedoraproject.org/rpms/compsize compsize] utility can be
> used to examine the compression type, effective compression ration and
> actual size.
>
> == Dependencies ==
>
> Anaconda will need to be updated to perform the installation using
> mount -o compress=zstd:1
>
> == Contingency plan ==
>
> * Contingency mechanism: will not include PR patches if not merged
> upstream and will not enable
> * Contingency deadline: Final freeze
> * Blocks release? No
> * Blocks product? No
>
> == Documentation ==
>
> https://btrfs.wiki.kernel.org/index.php/Compression
>
> == Release Notes ==
>
> Transparent compression of the filesystem using zstd is now enabled by
> default. Use the compsize utility to find out the actual size on disk
> of a given file.
>
>
> --
> Ben Cotton
> He / Him / His
> Senior Program Manager, Fedora & CentOS Stream
> Red Hat
> TZ=America/Indiana/Indianapolis
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: 
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org



-- 
Elliott
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org