Re: BTRFS for OLTP Databases

Austin S. Hemmelgarn Tue, 07 Feb 2017 07:43:54 -0800

On 2017-02-07 10:20, Timofey Titovets wrote:

I think that you have a problem with extent bookkeeping (if i
understand how btrfs manage extents).
So for deal with it, try enable compression, as compression will force
all extents to be fragmented with size ~128kb.


No, it will compress everything in chunks of 128kB, but it will not fragment
things any more than they already would have been (it may actually _reduce_
fragmentation because there is less data being stored on disk).  This
representation is a bug in the FIEMAP ioctl, it doesn't understand the way
BTRFS represents things properly.  IIRC, there was a patch to fix this, but
I don't remember what happened with it.

That said, in-line compression can help significantly, especially if you
have slow storage devices.



I mean that:
You have a 128MB extent, you rewrite random 4k sectors, btrfs will not
split 128MB extent, and not free up data, (i don't know internal algo,
so i can't predict when this will hapen), and after some time, btrfs
will rebuild extents, and split 128 MB exten to several more smaller.
But when you use compression, allocator rebuilding extents much early
(i think, it's because btrfs also operates with that like 128kb
extent, even if it's a continuos 128MB chunk of data).

The allocator has absolutely nothing to do with this, it's a function ofthe COW operation. Unless you're using nodatacow, that 128MB extentwill get split the moment the data hits the storage device (either onthe next commit cycle (at most 30 seconds with the default commitcycle), or when fdatasync is called, whichever is sooner). In the caseof compression, it's still one extent (although on disk it will be lessthan 128MB) and will be split at _exactly_ the same time under _exactly_the same circumstances as an uncompressed extent. IOW, it hasabsolutely nothing to do with the extent handling either.

The difference arises in that compressed data effectively has a on-mediablock size of 128k, not 16k (the current default block size) or 4k (theold default). This means that the smallest fragment possible for a filewith in-line compression enabled is 128k, while for a file without itit's equal to the filesystem block size. A larger minimum fragment sizemeans that the maximum number of fragments a given file can have issmaller (8 times smaller in fact than without compression when using thecurrent default block size), which means that there will be lessfragmentation.

Some rather complex and tedious math indicates that this is not the_only_ thing improving performance when using in-line compression, butit's probably the biggest thing doing so for the workload being discussed.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS for OLTP Databases

Reply via email to