On Fri, Aug 29, 2014 at 09:34:54PM +0530, Shriramana Sharma wrote:
> On 8/17/14, Shriramana Sharma <samj...@gmail.com> wrote:
> > Hello. One more Q re generic BTRFS behaviour.
> > https://btrfs.wiki.kernel.org/index.php/Main_Page specifically
> > advertises BTRFS's "Space-efficient packing of small files".
> 
> Hello. I realized that while I got lots of interesting advice on how
> to best layout my FS on multiple devices/FSs, I would like to
> specifically know how exactly the above works (in not-too-technical
> terms) so I'd like to decide for myself if the above feature of BTRFS
> would suit my particular purpose.

   In brief: For small files (typically under about 3.5k), the FS can
put the file's data in the metadata -- specifically, the extent tree
-- so that the data is directly available without a second seek to
find it.

   The longer version: btrfs has a number of B-trees in its metadata.
These are trees with a high fan-out (from memory, it's something like
30-240 children each, depending on the block size), and with the
actual data being stored at the leaves of the tree. Each leaf of the
tree is a fixed size, depending on the options passed to mkfs.
Typically 4k-32k.

   The data in the trees is stored as a key and a value -- the tree
indexes the keys efficiently, and stores the values (usually some data
structure like an inode or file extent information) in the same leaf
node as the key -- keys at the front of the leaf, data at the back.

   The extent tree keeps track of the contiguous byte sequences of
each file, and where those sequences can be found on the FS. To read a
file, the FS looks up the file's extents in the extent tree, and then
has to go and find the data that it points to. This involves an extra
read of the disk, which is slow. However, the metadata tree leaf is
already in RAM (because the FS has just read it). So, for performance
and space efficiency reasons, it can optionally store data for small
files as part of the "value" component of the key/value pair for the
file's extent. This means that the file's data is available
immediately, without the extra disk read.

   Drawbacks -- metadata on btrfs is usually DUP, which means two
copies, so storing lots of medium-small files (2k-4k) will take up
more space than it would otherwise, because you're storing two copies
and not saving enough space to make it worthwhile. It also makes it
harder to calculate the "used" vs "free" values for df.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- Great films about cricket: Umpire of the Rising Sun ---       

Attachment: signature.asc
Description: Digital signature

Reply via email to