Am Sun, 17 Sep 2017 08:20:50 -0500
schrieb Dan Douglas <orm...@gmail.com>:

> On 09/17/2017 04:17 AM, Kai Krakow wrote:
> > Am Sun, 17 Sep 2017 01:20:45 -0500
> > schrieb Dan Douglas <orm...@gmail.com>:
> >   
> >> On 09/16/2017 07:06 AM, Kai Krakow wrote:  
>  [...]  
>  [...]  
> >>  [...]    
>  [...]  
>  [...]  
> >>
> >> According to btrfs-filesystem(8), defragmentation breaks reflinks,
> >> in all but a few old kernel versions where I guess they tried to
> >> fix the problem and apparently failed.  
> > 
> > It was splitting and splicing all the reflinks which is actually a
> > tree walk with more and more extents coming into the equation, and
> > ended up doing a lot of small IO and needing a lot of memory. I
> > think you really cannot fix this when working with extents.  
> 
> I figured by "break up" they meant it eliminates the reflink by making
> a full copy... so the increased space they're talking about isn't
> really double that of the original data in other words.
> 
> >   
> >> This really makes much of what btrfs
> >> does altogether pointless if you ever defragment manually or have
> >> autodefrag enabled. Deduplication is broken for the same reason.  
> > 
> > It's much easier to fix this for deduplication: Just write your
> > common denominator of an extent to a tmp file, then walk all the
> > reflinks and share them with parts of this extent.
> > 
> > If you carefully select what to defragment, there should be no
> > problem. A defrag tool could simply skip all the shared extents. A
> > few fragments do not hurt performance at all, but what's important
> > is spatial locality. A lot small fragments may hurt performance a
> > lot, so one could give the defragger a hint when to ignore the rule
> > and still defragment the extent. Also, when your deduplication
> > window is 1M you could probably safely defrag all extents smaller
> > than 1M.  
> 
> Yeah this sort of hurts with the way I deal wtih KVM image snapshots.
> I have raw base images as backing files with lots of shared and null
> data, so I run `fallocate --dig-holes' followed by `duperemove
> --dedupe-options=same' on the cow-enabled base images and hope that
> btrfs defrag can clean up the resulting fragmented mess, but it's a
> slow process and doesn't seem to do a good job.

I would be interested about your results if you try bees[1] to
deduplicate your KVM images. It should be able to dig holes and merge
blocks by reflinking. I'm not sure if it would merge continuous extents
back into one single extent, I think that's on a todo list. It could
act as a reflink-aware defragger then.

It currently does not work well for mixed datasum/nodatasum workloads,
so I made a PR[2] to ignore nocow files. A more elaborated patch would
not try to reflink datasum and nodatasum extents (nocow implies
nodatasum).

[1]: https://github.com/Zygo/bees
[2]: https://github.com/Zygo/bees/pull/21


-- 
Regards,
Kai

Replies to list-only preferred.

Attachment: pgpb6FiJolG_M.pgp
Description: Digitale Signatur von OpenPGP

Reply via email to