On Mon, Dec 04, 2017 at 08:34:28 +0800, Qu Wenruo wrote:

>> 1. is there any switch resulting in 'defrag only exclusive data'?
> 
> IIRC, no.

I have found a directory - pam_abl databases, which occupy 10 MB (yes,
TEN MEGAbytes) and released ...8.7 GB (almost NINE GIGAbytes) after
defrag. After defragging files were not snapshotted again and I've lost
3.6 GB again, so I got this fully reproducible.
There are 7 files, one of which is 99% of the space (10 MB). None of
them has nocow set, so they're riding all-btrfs.

I could debug something before I'll clean this up, is there anything you
want to me to check/know about the files?

The fragmentation impact is HUGE here, 1000-ratio is almost a DoS
condition which could be triggered by malicious user during a few hours
or faster - I've lost 3.6 GB during the night with reasonably small
amount of writes, I guess it might be possible to trash entire
filesystem within 10 minutes if doing this on purpose.

>> 3. I guess there aren't, so how could I accomplish my target, i.e.
>>    reclaiming space that was lost due to fragmentation, without breaking
>>    spanshoted CoW where it would be not only pointless, but actually harmful?
> 
> What about using old kernel, like v4.13?

Unfortunately (I guess you had 3.13 on mind), I need the new ones and
will be pushing towards 4.14.

>> 4. How can I prevent this from happening again? All the files, that are
>>    written constantly (stats collector here, PostgreSQL database and
>>    logs on other machines), are marked with nocow (+C); maybe some new
>>    attribute to mark file as autodefrag? +t?
> 
> Unfortunately, nocow only works if there is no other subvolume/inode
> referring to it.

This shouldn't be my case anymore after defrag (==breaking links).
I guess no easy way to check refcounts of the blocks?

> But in my understanding, btrfs is not suitable for such conflicting
> situation, where you want to have snapshots of frequent partial updates.
> 
> IIRC, btrfs is better for use case where either update is less frequent,
> or update is replacing the whole file, not just part of it.
> 
> So btrfs is good for root filesystem like /etc /usr (and /bin /lib which
> is pointing to /usr/bin and /usr/lib) , but not for /var or /run.

That is something coherent with my conclusions after 2 years on btrfs,
however I didn't expect a single file to eat 1000 times more space than it
should...


I wonder how many other filesystems were trashed like this - I'm short
of ~10 GB on other system, many other users might be affected by that
(telling the Internet stories about btrfs running out of space).

It is not a problem that I need to defrag a file, the problem is I don't know:
1. whether I need to defrag,
2. *what* should I defrag
nor have a tool that would defrag smart - only the exclusive data or, in
general, the block that are worth defragging if space released from
extents is greater than space lost on inter-snapshot duplication.

I can't just defrag entire filesystem since it breaks links with snapshots.
This change was a real deal-breaker here...

Any way to fed the deduplication code with snapshots maybe? There are
directories and files in the same layout, this could be fast-tracked to
check and deduplicate.

-- 
Tomasz Pala <go...@pld-linux.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to