Re: btrfs send extremely slow (almost stuck)

Qu Wenruo Sun, 28 Aug 2016 19:12:02 -0700


At 08/28/2016 11:38 AM, Oliver Freyermuth wrote:

Dear btrfs experts,

I just tried to make use of btrfs send / receive for incremental backups (using 
btrbk to simplify the process).
It seems that on my two machines, btrfs send gets stuck after transferring some 
GiB - it's not fully halted, but instead of making full use of the available I/O, 
I get something < 500 kiB on average,
which are just some "full speed spikes" with many seconds / minutes of no I/O 
in between.

During this "halting", btrfs send eats one full CPU core.
A "perf top" shows this is spent in "find_parent_nodes" and "__merge_refs" 
inside the kernel.
I am using btrfs-progs 4.7 and kernel 4.7.0.


Unknown bug, while unfortunately no good idea to solve yet.

I sent a RFC patch to completely disable shared extent detection, whilegot strong objection.

I also submitted some other ideas on fixing it, while still got strongobjection. Objection includes this is a performance problem, not afunction problem and we should focus on function problem first andpostpone such performance problem.

And further more, Btrfs from the beginning of its design, focuses onfast snapshot creation, and takes backref walk as sacrifice.

So it's not an easy thing to fix.


I googled a bit and found related patchwork 
(https://patchwork.kernel.org/patch/9238987/) which seems to workaround high 
load in this area and mentions a real solution is proposed but not yet there.

Since this affects two machines of mine and backupping my root volume would 
take about 80 hours in case I can extrapolate the average rate, this means 
btrfs send is unusable to me.

Can I assume this is a common issue which will be fixed in a later kernel 
release (4.8, 4.9) or can I do something to my FS's to workaround this issue?

I don't expect there will be even an agreement on how to fix the problemin v4.1x.

Fixes in send will lead to obvious speed improvement, while causeincompatibility or super complex design.Fixes in backref will lead to a backref rework, which normally comeswith new regression, and we are even unsure if it will really help.

If you just hate the super slow send, and can accept the extra spaceusage, please try this RFC patch:


https://patchwork.kernel.org/patch/9245287/

This patch, just as its name, will completely stop same extent(reflink)detection.Which will cause more space usage, while it skipped the super timeconsuming find_parent_nodes(), it should at least workaround your problem.

I have some other idea to fix it with less aggressive idea, while sincethere is objection against it, I didn't code it further.

But, since there are *REAL* *WORLD* users reporting such problem, Ithink I'd better restart the fix as an RFC.


Thanks,
Qu


One FS is only two weeks old, the other one now about 1 year. I did some 
balancing at some points of time to have more unallocated space for trimming,
and used duperemove regularly to free space. One FS has skinny extents, the 
other has not.

Mount options are "rw,noatime,compress=zlib,ssd,space_cache,commit=120".

Apart from that: No RAID or any other special configuration involved.

Cheers and any help appreciated,
        Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs send extremely slow (almost stuck)

Reply via email to