On Tue, Sep 22, 2015 at 09:52:19PM +0200, carlo von lynX wrote: > Hello, it's me again. This time I searched the web to make sure > I'm not making another beginner's mistake. I'm still not on the > list, so please keep me in cc: on replies. > > I have optimized a btrfs subvolume with a script* that reflinks > all files with identical contents, then I did a read-only snap > and fed it to send/receive. The bad news: on the receiving > side the same snapshot grew from 5.5G to 7.1G.
That's something I'd definitely expect it to be able to do. If it's not doing it, I'd say there's something wrong. cc'ing Filipe, who is, I think, currently the local expert on send/receive. > I assume send/receive does not support one of the coolest > btrfs features ever.. reflinks. Didn't find any mention on this > on https://btrfs.wiki.kernel.org/index.php/Incremental_Backup > or other pages. Is there any documentation that would explain > to me why this has to be or is it just a missing feature that > someone someday may find the time to add? > > Generally I find it odd that btrfs receive would not recreate > an identical clone of the original snapshot, that would also > allow me to continue working on a backup hard disk, then merge > the changes back to the main disk. Instead I have to decide > which device contains the master copy for all times and never > make rw snapshots elsewhere. What if the master disk dies? > Then I can turn a backup into the new master but I will have > to re-bootstrap all other backups as they will not accept the > non-identical parent snapshot. That's a known drawback, and one that's been discussed on this list already. It's fixable (within some limits), but requires a change to the send stream format. (See my analysis below). > Apparently I'm not the only one that thought this to be a > defect rather than a design choice: > http://www.spinics.net/lists/linux-btrfs/msg45175.html > > This actually confused me (in particular the absence of responses > to that mail), that's why I have btrfs-progs 4.0 installed... > but in the meantime I figured out that I expected send/receive > to be bidirectional. So my question in this case.. is there a > higher reasoning for the inexactness of send/receive transfers? It's about tracking enough metadata to be sure that the send (or the receive) is actually feasible. See http://www.spinics.net/lists/linux-btrfs/msg44089.html for my analysis of the problem, and (theoretical) suggestions for what the solution should look like. > And another classic: since the output size of the snapshot copy > is unpredictable, running out of disk space can be frequent. > Wouldn't it be cool if receive could resume rather than restarting > from scratch? Resuming is a bit tricky -- how do you know where to resume from? Bear in mind that send simply writes its results to stdout, so it has no knowledge of anything on the receiving side. In fact, the receiving side may not even exist at the point that the send stream is created. Hugo. > But maybe I still got it all wrong in my head. If these things > are FAQs, please add them to the FAQ document. In particular some > criteria to decide when rsync is actually a more suitable tool > over send/receive, which apparently under some circumstances is > the case. In some other cases, git can be the better suited tool. > > Still I am very glad that you created a new alternative for data > organization between the extremes of reckless rsync and overly > accurate git. It's just a steep learning mountain. > > > *) I used fdupes' output ran through a perl script that calls > "cp --reflink" for each match. Would "bedup" or "duperemove" > do a better job? bedup looks like a better long-term solution. > > -- Hugo Mills | Great oxymorons of the world, no. 3: hugo@... carfax.org.uk | Military Intelligence http://carfax.org.uk/ | PGP: E2AB1DE4 |
signature.asc
Description: Digital signature