Re: historical backups with hardlinks vs cp --reflink vs snapshots

2014-05-21 Thread Russell Coker
On Tue, 20 May 2014 20:59:28 Marc MERLIN wrote:
  just wrote a blog post about the 3 way of doing historical snapshots:
 http://marc.merlins.org/perso/btrfs/post_2014-05-20_Historical-Snapshots-Wit
 h-Btrfs.html 
 I love reflink, but that forces me to use btrfs send as the only way to
 copy a filesystem without losing the reflink relationship, and I have no
 good way from user space to see the blocks shared to see how many are
 shared or whether some just got duped in a copy.
 As a result, for now I still use hardlinks.

It would be nice if someone patched rsync to look for files with identical 
contents and use reflink or hardlinks (optionally at user request) instead of 
making multiple copies of the same data.  Also it would be nice if rsync would 
look for matching blocks in different files to save transfer.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: historical backups with hardlinks vs cp --reflink vs snapshots

2014-05-20 Thread Marc MERLIN
On Mon, May 19, 2014 at 06:01:25PM +0200, Brendan Hide wrote:
 On 19/05/14 15:00, Scott Middleton wrote:
 On 19 May 2014 09:07, Marc MERLIN m...@merlins.org wrote:
 On Wed, May 14, 2014 at 11:36:03PM +0800, Scott Middleton wrote:
 I read so much about BtrFS that I mistaked Bedup with Duperemove.
 Duperemove is actually what I am testing.
 I'm currently using programs that find files that are the same, and
 hardlink them together:
 http://marc.merlins.org/perso/linux/post_2012-05-01_Handy-tip-to-save-on-inodes-and-disk-space_-finddupes_-fdupes_-and-hardlink_py.html
 
 hardlink.py actually seems to be the faster (memory and CPU) one event
 though it's in python.
 I can get others to run out of RAM on my 8GB server easily :(
 
 Interesting app.
 
 An issue with hardlinking (with the backups use-case, this problem isn't 
 likely to happen), is that if you modify a file, all the hardlinks get 
 changed along with it - including the ones that you don't want changed.
 
 @Marc: Since you've been using btrfs for a while now I'm sure you've already 
 considered whether or not a reflink copy is the better/worse option.

Yes, I have indeed considered it :)

I just wrote a blog post about the 3 way of doing historical snapshots:
http://marc.merlins.org/perso/btrfs/post_2014-05-20_Historical-Snapshots-With-Btrfs.html
 
I love reflink, but that forces me to use btrfs send as the only way to
copy a filesystem without losing the reflink relationship, and I have no
good way from user space to see the blocks shared to see how many are
shared or whether some just got duped in a copy.
As a result, for now I still use hardlinks.

Once bedup is a bit more ready, I may switch.

That said, duperemove is another dedup I wasn't aware of and I should
look at indeed:
https://github.com/markfasheh/duperemove/blob/master/README

Does it basically do the same work then bedup and tell btrfs to
consolidate blocks it indentified as dupes?
Does it work across subvolumes?

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html