Re: Content based storage

2010-03-20 Thread Boyd Waters
I realize that I've posted some dumb things in this thread so here's a re-cast summary: 1) In the past, I experimented with fikesystem backups, using my own file-level checksumming that would detect when a file was already in the backup repository, and add a hard link rather than allocate new bloc

Re: Content based storage

2010-03-20 Thread Ric Wheeler
On 03/20/2010 06:16 PM, Ric Wheeler wrote: On 03/20/2010 05:24 PM, Boyd Waters wrote: On Mar 20, 2010, at 9:05 AM, Ric Wheeler wrote: My dataset reported a dedup factor of 1.28 for about 4TB, meaning that almost a third of the dataset was duplicated. It is always interesting to compare thi

Re: Content based storage

2010-03-20 Thread Ric Wheeler
On 03/20/2010 05:24 PM, Boyd Waters wrote: On Mar 20, 2010, at 9:05 AM, Ric Wheeler wrote: My dataset reported a dedup factor of 1.28 for about 4TB, meaning that almost a third of the dataset was duplicated. It is always interesting to compare this to the rate you would get with old fashion

Re: Content based storage

2010-03-20 Thread Boyd Waters
On Mar 20, 2010, at 9:05 AM, Ric Wheeler wrote: >> >> My dataset reported a dedup factor of 1.28 for about 4TB, meaning >> that >> almost a third of the dataset was duplicated. > It is always interesting to compare this to the rate you would get > with old fashioned compression to see how effecti

Re: Content based storage

2010-03-20 Thread Ric Wheeler
On 03/19/2010 10:46 PM, Boyd Waters wrote: 2010/3/17 Hubert Kario: Read further, Sun did provide a way to enable the compare step by using "verify" instead of "on": zfs set dedup=verify I have tested ZFS deduplication on the same data set that I'm using to test btrfs. I used a 5-eleme

Re: Content based storage

2010-03-19 Thread Boyd Waters
2010/3/17 Hubert Kario : > > Read further, Sun did provide a way to enable the compare step by using > "verify" instead of "on": > zfs set dedup=verify I have tested ZFS deduplication on the same data set that I'm using to test btrfs. I used a 5-element radiz, dedup=on, which uses SHA256 for ZFS

Re: Content based storage

2010-03-17 Thread Hubert Kario
On Wednesday 17 March 2010 16:33:41 Leszek Ciesielski wrote: > On Wed, Mar 17, 2010 at 4:25 PM, Hubert Kario wrote: > > On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: > >> Hi, > >> > >> just want to add one correction to your thoughts: > >> > >> Storage is not cheap if you think abou

Re: Content based storage

2010-03-17 Thread Leszek Ciesielski
On Wed, Mar 17, 2010 at 4:25 PM, Hubert Kario wrote: > On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: >> Hi, >> >> just want to add one correction to your thoughts: >> >> Storage is not cheap if you think about enterprise storage on a SAN, >> replicated to another data centre. Using

Re: Content based storage

2010-03-17 Thread Hubert Kario
On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: > Hi, > > just want to add one correction to your thoughts: > > Storage is not cheap if you think about enterprise storage on a SAN, > replicated to another data centre. Using dedup on the storage boxes leads > to performance issues an

Re: Content based storage

2010-03-17 Thread Heinz-Josef Claes
Hi, just want to add one correction to your thoughts: Storage is not cheap if you think about enterprise storage on a SAN, replicated to another data centre. Using dedup on the storage boxes leads to performance issues and other problems - only NetApp is offering this at the moment and it's no

Re: Content based storage

2010-03-17 Thread David Brown
On 17/03/2010 01:45, Hubert Kario wrote: On Tuesday 16 March 2010 10:21:43 David Brown wrote: Hi, I was wondering if there has been any thought or progress in content-based storage for btrfs beyond the suggestion in the "Project ideas" wiki page? The basic idea, as I understand it, is that a l

Re: Content based storage

2010-03-17 Thread David Brown
On 16/03/2010 23:45, Fabio wrote: Some years ago I was searching for that kind of functionality and found an experimental ext3 patch to allow the so-called COW-links: http://lwn.net/Articles/76616/ I'd read about the COW patches for ext3 before. While there is certainly some similarity here,

Re: Content based storage

2010-03-16 Thread Hubert Kario
On Tuesday 16 March 2010 10:21:43 David Brown wrote: > Hi, > > I was wondering if there has been any thought or progress in > content-based storage for btrfs beyond the suggestion in the "Project > ideas" wiki page? > > The basic idea, as I understand it, is that a longer data extent > checksum i

Re: Content based storage

2010-03-16 Thread Fabio
Some years ago I was searching for that kind of functionality and found an experimental ext3 patch to allow the so-called COW-links: http://lwn.net/Articles/76616/ There was a discussion later on LWN http://lwn.net/Articles/77972/ an approach like COW-links would break POSIX standards. I am no