Re: [GSoC 2015] Btrfs content based storage

2015-04-14 Thread harshad shirwadkar
, 2015 at 10:47 AM, David Sterba dste...@suse.cz wrote: On Fri, Mar 27, 2015 at 10:58:42AM -0400, harshad shirwadkar wrote: I am a CS graduate student from Carnegie Mellon University. I am hoping to build the feature - Content based storage mode under Google Summer of Code 2015. This project has

Re: [GSoC 2015] Btrfs content based storage

2015-04-13 Thread David Sterba
On Fri, Mar 27, 2015 at 10:58:42AM -0400, harshad shirwadkar wrote: I am a CS graduate student from Carnegie Mellon University. I am hoping to build the feature - Content based storage mode under Google Summer of Code 2015. This project has also been listed as an idea on BTRFS ideas page

[GSoC 2015] Btrfs content based storage

2015-03-27 Thread harshad shirwadkar
Hello All, I am a CS graduate student from Carnegie Mellon University. I am hoping to build the feature - Content based storage mode under Google Summer of Code 2015. This project has also been listed as an idea on BTRFS ideas page. However, I have not found a mentor yet, and without a mentor I

Re: Content based storage

2010-03-20 Thread Ric Wheeler
On 03/19/2010 10:46 PM, Boyd Waters wrote: 2010/3/17 Hubert Karioh...@qbs.com.pl: Read further, Sun did provide a way to enable the compare step by using verify instead of on: zfs set dedup=verifypool I have tested ZFS deduplication on the same data set that I'm using to test btrfs.

Re: Content based storage

2010-03-20 Thread Boyd Waters
On Mar 20, 2010, at 9:05 AM, Ric Wheeler rwhee...@redhat.com wrote: My dataset reported a dedup factor of 1.28 for about 4TB, meaning that almost a third of the dataset was duplicated. It is always interesting to compare this to the rate you would get with old fashioned compression to see

Re: Content based storage

2010-03-20 Thread Ric Wheeler
On 03/20/2010 05:24 PM, Boyd Waters wrote: On Mar 20, 2010, at 9:05 AM, Ric Wheelerrwhee...@redhat.com wrote: My dataset reported a dedup factor of 1.28 for about 4TB, meaning that almost a third of the dataset was duplicated. It is always interesting to compare this to the rate you would

Re: Content based storage

2010-03-19 Thread Boyd Waters
2010/3/17 Hubert Kario h...@qbs.com.pl: Read further, Sun did provide a way to enable the compare step by using verify instead of on: zfs set dedup=verify pool I have tested ZFS deduplication on the same data set that I'm using to test btrfs. I used a 5-element radiz, dedup=on, which uses

Re: Content based storage

2010-03-17 Thread David Brown
On 16/03/2010 23:45, Fabio wrote: Some years ago I was searching for that kind of functionality and found an experimental ext3 patch to allow the so-called COW-links: http://lwn.net/Articles/76616/ I'd read about the COW patches for ext3 before. While there is certainly some similarity

Re: Content based storage

2010-03-17 Thread David Brown
On 17/03/2010 01:45, Hubert Kario wrote: On Tuesday 16 March 2010 10:21:43 David Brown wrote: Hi, I was wondering if there has been any thought or progress in content-based storage for btrfs beyond the suggestion in the Project ideas wiki page? The basic idea, as I understand

Re: Content based storage

2010-03-17 Thread Heinz-Josef Claes
think there is enough need. my 2 cents, Heinz-Josef Claes On Wednesday 17 March 2010 09:27:15 you wrote: On 17/03/2010 01:45, Hubert Kario wrote: On Tuesday 16 March 2010 10:21:43 David Brown wrote: Hi, I was wondering if there has been any thought or progress in content-based storage

Re: Content based storage

2010-03-17 Thread Hubert Kario
On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: Hi, just want to add one correction to your thoughts: Storage is not cheap if you think about enterprise storage on a SAN, replicated to another data centre. Using dedup on the storage boxes leads to performance issues and

Re: Content based storage

2010-03-17 Thread Leszek Ciesielski
On Wed, Mar 17, 2010 at 4:25 PM, Hubert Kario h...@qbs.com.pl wrote: On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: Hi, just want to add one correction to your thoughts: Storage is not cheap if you think about enterprise storage on a SAN, replicated to another data centre.

Re: Content based storage

2010-03-17 Thread Hubert Kario
On Wednesday 17 March 2010 16:33:41 Leszek Ciesielski wrote: On Wed, Mar 17, 2010 at 4:25 PM, Hubert Kario h...@qbs.com.pl wrote: On Wednesday 17 March 2010 09:48:18 Heinz-Josef Claes wrote: Hi, just want to add one correction to your thoughts: Storage is not cheap if you think about

Content based storage

2010-03-16 Thread David Brown
Hi, I was wondering if there has been any thought or progress in content-based storage for btrfs beyond the suggestion in the Project ideas wiki page? The basic idea, as I understand it, is that a longer data extent checksum is used (long enough to make collisions unrealistic), and merge

Re: Content based storage

2010-03-16 Thread Fabio
on Hard Disk and make SSD devices possible for storage (because of the space efficiency). -- Fabio David Brown ha scritto: Hi, I was wondering if there has been any thought or progress in content-based storage for btrfs beyond the suggestion in the Project ideas wiki page? The basic idea

Re: Content based storage

2010-03-16 Thread Hubert Kario
On Tuesday 16 March 2010 10:21:43 David Brown wrote: Hi, I was wondering if there has been any thought or progress in content-based storage for btrfs beyond the suggestion in the Project ideas wiki page? The basic idea, as I understand it, is that a longer data extent checksum is used