Re: [BackupPC-users] improving the deduplication ratio

2008-04-29 Thread dan
WHS is a 'OK' as a backup server in a very small, windows only environment. It can only backup 10 machines and can only backup machines that are running windows XP+. Plus is has a major file corruption bug when under load! that microsoft has yet to patch since it's discovery oct07. On Mon, Apr 2

Re: [BackupPC-users] improving the deduplication ratio

2008-04-28 Thread Kenneth Porter
--On Wednesday, April 09, 2008 4:32 PM +0200 Ludovic Drolez <[EMAIL PROTECTED]> wrote: > I've seen in some commercial backup systems with included > deduplication (which often run under Linux :-), that files are split in > 128k or 256k chunks prior to deduplication. Just to name one: Windows Hom

Re: [BackupPC-users] improving the deduplication ratio

2008-04-16 Thread Les Mikesell
Ludovic Drolez wrote: > > Yes, and that's why a database would be more efficient than hard > links... > I've just read that Diligent ProtecTIER can store all the links for a > 1 PB pool (yes ! peta bytes !), in just 4GB of memory. With the links in > memory, they can > achieve very high backup s

Re: [BackupPC-users] improving the deduplication ratio

2008-04-16 Thread Ludovic Drolez
On Wed, Apr 16, 2008 at 03:23:43PM +0200, Tino Schwarze wrote: > While it would work (and I thought about that myself), I'm not keen on > having even more directories in the BackupPC file system. We're talking > about hundreds of thousands of *additional* files here, if the chunk > size is, say 64k

Re: [BackupPC-users] improving the deduplication ratio

2008-04-16 Thread Ludovic Drolez
On Mon, Apr 14, 2008 at 02:31:30PM -0700, Michael Barrow wrote: > > Introducing file chunking would introduce a new abstraction layer - a > > file would need to be split into chunks and recreated for restore. You > > > Tino -- thanks for posting this. These issues are exactly what I had > in mi

Re: [BackupPC-users] improving the deduplication ratio

2008-04-14 Thread Les Mikesell
Ludovic Drolez wrote: > On Wed, Apr 09, 2008 at 10:12:09AM -0500, Les Mikesell wrote: >> I'd probably look at what rdiff-backup does with incremental differences >> and instead of chunking everything, just track changes where the >> differences are small. > > Yes but rdiff-backup has no pooling/

Re: [BackupPC-users] improving the deduplication ratio

2008-04-14 Thread Michael Barrow
On Apr 14, 2008, at 11:20 AM, Tino Schwarze wrote: > > Of > course, you shouldn't underestimate the cost of managing a lot of > small > files (my pool has about 5 million files, some of them are pretty > large), so the pool will have even more files which means more seeking > and looking up file

Re: [BackupPC-users] improving the deduplication ratio

2008-04-14 Thread Tino Schwarze
On Mon, Apr 14, 2008 at 10:09:57AM +0200, Ludovic Drolez wrote: > > How long are you willing to have your backups and restores take? If > > you do more processing on the backed up files, you'll take a greater > > Not true : > - working with fixed size chunks may improve speed, because algorit

Re: [BackupPC-users] improving the deduplication ratio

2008-04-14 Thread Ludovic Drolez
On Wed, Apr 09, 2008 at 10:12:09AM -0500, Les Mikesell wrote: > I'd probably look at what rdiff-backup does with incremental differences > and instead of chunking everything, just track changes where the > differences are small. Yes but rdiff-backup has no pooling/deduplication. With that featu

Re: [BackupPC-users] improving the deduplication ratio

2008-04-14 Thread Ludovic Drolez
On Wed, Apr 09, 2008 at 06:11:58PM -0700, Michael Barrow wrote: > How long are you willing to have your backups and restores take? If > you do more processing on the backed up files, you'll take a greater Not true : - working with fixed size chunks may improve speed, because algorithms could

Re: [BackupPC-users] improving the deduplication ratio

2008-04-09 Thread Michael Barrow
>>> I've seen in some commercial backup systems with included >>> deduplication (which often run under Linux :-), that files are >>> split in >>> 128k or 256k chunks prior to deduplication. >>> >>> It's nice to improve the deduplication ratio for big log files, mbox >>> files, binary db not ofte

Re: [BackupPC-users] improving the deduplication ratio

2008-04-09 Thread Les Mikesell
Tino Schwarze wrote: > >> I've seen in some commercial backup systems with included >> deduplication (which often run under Linux :-), that files are split in >> 128k or 256k chunks prior to deduplication. >> >> It's nice to improve the deduplication ratio for big log files, mbox >> files, binary d

Re: [BackupPC-users] improving the deduplication ratio

2008-04-09 Thread Tino Schwarze
On Wed, Apr 09, 2008 at 04:32:13PM +0200, Ludovic Drolez wrote: > I've seen in some commercial backup systems with included > deduplication (which often run under Linux :-), that files are split in > 128k or 256k chunks prior to deduplication. > > It's nice to improve the deduplication ratio for

[BackupPC-users] improving the deduplication ratio

2008-04-09 Thread Ludovic Drolez
Hi ! I've seen in some commercial backup systems with included deduplication (which often run under Linux :-), that files are split in 128k or 256k chunks prior to deduplication. It's nice to improve the deduplication ratio for big log files, mbox files, binary db not often updated, etc. Only the