Pieter Wuille wrote at about 13:18:33 +0100 on Tuesday, December 1, 2009:
 > What you can do is count the allocated space for each directory and file, but
 > divide the numbers for files by (nHardlinks+1). This way you end up
 > distributing the size each file takes on disks over the different backups it
 > belongs to.
 > 
 > I have a script that does this; if there's interest i'll attach it. It does
 > take a day (wild guess, never accurately measured) to go over all pc/*
 > directories (Pool is 370.65GB comprising 4237093 files and 4369
 > directories)

I am surprised that it would take a day.

The only real cost should be that of doing a 'find' and a 'stat' on
the pc tree - which I would do in perl so that I could do the
arithmetic in place (rather than having to use a *nix find -printf to
pass it off to another program).

Unless you have a huge number of pc's and backups, I can't imagine
this would take more than a couple of hours since your total number of
unique files in only about 4 million.

Given that you only have 4 million unique files, you could even avoid
the multiple stats at the cost of that much memory by caching the
nlinks and size by inode number.

Can you post your script?

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to