John Drescher wrote:
I have seen this discussed in this list before and I believe there are several problems on top of the small chance that a file will have the same size and same md5sum but different contents. One is do we only search (for dups) in the current backup job or volume or do we include other backups and other volumes.  If  we inclulde other  backups  how do we handle  the case where  a file from job X is on a volume from job Y because of a duplicate and now some user has purged that volume that contains job Y.

John

------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
My opinion is that as backup solution we could check only in one volume, but on one volume we could have information from more file daemons.
Otherwise we will have problems with volume retention and this is true mainly for removable storage (tapes, external drives...). With file system storage we could use an algorithm similar to CDP to limit the number of copies that are held in storage or age and because file system is randomly accessible and always available it will be easy to copy data.
Or if database type storage type is used (why not) we could just create/delete links to a row.

About searching for duplications it will be just comparing a checksum this could be done fast in SQL with b-tree indexes (I think) and if found file is not transmitted over the network, just the relevant info (filename, location, permissions etc...)

-- 
Hristo Benev
IT Manager

WAVEROAD
Partners in Telecommunications

514-935-2020 x225 T
514-935-1001 F
www.waveroad.ca
[EMAIL PROTECTED]
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to