Hi, I use a similar Algorithm for optimizing document storage. Pretty simple actually: just troll through all directories recursively and store each file in a record. You just need the path and the file hash which you can create with
DOCUMENT TO BLOB($t_DocPath;$x_Content) $t_FileHash:=Generate digest($x_Content;MD5 digest) SET BLOB SIZE($x_Content;0) Then just check for unique hashes and voila! Using the hash will also find identical files that have different filenames. The chances of 2 different files generating the same hash are so close to 0 they are for all practivcal reasons 0. Now write something that moves unigue data somewehere else or deletes duplicates. Whole thing is quickly written, I guess some 100 lines of code. 120 with progress bars :-) As for running it, well that will take some time, don’t do it on you main work machine, it might be tied up for a while. Hope that helped. Cheers Alex > Am 14.03.2017 um 07:56 schrieb Robert ListMail via 4D_Tech > <4d_tech@lists.4d.com>: > > I need a utility that can scan a backup drive (or index) and identify what’s > unique to the backup volume without expecting identical pathnames on the > other drives... So, the routine would have to query (effectively a Finder > Search for each file) all specified drives looking for each file and > reporting those that are missing... Basically, I need to know which data on > this given backup drive is truly unique and therefore potentially valuable. > > Might there be a 4D solution? Have you dealt with large directories or many > directories from the file system? If there is a utility already built I’m > open to that as well. > > Thanks, > > Robert > ********************************************************************** > 4D Internet Users Group (4D iNUG) > FAQ: http://lists.4d.com/faqnug.html > Archive: http://lists.4d.com/archives.html > Options: http://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ********************************************************************** ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************