On Wed, 13 Jan 2010 at 12:44:47PM +0000, Dermot wrote: > Hi, > > I have a lots of PDFs that I need to catalogue and I want to ensure > the uniqueness of each PDF. At LWP, Jonathan Rockway mentioned > something similar with SHA1 and binary files. Am I right in thinking > that the code below is only taking the SHA on the name of the file and > if I want to ensure uniqueness of the content I need to do something > similar but as a file blob? >
Have a look here: http://en.wikipedia.org/wiki/Fdupes There are links to Perl examples, that do SHA de-duplication. -- Adam Trickett Overton, HANTS, UK A bank is a place where they lend you an umbrella in fair weather and ask for it back when it begins to rain. -- Robert Frost