Hi Peter, > My thoughts were to do an 'Ls -something' piped into a file, then > perhaps if I could do a sort on that from the end of each line, I > would end up with duplicates adjacent, which I could then investigate > and clean up as required.
What do you mean by `duplicate'? If you mean you want to group all files called `foo' together, regardless of their possible differing size or content then find foo bar -type f -printf '%h %f\n' | rev | sort | rev | uniq -f 1 -D will list the directory path and filename for all files under `foo' and `bar' that occur more than once by name, e.g. README is a prime contender. It won't work well with paths or filenames with spaces in or other weird characters, but then that's why you shouldn't have them. :-) On the other hand, if you want to find files that almost certainly have the same content regardless of their file name then find foo bar -type f -print0 | xargs -r0 sha1sum | sort | uniq -D -w 40 lists those. Note, it won't realise that two files may be hard or symbolically linked together. Cheers, Ralph. -- Next meeting: Bournemouth, Wednesday 2009-08-05 20:00 Dorset LUG: http://dorset.lug.org.uk/ Chat: http://www.mibbit.com/?server=irc.blitzed.org&channel=%23dorset List info: https://mailman.lug.org.uk/mailman/listinfo/dorset