On Tue, 16 Aug 2011, Hal Pomeranz wrote: > sort -u -k1,4 inputfile >inputfile.de-duped
Wow! I have learned so much today about built-in tools that solve major headaches in data cleaning. Running the results of uniq through sort (as above, but specifying -k2,4) from 8605 rows to 5540. And this down from 12,500+ originally. My grateful thanks to all of you. Not only have I learned valuable uses of common tools but you've saved me days of work. Rich _______________________________________________ PLUG mailing list PLUG@lists.pdxlinux.org http://lists.pdxlinux.org/mailman/listinfo/plug