Rich Shepard wrote: > On Tue, 16 Aug 2011, Rich Shepard wrote: > >> This will work for all completely duplicated lines. I'll need to see how >> many remain that vary in one or more columns ('fields') such as the >> parameter, lab_id number, or qa_qc. > > I had manually cleaned up a bunch of lines so the souce file had 12,119 > lines. AFter running uniq the output file has 8,605 lines, about 1/3 fewer. > > The need to remove almost duplicates/triplicates, based on the same values > in three columns regardless of the rest of the contents, remains.
Can we see another snapshot of the data? To see where the non-duplicate values are. And (did I miss it?) which three columns. Rod -- > > Rich > _______________________________________________ > PLUG mailing list > PLUG@lists.pdxlinux.org > http://lists.pdxlinux.org/mailman/listinfo/plug _______________________________________________ PLUG mailing list PLUG@lists.pdxlinux.org http://lists.pdxlinux.org/mailman/listinfo/plug