On Thursday 13 September 2007 20:35, Roland Smith wrote: > On Thu, Sep 13, 2007 at 10:16:40AM -0700, Kurt Buff wrote: > > I'm trying to do some text file manipulation, and it's driving me nuts. [snip] > > I've looked at sort and uniq, and I've googled a fair bit but can't > > seem to find anything that would do this. > > > > I don't have the perl skills, though that would be ideal. > > > > Any help out there? > > #!/usr/bin/perl > while (<>) { > # Assuming no whitespace in addresses; kill everything after the first > # space > s/ .*$//; > # Store the name & count in a hash > $names{$_}++; > } > # Go over the hash > while (($name,$count) = each(%names)) { > if ($count == 1) { > # print unique names. > print $name, "\n"; > } > }
Another approach in Perl would be: #!/usr/bin/perl my (%names, %dups); while (<>) { my ($key) = split; $dups{$key} = 1 if $names{$key}; $names{$key} = 1; } delete @names{keys %dups}; # # keys %names is now an unordered list of only non-repeated elements # keys %dups is an unordered list of only repeated elements split splits on whitespace, returning a list of fields which can be assigned to a list of variables. Here we only want to capture the first field: split is more efficient for this than using a regex. The first occurrence of $key is in parens because it's actually a list of one variable name. We build two hashes, one, %name, keyed by the original names (this is the classic way to reduce duplicates to single occurrences, since the duplicated keys overwrite the originals), and one, %dup, whose keys are names already appearing in %names - the duplicated entries. Having done that we use a hash slice to delete from %names all the keys of %dups, which leaves the keys of %names holding all the entries which only appear once (and the keys of %dups all the duplicated entries if that's useful). Jonathan _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"