My bbdb has gotten unwieldy and ugly-looking, so I'd like to clean it. The most frustrating bit of ugliness involves the "newsgroups" field in my database. My goal is to have a return-delimited sorted unique list of newsgroups for each poster in my database, but the real data looks ugly. I wrote a perl function to fix this, but I can't just randomly apply it to my database because it'll mess up other fields. Ideally, I'd like new entries to the newsgroups field be return-delimited and sorted and unique, but the perl script solution would be great if I knew how to say "only abuse *this* field in each line" to my script. Another bit of ugliness involves duplicate entries. I have a few sets of duplicate entries for people who have dynamic email addresses. Now that I have the ability to intelligently deal with those addresses as they arise, I need to deal with the ones that have since slipped through my fingers. I don't want duplicate entries to be deleted, just flagged and displayed, so I can massage them by hand. It'd also be nice to say "display any entries that haven't come up in the past two years". With all that done, my database would be trim and neat and probably a whole lot speedier. Someone out there must be doing some of these things, if not all. Help? Jack. -- Jack Twilley jmt at twilley dot org http colon slash slash www dot twilley dot org slash tilde jmt slash
#!/usr/bin/perl -w sub fixit ($) { my($in) = @_; # strip spaces $in =~ s/ //g; # replace newlines with commas $in =~ s/\n/,/g; # split on commas, sort, remove duplicates, and join with newlines return join "\\n", sort keys %{ { map {$_ => 1} split /,/, $in } }; } print STDOUT &fixit("alpha,charlie, beta\ncharlie") . "\n";