My bbdb has gotten unwieldy and ugly-looking, so I'd like to clean
it.

The most frustrating bit of ugliness involves the "newsgroups" field
in my database.  My goal is to have a return-delimited sorted unique
list of newsgroups for each poster in my database, but the real data
looks ugly.  I wrote a perl function to fix this, but I can't just
randomly apply it to my database because it'll mess up other fields.
Ideally, I'd like new entries to the newsgroups field be
return-delimited and sorted and unique, but the perl script solution
would be great if I knew how to say "only abuse *this* field in each
line" to my script.

Another bit of ugliness involves duplicate entries.  I have a few sets
of duplicate entries for people who have dynamic email addresses.  Now
that I have the ability to intelligently deal with those addresses as
they arise, I need to deal with the ones that have since slipped
through my fingers.  I don't want duplicate entries to be deleted,
just flagged and displayed, so I can massage them by hand.

It'd also be nice to say "display any entries that haven't come up in
the past two years".  With all that done, my database would be trim
and neat and probably a whole lot speedier.

Someone out there must be doing some of these things, if not all.
Help?

Jack.
-- 
Jack Twilley
jmt at twilley dot org
http colon slash slash www dot twilley dot org slash tilde jmt slash
#!/usr/bin/perl -w

sub fixit ($) {
  my($in) = @_;

  # strip spaces
  $in =~ s/ //g;

  # replace newlines with commas
  $in =~ s/\n/,/g;

  # split on commas, sort, remove duplicates, and join with newlines
  return join "\\n", sort keys %{ { map {$_ => 1} split /,/, $in } };
}

print STDOUT &fixit("alpha,charlie, beta\ncharlie") . "\n";

Reply via email to