Jeff 'Japhy' Pinyan wrote: > > On Jul 13, David Storrs said: > > >Given this code: > > > > my $TOKEN_DELIMS = qr/[^\w\-'\$]/; > > my $text = # string containing the contents of an mbox file > > > > my @tokens = split /$TOKEN_DELIMS/o, $text; > > > >I end up with a large number of null strings ('') in @tokens. After > >RTFMing on split, I assume this is because I am matching at the > >beginning/end of the string. I can't quite grok how to eliminate > >these, however. > > Well, your split pattern is only matching ONE token delimiter at a time. > That means on a string like "[EMAIL PROTECTED]", you're going to get back the > elements "this", "", "", "that", because there's an empty string in > between the ! and @, as well as between the @ and #. I think you want to > change your regex to /$TOKEN_DELIMS+/o instead.
And you are not going to need the /o option as qr// compiles the regular expression. my $TOKEN_DELIMS = qr/[^\w'$-]+/; my $text = # string containing the contents of an mbox file my @tokens = split /$TOKEN_DELIMS/, $text; John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]