Re: \b confusion

Randy W. Sims Fri, 18 Jun 2004 14:24:08 -0700

After a long day, a long night, and another long day I spew crap. And no one catches me. A couple hours later I lay down to finally get some sleep and !! I realized I screwed up. I try to shove it off to no avail. I must respond...

Randy W. Sims wrote:

[EMAIL PROTECTED] wrote:
According to the principle of \b why is this doing this?
$word = "(HP)";
$word =~ s/[,\]\)\}]\b//;
$word =~ s/\b[,\]\)\}]//;
Since the parentheses is on either side of the boundary, it should take off bpth of them. Instead the result is: $word = "(HP" It only took of the end paren.
When I used "(HP),"     the result is "(HP,"?
A second question. If I want to get rid of any non numeric and non alphabetic before and after a word, but not what's in the middle (like apostrphes, dashes) of the word, what's the most simplest way that works.
\b does not match at the end or the beginning of a string, so you need something like (?:^|\b) to match beginning and (?:\b|$) to match ending. The (?:...) construct is a non-capturing match.

This is just wrong. Don't know where it came from, but it is absolutely wrong. If ever you're not sure about what an operator matches, break it down to a workable example:

perl -e '$_="(HP)";s/\b/./g;print "$_\n"'
=> (.HP.)

perl -e '$_="HP";s/\b/./g;print "$_\n"'
=> .HP.

Another problem is that in the case of the string '(HP),', you want to match multiple sequential occurences, so you need to specify that in the regex with a '+' following your character class.

Also, in a character class you don't need to escape anything except the sqare brakets, and you don't have to quote them if you move them to immediately after the opening bracket of the character class.


I actually got these two right.

Finally, your brackets are flipped the wrong way for the opening sequence.

This is basically the problem with the original. The first line in your original example should have the brakets flipped.

If my assumptions are right, you'll end up with something like:

$word =~ s/[],)}]+(?:\b|$)//;
$word =~ s/(?:^|\b)[[,({]+//;


And the correct solution is:

$word =~ s/\b[],)}]+//;
$word =~ s/[[,((]+\b//;

An even better solution is to use the core Text::ParseWords[1] module or the very popular and very usefull Regexp::Common[2].
Randy.
1. <http://search.cpan.org/dist/Text-ParseWords/>
2. <http://search.cpan.org/dist/Regexp-Common/>

By now there should be no doubt as to why the above two modules are the best solution... they work.

I still can't believe no one called me on this.

I'm going to sleep now...

RandyZZZ...

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: \b confusion

Reply via email to