On Fri, May 28, 2004 at 01:08:25PM -0400, Greg Rundlett <[EMAIL PROTECTED]> wrote:
NOTE: I know how to solve this problem by processing the text in 2 steps, first finding all occurences of /A(.*)C/ and then searching for B in $1, but I'm wondering if there is some advanced expression for doing it in only one step.

I have an interesting little problem that I'm wondering if someone knows how to solve using regular expressions:

Given some larger text, where you have many subsections that are made up of a token A followed by an indeterminate amount of text NOT including token B and then token C, how can you find those chunks of text? I've been trying with Perl-compatible Regular Expressions through PHP, but can't come up with a way to do it.

Well, I don't know about PCRE in PHP, but in pure Perl, you could do the following: /A(?(?=B)(?>.*)|.)*C/


This matches token A followed by token C, with a possible series of "stuff" in the middle. The "stuff" is evaluated conditionally. It uses look-ahead to see if what's coming matches token B, and if so it independently matches the rest of the line, irrevocably consuming token C, so that the required match to token C will fail, and the RE as a whole will fail to match. Otherwise, the "stuff" in the middle matches any character, one character at a time.

Thanks for the opportunity to learn more about Perl REs. :-)

--
Bob Bell
_______________________________________________
gnhlug-discuss mailing list
[EMAIL PROTECTED]
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Reply via email to