subject:"Searching for what is not there using REGEX in only a single step"

Re: Searching for what is not there using REGEX in only a single step

2004-06-01 Thread Bob Bell

On Fri, May 28, 2004 at 01:08:25PM -0400, Greg Rundlett [EMAIL PROTECTED] wrote:
NOTE:  I know how to solve this problem by processing the text in 
2 steps, first finding all occurences of  /A(.*)C/ and then searching 
for B in $1, but I'm wondering if there is some advanced expression for 
doing it in only one step.

I have an interesting little problem that I'm wondering if someone knows 
how to solve using regular expressions:

Given some larger text, where you have many subsections that are made up 
of a token A followed by an indeterminate amount of text NOT including 
token B and then token C, how can you find those chunks of text?  I've 
been trying with Perl-compatible Regular Expressions through PHP, but 
can't come up with a way to do it.
Well, I don't know about PCRE in PHP, but in pure Perl, you could do the 
following: /A(?(?=B)(?.*)|.)*C/

This matches token A followed by token C, with a possible series of 
stuff in the middle.  The stuff is evaluated conditionally.  It uses 
look-ahead to see if what's coming matches token B, and if so it 
independently matches the rest of the line, irrevocably consuming token 
C, so that the required match to token C will fail, and the RE as 
a whole will fail to match.  Otherwise, the stuff in the middle 
matches any character, one character at a time.

Thanks for the opportunity to learn more about Perl REs. :-)
--
Bob Bell
___
gnhlug-discuss mailing list
[EMAIL PROTECTED]
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Searching for what is not there using REGEX in only a single step

2004-05-28 Thread Greg Rundlett

NOTE:  I know how to solve this problem by processing the text in 2 
steps, first finding all occurences of  /A(.*)C/ and then searching for 
B in $1, but I'm wondering if there is some advanced expression for 
doing it in only one step.

I have an interesting little problem that I'm wondering if someone knows 
how to solve using regular expressions:

Given some larger text, where you have many subsections that are made up 
of a token A followed by an indeterminate amount of text NOT including 
token B and then token C, how can you find those chunks of text?  I've 
been trying with Perl-compatible Regular Expressions through PHP, but 
can't come up with a way to do it.

For example,
I have an XML file, with a bunch of records.  Some records are fine.  
Others are missing a chunk.  I want to find the broken records and 
insert the missing tags.
Broken Record
 /fh

   30101 Agoura Ct., #115br //location_addr1
   location_addr2/location_addr2
Fixed Record
 /fh
 location id=
   location_name
   /location_name
   location_addr130101 Agoura Ct., #115br //location_addr1
   location_addr2/location_addr2
I thought I would be able to find /fh followed by /locacation_addr1 
and do a lookback negative assertion to say that location_addr1 was 
not present.  However, not knowing the length of text between /fh and 
/location_addr1 seems to make this impossible.

--
FREePHILE
We are 'Open' for Business
Free and Open Source Software
http://www.freephile.com
(978) 270-2425
Paul Lynde to block...
-- a contestant on Hollywood Squares
___
gnhlug-discuss mailing list
[EMAIL PROTECTED]
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Re: Searching for what is not there using REGEX in only a single step

2004-05-28 Thread Michael ODonnell



It's possible you could approach the problem more
simply, maybe like this:

   starting from every instance of /fh start
   gathering all text except anything that looks like
   a tag (ie.  discard all tags) up until the point
   where you find an instance of /location_addr1.
   You're then situated where you have the desired
   text (sans tags) and you know exactly where you
   are, so you should be able to utter (that part of)
   your record with the desired format.  In other
   words, instead of rewriting just the damaged
   records, rewrite ALL the records.

...just a thought, based only upon the info supplied,
FWIW, YMMV, etc...

___
gnhlug-discuss mailing list
[EMAIL PROTECTED]
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss

Re: Searching for what is not there using REGEX in only a single step

Searching for what is not there using REGEX in only a single step

Re: Searching for what is not there using REGEX in only a single step

3 matches

Site Navigation

Mail list logo

Footer information