regex heck

Tom Allison Thu, 03 Nov 2005 03:56:30 -0800

I've been playing with some regex, Benchmark, and 'slurping' and foundsomething that I could do (If I could get it to work) but not sure Iwant to do it.


Benchmark:
reading a file using:


        my $line = do {local $/; <$file>};
versus;
        while (<$file>) {
        }
is ~2x faster on my machine (caveat).

I want to read check for multiple lines and capture about 5 elementsfrom that 'paragraph'


paragraph would start with a dated line like:
2005/10/31/12:23:21......12345...Active Configuration failed
--or--
2005/10/31/12:23:21......12345...Configuration Request failed

eventually followed by a line like:
2005/10/31/12:32:54..............THREAD: Complete/12345/4435/

I was thinking this could be done in one regex similar to (not entirelyfunctional):

m|^([\d\:/]+)\tN\t(\d+)\t((?:Active )? Configuration (?:Request )?failed)(?:.+?)THREAD: Complete/$2/(\d+)|smg;

I don't have this quite working yet. I was doing OK until I startedtrying for multi-line matching. Unless I'm doing something obviouslyimpossible I'm hoping I can sort this one out before too long.

But then I realized there was another potential problem that I am notsure how to address. It is possible for multiple instances tointerleave themselves such that item 12345 can have an "ActiveConfiguration" statement and before I find the "THREAD: Complete" forthe same statement, I run into an "Active Configuration" for item 44532.To solve this one, I probably need to anchor the regex at the secondmatch { ((?:Active )? Configuration (?:Request )? failed) } but Ihaven't a clue how to do this.

The "Olde School" approach for me would be to take the 'while (<$file>)'approach and save up found bits of information into a hash until I canget all the pieces I need for an answer. But I'm enticed by the speedimprovement. I have a LOT of data to read through.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

regex heck

Reply via email to