I've been playing with some regex, Benchmark, and 'slurping' and found something that I could do (If I could get it to work) but not sure I want to do it.

Benchmark:
reading a file using:

        my $line = do {local $/; <$file>};
versus;
        while (<$file>) {
        }
is ~2x faster on my machine (caveat).

I want to read check for multiple lines and capture about 5 elements from that 'paragraph'

paragraph would start with a dated line like:
2005/10/31/12:23:21......12345...Active Configuration failed
--or--
2005/10/31/12:23:21......12345...Configuration Request failed

eventually followed by a line like:
2005/10/31/12:32:54..............THREAD: Complete/12345/4435/

I was thinking this could be done in one regex similar to (not entirely functional):

m|^([\d\:/]+)\tN\t(\d+)\t((?:Active )? Configuration (?:Request )? failed)(?:.+?)THREAD: Complete/$2/(\d+)|smg;

I don't have this quite working yet. I was doing OK until I started trying for multi-line matching. Unless I'm doing something obviously impossible I'm hoping I can sort this one out before too long.

But then I realized there was another potential problem that I am not sure how to address. It is possible for multiple instances to interleave themselves such that item 12345 can have an "Active Configuration" statement and before I find the "THREAD: Complete" for the same statement, I run into an "Active Configuration" for item 44532. To solve this one, I probably need to anchor the regex at the second match { ((?:Active )? Configuration (?:Request )? failed) } but I haven't a clue how to do this.

The "Olde School" approach for me would be to take the 'while (<$file>)' approach and save up found bits of information into a hash until I can get all the pieces I need for an answer. But I'm enticed by the speed improvement. I have a LOT of data to read through.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to