I've been playing with some regex, Benchmark, and 'slurping' and found
something that I could do (If I could get it to work) but not sure I
want to do it.
Benchmark:
reading a file using:
my $line = do {local $/; <$file>};
versus;
while (<$file>) {
}
is ~2x faster on my machine (caveat).
I want to read check for multiple lines and capture about 5 elements
from that 'paragraph'
paragraph would start with a dated line like:
2005/10/31/12:23:21......12345...Active Configuration failed
--or--
2005/10/31/12:23:21......12345...Configuration Request failed
eventually followed by a line like:
2005/10/31/12:32:54..............THREAD: Complete/12345/4435/
I was thinking this could be done in one regex similar to (not entirely
functional):
m|^([\d\:/]+)\tN\t(\d+)\t((?:Active )? Configuration (?:Request )?
failed)(?:.+?)THREAD: Complete/$2/(\d+)|smg;
I don't have this quite working yet. I was doing OK until I started
trying for multi-line matching. Unless I'm doing something obviously
impossible I'm hoping I can sort this one out before too long.
But then I realized there was another potential problem that I am not
sure how to address. It is possible for multiple instances to
interleave themselves such that item 12345 can have an "Active
Configuration" statement and before I find the "THREAD: Complete" for
the same statement, I run into an "Active Configuration" for item 44532.
To solve this one, I probably need to anchor the regex at the second
match { ((?:Active )? Configuration (?:Request )? failed) } but I
haven't a clue how to do this.
The "Olde School" approach for me would be to take the 'while (<$file>)'
approach and save up found bits of information into a hash until I can
get all the pieces I need for an answer. But I'm enticed by the speed
improvement. I have a LOT of data to read through.
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>