Re: trying to understand how regex works
Ron Grabowski <[EMAIL PROTECTED]> wrote: > my $regex = join '|', 'value_garbage1', > 'value_garbage2', > 'value_garbage3'; > next if /$regex/; You might want to say "next if /$regex/o" to prevent Perl from compiling every time. If you're Perl 5.6, you could even make use of the sexy new qr {} operator, which returns a reference to a compiled regular expression: my $regex = join '|', ... my $re = qr{$regex}; next if /$re/; Tom Wyant ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
RE: trying to understand how regex works
I'd add the check for the garbage before I split, not sure if it would really add any time to the program running but would, I think, reduce the amount of checking needed after the split function. next if(/value_garbage/g); # assuming value_garbage is the exact string. or you can use: while { p = "N"; my @f = split /\s*\|\s*/, $_ unless(m/value_garbage/g); if (@f != 30) { #^^ print "Field count is ", scalar @f, " should be 30\n"; # error processing ... } if ($f[1] =~ / ... ... This is again assuming that value_garbage is a string...if not, then well, "if, elsif" away :) But I would absolutely use the split function Joe Youngquist -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of $Bill Luebkert Sent: Tuesday, August 13, 2002 12:39 AM To: Dan Jablonsky Cc: [EMAIL PROTECTED] Subject: Re: trying to understand how regex works Dan Jablonsky wrote: > Hi all, > I guess it must be a simple problem, but it's a > mystery to me. > I got 30 "fields" all separated by pipes in some files > with many many lines. Some of the fields need to be > changed, but mostly I have to drop any line that has > certain values in certain fields. > So I start by skipping any field that has garbage in > it: > open FOUT, ">>/some/path/outputfile.txt"; > open FILE " while{ > p="N"; > next if (/.*?\|value_garbage1\|.*?/ || > /.*?\|value_garbage2\|.*?/ || > /.*?\|value_garbage3\|.*?/); > #and then I continue with an if > if(/(.*?)\|(.*?)\|30 times/){ > $p="Y"; > do something to $1; #change field 1 > do something to $3; #change filed 3 > $fld1=$newfld1; > $fld2=$2; > $fld3=$newfld3; > $fld4=$4;and so on > } > print FOUT "$fld1|$fld2|...|$fld30|\n" if ($p="Y"); > #print the whole thing to the new output > } > > Well, it happens that some of the lines are completely > out of whack and the regex simply stops there - it > doesn't exit, no errors but goes into an infinite loop > even though I don't know how exactly is this possible. > My second if states clearly (or not so clearly) that > if the line does not have 30 fields it should skip the > block, it should NOT print anything at the handle and > should get the next line. > For whatever reason, the first time it encounters a > line with less that 30 fields, it just loops without > end. > I tried to solve this by replacing the .*? in the > references by the actual format of each field and > suddenly it started working but now the regex is a > hundred times slower and the only thing that speeds it > up is to go back to the .*? that really goes fast as > long as the regex "is true". I mean if I have 30 > fields all the time, the regex works OK and it goes > very fast. > > Anybody cares to explain this to me? No, but I'll offer an alternative. while { p = "N"; my @f = split /\s*\|\s*/, $_; if (@f != 30) { print "Field count is ", scalar @f, " should be 30\n"; # error processing ... } if ($f[1] =~ / ... ... -- ,-/- __ _ _ $Bill Luebkert ICQ=162126130 (_/ / )// // DBE Collectibles Mailto:[EMAIL PROTECTED] / ) /--< o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/_<_http://www.todbe.com/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: trying to understand how regex works
On 13/08/2002 06:26:59 perl-win32-users-admin wrote: >Hi all, >I guess it must be a simple problem, but it's a >mystery to me. [snip question involving regex] > >Anybody cares to explain this to me? Try running your script with perl -re=debug scriptname.pl 2>re_debug Make sure you redirect stderr to a file, as there's plenty of it. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933 ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: trying to understand how regex works
> open FOUT, ">>/some/path/outputfile.txt"; > open FILE ">/some/path/outputfile.txt") or die("Error: $!"); open(FILE " while{ > p="N"; > next if (/.*?\|value_garbage1\|.*?/ || > /.*?\|value_garbage2\|.*?/ || > /.*?\|value_garbage3\|.*?/); my $regex = join '|', 'value_garbage1', 'value_garbage2', 'value_garbage3'; next if /$regex/; > if(/(.*?)\|(.*?)\|30 times/){ > $p="Y"; > do something to $1; #change field 1 > do something to $3; #change filed 3 > $fld1=$newfld1; > $fld2=$2; > $fld3=$newfld3; > $fld4=$4;and so on > } > print FOUT "$fld1|$fld2|...|$fld30|\n" if ($p="Y"); If you put the print inside of the if(), you don't need $p. Look into the join() function: print FOUT join '|', $fld1, $fld2, $fld3; print FOUT join '|', @array; ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: trying to understand how regex works
Dan Jablonsky wrote: > Hi all, > I guess it must be a simple problem, but it's a > mystery to me. > I got 30 "fields" all separated by pipes in some files > with many many lines. Some of the fields need to be > changed, but mostly I have to drop any line that has > certain values in certain fields. > So I start by skipping any field that has garbage in > it: > open FOUT, ">>/some/path/outputfile.txt"; > open FILE " while{ > p="N"; > next if (/.*?\|value_garbage1\|.*?/ || > /.*?\|value_garbage2\|.*?/ || > /.*?\|value_garbage3\|.*?/); > #and then I continue with an if > if(/(.*?)\|(.*?)\|30 times/){ > $p="Y"; > do something to $1; #change field 1 > do something to $3; #change filed 3 > $fld1=$newfld1; > $fld2=$2; > $fld3=$newfld3; > $fld4=$4;and so on > } > print FOUT "$fld1|$fld2|...|$fld30|\n" if ($p="Y"); > #print the whole thing to the new output > } > > Well, it happens that some of the lines are completely > out of whack and the regex simply stops there - it > doesn't exit, no errors but goes into an infinite loop > even though I don't know how exactly is this possible. > My second if states clearly (or not so clearly) that > if the line does not have 30 fields it should skip the > block, it should NOT print anything at the handle and > should get the next line. > For whatever reason, the first time it encounters a > line with less that 30 fields, it just loops without > end. > I tried to solve this by replacing the .*? in the > references by the actual format of each field and > suddenly it started working but now the regex is a > hundred times slower and the only thing that speeds it > up is to go back to the .*? that really goes fast as > long as the regex "is true". I mean if I have 30 > fields all the time, the regex works OK and it goes > very fast. > > Anybody cares to explain this to me? No, but I'll offer an alternative. while { p = "N"; my @f = split /\s*\|\s*/, $_; if (@f != 30) { print "Field count is ", scalar @f, " should be 30\n"; # error processing ... } if ($f[1] =~ / ... ... -- ,-/- __ _ _ $Bill Luebkert ICQ=162126130 (_/ / )// // DBE Collectibles Mailto:[EMAIL PROTECTED] / ) /--< o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/_<_http://www.todbe.com/ ___ Perl-Win32-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs