Re: regex count

Jay Savage Thu, 25 Sep 2008 07:51:17 -0700

On Wed, Sep 24, 2008 at 10:55 PM, Stephen Reese <[EMAIL PROTECTED]> wrote:
>> Have a look at the sample data you posted and you will see where.
>>
>>
>> John
>
> I believe I found where the ']' needs to go but didn't see any extra ' '
> space.
>
> The $x count seems off. As I see it every time a regex match is made then $x
> will increase one. The match numbers results are about 5x greater then what
> they should be after correlating my grep findings with the perl output.
>
> my ( %srca, %quad, %port );
> my $x;
>
> while (<LOG>) {
>        next unless
> /Sig:\s*(\d+)\s+Subsig:\s*(\d+)\s+Sev:\s*(\d+)([^\[]+)\[([\d\.]+):(\d+)\s*->
> \s*([\d\.]+):(\d+)\]/;
>        $x++;
>        $srca{ $5 } += $x;
>        $quad{ sprintf '%-16s -> %-16s Port %-6s %-s', $5, $7, $8, $4 } +=
> $x;
> #       $port{ sprintf 'port %-6s %-16s %-s', $1, $5, $4 } += $x;
> #       $port{ sprintf 'port %-6s %-s', $1, $4 } += $x;
>        $port{ sprintf 'Sig %-6s Severity %-2s', $1, $2 } += $x;
> }
> my $n;
>
> print "\nSource Address Summary:\n";
> foreach my $i ( sort { $srca{$b} <=> $srca{$a} } keys %srca) {
>   if ($n++ >= $ntop) { last };
>   printf ("%6s: %s\n", $srca{$i},$i);
> }
> $n=0;
>
> print "Connection Summary:\n";
> foreach my $i ( sort { $quad{$b} <=> $quad{$a} } keys %quad) {
>   if ($n++ >= $ntop) { last };
>   printf ("%6s: %s\n", $quad{$i},$i);
> }
> $n=0;
>
> print "\nDestination Port Summary:\n";
> foreach my $i ( sort { $port{$b} <=> $port{$a} } keys %port) {
>   if ($n++ >= $ntop) { last };
>   printf ("%6s: %s\n", $port{$i},$i);
> }


I don't see where you're printing $x to check.

Assuming you have actually checked $x, though, the important question
isn't whether $x == `grep -c regex /your/log/file`.

The important question is whether $x == scalar keys %srca.

If those two match, then you still have a problem with your regex,
somewhere. Remember that Perl regex is very different from grep, and
if you are using the same RE in both, you will get different results.
I particular, all those /\s*/ may be grabbing more than you think the
are. You also seem to possibly have an extra unescapred ']' in the
middle of the pattern (which leads me to ask: you do have warnings
turned on, right?).

On the other hand, if $x != scalar keys %srca (or scalar keys %quad),
then you are incrementing $x more often than you think, possibly
somewhere else in the program.

$x is probably just superfluous and distracting, anyway. The number of
key/value pairs in %scra, %quad, or %port will tell you how many times
your regex matched.

In fact, why bother with hashes at all, here? You're just using them
to keep track of order. That's not what hashes are for. That's what
ordered lists (arrays) are for:

    my (@srca, @quad, @port );

    while (<LOG>) {
        next unless
/Sig:\s*(\d+)\s+Subsig:\s*(\d+)\s+Sev:\s*(\d+)([^\[]+)\[([\d\.]+):(\d+)\s*->\s*([\d\.]+):(\d+)\]/;
        push @srca, $5;
        push @quad, sprintf '%-16s -> %-16s Port %-6s %-s', $5, $7, $8, $4;
        push @port,  sprintf 'Sig %-6s Severity %-2s', $1, $2;
    }

    my $i = 0;
    print ++$i, ": $_\n" foreach sort {$a <=> $b} @srca;

HTH,

-- jay
--------------------------------------------------
This email and attachment(s): [ ] blogable; [ x ] ask first; [ ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com http://www.downloadsquad.com http://www.engatiki.org

values of β will give rise to dom!

Re: regex count

Reply via email to