In message <4f69f9f1.9070...@seb.ee>,
Risto Vaarandi writes:
>On 03/20/2012 09:54 PM, John P. Rouillard wrote:
>>
>> Hi all:
>>
>> When sec creates a dump file, the input sources are reported as:
>>
>> Input sources:
>> ============================================================
>> /var/log/messages (status: Open, type: regular file, device/inode: 64774/8
>339528, received data: 180797 lines, context: _FILE_EVENT_messages)
>> /var/spool/sec/Control (status: Open, type: pipe, device/inode: 64770/5846
>44, received data: 0 lines, context: CONTROL)
>>
>> would it be possible to get a byte ofset at which SEC is reading put
>> in there as well for "type: regular file".
>>
>> What I am trying to do is see how far SEC has to process to reach the
>> end of the file (aka realtime). Ideally I should be able to ls -l the
>> file and be able to compare that to the reported offset. The
>> difference between the two is how many bytes sec has to process to
>> reach realtime.
>>
>...
>The file position indicator should be easy to implement -- one needs to 
>invoke some extra system calls during the dump, which does not add CPU 
>overhead to regular rule matching.

Agreed.

>There is one crucial difference 
>between the number of processed lines and file position, though. The 
>former reflects lines successfully read and processed from a given file. 
>However, it is possible that the file position is located beyond the end 
>of the last processed line, since SEC implements line buffering layer on 
>top of read(2) system calls. For example, we could have a very long line 
>for which read(2) hasn't seen a terminating newline yet. Therefore, the 
>reported file position does not necessarily mean that all the data 
>before it have been processed by SEC -- it merely indicates the data 
>have been read (but could still reside in a buffer).

Hmm, maybe add an indication of how many characters are in the line
buffer? Something like:

   /var/log/messages (status: Open, type: regular file,
    device/inode: 64774/833 9528, received data: 180797 lines,
    file position: 321992882 bytes, buffered: 1024 bytes,
    context: _FILE_EVENT_messages)

where the number of bytes is reported as buffered? (Also IIUC this
works for both --nojointbuf and --jointbuf as the line buffering layer
is per input stream correct?)

So a backup in processing could be seen in:

  file position < length of file

  buffered >> the expected length of a single line (i.e.
      there are new line delimited lines in the buffer)

> > Which leads me to wonder if some sort
>> of profiling mode could be added to sec that tells me for every rule:
>>
>>    how many events are compared to the rule (the current metrics only
>>         tell me how many events were matched by the rule including
>>         context processing). With this info I can restructure the order
>>         to put more expensive/inefficient rules later in processing.
>>
>>    how many times an event is processed by a rule. This is sort of the
>>         inverse of the stat above, but is a good metric to use to tune
>>         rules as reducing the number of event ->  rule applications
>>         seems to be the best way to reduce processing time.
>>
>>    how long on average (in real and cpu second/centiseconds) it takes
>>         to process an event against a rule (provides an indication of
>>         how efficient the regexp/rule is).
>>
>> I expect this would toss performance through the floor, but I could
>> see this as being a useful offline mode to throw thousands of same
>> events against and use it to tune the ruleset order, regexps etc.
>>
>
>The rest of the ideas need some thought, though, since they add some 
>overhead to the main matching loop (even if extra code is not used, one 
>still needs to check command line flags if the measuring functionality 
>is switched on).

Is there a way in perl to have the compile of the script remove lines?
Sort of like how python optimizes out __debug__ lines if optimization
is requested. That would make the profiler have 0 effect unless
enabled since it would not show up in the perl bytecode.

Maybe a perl source filter (ugly but...) could work? It may require
a second source file though.

http://perldoc.perl.org/perlfilter.html look for

  USING CONTEXT: THE DEBUG FILTER

for an idea.

--
                                -- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to