The reason I asked about watching the analyzer process the files is so you
can determine where it slows down.  Once you know this, run just those logs
to where it slows down and look at all the reports.   At least one of those
reports may have thousands of entries and I need to know which ones.
Internally the log analyzer has to keep track of potentially thousands of
unique strings and it allocates space to hold the strings.  You may have
encountered a situation where you have more unique strings than the maps
alloted for thereby causing collisions; this can make the analyzer run
slowly.

Tripp


----- Original Message -----
From: "Darryl Koster" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, March 02, 2003 1:24 PM
Subject: RE: [IMail Forum] Log gile analyzer


>
>
> Good answer,
>
> Thanks
>
> Darryl
>
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Smart Business
> Lists
> Sent: Sunday, March 02, 2003 12:33 PM
> To: Darryl Koster
> Subject: Re: [IMail Forum] Log gile analyzer
>
>
> Darryl,
>
> Sunday, March 2, 2003 you wrote:
> DK> No, does not even come close to his numbers, if it did it would
> DK> take app 30 seconds to parse the amount that I have listed
> DK> (5,000,000 at 250,000 per second comes to about 20 seconds)..
>
>     Firstly, let me apologize for an incomplete and inaccurate
>     previous post.  There are several phases to the log analysis
>     program.  One phase is opening the files, reading the lines,
>     parsing the lines for the required reports, accumulating the data,
>     closing the file and then proceeding to the next file until all
>     files have been read.  Then there is the reporting phase which
>     produces the output of the data collected in the first.  I'll call
>     the first phase log parsing and the second, reporting.
>
>     I have been working on log parsing and I kind of forget that I
>     have to do anything with it. So when I wrote that log files can be
>     parsed at 250 K to 500 K lines per second I really didn't mean
>     throughput to report but actually the log parsing part. I kind of
>     obsess on little details in my programming when I am involved in
>     something and forget the rather more obvious aspects.
>
>     So I didn't really mean that you could get an actual report that
>     rapidly.
>
>     But I've done some tests on my system and I'll share the results:
>
>     There are 27 reports that the Imail log analyzer would run. If one
>     report is selected then there is less parsing and reporting
>     required than if 27 reports are selected.
>
>     On my system for February I have:
>        28 Files: 147,010,916 bytes, 1,560,832 lines
>                  (94.19 bytes/line)
>
>     Running the latest Log analyzer from a perl wrapper so I can time
>     it I get the following:
>
>         1 Report - SMTPD Connections (r1) - TXT output
>         Total Elapsed: 64.482 seconds (1 min 4.48 sec)
>         24,206 lines / sec
>         2,279,875 bytes / sec
>
>         27 Reports - (r1-r27) - TXT output
>         Total Elapsed 103.358 seconds ( 1 min 43.36 sec)
>         15,101 lines / sec
>         1,422,346 bytes / sec
>         (output file = 5,877,048 bytes)
>
>     Averages are useful but sometimes very deceptive.  If you would
>     run the same test on a file by file basis you would find large
>     differences.  So it is a complex issue and we're not even taking
>     into account differences in hardware and load.
>
>     TXT output is faster than HTML output for rather obvious reasons.
>
>     I believe you wrote that you had 4.8 Million lines for February.
>     So that's a little more than 3 times what I have processed.  So on
>     my system your size files would process in 6 minutes approximately
>     for all 27 tests.
>
>     Tripp Allen made a very useful suggestion.  If you run these from
>     the analyzer GUI then you can watch and measure approximately the
>     time the program takes to parse each file and then to prepare and
>     run each report.  It might be useful to run one file with various
>     options in order to see what is taking so long on your system.  Or
>     for all files begin with just one report section and TXT and then
>     gradually add more until you see where all the time is devoted.
>
>     Normally you do not need all the report sections so eliminate all
>     that you do not absolutely require.  Likewise TXT output is faster
>     than HTML output.
>
>     In our case we run HTML files daily for our logs and have an ASP
>     application that allows us to view each log file.
>
> 60 minutes seems long but I find the log analyzer program to be pretty
> fast so I think you probably need to search for changes in your
> reporting to speed up the process rather than hoping for improvements
> in the program.
>
> HTH
>
> Terry Fritts
>
>
> To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
> List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
> Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/
>
>
> To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
> List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
> Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/
>


To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

Reply via email to