Darryl,

Sunday, March 2, 2003 you wrote:
DK> No, does not even come close to his numbers, if it did it would
DK> take app 30 seconds to parse the amount that I have listed
DK> (5,000,000 at 250,000 per second comes to about 20 seconds)..

    Firstly, let me apologize for an incomplete and inaccurate
    previous post.  There are several phases to the log analysis
    program.  One phase is opening the files, reading the lines,
    parsing the lines for the required reports, accumulating the data,
    closing the file and then proceeding to the next file until all
    files have been read.  Then there is the reporting phase which
    produces the output of the data collected in the first.  I'll call
    the first phase log parsing and the second, reporting.

    I have been working on log parsing and I kind of forget that I
    have to do anything with it. So when I wrote that log files can be
    parsed at 250 K to 500 K lines per second I really didn't mean
    throughput to report but actually the log parsing part. I kind of
    obsess on little details in my programming when I am involved in
    something and forget the rather more obvious aspects.

    So I didn't really mean that you could get an actual report that
    rapidly.

    But I've done some tests on my system and I'll share the results:

    There are 27 reports that the Imail log analyzer would run. If one
    report is selected then there is less parsing and reporting
    required than if 27 reports are selected.

    On my system for February I have:
       28 Files: 147,010,916 bytes, 1,560,832 lines
                 (94.19 bytes/line)

    Running the latest Log analyzer from a perl wrapper so I can time
    it I get the following:

        1 Report - SMTPD Connections (r1) - TXT output
        Total Elapsed: 64.482 seconds (1 min 4.48 sec)
        24,206 lines / sec
        2,279,875 bytes / sec

        27 Reports - (r1-r27) - TXT output
        Total Elapsed 103.358 seconds ( 1 min 43.36 sec)
        15,101 lines / sec
        1,422,346 bytes / sec
        (output file = 5,877,048 bytes)

    Averages are useful but sometimes very deceptive.  If you would
    run the same test on a file by file basis you would find large
    differences.  So it is a complex issue and we're not even taking
    into account differences in hardware and load.

    TXT output is faster than HTML output for rather obvious reasons.

    I believe you wrote that you had 4.8 Million lines for February.
    So that's a little more than 3 times what I have processed.  So on
    my system your size files would process in 6 minutes approximately
    for all 27 tests.

    Tripp Allen made a very useful suggestion.  If you run these from
    the analyzer GUI then you can watch and measure approximately the
    time the program takes to parse each file and then to prepare and
    run each report.  It might be useful to run one file with various
    options in order to see what is taking so long on your system.  Or
    for all files begin with just one report section and TXT and then
    gradually add more until you see where all the time is devoted.

    Normally you do not need all the report sections so eliminate all
    that you do not absolutely require.  Likewise TXT output is faster
    than HTML output.

    In our case we run HTML files daily for our logs and have an ASP
    application that allows us to view each log file.

60 minutes seems long but I find the log analyzer program to be pretty
fast so I think you probably need to search for changes in your
reporting to speed up the process rather than hoping for improvements
in the program.

HTH

Terry Fritts


To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

Reply via email to