Darryl,
Sunday, March 2, 2003 you wrote:
DK> No, does not even come close to his numbers, if it did it would
DK> take app 30 seconds to parse the amount that I have listed
DK> (5,000,000 at 250,000 per second comes to about 20 seconds)..
Firstly, let me apologize for an incomplete and inaccurate
previous post. There are several phases to the log analysis
program. One phase is opening the files, reading the lines,
parsing the lines for the required reports, accumulating the data,
closing the file and then proceeding to the next file until all
files have been read. Then there is the reporting phase which
produces the output of the data collected in the first. I'll call
the first phase log parsing and the second, reporting.
I have been working on log parsing and I kind of forget that I
have to do anything with it. So when I wrote that log files can be
parsed at 250 K to 500 K lines per second I really didn't mean
throughput to report but actually the log parsing part. I kind of
obsess on little details in my programming when I am involved in
something and forget the rather more obvious aspects.
So I didn't really mean that you could get an actual report that
rapidly.
But I've done some tests on my system and I'll share the results:
There are 27 reports that the Imail log analyzer would run. If one
report is selected then there is less parsing and reporting
required than if 27 reports are selected.
On my system for February I have:
28 Files: 147,010,916 bytes, 1,560,832 lines
(94.19 bytes/line)
Running the latest Log analyzer from a perl wrapper so I can time
it I get the following:
1 Report - SMTPD Connections (r1) - TXT output
Total Elapsed: 64.482 seconds (1 min 4.48 sec)
24,206 lines / sec
2,279,875 bytes / sec
27 Reports - (r1-r27) - TXT output
Total Elapsed 103.358 seconds ( 1 min 43.36 sec)
15,101 lines / sec
1,422,346 bytes / sec
(output file = 5,877,048 bytes)
Averages are useful but sometimes very deceptive. If you would
run the same test on a file by file basis you would find large
differences. So it is a complex issue and we're not even taking
into account differences in hardware and load.
TXT output is faster than HTML output for rather obvious reasons.
I believe you wrote that you had 4.8 Million lines for February.
So that's a little more than 3 times what I have processed. So on
my system your size files would process in 6 minutes approximately
for all 27 tests.
Tripp Allen made a very useful suggestion. If you run these from
the analyzer GUI then you can watch and measure approximately the
time the program takes to parse each file and then to prepare and
run each report. It might be useful to run one file with various
options in order to see what is taking so long on your system. Or
for all files begin with just one report section and TXT and then
gradually add more until you see where all the time is devoted.
Normally you do not need all the report sections so eliminate all
that you do not absolutely require. Likewise TXT output is faster
than HTML output.
In our case we run HTML files daily for our logs and have an ASP
application that allows us to view each log file.
60 minutes seems long but I find the log analyzer program to be pretty
fast so I think you probably need to search for changes in your
reporting to speed up the process rather than hoping for improvements
in the program.
HTH
Terry Fritts
To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/