Re: Log analysis server suggestions?
On Thursday 16 February 2006 15:07, Nathan Vidican wrote: I would advise against trying to log everything into SQL records, aside from the performance hit on translating log/write outputs to SQL inserts/queries then having the SQL server write to disk anyway, it just complicates things uneccessarily. You are probably right. I was thinking that it would be easier to search through in a database, but then, most of the issues we are interested in (eg disk failure) we want to know about *now*, rather than the sort of thing that are revealed by historical analysis. My advice would be to take a step back and look at what's important to you. I find it's best to work with a mixture of things and hack your own scripts to fill in the gaps. Having looked at some logs, most of the stuff we are interested in probably is specific to our setup. Log formats are so loose I doubt any off-the-shelf log analysis tool would be much good unless it was 10x more complex than most of the software we want to log anyway. It's surprised me how much time and effort it takes to turn logs into useful data. And I wonder how Windows admins get by at all? Thanks for the advice Ashley ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Log analysis server suggestions? [long]
On Thursday 16 February 2006 15:30, Chuck Swiger wrote: I'm not sure who the original poster was, but whoever is interested in this topic might benefit by reading a thread from the firewall-wizards mailing list: snip Cheers that was very useful- I've put it into our company Wiki so it can be ignored by everyone :) I like the 3-stage processing: Simply design your analysis as an always 3-stage process consisting of: - weeding out and counting instances of uninteresting events - selecting, parsing sub-fields of, and processing interesting events - retaining events that fell through the first two steps as unusual That solves the problem of missing logs that you didn't anticipate, although it adds a lot to the initial server configuration. Ashley ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Log analysis server suggestions?
Until recently I had a server running syslog-ng set to archive all logs into server/year/month/day/ directories. Now the server is running in amd64, we've lost our hi-res scrolling display so I want to look at a better log watching system. I've read about logging to a database. I quite like the idea of storing our logs in PostgreSQL (I don't like MySQL and don't want to get involved in administering a second database). I know I can log to a PG database quite easily, but I don't know how I can get the data back out without writing manual queries. Here is what I need: - Logs stored for the last 6 months or so, and easily searchable - Live log watching - Log analysis I might try swatch for the live log watching as this is not affected by the choice of log storage and seems the best tool for the job. As for searching / analysis, I've seen php-syslog-ng ( http://www.vermeer.org/projects/php-syslog-ng ), which looks very basic, and phpLogCon ( http://www.phplogcon.com/ ), which does not support PG anyway. Is there anything better GUI-wise? Maybe I am best keeping the logs in text files for now, and spending more time on swatch. Any thoughts? Cheers Ashley ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Log analysis server suggestions?
Ashley Moran wrote: Until recently I had a server running syslog-ng set to archive all logs into server/year/month/day/ directories. Now the server is running in amd64, we've lost our hi-res scrolling display so I want to look at a better log watching system. I've read about logging to a database. I quite like the idea of storing our logs in PostgreSQL (I don't like MySQL and don't want to get involved in administering a second database). I know I can log to a PG database quite easily, but I don't know how I can get the data back out without writing manual queries. Here is what I need: - Logs stored for the last 6 months or so, and easily searchable - Live log watching - Log analysis I might try swatch for the live log watching as this is not affected by the choice of log storage and seems the best tool for the job. As for searching / analysis, I've seen php-syslog-ng ( http://www.vermeer.org/projects/php-syslog-ng ), which looks very basic, and phpLogCon ( http://www.phplogcon.com/ ), which does not support PG anyway. Is there anything better GUI-wise? Maybe I am best keeping the logs in text files for now, and spending more time on swatch. Any thoughts? Cheers Ashley ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] In my experience, logfiles are best NOT in SQL, flat-files are easy to deal with, and through a few simple Perl scripts you could accomplish all you need to. You can run a tail -f and dump output to stdout, or even pipe it to a socket and monitor remotely. Also, various programs have great open-source analysers, for specific logs, (ie Apache, sendmail, etc). I would advise against trying to log everything into SQL records, aside from the performance hit on translating log/write outputs to SQL inserts/queries then having the SQL server write to disk anyway, it just complicates things uneccessarily. My advice would be to take a step back and look at what's important to you. Decide which logs need to be monitored in real-time, are there certain criteria that require immediate attention? What about alhpa-numeric pager systems, or emailed warnings? Are customers going to require reports/information (ie web server stats, sendmail relay stats or spam logs, bandwidth usage, etc). It of course depends on your overall system, and your users more than anything... but in the end I find it's best to work with a mixture of things and hack your own scripts to fill in the gaps. Just my two cents, hope it helps. -- Nathan Vidican [EMAIL PROTECTED] Windsor Match Plate Tool Ltd. http://www.wmptl.com/ ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Log analysis server suggestions? [long]
Ashley Moran [EMAIL PROTECTED] wrote: [ ... ] I'm not sure who the original poster was, but whoever is interested in this topic might benefit by reading a thread from the firewall-wizards mailing list: Original Message Subject: [fw-wiz] parsing logs ultra-fast inline Date: Wed, 01 Feb 2006 16:03:38 -0500 From: Marcus J. Ranum [EMAIL PROTECTED] To: firewall-wizards@honor.icsalabs.com OK, based on some offline discussion with a few people, about doing large amounts of system log processing inline at high speeds, I thought I'd post a few code fragments and some ideas that have worked for me in the past. First off, if you want to handle truly ginormous amounts of log data quickly, you need to build a structure wherein you're making decisions quickly at a broad level, then drilling down based on the results of the decision. This allows you to parallelize infinitely because all you do is make the first branch in your decision tree stripe across all your analysis engines. So, hypothetically, let's say we were handling typical UNIX syslogs at a ginormous volume, we might have one engine (CPU/process or even a separate box/bus/backplane/CPU/drive array) responsible for (sendmail | named) and another one responsible for (apache | imapd) etc. If you put some simple counters in your analysis routines (hits versus misses) you can load balance your first tree-branch appropriately using a flat percentage. Also, remember, if you standardize your processing, it doesn't matter where it happens; it can happen at the edge/source or back in a central location or any combination of the two. Simply design your analysis as an always 3-stage process consisting of: - weeding out and counting instances of uninteresting events - selecting, parsing sub-fields of, and processing interesting events - retaining events that fell through the first two steps as unusual The results of these 3 stages are - a set of event-IDs and counts - a set of event-IDs and interesting fields and counts - residual data in raw form back-haul the event-IDS and counts and fields and graph them or stuff them into a database, and bring the residual data to the attention of a human being. I suppose if you needed to you could implement a log load balancer in the form of a box that had N interfaces that collected a fat stream of log data, ran a simple program that sorted the stream into 1/N sub-streams and forwarded them to backend engines for more involved processing. You could scale your logging architecture to very very large loads this way. It works for Google and it'd work for you, too. The first phase of processing is to stripe across engines if necessary, then inside each engine you stripe the processing into functional sub-parsers that deal with a given message format. The implementation is language-irrelevant though your language choice will affect performance. Typically you write a main loop that looks like: while ( get a message ) { if(message is a sendmail message) parse sendmail message if(message is an imap message) parse imap message ... } Once your system has run on a sample dataset you will be able to determine which messages come most frequently and you can put that test at the top of the loop. This can result in an enormous performance boost. Each sub-parse routine follows the same structure as the main loop, performing a sorted series of checks to sub-parse the fields of the message-specific formats. I.e.: parse sendmail message( ) { if(message is a stat=sent message) { pull out recipient; pull out sender; increment message sent count; add message size to sender score; done } if(message is a stat=retry message) { ignore; //done } if(message is a whatever) { whatever; done } // if we fell through to here we have a new message structure // we have never seen before!! vector message to interesting folder; } Variant messages are a massive pain in the butt; you need to decide whether to deal with variants as separate cases or to make the sub-parser smarter in order to deal with them. This is one of the reasons I keep saying that system log analysis is highly site specific! If your site doesn't get system logs from a DECStation 5000 running ULTRIX 3.1D then you don't need to parse that data. Indeed, if you build your parse tree around that notion then, if you suddenly start getting ULTRIX format log records in your data stream, that'd be - shall we say - interesting and you want to know about it. I remember when I was looking at some log data at one site (Hi Abe!) we found one log message that was about 400K long on a single line. It appeared that a fairly crucial piece of software decided to spew a chunk of its memory in a log message, for no apparent reason.
Re: Log analysis server suggestions?
Ashley Moran wrote: Until recently I had a server running syslog-ng set to archive all logs into server/year/month/day/ directories. Now the server is running in amd64, we've lost our hi-res scrolling display so I want to look at a better log watching system. I've read about logging to a database. I quite like the idea of storing our logs in PostgreSQL (I don't like MySQL and don't want to get involved in administering a second database). I know I can log to a PG database quite easily, but I don't know how I can get the data back out without writing manual queries. Here is what I need: - Logs stored for the last 6 months or so, and easily searchable - Live log watching - Log analysis I might try swatch for the live log watching as this is not affected by the choice of log storage and seems the best tool for the job. As for searching / analysis, I've seen php-syslog-ng ( http://www.vermeer.org/projects/php-syslog-ng ), which looks very basic, and phpLogCon ( http://www.phplogcon.com/ ), which does not support PG anyway. Is there anything better GUI-wise? Maybe I am best keeping the logs in text files for now, and spending more time on swatch. Any thoughts? Cheers Ashley http://www.loganalysis.org, and the related listserv might be well worth your time... ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Log analysis server suggestions?
As for searching / analysis, I've seen php-syslog-ng ( http://www.vermeer.org/projects/php-syslog-ng ), which looks very basic, and phpLogCon ( http://www.phplogcon.com/ ), which does not support PG anyway. Is there anything better GUI-wise? As for the log analysis, I remember attending a security seminar where the conclusion was that a good log analysis system should let you define what events are unimportant and could be ignored so that all other events, including the unexepected ones are shown as important and requiring action. Best regards, Olivier ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]