Re: [lopsa-tech] Log correlation question

david Tue, 25 Jan 2011 15:23:13 -0800

On Tue, 25 Jan 2011, Nathan Hruby wrote:

> On Mon, Jan 24, 2011 at 4:25 PM,  <[email protected]> wrote:
>> Splunk is great for doing ad-hoc searches through your logs
>> (investigations, troubleshooting, etc), but having it do event correlation
>> is really inefficient. What you do for 'event correlation' with splunk is
>> to define a search and run it repeatedly. if you have enough logs that
>> your search time is larger than what will fit in ram, this will really
>> hammer your I/O system (on multiple systems if you have scaled your splunk
>> install).
>
> Yes, complicated searches using non-indexed data or fields at query
> time can drag down the web UI.  Disabling segmenting and tuning the
> indexer to ensure its indexing the data you search on often is
> important.  There's also some query tuning one can do to make things
> go faster, though, it's been a while since I used Splunk so I don't
> really remember what those hints are any more :)
>
> Splunk allows you to do similar (though not  as complicated) things as
> SEC does at Splunk's indexing time via multi-line events and
> transactions:
> - http://www.splunk.com/base/Documentation/latest/Knowledge/Abouttransactions
> - http://www.splunk.com/base/Documentation/latest/Admin/Indexmulti-lineevents
> which you can then run a regular job to look for and take action accordingly
>
> Speaking of the regular reports, you can also pre-bake search results
> for faster report generation when you need it:
> - http://www.splunk.com/base/Documentation/latest/Knowledge/Usesummaryindexing


my point is that the way that splunk does these things is by scheduling 
searches across the entire timeframe of logs, this is a very expensive 
thing to do, and much more expensive (in terms of processing and I/O) than 
something like SEC that is looking at the logs as they arrive (and 
remembering only what context you tell it to remember)

if your data rate is low enough that splunk will still have the data you 
are needing to search for in the disk cache, then it's not much of a 
problem. But as the rate of data goes up, this falls apart.

I have a lot of logs (we are in the process up upgrading our splunk 
license from 300GB/day to 600GB/day. And as usual, the rate of data varies 
greatly over the day. 600G/day is 24G/min on average, at peak time this 
can be 2-3 times as much data (plus the fact that splunk then creates 
a lot of index data, which consumes even more ram). The result is that 
doing a search over 10 min of data is unlikely to still be in ram, even 
for my farm that has hundreds of gigs of ram available across the farm.

once the data you are searching no longer fits in ram, the performance 
drops drastically. When doing searches for rare events, Splunk is seek 
limited. seeks on disk are between 100-160 seeks/sec per spindle, but 
seeks through data that's in ram is several orders of magnatude faster 
(many millions of seeks/sec).

If you are doing a summary report of data that shows up a lot, then you 
can get into a mode where you are not seek limited, but are instead 
retreiving lots of data and uncompressing it, you can become cpu limited, 
but even there you will consume a lot of ram very fast.

David Lang

> That all said, for pure "get mail -> see spam -> block host" kind of
> event correlation, SEC is probably easier and certainly cheaper than
> Splunk.  Splunk takes some extra out-of-the-box work to get your data
> rigged correctly but once that done and integrated into service and
> host config management, the niftyness quotient goes up much more
> quickly for splunk.
>
> -n
>
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] Log correlation question

Reply via email to