IMHO, you really don't want your log collection tier doing data
enrichment because now the collection tier has a dependency on the
enrichment source. If the enrichment source stops responding, your
pipeline can break. Even if the enrichment source does not fail, your
collection will slow down to the level of responsiveness of the
enrichment source. A simple example is DNS lookups. For large volumes
of logs, a DNS outage can wreak havoc on your log collection tier. I
have seen instances of the log collection tier effectively DoS-ing the
network because DNS infra failed, which is the last thing you want.

Without mentioning specific tools, a good enrichment architecture,
binds raw data with enrichment sources at query time. For example, you
throw log data in a database. You then have a layer of log query APIs
that pull raw data from the database, join it with static enrichment
sources (also maybe stored in the database) and also join dynamic
enrichment sources. At a 10,000 ft level, your log query app makes an
API call to the logQuery layer and asks for say firewall logs as,
{TimeStamp, FirewallName, SourceIP, SourceIPCountry, DestinationIP,
destinationIPCountry, RblRatingofSourceIP, RblRatingOfDestinationIP}.
In the backend, the logQuery layer grabs raw data from the database,
grabs GeoIP from either local file or some cloud service and grabs RBL
info dynamically from a public RBL source.

You can then seamlessly add more intelligence/enrichment sources to
your logQuery API and all data (new and old) is instantly enriched.
Nice, right? :)

There is a flaw in dynamic enrichment, that is, current intelligence
from the enrichment source maybe inaccurate for historic data. Again,
simple example DNS. Say, 10.0.0.22 was called joe-workstation three
months ago. You run a query to look for 10.0.0.22 today and pull three
month's worth of logs. But now, 10.0.0.22 is called marys-workstation.
If you do DNS lookup at query time, you will end up with bad data.

At my workplace, we are still evolving tools/methods to handle all
this so ask me again in a few months and maybe I will have a better
answer :) Till then, maybe someone else can chime in with a better
solution?



On Tue, Mar 18, 2014 at 9:02 PM, David Lang <[email protected]> wrote:
> On Tue, 18 Mar 2014, Otis Gospodnetic wrote:
>
>> Hi,
>>
>> Does rsyslog have anything for doing geoIP lookup?
>
>
> currently no, it does not. We have a feature speced out, but not implemented
> that would provide this capability (table lookups), but the sponsor backed
> out and nobody has picked it up (either as a sponsor or to develop it)
>
> David Lang
>
>
>> If not, how are others handling this (other than passing logs through
>> Logstash)?
>>
>> A quick Google found https://github.com/mricon/howler , but that's
>> obviously not quite the same as figuring out the
>> city/country/lat,lon/postal code info from IP addresses.
>>
>> Thanks,
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>> LIKE THAT.
>>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
> LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to