Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-12-04 Thread Bob Gregory
Hi David,

It's probably best if you _don't_ try to map syslog fields into
riemann fields because the two technologies are accomplishing different
things. Riemann is for processing metrics - numerical data about the state
of our systems, while syslog is about logs - narrative textual data about
our systems.

Service, tags, etc will need to be configured by the end-user; we shouldn't
be guessing what they might be based on our understanding of the log
message.

The reason I would need a Riemann output is that I have three use cases
where I forward data in logs to Riemann from logstash -

1) Logstash's heartbeat (so I can measure latency on my processing pipeline)
2) ERROR and CRITICAL logs so I can alert on them
3) Metrics encoded into json logs by applications.

Service is the "Thing under measurement". The closest analogue would be
programname, but one program might have many services. For example: "http
response time ms", "Bytes read", "Active users", "messages received". Each
of the keys in the key/value messages raised by impstats is a single
service.

Tags are used to aggregate and filter services, they're arbitrary bits of
data; eg. "Message type", "User account type", "ec2 instance type", "site
map area". Our biggest use case for them is in asynchronous processing
pipelines, where we use them to tag the messages we're processing so that
we can see overall throughput and latency, but drill down when we have to.

The metric is the actual measurement, it's a number.

The closest analogue to severity is the "state", which is an arbitrary
string. Usually people use the statuses "ok", "warning", "error" etc. but
it's entirely arbitrary. They're mostly used to trigger state changes in
Riemann.

Description is a narrative description of an event. We only use these in a
single use-case, which is that we forward all logs of ERROR level and
higher to riemann so that it can count them, and send us roll-up emails
every hour, or trigger pagerduty. In this use-case, we set the description
to the incoming log message.

Lastly, the TTL is used to control how long a message should be held
in-memory by Riemann. It can be used to keep a snapshot of current state.
We use it for heartbeats - when an event's TTL expires, if we haven't
received another of the same event, we can raise an alert.

Hope that makes more sense - if you're interested in learning more about
Riemann, there's a great introductory video on the site. http://riemann.io/

The only fields that are required are the host, the service, and the metric.

 -- Bob



On Mon, 5 Dec 2016 at 00:06 David Lang  wrote:

On Mon, 5 Dec 2016, Dave Cottlehuber wrote:

> https://github.com/algernon/riemann-c-client may be of interest to use
> it directly -- its been dropped into collectd as a library now as well,
> and is ported to Debian & FreeBSD already, that I know of. The protobuf
> wire format is
>
https://github.com/algernon/riemann-c-client/blob/master/lib/riemann/proto/riemann.proto
> if that's helpful.

it is.

> What I've found useful with collectd and riemann was to be able to set
> specific custom tags per instance (rsyslog server in our case) which
> makes the sorting in riemann very easy prior to parsing any specific
> message output. Mainly source & instance type:

it looks like the protobuf allows a lot of options in terms of how to store
the
data.

We can make educated guesses as to what makes sense fro the riemann point of
view, but they will only be guesses

as far as tags go, tagging it as being from rsyslog is an obvious item, and
if
we have tags from mmnormalize, they should go here. What else?

should service be the programname or the faclity?

where would facility/severity be stored? is severity == metric?

what sort of stuff normally goes in the description field?

for the attributes, one obvious one is the message, but beyond that it's
less
clear. Given that rsyslog internally tracks things as JSON, I think putting
each
json object as an attribute makes sense, but attributes can't be nested.
Internally to rsyslog, we deal with nested objects by flattening them and
seperating the tiers with a ! (i.e. {foo:{bar:baz}} == foo!bar:baz), is this
reasonable from a riemann point of view? should we use a different character
instead?

David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE 

Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-12-04 Thread David Lang

On Mon, 5 Dec 2016, Dave Cottlehuber wrote:


https://github.com/algernon/riemann-c-client may be of interest to use
it directly -- its been dropped into collectd as a library now as well,
and is ported to Debian & FreeBSD already, that I know of. The protobuf
wire format is
https://github.com/algernon/riemann-c-client/blob/master/lib/riemann/proto/riemann.proto
if that's helpful.


it is.


What I've found useful with collectd and riemann was to be able to set
specific custom tags per instance (rsyslog server in our case) which
makes the sorting in riemann very easy prior to parsing any specific
message output. Mainly source & instance type:


it looks like the protobuf allows a lot of options in terms of how to store the 
data.


We can make educated guesses as to what makes sense fro the riemann point of 
view, but they will only be guesses


as far as tags go, tagging it as being from rsyslog is an obvious item, and if 
we have tags from mmnormalize, they should go here. What else?


should service be the programname or the faclity?

where would facility/severity be stored? is severity == metric?

what sort of stuff normally goes in the description field?

for the attributes, one obvious one is the message, but beyond that it's less 
clear. Given that rsyslog internally tracks things as JSON, I think putting each 
json object as an attribute makes sense, but attributes can't be nested. 
Internally to rsyslog, we deal with nested objects by flattening them and 
seperating the tiers with a ! (i.e. {foo:{bar:baz}} == foo!bar:baz), is this 
reasonable from a riemann point of view? should we use a different character 
instead?


David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-12-04 Thread David Lang

On Mon, 5 Dec 2016, Dave Cottlehuber wrote:


I can't work out what rsyslog
mainly comes under.


Rsyslog is moving to ASL 2.0 (we have a few files that are gpl and are going to 
need to be replaced)


David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] omriemann Re: Are we building an ERK stack?

2016-12-04 Thread Dave Cottlehuber
On Wed, 23 Nov 2016, at 19:18, David Lang wrote:
> On Wed, 23 Nov 2016, Bob Gregory wrote:
> 
> > For that, I'd like to see better support for GeoIP tagging, a Riemann
> > output plugin, some better guidance on "failed message queues", etc. etc.
> > etc.
> 
> With a bit of digging, I can't find where Riemann defines what the
> over-the-wire 
> format is that you would need to deliver logs to it.
> 
> I see hints that it uses protobuf to serialize things, and has an 
> application-level ack mechanism similar to what we have in relp, but the
> levels 
> of indirection are stacked high, and the API documenation only points you
> at the 
> function defintions.
> 
> David Lang

Hi David, Bob,

https://github.com/algernon/riemann-c-client may be of interest to use
it directly -- its been dropped into collectd as a library now as well,
and is ported to Debian & FreeBSD already, that I know of. The protobuf
wire format is
https://github.com/algernon/riemann-c-client/blob/master/lib/riemann/proto/riemann.proto
if that's helpful. License is LGPL3 and I can't work out what rsyslog
mainly comes under.

What I've found useful with collectd and riemann was to be able to set
specific custom tags per instance (rsyslog server in our case) which
makes the sorting in riemann very easy prior to parsing any specific
message output. Mainly source & instance type:

A+
Dave
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.