Re: [rsyslog] Dealing with malformed messages

2016-06-24 Thread Maupertuis Philippe
My main goal is to store messages  according to sending clients even for 
malformed messages.
Finding which clients are sending malformed messages would be a plus.
Broadly speaking our naming convention says that a hostname is 8 characters 
long ending with a specific letter according to the location.

In my understanding, whenever a syslog relay is involved, cares should be taken 
on the first relay because after only the message itself is supposed to contain 
information from the sender.
So maybe a no straightforward way is to add json field on the first relay to 
keep fromhost and/or  fromhost-ip and to reuse these additional fields on the 
ultimate central log server.
Would that be a significant overhead ?
Any  thoughts on this ?

Philippe

> -Original Message-
> From: rsyslog-boun...@lists.adiscon.com [mailto:rsyslog-
> boun...@lists.adiscon.com] On Behalf Of David Lang
> Sent: Thursday, June 23, 2016 7:08 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Dealing with malformed messages
>
> On Thu, 23 Jun 2016, Maupertuis Philippe wrote:
>
> > Hi list,
> > We have a central log server in place with logs going to files according to 
> > the
> sending host.
> > The template is :
> > $template DYNfile,"/dailylog/HOSTS/%HOSTNAME%/%$YEAR%-
> %$MONTH%-%$DAY%/%syslogfacility-text%-%$HOUR%"
> > This works as expected for mosts clients.
> > However some clients are probably sending malformed messages and we
> end up with funny hostnames like "unicast_addr_effective_port" or
> "ventID".
> > So I would like to retain the hostname only if it abide by our naming
> convention and otherwise retain the sender's IP address.
>
> This is a really hard thing to do, what is your naming convention?
>
> > I would also like to build a list of daily culprit to try to correct
> > the messages on the client if possible or to set them apart. What
> > would be the best way to do that.
>
> are you asking the best way to build the list of culprits? or the best way to
> deal with them once you have the list?
>
> David Lang
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL:
> This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond
> our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

!!!*
"Ce message et les pièces jointes sont confidentiels et réservés à l'usage 
exclusif de ses destinataires. Il peut également être protégé par le secret 
professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant 
être assurée sur Internet, la responsabilité de Worldline ne pourra être 
recherchée quant au contenu de ce message. Bien que les meilleurs efforts 
soient faits pour maintenir cette transmission exempte de tout virus, 
l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne 
saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for 
the addressee; it may also be privileged. If you receive this e-mail in error, 
please notify the sender immediately and destroy it. As its integrity cannot be 
secured on the Internet, the Worldline liability cannot be triggered for the 
message content. Although the sender endeavours to maintain a computer 
virus-free network, the sender does not warrant that this transmission is 
virus-free and will not be liable for any damages resulting from any virus 
transmitted.!!!"
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] Dealing with malformed messages

2016-06-24 Thread David Lang

On Fri, 24 Jun 2016, Maupertuis Philippe wrote:


My main goal is to store messages  according to sending clients even for 
malformed messages.
Finding which clients are sending malformed messages would be a plus.
Broadly speaking our naming convention says that a hostname is 8 characters 
long ending with a specific letter according to the location.

In my understanding, whenever a syslog relay is involved, cares should be taken 
on the first relay because after only the message itself is supposed to contain 
information from the sender.
So maybe a no straightforward way is to add json field on the first relay to 
keep fromhost and/or  fromhost-ip and to reuse these additional fields on the 
ultimate central log server.
Would that be a significant overhead ?
Any  thoughts on this ?


If you are using a current version, it's not a huge overhead (significant 
depends on your traffic :-) and there are things available to reduce it if you 
do find it significant.


Haivng your first-tier relay boxes take the incoming message and send it out 
as JSON and include the fromhost-ip is the only way to be sure that you are 
splitting the messages per source no matter what they contain.


David Lang


Philippe


-Original Message-
From: rsyslog-boun...@lists.adiscon.com [mailto:rsyslog-
boun...@lists.adiscon.com] On Behalf Of David Lang
Sent: Thursday, June 23, 2016 7:08 PM
To: rsyslog-users
Subject: Re: [rsyslog] Dealing with malformed messages

On Thu, 23 Jun 2016, Maupertuis Philippe wrote:


Hi list,
We have a central log server in place with logs going to files according to the

sending host.

The template is :
$template DYNfile,"/dailylog/HOSTS/%HOSTNAME%/%$YEAR%-

%$MONTH%-%$DAY%/%syslogfacility-text%-%$HOUR%"

This works as expected for mosts clients.
However some clients are probably sending malformed messages and we

end up with funny hostnames like "unicast_addr_effective_port" or
"ventID".

So I would like to retain the hostname only if it abide by our naming

convention and otherwise retain the sender's IP address.

This is a really hard thing to do, what is your naming convention?


I would also like to build a list of daily culprit to try to correct
the messages on the client if possible or to set them apart. What
would be the best way to do that.


are you asking the best way to build the list of culprits? or the best way to
deal with them once you have the list?

David Lang
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL:
This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond
our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.


!!!*
"Ce message et les pièces jointes sont confidentiels et réservés à l'usage 
exclusif de ses destinataires. Il peut également être protégé par le secret 
professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être 
assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant 
au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir 
cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à 
cet égard et sa responsabilité ne saurait être recherchée pour tout dommage 
résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for the 
addressee; it may also be privileged. If you receive this e-mail in error, please 
notify the sender immediately and destroy it. As its integrity cannot be secured on 
the Internet, the Worldline liability cannot be triggered for the message content. 
Although the sender endeavours to maintain a computer virus-free network, the sender 
does not warrant that this transmission is virus-free and will not be liable for any 
damages resulting from any virus transmitted.!!!"
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Re: [rsyslog] mmnormalize rule database Re: mmgrok packages

2016-06-24 Thread Brian Knox
I am very much looking forward to the custom data type support!  Safe
travels Rainer!

Brian

On Fri, Jun 24, 2016 at 2:07 AM Rainer Gerhards 
wrote:

> Thanks all for the great discussion and effort going forward! I am in
> preparation for a trip next week and so unfortunately had limited time
> to contribute (and will be unable next week), but I am more than
> interested in helping to move this forward.
>
> Note that we currently have some rulebases inside liblognorm's git:
> https://github.com/rsyslog/liblognorm/tree/master/rulebases This might
> be the place where we can begin to actually gather a full set ... or
> we could create a new git repo. The latter might be a better idea, as
> the folks who primarily maintain it are probably quite different.
>
> Again, I am excited to see all this new activity. Also keep in mind
> that with v2 (finally to be released next month), we can have custom
> data types just like in grok, so building rules is also much easier.
> IMHO it would make sense to first build a set of custom data types
> (like we did in lognorm with the cisco address representation), and
> then base rules on those extended set of base types. This is a sample
> from the testbench of how custom types are defined:
>
> https://github.com/rsyslog/liblognorm/blob/master/tests/usrdef_twotypes.sh
>
> Also, the doc has good information on that topic:
> https://github.com/rsyslog/liblognorm/blob/master/doc/configuration.rst
>
> As I said, I will unfortunately be mostly silent up unitl begin of
> june - please don't treat this as sign of desinterest! Again, I think
> this is an extremely valuable approach.
>
> Rainer
>
> 2016-06-23 19:25 GMT+02:00 David Lang :
> > On Thu, 23 Jun 2016, Champ Clark III wrote:
> >
> >> I assist with a project that pretty heavily depends on liblognorm called
> >> "Sagan" (http://sagan.io).
> >>
> >> While we have other "normalization" methods, we prefer liblognorm.  Our
> >> community rulebase file is at:
> >>
> >> https://github.com/beave/sagan-rules/blob/master/normalization.rulebase
> >>
> >> I agree with David, we don't want 10 different ways to normalize a Cisco
> >> log. At the same time, Cisco logs sometimes differ just enough that you
> >> _might_ need multiple ways to normalize them.
> >
> >
> > as an example of what I'm talking about.
> >
> > take the log example %ASA-6-302014 (end of TCP session)
> >
> > a few variations of which are:
> >
> >  %ASA-6-302014:Teardown TCP connection 42095195 for outside:2.2.9.2/5721
> to
> > inside:192.168.1.1/54151 duration 0:00:30 bytes 0 SYN Timeout
> >
> >  %ASA-6-302014: Teardown TCP connection 43363071 for
> > outside:192.168.2.5\/58949(LOCAL\\D.A) to
> > outside:192.168.2.3\/3283(LOCAL\\CP-G-SEP) duration
> 0:00:00
> > bytes 0 TCP Reset-O (D.A)
> >  %ASA-6-302014: Teardown TCP connection 51708532 for outside:
> 10.1.5.5/54853
> > to backup:192.168.2.1/4784(LOCALCP-G-SEPC999) duration
> 0:00:00
> > bytes 0 
> >
> > some people will parse it so that they have the variables sourceif,
> > sourceip, sourceport, destif, destip, destport etc
> >
> > I do source:{interface,ip,port} dest:{interface,ip,port}
> >
> > this is making use of the v2 ciscointerface type
> >
> > prefix=%timestamp:date-rfc3164% %hostname:word%
> >
> > rule=cisco,disconnect: \x25ASA-6-302014\x3a Teardown %proto:word%
> connection
> > %connection-id:number% for %source:cisco-interface-spec% to
> > %dest:cisco-interface-spec% duration %duration:char-to: % bytes
> > %bytes:number% %reason:rest%
> >
> > So we will need to agree of if we are going to use nesting or not (I
> think
> > we should), and if we do it with Cisco, we need to do it across the board
> >
> > by the way, this also brings up the issue of tags for the message
> >
> >> We have talked about "market place" for rule normalization for years
> now.
> >> It was always my impression that this would be part of the rsyslog team
> >> efforts. It sounds like you have enough on your plate, keeping track for
> >> rulebase isn't high on priority.  I understand this.  With Sagan, we are
> >> doing this "anyways".  That is, we are creating rulebases for different
> >> types of logs either way.  We commit them to the Sagan repo right now.
> >>
> >> I'd like to suggest the following for response:
> >>
> >> 1.  Split off the "normalization.rules" base from Sagan and great a new,
> >> separate github repo for it.
> >> 2.  If someone would like to add some rulebase "rules",  they can do a
> >> "pull" request.
> >> 3.  All rulebase "rules" need to have an example,  anonymized log
> sample.
> >> Used for testing.
> >> 4.  If the rules look good,  then they can be merged.
> >
> >
> > besides the pull request mechansim, I think we also need a way for people
> > who have rulesets to send them out for others to convert to pull
> requests. I
> > think that there is going to be a lot of tweaking/corrections to the
> > proposed rules, and a pull request isn't neccessarily the best way to
> handle
>