Re: [rsyslog] elasticsearch 2.0 and field names

2015-12-08 Thread Radu Gheorghe
On Tue, Dec 8, 2015 at 1:44 PM, Peter Portante
 wrote:
> On Tue, Dec 8, 2015 at 6:38 AM, Brian Knox  wrote:
>
>> As a short term solution I'm working on a small service (in golang) that
>> accepts logs over tcp, can replace characters in JSON field names in a @cee
>> syslog line, and then forward the line to another syslog destination.  In
>> tests on my laptop it handles modifying ~ 50,000 reasonably sized log lines
>> a second per connection.  It gracefully handles tcp connection issues and
>> I'll test it under adverse circumstances to make sure it's reasonably
>> robust.  I personally find this preferable to deploying logstash just to
>> substitute one character.  I'll release it open source this week in case
>> any one else needs an immediate solution to this problem like I do.
>>
>> It's less than ideal - ideally elasticsearch would support JSON rather than
>> a subset of characters JSON allows - but it solves the immediate problem
>> for us.
>>
>
> Have you brought this up with the ElasticSearch community to see what they
> say?

I think Brian touched on this already - previously dots were allowed
but it was confusing (both for users and for Elasticsearch itself)
when you ran a query or indexed a document with a dot in the field
name. For example, is "user.name" an actual field or do we have a
"user" object with "name" as the actual field that contains data?

Lucene (the search engine library on which Elasticsearch is built)
doesn't support real hierarchies. So when you index a "user" object
with a "name" field, the data you put in there is actually indexed in
a "user.name" field in Lucene. This means you can't have both
"user.name" as a separate field and the "name" field in the "user"
object in the same index. Actually, you could have that before, and
that caused quite some bugs (imagine what happens if you have
different definitions for those fields which are supposed to be
identical).

With this in mind, I doubt this change would be reverted in future, it
was done on purpose to work around issues that came up in previous
versions. One could think of workarounds, but I don't see one that
wouldn't come with its own problems. Unfortunately, Elasticsearch
changes quite a lot in backwards-incompatible ways between major
versions. This hurts the ecosystem as you can see here, but it helps
drive the project itself forward at a quicker pace.

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


Re: [rsyslog] elasticsearch 2.0 and field names

2015-12-08 Thread Brian Knox
As a short term solution I'm working on a small service (in golang) that
accepts logs over tcp, can replace characters in JSON field names in a @cee
syslog line, and then forward the line to another syslog destination.  In
tests on my laptop it handles modifying ~ 50,000 reasonably sized log lines
a second per connection.  It gracefully handles tcp connection issues and
I'll test it under adverse circumstances to make sure it's reasonably
robust.  I personally find this preferable to deploying logstash just to
substitute one character.  I'll release it open source this week in case
any one else needs an immediate solution to this problem like I do.

It's less than ideal - ideally elasticsearch would support JSON rather than
a subset of characters JSON allows - but it solves the immediate problem
for us.

Cheers,
Brian



On Sun, Dec 6, 2015 at 2:51 PM, David Lang  wrote:

> On Sat, 5 Dec 2015, Peter Portante wrote:
>
> On Sat, Dec 5, 2015 at 5:03 PM, David Lang  wrote:
>>
>> we really need mmscrubnames or similar
>>>
>>> 1. change all names to lower case
>>> 2. replace characters that rsyslog doesn't allow in names with something
>>> 3. allow other characters to be added to the list to be replaced
>>> 4. change names that are foo!bar into multi-layer structures
>>> 5. handle the case where these changes create nultiple objects with the
>>> same name (probably by appending a string until there are no longer
>>> conflicts)
>>>
>>> #1 may be able to go away in a decade or so if we allow case sensitive
>>> names as an option
>>>
>>>
>> Don't we need to make this go away sooner than later?  If rsyslog is the
>> link in the chain that prevents someone from getting the key names they
>> expect into ES, won't they find something else to replace that link?
>>
>> I have made available RPMs for EPEL 7 (which should work on RHEL 7 and
>> CentOS 7)P, and Fedora 21, 22, and 23.  Why not make the effort to find
>> out
>> what breaks, and put in a switch so that folks can opt-in to
>> case-sensitive
>> names in config files?  I'd be happy to implement the switch, but would
>> need help verifying existing configurations work.
>>
>
> this will break some existing configs, won't it? If someone has something
> that's assuming everything is squished to lower case, and it becomes case
> sensitive, won't that break?
>
> We can add the new case sensitivity as an option quickly, but can't make
> it the default for quite a while (a cycle or two of the enterprise distros)
>
> #2 needs to be done on the actual variable names, not just on the ES
>>> output so that the variables can be accessed and manipulated in rsyslog
>>>
>>>
>> Why do we need to do this?  Is this because we need to reference them in
>> the configuration files?  If so, why not provide an escape syntax for the
>> configuration file?
>>
>> Do we really want rsyslog in the position where it adds restrictions to
>> the
>> data handling pipeline because of how it operates?  I think we all agree
>> that an mmscrubnames module would be good to help put rsyslog in the
>> position of transforming data from one source to another in the overall
>> pipeline.
>>
>
> AFAIK, JSON imposes no limits of field names, so any strange character (or
> unicode character, or even control character) could be part of a field
> name. And even if the JSON spec imposes some limits, do the libraries
> impose such limits in practice?
>
> I don't think it makes sense to support all of this in rsyslog, I think
> it's reasonable to impose something sane. Other log handling software does
> this (for example, logstash doesn't allow '.' in the name, but also is case
> insensitive :-)
>
> and finally, #4 is needed to allow the work-around for problems like ES
>>> has.
>>>
>>>
>> I am not sure I follow why this allows us to work-around problems like ES
>> has.
>>
>> The dots in field names are confusing and ambiguous in ES because you can
>> reference a hierarchical set of objects in the json objects indexed.  So
>> if
>> one has a field name with dots in it in one document and another document
>> in the index has a hierarchy with sub objects, then it is ambiguous which
>> we are dealing with, if I understand the problem correctly.
>>
>
> Ok, that explains why this is an issue, it makes sense. We have the same
> problem with '!'. It's a problem in ES because it's a new requirement,
> breaking existing input.
>
> But #4 would let us say that '.' is an illegal character, along with
> control characters, anything above plain ASCII, and other punctuation
> characters we don't allow and get them replaced by something we do allow.
>
> Folks can stay with ES 1.7 if they need the dots in names.
>>
>
> not long term.
>
> David Lang
>
> ___
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This 

Re: [rsyslog] elasticsearch 2.0 and field names

2015-12-08 Thread Peter Portante
On Tue, Dec 8, 2015 at 6:38 AM, Brian Knox  wrote:

> As a short term solution I'm working on a small service (in golang) that
> accepts logs over tcp, can replace characters in JSON field names in a @cee
> syslog line, and then forward the line to another syslog destination.  In
> tests on my laptop it handles modifying ~ 50,000 reasonably sized log lines
> a second per connection.  It gracefully handles tcp connection issues and
> I'll test it under adverse circumstances to make sure it's reasonably
> robust.  I personally find this preferable to deploying logstash just to
> substitute one character.  I'll release it open source this week in case
> any one else needs an immediate solution to this problem like I do.
>
> It's less than ideal - ideally elasticsearch would support JSON rather than
> a subset of characters JSON allows - but it solves the immediate problem
> for us.
>

Have you brought this up with the ElasticSearch community to see what they
say?


>
> Cheers,
> Brian
>
>
>
> On Sun, Dec 6, 2015 at 2:51 PM, David Lang  wrote:
>
> > On Sat, 5 Dec 2015, Peter Portante wrote:
> >
> > On Sat, Dec 5, 2015 at 5:03 PM, David Lang  wrote:
> >>
> >> we really need mmscrubnames or similar
> >>>
> >>> 1. change all names to lower case
> >>> 2. replace characters that rsyslog doesn't allow in names with
> something
> >>> 3. allow other characters to be added to the list to be replaced
> >>> 4. change names that are foo!bar into multi-layer structures
> >>> 5. handle the case where these changes create nultiple objects with the
> >>> same name (probably by appending a string until there are no longer
> >>> conflicts)
> >>>
> >>> #1 may be able to go away in a decade or so if we allow case sensitive
> >>> names as an option
> >>>
> >>>
> >> Don't we need to make this go away sooner than later?  If rsyslog is the
> >> link in the chain that prevents someone from getting the key names they
> >> expect into ES, won't they find something else to replace that link?
> >>
> >> I have made available RPMs for EPEL 7 (which should work on RHEL 7 and
> >> CentOS 7)P, and Fedora 21, 22, and 23.  Why not make the effort to find
> >> out
> >> what breaks, and put in a switch so that folks can opt-in to
> >> case-sensitive
> >> names in config files?  I'd be happy to implement the switch, but would
> >> need help verifying existing configurations work.
> >>
> >
> > this will break some existing configs, won't it? If someone has something
> > that's assuming everything is squished to lower case, and it becomes case
> > sensitive, won't that break?
> >
> > We can add the new case sensitivity as an option quickly, but can't make
> > it the default for quite a while (a cycle or two of the enterprise
> distros)
> >
> > #2 needs to be done on the actual variable names, not just on the ES
> >>> output so that the variables can be accessed and manipulated in rsyslog
> >>>
> >>>
> >> Why do we need to do this?  Is this because we need to reference them in
> >> the configuration files?  If so, why not provide an escape syntax for
> the
> >> configuration file?
> >>
> >> Do we really want rsyslog in the position where it adds restrictions to
> >> the
> >> data handling pipeline because of how it operates?  I think we all agree
> >> that an mmscrubnames module would be good to help put rsyslog in the
> >> position of transforming data from one source to another in the overall
> >> pipeline.
> >>
> >
> > AFAIK, JSON imposes no limits of field names, so any strange character
> (or
> > unicode character, or even control character) could be part of a field
> > name. And even if the JSON spec imposes some limits, do the libraries
> > impose such limits in practice?
> >
> > I don't think it makes sense to support all of this in rsyslog, I think
> > it's reasonable to impose something sane. Other log handling software
> does
> > this (for example, logstash doesn't allow '.' in the name, but also is
> case
> > insensitive :-)
> >
> > and finally, #4 is needed to allow the work-around for problems like ES
> >>> has.
> >>>
> >>>
> >> I am not sure I follow why this allows us to work-around problems like
> ES
> >> has.
> >>
> >> The dots in field names are confusing and ambiguous in ES because you
> can
> >> reference a hierarchical set of objects in the json objects indexed.  So
> >> if
> >> one has a field name with dots in it in one document and another
> document
> >> in the index has a hierarchy with sub objects, then it is ambiguous
> which
> >> we are dealing with, if I understand the problem correctly.
> >>
> >
> > Ok, that explains why this is an issue, it makes sense. We have the same
> > problem with '!'. It's a problem in ES because it's a new requirement,
> > breaking existing input.
> >
> > But #4 would let us say that '.' is an illegal character, along with
> > control characters, anything above plain ASCII, and other punctuation
> > characters we don't allow and get them 

Re: [rsyslog] Type of encryption algorithms used in rsylog

2015-12-08 Thread David Lang
The 'traditional' way of doing this was to route syslog through stunnel. A few 
years ago Rsyslog gained the ability to use TLS encryption for this, but the 
configuration for this is far from as simple as it should be.


This thread from last month is probably a good place to start

http://www.gossamer-threads.com/lists/rsyslog/users/18333#18333


 On Tue, 8 Dec 2015, Girish Kumar wrote:


I am looking for encrypting the communications between rsyslog systems.

Regards,
Girish
-Original Message-
From: rsyslog-boun...@lists.adiscon.com 
[mailto:rsyslog-boun...@lists.adiscon.com] On Behalf Of David Lang
Sent: Tuesday, December 08, 2015 11:17 AM
To: rsyslog-users
Subject: Re: [rsyslog] Type of encryption algorithms used in rsylog

On Tue, 8 Dec 2015, Girish Kumar wrote:


Hi All,

Could you please let me know different types of encryption algorithm used in 
rsylog.
Does user has option to select the type of encryption algorithm to be used ?
Can it be configured in rsyslog.conf file.
Please point to  the documentation or tutorial where I can get the
detailed info


are you looking for encrypting the communications between rsyslog systems? or 
encrypting the files that rsyslog writes? or ??

David Lang

___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is 
a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our 
control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.


___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.