Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-07-02 Thread Steve Mayzak
All,

This seems apropos to the current discussion and could help clear up some 
confusion on recommendations etc.  We, Elasticsearch, are hosting a Webinar 
on ELK, given by the Logstash creator, Jordan Sissel.

Its today in 40 minutes.
http://www.elasticsearch.org/webinars/introduction-elk-stack/


On Wednesday, July 2, 2014 6:08:34 AM UTC-7, Brian wrote:
>
>
> Patrick, 
>>
>>
>>
>>
>> *> Well, I did answer your question. But probably not from the direction 
>> you expected. hmm no, you didn't. My question was: "it looks like I cant 
>> retrieve/display [_all fields] content. Any idea?" and you replied with 
>> your logstash template where _all is disabled. I'm interested in disabling 
>> _all, but that was not my question at this point.*
>>
>
> Fair enough. I don't know the inner details; I am just an enthusiastic end 
> user.
>
> To the best of my knowledge, there is no content for the _all field; I 
> view this as an Elasticsearch psuedo field whose name is _all and whose 
> index terms are taken from all fields (by default), but still there is no 
> actual content for it.
>
> And after I got into the habit of disabling the _all field, my hands-on 
> exploration of its nuances have ended. It's time for the experts to explain!
>  
>
>>   
>> *Your answer to my second message, below, is informative and interesting 
>> but fails to answer my second question too. I simply asked whether I need 
>> to feed the complete modified mapping of my template or if I can just push 
>> the modified part (ie. the _all:{enabled: false} part). *
>>
>
>  Again, I have never done this, so I can only tell you what I do. I just 
> cannot tell you all the nuances of what Elasticsearch is capable of.
>
> My recommendation is to try it. Elasticsearch is great at letting you 
> experiment and then telling you clearly if your attempt succeeds or fails.
>
> So, try your scenario. If it fails, then it didn't work or you did 
> something wrong. If it succeeds, then you can see exactly what 
> Elasticsearch actually accepted as your mapping. For example:
>
> curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' && 
> echo
>
> This particular query looks at one of my logstash-generated indices, and 
> it lets me verify that Elasticsearch and Logstash conspired to create the 
> mappings I expected. I used this command quite a bit until I finally got 
> everything configured correctly. (I actually verify the mapping via 
> Elasticsearch Head, but under the covers it's the same command.)
>
> Brian
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2dd4206-c8bd-4c96-90df-5ad4a7bce5e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-07-02 Thread Brian


> Patrick, 
>
>
>
>
> *> Well, I did answer your question. But probably not from the direction 
> you expected. hmm no, you didn't. My question was: "it looks like I cant 
> retrieve/display [_all fields] content. Any idea?" and you replied with 
> your logstash template where _all is disabled. I'm interested in disabling 
> _all, but that was not my question at this point.*
>

Fair enough. I don't know the inner details; I am just an enthusiastic end 
user.

To the best of my knowledge, there is no content for the _all field; I view 
this as an Elasticsearch psuedo field whose name is _all and whose index 
terms are taken from all fields (by default), but still there is no actual 
content for it.

And after I got into the habit of disabling the _all field, my hands-on 
exploration of its nuances have ended. It's time for the experts to explain!
 

>   
> *Your answer to my second message, below, is informative and interesting 
> but fails to answer my second question too. I simply asked whether I need 
> to feed the complete modified mapping of my template or if I can just push 
> the modified part (ie. the _all:{enabled: false} part). *
>

 Again, I have never done this, so I can only tell you what I do. I just 
cannot tell you all the nuances of what Elasticsearch is capable of.

My recommendation is to try it. Elasticsearch is great at letting you 
experiment and then telling you clearly if your attempt succeeds or fails.

So, try your scenario. If it fails, then it didn't work or you did 
something wrong. If it succeeds, then you can see exactly what 
Elasticsearch actually accepted as your mapping. For example:

curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' && 
echo

This particular query looks at one of my logstash-generated indices, and it 
lets me verify that Elasticsearch and Logstash conspired to create the 
mappings I expected. I used this command quite a bit until I finally got 
everything configured correctly. (I actually verify the mapping via 
Elasticsearch Head, but under the covers it's the same command.)

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8eaefd0e-f684-4f44-9fcb-3137812a99d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-07-01 Thread Patrick Proniewski
Brian,

On 30 juin 2014, at 22:59, Brian wrote:

> Well, I did answer your question. But probably not from the direction you 
> expected.

hmm no, you didn't. My question was: "it looks like I cant retrieve/display 
[_all fields] content. Any idea?" and you replied with your logstash template 
where _all is disabled.
I'm interested in disabling _all, but that was not my question at this point.
 
Your answer to my second message, below, is informative and interesting but 
fails to answer my second question too. I simply asked whether I need to feed 
the complete modified mapping of my template or if I can just push the modified 
part (ie. the _all:{enabled: false} part).


> When I create and manage specific indices, I lock down Elasticsearch. When I 
> update the mappings, I understand that ES will not allow the mapping for an 
> existing field to be modified in an incompatible way. So I only update to add 
> new fields, and never to change or remove an existing field.
> 
> For time-based indices as used by the ELK stack, it makes the most sense to 
> me to create an on-disk mapping template. So I always disable the all field 
> and pre-map a subset of string fields as shown in my previous post. I do this 
> because when the next day arrives and logstash causes a new index to be 
> created, that new index will also set my default mapping from the template.
> 
> I don't disable the _all field in an existing index that currently has it 
> enabled. I don't know if it would succeed or fail, but I would not expect it 
> to be successful.
> 
> Instead, based on my previous experience with ES, I disable the _all field 
> and have disabled it from the very first test deployment of the ELK stack in 
> our group. And then I configured my ES startup script to set message as the 
> default field for a Lucene query. This was already set up and working when I 
> let others have access to it for the very first time. So I don't know the 
> answer to your specific question.
> 
> But I do know that a lot of experimentation went into my ELK configurations 
> before I let anyone else look at it for the very first time. So don't be 
> afraid to change your mappings and leave the old ones behind, and re-add data 
> as needed to get everything just the way you want it.
> 
> Brian
> 
> On Monday, June 30, 2014 1:22:34 AM UTC-4, Patrick Proniewski wrote:
> Brian, 
> 
> Thank you for the reply, even if it does not answer my question. 
> 
> By the way, how am I supposed to change a mapping setting? Do I have to push 
> back the entire mapping with one line modified, or can I just push something 
> like: 
> 
> { 
>   "logstash": { 
>  "mappings": { 
> "_default_": { 
>"_all": { 
>   "enabled": false 
>} 
> } 
>  } 
>   } 
> } 
> 
> 
> 
> On 20 juin 2014, at 23:04, Brian wrote: 
> 
> > Patrick, 
> > 
> > Here's my template, along with where the _all field is disabled. You may 
> > wish to add this setting to your own template, and then also add the index 
> > setting to ignore malformed data (if someone's log entry occasionally slips 
> > in "null" or "no-data" instead of the usual numeric value): 
> > 
> > { 
> >   "automap" : { 
> > "template" : "logstash-*", 
> > "settings" : { 
> >   "index.mapping.ignore_malformed" : true 
> > }, 
> > "mappings" : { 
> >   "_default_" : { 
> > "numeric_detection" : true, 
> > "_all" : { "enabled" : false }, 
> > "properties" : { 
> >   "message" : { "type" : "string" }, 
> >   "host" : { "type" : "string" }, 
> >   "UUID" : {  "type" : "string", "index" : "not_analyzed" }, 
> >   "logdate" : {  "type" : "string", "index" : "no" } 
> > } 
> >   } 
> > } 
> >   } 
> > } 
> > 
> > Brian 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B44B497A-5DC3-4BC5-9164-7F53B5D1D6B6%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-06-30 Thread Brian
Patrick,

Well, I did answer your question. But probably not from the direction you 
expected.

When I create and manage specific indices, I lock down Elasticsearch. When 
I update the mappings, I understand that ES will not allow the mapping for 
an existing field to be modified in an incompatible way. So I only update 
to add new fields, and never to change or remove an existing field.

For time-based indices as used by the ELK stack, it makes the most sense to 
me to create an on-disk mapping template. So I always disable the all field 
and pre-map a subset of string fields as shown in my previous post. I do 
this because when the next day arrives and logstash causes a new index to 
be created, that new index will also set my default mapping from the 
template.

I don't disable the _all field in an existing index that currently has it 
enabled. I don't know if it would succeed or fail, but I would not expect 
it to be successful.

Instead, based on my previous experience with ES, I disable the _all field 
and have disabled it from the very first test deployment of the ELK stack 
in our group. And then I configured my ES startup script to set message as 
the default field for a Lucene query. This was already set up and working 
when I let others have access to it for the very first time. So I don't 
know the answer to your specific question.

But I do know that a lot of experimentation went into my ELK configurations 
before I let anyone else look at it for the very first time. So don't be 
afraid to change your mappings and leave the old ones behind, and re-add 
data as needed to get everything just the way you want it.

Brian

On Monday, June 30, 2014 1:22:34 AM UTC-4, Patrick Proniewski wrote:
>
> Brian, 
>
> Thank you for the reply, even if it does not answer my question. 
>
> By the way, how am I supposed to change a mapping setting? Do I have to 
> push back the entire mapping with one line modified, or can I just push 
> something like: 
>
> { 
>   "logstash": { 
>  "mappings": { 
> "_default_": { 
>"_all": { 
>   "enabled": false 
>} 
> } 
>  } 
>   } 
> } 
>
>
>
> On 20 juin 2014, at 23:04, Brian wrote: 
>
> > Patrick, 
> > 
> > Here's my template, along with where the _all field is disabled. You may 
> wish to add this setting to your own template, and then also add the index 
> setting to ignore malformed data (if someone's log entry occasionally slips 
> in "null" or "no-data" instead of the usual numeric value): 
> > 
> > { 
> >   "automap" : { 
> > "template" : "logstash-*", 
> > "settings" : { 
> >   "index.mapping.ignore_malformed" : true 
> > }, 
> > "mappings" : { 
> >   "_default_" : { 
> > "numeric_detection" : true, 
> > "_all" : { "enabled" : false }, 
> > "properties" : { 
> >   "message" : { "type" : "string" }, 
> >   "host" : { "type" : "string" }, 
> >   "UUID" : {  "type" : "string", "index" : "not_analyzed" }, 
> >   "logdate" : {  "type" : "string", "index" : "no" } 
> > } 
> >   } 
> > } 
> >   } 
> > } 
> > 
> > Brian 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2ff289e5-baf7-4d25-8412-8fcf967440fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-06-29 Thread Patrick Proniewski
Brian,

Thank you for the reply, even if it does not answer my question. 

By the way, how am I supposed to change a mapping setting? Do I have to push 
back the entire mapping with one line modified, or can I just push something 
like: 

{
  "logstash": {
 "mappings": {
"_default_": {
   "_all": {
  "enabled": false
   }
}
 }
  }
}



On 20 juin 2014, at 23:04, Brian wrote:

> Patrick,
> 
> Here's my template, along with where the _all field is disabled. You may wish 
> to add this setting to your own template, and then also add the index setting 
> to ignore malformed data (if someone's log entry occasionally slips in "null" 
> or "no-data" instead of the usual numeric value):
> 
> {
>   "automap" : {
> "template" : "logstash-*",
> "settings" : {
>   "index.mapping.ignore_malformed" : true
> },
> "mappings" : {
>   "_default_" : {
> "numeric_detection" : true,
> "_all" : { "enabled" : false },
> "properties" : {
>   "message" : { "type" : "string" },
>   "host" : { "type" : "string" },
>   "UUID" : {  "type" : "string", "index" : "not_analyzed" },
>   "logdate" : {  "type" : "string", "index" : "no" }
> }
>   }
> }
>   }
> }
> 
> Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8D497ED9-54DF-48EA-AA91-44A621B72287%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-06-20 Thread Brian
Patrick,

Here's my template, along with where the _all field is disabled. You may 
wish to add this setting to your own template, and then also add the index 
setting to ignore malformed data (if someone's log entry occasionally slips 
in "null" or "no-data" instead of the usual numeric value):

{
  "automap" : {
"template" : "logstash-*",
"settings" : {
  *"index.mapping.ignore_malformed" : true*
},
"mappings" : {
  "_default_" : {
"numeric_detection" : true,
*"_all" : { "enabled" : false },*
"properties" : {
  "message" : { "type" : "string" },
  "host" : { "type" : "string" },
  "UUID" : {  "type" : "string", "index" : "not_analyzed" },
  "logdate" : {  "type" : "string", "index" : "no" }
}
  }
}
  }
}

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a145cb1e-4013-4a6b-a58d-9a42368d8107%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-06-20 Thread Patrick Proniewski
On 20 juin 2014, at 18:43, Brian wrote:

> Re: double the storage. I strongly recommend ELK users to disable the _all 
> field. The entire text of the log events generated by logstash ends up in the 
> message field (and not @message as many people incorrectly post). So the _all 
> field is just redundant overhead with no value add. The result is a dramatic 
> drop in database file sizes and dramatic increase in load performance. Of 
> course, you need to configure ES to use the message field as the default for 
> a Lucene Kibana query.


"message" field can be edited during logstash filtering, but admitting it's 
enough, I would love to remove "_all" field and point Kibana to "message". 
Oddly, I can't find the "_all" field, neither in Sense, nor in Kibana. I know 
it's enabled: 

GET _template/logstash

{
   "logstash": {
  "order": 0,
  "template": "logstash-*",
  "settings": {
 "index.refresh_interval": "5s"
  },
  "mappings": {
 "_default_": {
"dynamic_templates": [
   {
  "string_fields": {
 "mapping": {
"index": "analyzed",
"omit_norms": true,
"type": "string",
"fields": {
   "raw": {
  "index": "not_analyzed",
  "ignore_above": 256,
  "type": "string"
   }
}
 },
 "match_mapping_type": "string",
 "match": "*"
  }
   }
],
"properties": {
   "geoip": {
  "dynamic": true,
  "path": "full",
  "properties": {
 "location": {
"type": "geo_point"
 }
  },
  "type": "object"
   },
   "@version": {
  "index": "not_analyzed",
  "type": "string"
   }
},
"_all": {
   "enabled": true<--
}
 }
  },
  "aliases": {}
   }
}

But it looks like I cant retrieve/display its content. Any idea?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DA2C93C0-709E-4DAA-96A3-F6AB4588FF6A%40patpro.net.
For more options, visit https://groups.google.com/d/optout.