Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
All, This seems apropos to the current discussion and could help clear up some confusion on recommendations etc. We, Elasticsearch, are hosting a Webinar on ELK, given by the Logstash creator, Jordan Sissel. Its today in 40 minutes. http://www.elasticsearch.org/webinars/introduction-elk-stack/ On Wednesday, July 2, 2014 6:08:34 AM UTC-7, Brian wrote: > > > Patrick, >> >> >> >> >> *> Well, I did answer your question. But probably not from the direction >> you expected. hmm no, you didn't. My question was: "it looks like I cant >> retrieve/display [_all fields] content. Any idea?" and you replied with >> your logstash template where _all is disabled. I'm interested in disabling >> _all, but that was not my question at this point.* >> > > Fair enough. I don't know the inner details; I am just an enthusiastic end > user. > > To the best of my knowledge, there is no content for the _all field; I > view this as an Elasticsearch psuedo field whose name is _all and whose > index terms are taken from all fields (by default), but still there is no > actual content for it. > > And after I got into the habit of disabling the _all field, my hands-on > exploration of its nuances have ended. It's time for the experts to explain! > > >> >> *Your answer to my second message, below, is informative and interesting >> but fails to answer my second question too. I simply asked whether I need >> to feed the complete modified mapping of my template or if I can just push >> the modified part (ie. the _all:{enabled: false} part). * >> > > Again, I have never done this, so I can only tell you what I do. I just > cannot tell you all the nuances of what Elasticsearch is capable of. > > My recommendation is to try it. Elasticsearch is great at letting you > experiment and then telling you clearly if your attempt succeeds or fails. > > So, try your scenario. If it fails, then it didn't work or you did > something wrong. If it succeeds, then you can see exactly what > Elasticsearch actually accepted as your mapping. For example: > > curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' && > echo > > This particular query looks at one of my logstash-generated indices, and > it lets me verify that Elasticsearch and Logstash conspired to create the > mappings I expected. I used this command quite a bit until I finally got > everything configured correctly. (I actually verify the mapping via > Elasticsearch Head, but under the covers it's the same command.) > > Brian > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d2dd4206-c8bd-4c96-90df-5ad4a7bce5e1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
> Patrick, > > > > > *> Well, I did answer your question. But probably not from the direction > you expected. hmm no, you didn't. My question was: "it looks like I cant > retrieve/display [_all fields] content. Any idea?" and you replied with > your logstash template where _all is disabled. I'm interested in disabling > _all, but that was not my question at this point.* > Fair enough. I don't know the inner details; I am just an enthusiastic end user. To the best of my knowledge, there is no content for the _all field; I view this as an Elasticsearch psuedo field whose name is _all and whose index terms are taken from all fields (by default), but still there is no actual content for it. And after I got into the habit of disabling the _all field, my hands-on exploration of its nuances have ended. It's time for the experts to explain! > > *Your answer to my second message, below, is informative and interesting > but fails to answer my second question too. I simply asked whether I need > to feed the complete modified mapping of my template or if I can just push > the modified part (ie. the _all:{enabled: false} part). * > Again, I have never done this, so I can only tell you what I do. I just cannot tell you all the nuances of what Elasticsearch is capable of. My recommendation is to try it. Elasticsearch is great at letting you experiment and then telling you clearly if your attempt succeeds or fails. So, try your scenario. If it fails, then it didn't work or you did something wrong. If it succeeds, then you can see exactly what Elasticsearch actually accepted as your mapping. For example: curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true' && echo This particular query looks at one of my logstash-generated indices, and it lets me verify that Elasticsearch and Logstash conspired to create the mappings I expected. I used this command quite a bit until I finally got everything configured correctly. (I actually verify the mapping via Elasticsearch Head, but under the covers it's the same command.) Brian -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8eaefd0e-f684-4f44-9fcb-3137812a99d3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
Brian, On 30 juin 2014, at 22:59, Brian wrote: > Well, I did answer your question. But probably not from the direction you > expected. hmm no, you didn't. My question was: "it looks like I cant retrieve/display [_all fields] content. Any idea?" and you replied with your logstash template where _all is disabled. I'm interested in disabling _all, but that was not my question at this point. Your answer to my second message, below, is informative and interesting but fails to answer my second question too. I simply asked whether I need to feed the complete modified mapping of my template or if I can just push the modified part (ie. the _all:{enabled: false} part). > When I create and manage specific indices, I lock down Elasticsearch. When I > update the mappings, I understand that ES will not allow the mapping for an > existing field to be modified in an incompatible way. So I only update to add > new fields, and never to change or remove an existing field. > > For time-based indices as used by the ELK stack, it makes the most sense to > me to create an on-disk mapping template. So I always disable the all field > and pre-map a subset of string fields as shown in my previous post. I do this > because when the next day arrives and logstash causes a new index to be > created, that new index will also set my default mapping from the template. > > I don't disable the _all field in an existing index that currently has it > enabled. I don't know if it would succeed or fail, but I would not expect it > to be successful. > > Instead, based on my previous experience with ES, I disable the _all field > and have disabled it from the very first test deployment of the ELK stack in > our group. And then I configured my ES startup script to set message as the > default field for a Lucene query. This was already set up and working when I > let others have access to it for the very first time. So I don't know the > answer to your specific question. > > But I do know that a lot of experimentation went into my ELK configurations > before I let anyone else look at it for the very first time. So don't be > afraid to change your mappings and leave the old ones behind, and re-add data > as needed to get everything just the way you want it. > > Brian > > On Monday, June 30, 2014 1:22:34 AM UTC-4, Patrick Proniewski wrote: > Brian, > > Thank you for the reply, even if it does not answer my question. > > By the way, how am I supposed to change a mapping setting? Do I have to push > back the entire mapping with one line modified, or can I just push something > like: > > { > "logstash": { > "mappings": { > "_default_": { >"_all": { > "enabled": false >} > } > } > } > } > > > > On 20 juin 2014, at 23:04, Brian wrote: > > > Patrick, > > > > Here's my template, along with where the _all field is disabled. You may > > wish to add this setting to your own template, and then also add the index > > setting to ignore malformed data (if someone's log entry occasionally slips > > in "null" or "no-data" instead of the usual numeric value): > > > > { > > "automap" : { > > "template" : "logstash-*", > > "settings" : { > > "index.mapping.ignore_malformed" : true > > }, > > "mappings" : { > > "_default_" : { > > "numeric_detection" : true, > > "_all" : { "enabled" : false }, > > "properties" : { > > "message" : { "type" : "string" }, > > "host" : { "type" : "string" }, > > "UUID" : { "type" : "string", "index" : "not_analyzed" }, > > "logdate" : { "type" : "string", "index" : "no" } > > } > > } > > } > > } > > } > > > > Brian > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/B44B497A-5DC3-4BC5-9164-7F53B5D1D6B6%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
Patrick, Well, I did answer your question. But probably not from the direction you expected. When I create and manage specific indices, I lock down Elasticsearch. When I update the mappings, I understand that ES will not allow the mapping for an existing field to be modified in an incompatible way. So I only update to add new fields, and never to change or remove an existing field. For time-based indices as used by the ELK stack, it makes the most sense to me to create an on-disk mapping template. So I always disable the all field and pre-map a subset of string fields as shown in my previous post. I do this because when the next day arrives and logstash causes a new index to be created, that new index will also set my default mapping from the template. I don't disable the _all field in an existing index that currently has it enabled. I don't know if it would succeed or fail, but I would not expect it to be successful. Instead, based on my previous experience with ES, I disable the _all field and have disabled it from the very first test deployment of the ELK stack in our group. And then I configured my ES startup script to set message as the default field for a Lucene query. This was already set up and working when I let others have access to it for the very first time. So I don't know the answer to your specific question. But I do know that a lot of experimentation went into my ELK configurations before I let anyone else look at it for the very first time. So don't be afraid to change your mappings and leave the old ones behind, and re-add data as needed to get everything just the way you want it. Brian On Monday, June 30, 2014 1:22:34 AM UTC-4, Patrick Proniewski wrote: > > Brian, > > Thank you for the reply, even if it does not answer my question. > > By the way, how am I supposed to change a mapping setting? Do I have to > push back the entire mapping with one line modified, or can I just push > something like: > > { > "logstash": { > "mappings": { > "_default_": { >"_all": { > "enabled": false >} > } > } > } > } > > > > On 20 juin 2014, at 23:04, Brian wrote: > > > Patrick, > > > > Here's my template, along with where the _all field is disabled. You may > wish to add this setting to your own template, and then also add the index > setting to ignore malformed data (if someone's log entry occasionally slips > in "null" or "no-data" instead of the usual numeric value): > > > > { > > "automap" : { > > "template" : "logstash-*", > > "settings" : { > > "index.mapping.ignore_malformed" : true > > }, > > "mappings" : { > > "_default_" : { > > "numeric_detection" : true, > > "_all" : { "enabled" : false }, > > "properties" : { > > "message" : { "type" : "string" }, > > "host" : { "type" : "string" }, > > "UUID" : { "type" : "string", "index" : "not_analyzed" }, > > "logdate" : { "type" : "string", "index" : "no" } > > } > > } > > } > > } > > } > > > > Brian > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2ff289e5-baf7-4d25-8412-8fcf967440fc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
Brian, Thank you for the reply, even if it does not answer my question. By the way, how am I supposed to change a mapping setting? Do I have to push back the entire mapping with one line modified, or can I just push something like: { "logstash": { "mappings": { "_default_": { "_all": { "enabled": false } } } } } On 20 juin 2014, at 23:04, Brian wrote: > Patrick, > > Here's my template, along with where the _all field is disabled. You may wish > to add this setting to your own template, and then also add the index setting > to ignore malformed data (if someone's log entry occasionally slips in "null" > or "no-data" instead of the usual numeric value): > > { > "automap" : { > "template" : "logstash-*", > "settings" : { > "index.mapping.ignore_malformed" : true > }, > "mappings" : { > "_default_" : { > "numeric_detection" : true, > "_all" : { "enabled" : false }, > "properties" : { > "message" : { "type" : "string" }, > "host" : { "type" : "string" }, > "UUID" : { "type" : "string", "index" : "not_analyzed" }, > "logdate" : { "type" : "string", "index" : "no" } > } > } > } > } > } > > Brian -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8D497ED9-54DF-48EA-AA91-44A621B72287%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
Patrick, Here's my template, along with where the _all field is disabled. You may wish to add this setting to your own template, and then also add the index setting to ignore malformed data (if someone's log entry occasionally slips in "null" or "no-data" instead of the usual numeric value): { "automap" : { "template" : "logstash-*", "settings" : { *"index.mapping.ignore_malformed" : true* }, "mappings" : { "_default_" : { "numeric_detection" : true, *"_all" : { "enabled" : false },* "properties" : { "message" : { "type" : "string" }, "host" : { "type" : "string" }, "UUID" : { "type" : "string", "index" : "not_analyzed" }, "logdate" : { "type" : "string", "index" : "no" } } } } } } Brian -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a145cb1e-4013-4a6b-a58d-9a42368d8107%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)
On 20 juin 2014, at 18:43, Brian wrote: > Re: double the storage. I strongly recommend ELK users to disable the _all > field. The entire text of the log events generated by logstash ends up in the > message field (and not @message as many people incorrectly post). So the _all > field is just redundant overhead with no value add. The result is a dramatic > drop in database file sizes and dramatic increase in load performance. Of > course, you need to configure ES to use the message field as the default for > a Lucene Kibana query. "message" field can be edited during logstash filtering, but admitting it's enough, I would love to remove "_all" field and point Kibana to "message". Oddly, I can't find the "_all" field, neither in Sense, nor in Kibana. I know it's enabled: GET _template/logstash { "logstash": { "order": 0, "template": "logstash-*", "settings": { "index.refresh_interval": "5s" }, "mappings": { "_default_": { "dynamic_templates": [ { "string_fields": { "mapping": { "index": "analyzed", "omit_norms": true, "type": "string", "fields": { "raw": { "index": "not_analyzed", "ignore_above": 256, "type": "string" } } }, "match_mapping_type": "string", "match": "*" } } ], "properties": { "geoip": { "dynamic": true, "path": "full", "properties": { "location": { "type": "geo_point" } }, "type": "object" }, "@version": { "index": "not_analyzed", "type": "string" } }, "_all": { "enabled": true<-- } } }, "aliases": {} } } But it looks like I cant retrieve/display its content. Any idea? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DA2C93C0-709E-4DAA-96A3-F6AB4588FF6A%40patpro.net. For more options, visit https://groups.google.com/d/optout.