Re: Question about time based indexes/rolling indexes and eviction policies?
On Friday, May 23, 2014 at 20:13 CEST, John Smith java.dev@gmail.com wrote: #1 I have been reading around and some people suggest if doing log analytics to split the index based on time. Is this built in into Elastic search or does it mean I have to do it manual? I don't believe Elasticsearch itself understands date-based indices, but Logstash does. If manual PUT http://myhost:9200/myindex-(get-current-date-here)/SomeDoc/Id I'm pulling my data from SQL server and going to either use ETL or JDBC gatherer. I suppose the ETL process needs to consider the date and when it does it's index PUT to check and roll over the date so that a new index gets created? Yes. And my queries need to consider this also so they know that on each day they need to search the new index? Yes, unless you use an index alias like _all to search in all indices but that obviously has performance implication and in part voids the benefits of multiple indices. #2 is there such a thing as eviction policies? Basically is there a way to check if we are running out of diskspace and to either remove entries from the index or in the above case delete/archive indexes older then a few days? If disk space is your limiting factor you should find the curator script useful. You could also set the _ttl value of messages to have them automatically expire after a set time. https://github.com/elasticsearch/curator http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html -- Magnus Bäck| Software Engineer, Development Tools magnus.b...@sonymobile.com | Sony Mobile Communications -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20140526063906.GB16396%40seldlx20533.corpusers.net. For more options, visit https://groups.google.com/d/optout.
Re: Question about time based indexes/rolling indexes and eviction policies?
1. I will add a timeseries mode to my JDBC plugin soon. Right now you can create timestamps with bash (or your favorite shell) and append it as a suffix to the index name into the river/feeder creation call, but this can be automated. No ETA yet. 2. This is also a nifty feature, I will experiment with the JDBC plugin if I can estimate the data volume to index (probably from the data volume of previous runs) or if I can make an educated guess about data growth in ES data folders, and will refuse to continue if a limit is exceeded. Index data volume can fluctuate due to segment creations and merging so this would have to include an optimization strategy, or I rely on the JDBC source. Eviction is a harder topic, since I hesitate to create a plugin that can delete data without user interaction. Even eviction rules in a plugin configuration may contain mistakes and are risky. But I also see the usefulness of obsoleting indexed data by dropping them regularly. I don't want to take responsibility for this in the JDBC plugin, so this may just be another plugin implementation. Jörg On Fri, May 23, 2014 at 8:13 PM, John Smith java.dev@gmail.com wrote: #1 I have been reading around and some people suggest if doing log analytics to split the index based on time. Is this built in into Elastic search or does it mean I have to do it manual? If manual PUT http://myhost:9200/myindex-(get-current-date-here)/SomeDoc/Id I'm pulling my data from SQL server and going to either use ETL or JDBC gatherer. I suppose the ETL process needs to consider the date and when it does it's index PUT to check and roll over the date so that a new index gets created? And my queries need to consider this also so they know that on each day they need to search the new index? #2 is there such a thing as eviction policies? Basically is there a way to check if we are running out of diskspace and to either remove entries from the index or in the above case delete/archive indexes older then a few days? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25618b41-f567-4d22-a2df-ca9319017897%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/25618b41-f567-4d22-a2df-ca9319017897%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG-GdoHbLX0%2BCVj8jjBXQxQQjAnZzZkY90T2jnHAYT1HA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Question about time based indexes/rolling indexes and eviction policies?
Thanks! On Monday, 26 May 2014 03:58:15 UTC-4, Jörg Prante wrote: 1. I will add a timeseries mode to my JDBC plugin soon. Right now you can create timestamps with bash (or your favorite shell) and append it as a suffix to the index name into the river/feeder creation call, but this can be automated. No ETA yet. 2. This is also a nifty feature, I will experiment with the JDBC plugin if I can estimate the data volume to index (probably from the data volume of previous runs) or if I can make an educated guess about data growth in ES data folders, and will refuse to continue if a limit is exceeded. Index data volume can fluctuate due to segment creations and merging so this would have to include an optimization strategy, or I rely on the JDBC source. Eviction is a harder topic, since I hesitate to create a plugin that can delete data without user interaction. Even eviction rules in a plugin configuration may contain mistakes and are risky. But I also see the usefulness of obsoleting indexed data by dropping them regularly. I don't want to take responsibility for this in the JDBC plugin, so this may just be another plugin implementation. Jörg On Fri, May 23, 2014 at 8:13 PM, John Smith java.d...@gmail.comjavascript: wrote: #1 I have been reading around and some people suggest if doing log analytics to split the index based on time. Is this built in into Elastic search or does it mean I have to do it manual? If manual PUT http://myhost:9200/myindex-(get-current-date-here)/SomeDoc/Id I'm pulling my data from SQL server and going to either use ETL or JDBC gatherer. I suppose the ETL process needs to consider the date and when it does it's index PUT to check and roll over the date so that a new index gets created? And my queries need to consider this also so they know that on each day they need to search the new index? #2 is there such a thing as eviction policies? Basically is there a way to check if we are running out of diskspace and to either remove entries from the index or in the above case delete/archive indexes older then a few days? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25618b41-f567-4d22-a2df-ca9319017897%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/25618b41-f567-4d22-a2df-ca9319017897%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8a28604-993f-44c4-8632-249cd01d29c0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Question about time based indexes/rolling indexes and eviction policies?
#1 I have been reading around and some people suggest if doing log analytics to split the index based on time. Is this built in into Elastic search or does it mean I have to do it manual? If manual PUT http://myhost:9200/myindex-(get-current-date-here)/SomeDoc/Id I'm pulling my data from SQL server and going to either use ETL or JDBC gatherer. I suppose the ETL process needs to consider the date and when it does it's index PUT to check and roll over the date so that a new index gets created? And my queries need to consider this also so they know that on each day they need to search the new index? #2 is there such a thing as eviction policies? Basically is there a way to check if we are running out of diskspace and to either remove entries from the index or in the above case delete/archive indexes older then a few days? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25618b41-f567-4d22-a2df-ca9319017897%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.