inconsistent paging
Hi, We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. The query we use has an order by some numberic field that has many documents with the same value (0). It looks like the ordering between documents according to the same value, which is 0, isn't consistent. Did anyone encounter such behavior? Any suggestions on resolving this? We're using version 1.3.1. Thanks, Ron -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: inconsistent paging
You need to use scroll if you have that requirement. See: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#search-request-scroll -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 18 août 2014 à 08:02, Ron Sher ron.s...@gmail.com a écrit : Hi, We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. The query we use has an order by some numberic field that has many documents with the same value (0). It looks like the ordering between documents according to the same value, which is 0, isn't consistent. Did anyone encounter such behavior? Any suggestions on resolving this? We're using version 1.3.1. Thanks, Ron -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8DAEA97B-687A-44A6-B638-189A49D6310E%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
Re: inconsistent paging
Hi Ron, The cause of this issue is that Elasticsearch uses Lucene's internal doc IDs as tie-breakers. Internal doc IDs might be completely different across replicas of the same data, so this explains why documents that have the same sort values are not consistently ordered. There are 2 potential ways to fix that problem: 1. Use scroll as David mentionned. It will create a context around your request and will make sure that the same shards will be used for all pages. However, it also gives another warranty, which is that the same point-in-time view on the index will be used for each page, and this is expensive to maintain. 2. Use a custom string value as a preference in order to always hit the same shards for a given session[1]. This will help with always hitting the same shards likely to 1. but without adding the additional cost of a scroll. [1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html On Mon, Aug 18, 2014 at 8:02 AM, Ron Sher ron.s...@gmail.com wrote: Hi, We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. The query we use has an order by some numberic field that has many documents with the same value (0). It looks like the ordering between documents according to the same value, which is 0, isn't consistent. Did anyone encounter such behavior? Any suggestions on resolving this? We're using version 1.3.1. Thanks, Ron -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7FJofXSpDjHnpMVs1poHFREbrQ9DPnPX4YnjFjUKg_ng%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with the percentiles aggregation
Hi John, You should be able to do something like: { aggs: { verb: { terms: { field: verb }, aggs: { load_time_outliers: { percentiles: { field: responsetime } } } } } } This will first break down your documents according to the http verb that is being used and then compute percentiles separately for each unique verb. On Fri, Aug 15, 2014 at 11:23 AM, John Ogden johnog65...@gmail.com wrote: Hi, Am trying to run a single command which calculates percentiles for multiple search queries. The data for this is an Apache log file, and I want to get the percentile response times for the gets, posts, heads (etc) in one go If I run this: curl -XPOST 'http://localhost:9200/_search?search_type=countpretty=true' -d '{ facets: { 0: {query : {term : { verb : get }}}, 1: {query : {term : { verb : post }}} }, aggs : {load_time_outlier : {percentiles : {field : responsetime}}} }' The response I get back has the counts for each subquery but only does the aggregations for the overall dataset facets : { 0 : { _type : query, count : 5678 }, 1 : { _type : query, count : 1234 } }, aggregations : { load_time_outlier : { values : { 1.0 : 0.0, ... 99.0 : 1234 } } } I cant figure out how to structure the request so that I get the percentiles separately for each of the queries Could someone point me in in the right direction please Many thanks John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d9696cb-adfa-4812-bd81-5efee0d29032%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9d9696cb-adfa-4812-bd81-5efee0d29032%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5JwTLK2q10fEKX6bVBzYH69dSRgA2njoEvhhronqfh1A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: impact of stored fields on performance
Hi Ashish, On Thu, Aug 14, 2014 at 12:35 AM, Ashish Mishra laughingbud...@gmail.com wrote: That sounds possible. We are using spindle disks. I have ~36Gb free for the filesystem cache, and the previous data size (without the added field) was 60-65Gb per node. So it's likely that 50% of queries were previously addressed out of the FS cache, even more if queries are unevenly distributed. Data size is now 200Gb/node. So only ~18% of queries could hit the cache and the rest would incur seek times. Hmm... given this knowledge, is there a way to mitigate the effect without moving everything to SSD? Only a minority of queries return the stored field and it is not indexed. Ideally, it would be stored in separate (colocated) files from the indexed fields. That way, most queries would be unaffected and only those returning the value incur the seek cost. I imagine indexes with _source enabled would see similar effects. Is a parent-child relationship a good way to achieve the scenario above? The parent can contain indexed fields and the child has stored fields. Not sure if this just introduces new problems. I think that you don't even need parent/child relations for this. If you identify a few large stored fields that you rarely need, you could store them in a different index with the same _id and only GET them on demand. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j48QpGoV6Gh8ns5SzrABLFmZLMjWx6iEUGea2evx06kAg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: accessing field data faster in script
Script filters are inherently slow due to the fact that they cannot leverage the inverted index in order to skip efficiently over non-matching documents. Even if they were written in assembly, this would likely still be slow. What kind of filtering are you trying to do with scripts? On Thu, Aug 14, 2014 at 8:42 AM, avacados kotadia.ak...@gmail.com wrote: How to access field data faster from native (java) script ??? should i enable 'doc values'? I am already using doc().getField() and casting to long. It is date field type. But whenever, my argument to script changes, it has poor performance for search query. Subsequent call with same argument has good performance. (might be because _cache is true for that script filter.) Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5MH4Pw_sLy9M7Tr01gH0L-QQbRfXQSQZg7iYrFT_EQtA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Return selected fields from aggregation?
Can you elaborate more on what you are after? On Wed, Aug 13, 2014 at 5:16 PM, project2501 darreng5...@gmail.com wrote: The old facet DSL was very nice and easy to understand. I could declare only which fields I wanted returned. how is this done with aggregations? The docs do not say. I am only interested in the aggregation metrics not all the document results. I tried setting size:0 but that DOES NOT EVEN WORK. Any help appreciated. Thank you, D -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2c845104-59c0-4ea4-90a6-551a93dc3f99%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2c845104-59c0-4ea4-90a6-551a93dc3f99%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4m%3DLXao3%3Df%2BgV5uMXO_nLhLq-7fe-JcCh1oKhp2f8jYg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
inconsistent paging
We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. The query we use has an order by some numberic field that has many documents with the same value (0). It looks like the ordering between documents according to the same value, which is 0, isn't consistent. Did anyone encounter such behavior? Any suggestions on resolving this? We're using version 1.3.1. Thanks, Ron -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/inconsistent-paging-tp4061986.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1408336696765-4061986.post%40n3.nabble.com. For more options, visit https://groups.google.com/d/optout.
Re: Access to AbstractAggregationBuilder.name
Hi Phil, We would indeed consider a PR for that change if it makes things easier to you. Feel free to ping me when you open it so that I don't miss it. On Wed, Aug 13, 2014 at 3:55 PM, Phil Wills otherp...@gmail.com wrote: Hello, In the Java API AbstractAggregationBuilder's name property is protected. Is there a particular reason it can't be public, or have an accessor added, or is this something you'd consider a PR for? Not having access is making things more complicated than I'd like. Thanks, Phil -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/63c1ae3f-ff37-4f47-9147-037d19ff9eec%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/63c1ae3f-ff37-4f47-9147-037d19ff9eec%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kGnOjSHYMQD%3DtEsJvaMuGps3qBHyaTHGQk%3DemKbXGvA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: inconsistent paging
You have asked teh same question from another GMAIL ID. Please refer to the answers over there. Thanks Vineeth On Mon, Aug 18, 2014 at 10:08 AM, ronsher rons...@gmail.com wrote: We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. The query we use has an order by some numberic field that has many documents with the same value (0). It looks like the ordering between documents according to the same value, which is 0, isn't consistent. Did anyone encounter such behavior? Any suggestions on resolving this? We're using version 1.3.1. Thanks, Ron -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/inconsistent-paging-tp4061986.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1408336696765-4061986.post%40n3.nabble.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DyZNSi7usuBuc88N26rXrOeZ0682VJq-6cFE281fkV2A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: accessing field data faster in script
Thanks Adrien for reply. My script filter was, === { script: { script: xyz, params: { startRange: 1407939675, // Timestamp in milliseconds ... keep changing on all queries endRange: 1410531675 // Timestamp in milliseconds. keep changing on all queries }, lang: native, _cache: true // I removed this caching and i found significant performance improvement... do you know why ? :-) } }, === My Native(Java) script code // Return true if date ranges overlaps. === ScriptDocValues XsDocValue = (ScriptDocValues) doc().get( start_time); long XsLong = 0l; if (XsDocValue != null !XsDocValue.isEmpty()) { XsLong = ((ScriptDocValues.Longs) doc().get(start_time)) .getValue(); } ScriptDocValues XeDocValue = (ScriptDocValues) doc().get(end_time); long XeLong = 0l; if (XeDocValue != null !XeDocValue.isEmpty()) { XeLong = ((ScriptDocValues.Longs) doc().get(end_time)) .getValue(); } if ((endRange = XsLong) (startRange = XeLong)) { return true; } === On Monday, August 18, 2014 1:50:17 PM UTC+5:30, Adrien Grand wrote: Script filters are inherently slow due to the fact that they cannot leverage the inverted index in order to skip efficiently over non-matching documents. Even if they were written in assembly, this would likely still be slow. What kind of filtering are you trying to do with scripts? On Thu, Aug 14, 2014 at 8:42 AM, avacados kotadi...@gmail.com javascript: wrote: How to access field data faster from native (java) script ??? should i enable 'doc values'? I am already using doc().getField() and casting to long. It is date field type. But whenever, my argument to script changes, it has poor performance for search query. Subsequent call with same argument has good performance. (might be because _cache is true for that script filter.) Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/afd89e62-0773-4684-904d-53805d9d7358%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31232686-2208-4c9e-a0a5-53e7e33ba275%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Using a char_filter in combination with a lowercase filter
Hi, We're using Elasticsearch with an Analyzer to map the `y` character to `ij`, (*char_fitler* named char_mapper) since in Dutch these two are somewhat interchangeable. We're also using a *lowercase filter*. This is the configuration: { analysis: { analyzer: { index: { type: custom, tokenizer: standard, filter: [ lowercase, synonym_twoway, standard, asciifolding ], char_filter: [ char_mapper ] }, index_prefix: { type: custom, tokenizer: standard, filter: [ lowercase, synonym_twoway, standard, asciifolding, prefixes ], char_filter: [ char_mapper ] }, search: { alias: [ default ], type: custom, tokenizer: standard, filter: [ lowercase, synonym, synonym_twoway, standard, asciifolding ], char_filter: [ char_mapper ] }, postal_code: { tokenizer: keyword, filter: [ lowercase ] } }, tokenizer: { standard: { stopwords: [ ] } }, filter: { synonym: { type: synonym, synonyms: [ st = sint, jp = jan pieterszoon, mh = maarten harpertszoon ] }, synonym_twoway: { type: synonym, synonyms: [ den haag, s gravenhage, den bosch, s hertogenbosch ] }, prefixes: { type: edgeNGram, side: front, min_gram: 1, max_gram: 30 } }, char_filter: { char_mapper: { type: mapping, mappings: [ y = ij ] } } } } When indexing cities, we're using this mapping: { properties: { city: { type: multi_field, fields: { city: { type: string }, prefix: { type: string, boost: 0.5, index_analyzer: index_prefix } } }, province_code: { type: string }, unique_name: { type: boolean }, point: { type: geo_point }, search_terms: { type: multi_field, fields: { search_terms: { type: string }, prefix: { boost: 0.5, index_analyzer: index_prefix, type: string } } } }, search_analyzer: search, index_analyzer: index } When we index all the (Dutch) cities from our data-source, there are cities starting with both `IJ` and `Y`. (for example, these citiy names exist: *IJssel*, *IJsselstein*, *Yerseke* and *Ysselsteyn.*) It seems that these characters are not lowercased before the char_mapping is applied. Querying the index, results in /top/city/_search?q=ijsselstein - works, returns the document for IJsselstein /top/city/_search?q=Ijsselstein - works, returns the document for IJsselstein /top/city/_search?q=yerseke - *doesn't *work, returns nothing /top/city/_search?q=Yerseke - *does *work, returns the document for Yerseke /top/city/_search?q=YsselsteYn - *doesn't *work, returns nothing /top/city/_search?q=Ysselsteyn - *does *work, returns the document for Ysselsteyn Changing the case of any other letter doesn't affect the results. I've worked around this issue by adding the mapping Y = ij, i.e.: char_filter: { char_mapper: { type: mapping, mappings: [ y = ij, Y = ij ] } } This solves the problem, but I'd rather see that the lowercase filter is applied before the mapping, or, that I can make the order explicit. Is there any stance on this issue? Or is this intended behaviour? Regards, Matthias Hogerheijde -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c60de452-2a3f-42f7-a677-956f81ecec17%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: accessing field data faster in script
Your filter would be faster if you used range filters on the start/end dates instead of using a script. On Mon, Aug 18, 2014 at 10:52 AM, avacados kotadia.ak...@gmail.com wrote: _cache: true // I removed this caching and i found significant performance improvement... do you know why ? :-) Yes: when caching a filter, it needs to be evaluated over all documents of your index in order to be loaded into a bit set. On the other hand, when a script filter is not cached it will typically only be evaluated on documents that match the query. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6q0HJc_-J0i_mLBe%2BGKhkFdBEeTTabuYFGx21VToRVnQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Excheption in suggester responce
Hi! I try test elasticsearch suggester, but i got strange error. user@user:/user/esconfig # curl -X POST 'localhost:9200/dwh_direct/_suggest?pretty' -d @suggester { _shards : { total : 5, successful : 0, failed : 5, failures : [ { index : dwh_direct, shard : 0, reason : BroadcastShardOperationFailedException[[dwh_direct][0] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 1, reason : BroadcastShardOperationFailedException[[dwh_direct][1] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 2, reason : BroadcastShardOperationFailedException[[dwh_direct][2] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 3, reason : BroadcastShardOperationFailedException[[dwh_direct][3] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; } ] } } user@user:/user/esconfig # more suggester { my-suggest : { text : co, completion : { field : name } } } Is it bug in elasticsearch or i made mistake in configuration or query? -- Maxim Krasovsky -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/465403c8-6f5a-4151-9c0f-e6e490fdfe13%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Enhancing perf for my cluster
Hi everyone ! I'm currently working on a tool with *ES and Twitter Streaming API*, in which I try to find interesting profiles on Twitter, based on what they tweet, RT and which of their interactions are shared/RT. Anyway, I use ES to index and search among tweets. To do that, I get Twitter stream data and put in a *single index users tweets (2 types)*, linked by the user id via un parent-child relation. Actually, I thought of my indexing a lot and it is the best way to do it. - I need to update very often users (because i score them and because they update their profile quite often), so get the user nested in the tweet is not an option (too many replicas) - I could put user's tweets directly in the user object but I would have huge objects and I don't really want that. I work on a SoYouStart Server, 4c/4t 3.2GHz, 32Go RAM, 4To HDD. My settings for the index are : settings = { index : { number_of_replicas : 0, refresh_interval : '10s', routing.allocation.disable_allocation: False }, analysis: { analyzer: { snowFrench:{ type: snowball, language: French }, snowEnglish:{ type: snowball, language: English }, snowGerman:{ type: snowball, language: German }, snowRussian:{ type: snowball, language: Russian }, snowSpanish:{ type: snowball, language: Spanish }, snowJapanese:{ type: snowball, language: Japanese }, edgeNGramAnalyzer:{ tokenizer: myEdgeNGram }, name_analyzer: { tokenizer: whitespace, type: custom, filter: [lowercase, multi_words, name_filter] }, city_analyzer : { type : snowball, language : English } }, tokenizer : { myEdgeNGram : { type : edgeNGram, min_gram : 2, max_gram : 5 }, name_tokenizer: { type: edgeNGram, max_gram: 100, min_gram: 4 } }, filter: { multi_words: { type: shingle, min_shingle_size: 2, max_shingle_size: 10 }, name_filter: { type: edgeNGram, max_gram: 100, min_gram: 4 } } } } And my mappings are : tweet_mapping = { _all : { enabled : False }, _ttl : { enabled : True, default : 400d }, _parent : { type : 'user' }, properties: { textfr: { 'type': 'string', '_analyzer': 'snowFrench', 'copy_to': 'text' }, texten: { 'type': 'string', '_analyzer': 'snowEnglish', 'copy_to': 'text' }, textde: { 'type': 'string', '_analyzer': 'snowGerman', 'copy_to': 'text' }, textja: { 'type': 'string', '_analyzer': 'snowJapanese', 'copy_to': 'text' }, textru: { 'type': 'string', '_analyzer': 'snowRussian', 'copy_to': 'text' }, textes: { 'type': 'string', '_analyzer': 'snowSpanish', 'copy_to': 'text' }, text: { 'type': 'string', 'null_value': '', 'index': 'analyzed', 'store': 'yes' }, entities: { 'type': 'object', 'index': 'analyzed', 'store': 'yes', 'properties': { hashtags: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, mentions: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'long', 'precision_step': 64 } } }, lang: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'string' }, created_at: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'date', 'format' : 'dd-MM- HH:mm:ss' } } } user_mapping = { _all : { enabled : False }, _ttl : { enabled : True, default : 600d }, properties: { lang: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'string' }, name: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, screen_name: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, descfr: { 'type': 'string', '_analyzer': 'snowFrench', 'copy_to': 'description' }, descen: { 'type': 'string', '_analyzer': 'snowEnglish', 'copy_to': 'description' }, descde: { 'type': 'string', '_analyzer': 'snowGerman', 'copy_to': 'description' }, descja: { 'type': 'string', '_analyzer': 'snowJapanese', 'copy_to': 'description' }, descru: { 'type': 'string', '_analyzer': 'snowRussian', 'copy_to': 'description' }, desces: { 'type': 'string', '_analyzer': 'snowSpanish', 'copy_to': 'description' }, description: { 'type': 'string', 'null_value': '', 'index': 'analyzed', 'store': 'yes' }, created_at: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'date',
elasticsearch php api with multiple hosts
I followed this link to create an elasticsearch 2 nodes cluster on Azure: this link http://thomasardal.com/running-elasticsearch-in-a-cluster-on-azure/ the installation and configuring went good. When i started to check the cluster i found a strange behaviour from the php client. I declared 2 hosts in the client: $ELSEARCH_SERVER = array(dns1:9200,dns2:9200); $params = array(); $params['hosts'] = $ELSEARCH_SERVER; $dstEl = new Elasticsearch\Client($params); the excpected behaviour is that it will try to insert the documents to dns1 and if it fails it will *automatically* change to dns2. but, for some reason when one of the servers is down on insertion the php client throws an exception that it couldn't connect to host and only. Is there any way to cause the client automatically choose an online server? thnx, Niv -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7489eb44-bff3-41d1-baa1-da70b508ef66%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
How to update nest from 0.12 to 1.0
Hello. Does someone use NEST for .NET? Please help me. Sometime ago I asked how to get part of textfield. I wanted to do it with Highlight param no_match_size, but it's supported since NEST version 1.0RC1. After update nest.dll from 0.12 to 1.0 I got problem that nothing works. Looking GitHub for changelog didn't help. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5952eaf6-7d31-4682-9789-bf4a720768ee%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A few questions about node types + usage
Hello again Mark, Thanks for your response. Your answers really are very helpful. As with our previous conversation https://groups.google.com/d/topic/elasticsearch/ZouS4NVsTJw/discussion I am confused about how to make a client node also be master eligible. This is what I posted there, I would really like some help understanding this: I've done more investigating and it seems that a Client (AKA Query) node cannot also be a Master node. As it says here http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election *Nodes can be excluded from becoming a master by setting node.master to false. Note, once a node is a client node (node.client set to true), it will not be allowed to become a master (node.master is automatically set to false).* And from the elasticsearch.yml config file it says: *# 2. You want this node to only serve as a master: to not store any data and # to have free resources. This will be the coordinator of your cluster. # #node.master: true #node.data: false # # 3. You want this node to be neither master nor data node, but # to act as a search load balancer (fetching data from nodes, # aggregating results, etc.) # #node.master: false #node.data: false* So I'm wondering how exactly you set up your client nodes to also be master nodes. It seems like a master node can only either be purely a master or master + data. Perhaps you could show the relevant parts of one of your client node's config? Many thanks, Alex On Saturday, 16 August 2014 01:04:37 UTC+1, Mark Walkom wrote: 1 - Up to you. We use the http output and then just use a round robin A record to our 3 masters. 2 - They are routed but it makes more sense to specify. 3 - You're right, but most people only use 1 or 2 masters which is why they get recommended to have at least 3. 4 - That sounds like a lot. We use masters that double as clients and they only have 8GB, our use sounds similar and we don't have issues. I wouldn't bother with 3 client only nodes to start, use them as master and client and then if you find you are hitting memory issues due to queries you can re-evaluate things. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 15 August 2014 20:11, Alex alex@gmail.com javascript: wrote: Bump. Any help? Thanks On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote: Hello I would like some clarification about node types and their usage. We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can also be masters (discovery.zen.minimum_master_nodes set to 4). We will use Logstash and Kibana. Kibana will be used 24/7 by between a couple and handfuls of people. Some questions: 1. Should incoming Logstash write requests be sent to the cluster in general (using the *cluster* setting in the *elasticsearch* output) or specifically to the client nodes or to the data nodes (via load balancer)? I am unsure what kind of node is best for handling writes. 2. If client nodes exist in the cluster are Kibana requests automatically routed to them? Do I need to somehow specify to Kibana which nodes to contact? 3. I have heard different information about master nodes and the minimum_master_node setting. I've heard that you should have a odd number of master nodes but I fail to see why the parity of the number of masters matters as long as minimum_master_node is set to at least N/2 + 1. Does it really need to be odd? 4. I have been advised that the client nodes will use huge amount of memory (which makes sense due to the nature of the Kibana facet queries). 64GB per client node was recommended but I have no idea if that sounds right or not. I don't have the ability to actually test it right now so any more guidance on that would be helpful. I'd be so grateful to hear from you even if you only know something about one of my queries. Thank you for your time, Alex -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/70b16a1e-319c-4f7c-b129-b68258b3652f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/70b16a1e-319c-4f7c-b129-b68258b3652f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web
river-csv plugin
Hi, This is for elasticsearch : elasticsearch-1.3.2-1.noarch There are 2 nodes in the cluster. I have installed the river-csv pluging. When loading a file with 5 million rows loading stops after 477400 rows. I load with : curl -XPUT localhost:9200/_river/my_csv_river/_meta -d ' { type : csv, csv_file : { folder : /u01/app/div, first_line_is_header:true } }' In the logfile I see : [2014-08-18 14:44:53,216][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Stanley] [csv][my_csv_river] Going to execute new bulk composed of 100 actions [2014-08-18 14:44:53,275][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Stanley] [csv][my_csv_river] Executed bulk composed of 100 actions [2014-08-18 14:44:53,280][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Stanley] [csv][my_csv_river] Going to execute new bulk composed of 100 actions [2014-08-18 14:44:53,299][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Stanley] [csv][my_csv_river] Executed bulk composed of 100 actions [2014-08-18 14:44:53,385][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Stanley] [csv][my_csv_river] Executed bulk composed of 100 actions ./es -v indices status name pri rep size bytes docs green _river 1 1 15452 2 green my_csv_river 5 1 296047073 477400 Am I doing something wrong? Regards HansP -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76cafcc4-9966-4c0b-b891-b18b9376a74f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Excheption in suggester responce
Hello Maxim , Can you show the schema and a sample data that you have indexed. Thanks Vineeth On Mon, Aug 18, 2014 at 3:31 PM, m...@ciklum.com wrote: Hi! I try test elasticsearch suggester, but i got strange error. user@user:/user/esconfig # curl -X POST 'localhost:9200/dwh_direct/_suggest?pretty' -d @suggester { _shards : { total : 5, successful : 0, failed : 5, failures : [ { index : dwh_direct, shard : 0, reason : BroadcastShardOperationFailedException[[dwh_direct][0] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 1, reason : BroadcastShardOperationFailedException[[dwh_direct][1] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 2, reason : BroadcastShardOperationFailedException[[dwh_direct][2] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; }, { index : dwh_direct, shard : 3, reason : BroadcastShardOperationFailedException[[dwh_direct][3] ]; nested: ElasticsearchException[failed to execute suggest]; nested: ClassCastException[org.elasticsearch.index.mapper.core.StringFieldMapper cannot be cast to org.elasticsearch.index.mapper.core.CompletionFieldMapper]; } ] } } user@user:/user/esconfig # more suggester { my-suggest : { text : co, completion : { field : name } } } Is it bug in elasticsearch or i made mistake in configuration or query? -- Maxim Krasovsky -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/465403c8-6f5a-4151-9c0f-e6e490fdfe13%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/465403c8-6f5a-4151-9c0f-e6e490fdfe13%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DMQmX788p%3Db0G%2Bqk_z6wwsA9HBtLu66fLRoKxvupRo%3Dw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: EsRejectedExecutionException: rejected execution (queue capacity 1000)
You can put *threadpool.search.type: **cached* on elasticsearch.yml for unbounded queue for reads. 2014-08-10 9:52 GMT-03:00 James digital...@gmail.com: On Sat, 2014-08-09 at 23:53 -0700, Deep wrote: Hi, Elastic search internally has thread pool and a queue size is associated with each pool. You can have pools for search threads, index threads etc. Please see the elastic search documentation in link http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html . I think it is possible to override these properties in the elasticsearch.yml configuration file. Regards, Ishwardeep On Saturday, 9 August 2014 00:54:02 UTC+5:30, digit...@gmail.com wrote: So I've seen a few posts on this, but I've not seen any solutions posted. I've been log monitoring and I was trying to determine how to fix the below...any information would be great thank you. [2014-08-08 19:14:12,578][DEBUG][action.search.type ] [Jericho Drumm] [bro-201408032100][2], node[fgjxNK0cQ3O5Usn7wyjaMA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@126067b7] lastShard [true] org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@5a879352 at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) at org.elasticsearch.search.action.SearchServiceTransportAction.execute(SearchServiceTransportAction.java:509) at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:203) at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80) at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:171) at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:153) at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59) at org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63) at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:101) at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43) at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63) at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92) at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212) at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:75) at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159) at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142) at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121) at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83) at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:294) at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:44) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296) at
Re: Query problem
David hi, How can I configure the mapping so that the default analyzer will be the whitespace one? On Wed, Aug 13, 2014 at 2:46 PM, David Pilato da...@pilato.fr wrote: Having no answer is not good. I think something goes wrong here. May be you should see something in logs. That said, if you don't want to break your string as tokens at index time, you could set index:not_analyzed for fields you don't want to analyze. But, you should read this part of the book: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/analysis-intro.html#analysis-intro -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr Le 13 août 2014 à 14:39:20, Luc Evers (lucev...@gmail.com) a écrit: I like to use elasticsearch as Nosql database + search engine for data coming from text files (router configs) and databases . First I moved a routerconfig to a json file which I indexed . Mapping: { configs : { mappings : { test : { properties : { ConfLength : { type : string }, NVRAM : { type : string }, aaa : { type : string }, enable : { type : string }, hostname : { type : string }, lastChange : { type : string }, logging : { type : string }, model : { type : string }, policy-map : { type : string } } } } } } Document: { _index : configs, _type : test, _id : 7, _score : 1, _source : { hostname : [ hostname test-1234 ] } }, Example of a simple search: search a hostname. *If I start a query*: *curl -XGET 'http://127.0.0.1:9200/configs/_search?q= http://127.0.0.1:9200/configs/_search?q=hostname test-1234'* curl: (52) Empty reply from server No respone If I start a second query without hostname if got an answer: *curl -XGET 'http://127.0.0.1:9200/configs/_search?q= http://127.0.0.1:9200/configs/_search?q=test-1234'* OKE Analyser: standard Why a search instruction can find test-1234 but not hostname test-1234 ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b25127bb-2dca-440c-a7b3-937b5ddccd6d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b25127bb-2dca-440c-a7b3-937b5ddccd6d%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/xOrC6RMG_nw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53eb5e1b.3804823e.18f0%40MacBook-Air-de-David.local https://groups.google.com/d/msgid/elasticsearch/etPan.53eb5e1b.3804823e.18f0%40MacBook-Air-de-David.local?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAA0yNqLNnQK0%2BtJgRkOqwgJawqngMjmWJfXDgijpcuEbQYbyZw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
ThreadPool reject_policy
What does it work the threadpool using reject_policy *caller*? Can I catch the exception EsRejectedExecutionException (using Java api) during heavy writes? -- Atenciosamente, Sávio S. Teles de Oliveira voice: +55 62 9136 6996 http://br.linkedin.com/in/savioteles Mestrando em Ciências da Computação - UFG Arquiteto de Software CUIA Internet Brasil -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFKmhPtnm9xV21nhvtE%3D0hv4GoLXhugNpkJXqC9Mec93892USg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Query problem
I think could help you: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 18 août 2014 à 15:39:36, Luc Evers (lucev...@gmail.com) a écrit: David hi, How can I configure the mapping so that the default analyzer will be the whitespace one? On Wed, Aug 13, 2014 at 2:46 PM, David Pilato da...@pilato.fr wrote: Having no answer is not good. I think something goes wrong here. May be you should see something in logs. That said, if you don't want to break your string as tokens at index time, you could set index:not_analyzed for fields you don't want to analyze. But, you should read this part of the book: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/analysis-intro.html#analysis-intro -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 13 août 2014 à 14:39:20, Luc Evers (lucev...@gmail.com) a écrit: I like to use elasticsearch as Nosql database + search engine for data coming from text files (router configs) and databases . First I moved a routerconfig to a json file which I indexed . Mapping: { configs : { mappings : { test : { properties : { ConfLength : { type : string }, NVRAM : { type : string }, aaa : { type : string }, enable : { type : string }, hostname : { type : string }, lastChange : { type : string }, logging : { type : string }, model : { type : string }, policy-map : { type : string } } } } } } Document: { _index : configs, _type : test, _id : 7, _score : 1, _source : { hostname : [ hostname test-1234 ] } }, Example of a simple search: search a hostname. If I start a query: curl -XGET 'http://127.0.0.1:9200/configs/_search?q=hostname test-1234' curl: (52) Empty reply from server No respone If I start a second query without hostname if got an answer: curl -XGET 'http://127.0.0.1:9200/configs/_search?q=test-1234;' OKE Analyser: standard Why a search instruction can find test-1234 but not hostname test-1234 ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b25127bb-2dca-440c-a7b3-937b5ddccd6d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/xOrC6RMG_nw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53eb5e1b.3804823e.18f0%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAA0yNqLNnQK0%2BtJgRkOqwgJawqngMjmWJfXDgijpcuEbQYbyZw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53f202c7.2eb141f2.132%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/d/optout.
How to normalize score when combining regular query and function_score?
First of all kudos on the awesome job everyone here is doing! I was wondering if you guys can help me solve this puzzle: Also available on stack overflow: http://stackoverflow.com/questions/25361795/elasticsearch-how-to-normalize-score-when-combining-regular-query-and-function Idealy what I am trying to achieve is to assign weights to queries such that query1 constitutes 30% of the final score and query2 consitutes other 70%, so to achieve the maximum score a document has to have highest possible score on query1 and query2. My study of the documentation did not yield any hints as to how to achieve this so lets try to solve a simpler problem. Consider a query in following form: { query: { bool: { should: [ { function_score: { query: {match_all: {}}, script_score: { script: some_script, } } }, { match: { message: this is a test } } ] } } } The script can return an arbitrary number (think- it can return something like 12392002). How do I make sure that the result from the script will not dominate the overall score? (my experiments using explain show that this indeed can happen very often) Is there any way to normalize it? For example instead of script score return the ratio to max_script_score (achieved by document with highest score)? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2179ed93-575c-47d5-a13a-42d1e2244baa%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
help with a grok filter
Could someone help me write a grok filter for this log real quick here is what the log looks like: Aug 18 09:40:39 server01 webmin_log: 172.16.16.96 - username *[18/Aug/2014:09:40:39 -0400]* GET /right.cgi?open=systemopen=status HTTP/1.1 200 3228 here is what I have so far: match = [ message, %{SYSLOGTIMESTAMP:timestamp} %{WORD:Server} webmin_log: %{IP:IP_Address} - %{USERNAME:username} *[ stuck at this middle part [18/Aug/2014:09:40:39 -0400] *] %{WORD:method} %{URIPATHPARAM:request} HTTP/1.1 %{NUMBER:bytes} %{NUMBER:duration} -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4784c4b4-65ab-4894-8a1b-a8ab0fba0ed6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with the percentiles aggregation
That's spot on. Thanks! On 18 Aug 2014 09:08, Adrien Grand adrien.gr...@elasticsearch.com wrote: Hi John, You should be able to do something like: { aggs: { verb: { terms: { field: verb }, aggs: { load_time_outliers: { percentiles: { field: responsetime } } } } } } This will first break down your documents according to the http verb that is being used and then compute percentiles separately for each unique verb. On Fri, Aug 15, 2014 at 11:23 AM, John Ogden johnog65...@gmail.com wrote: Hi, Am trying to run a single command which calculates percentiles for multiple search queries. The data for this is an Apache log file, and I want to get the percentile response times for the gets, posts, heads (etc) in one go If I run this: curl -XPOST 'http://localhost:9200/_search?search_type=countpretty=true' -d '{ facets: { 0: {query : {term : { verb : get }}}, 1: {query : {term : { verb : post }}} }, aggs : {load_time_outlier : {percentiles : {field : responsetime}}} }' The response I get back has the counts for each subquery but only does the aggregations for the overall dataset facets : { 0 : { _type : query, count : 5678 }, 1 : { _type : query, count : 1234 } }, aggregations : { load_time_outlier : { values : { 1.0 : 0.0, ... 99.0 : 1234 } } } I cant figure out how to structure the request so that I get the percentiles separately for each of the queries Could someone point me in in the right direction please Many thanks John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d9696cb-adfa-4812-bd81-5efee0d29032%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9d9696cb-adfa-4812-bd81-5efee0d29032%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/6tHMOeWYtoo/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5JwTLK2q10fEKX6bVBzYH69dSRgA2njoEvhhronqfh1A%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5JwTLK2q10fEKX6bVBzYH69dSRgA2njoEvhhronqfh1A%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGfq%3DRjVu58Jetkgf%3DGvJ4BkLjhWYPvm789UGPrr0U%2BOiA_Wxg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
ES ignores index.query.bool.max_clause_count in elasticsearch.yml
It seems to me that ES ignores the index.query.bool.max_clause_count argument in elasticsearch.yml Setting index.query.bool.max_clause_count: 5000 results in the following error: Caused by: org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 Any solution whats going wrong here? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ddedd43b-9f1c-4373-9280-671d7cc828a9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with the percentiles aggregation
Slight follow on - do you know if returning this sort of stuff via Kibana is on the cards? Just looking for an easy way to graph the results. Thanks. On Friday, 15 August 2014 10:23:16 UTC+1, John Ogden wrote: Hi, Am trying to run a single command which calculates percentiles for multiple search queries. The data for this is an Apache log file, and I want to get the percentile response times for the gets, posts, heads (etc) in one go If I run this: curl -XPOST 'http://localhost:9200/_search?search_type=countpretty=true' -d '{ facets: { 0: {query : {term : { verb : get }}}, 1: {query : {term : { verb : post }}} }, aggs : {load_time_outlier : {percentiles : {field : responsetime}}} }' The response I get back has the counts for each subquery but only does the aggregations for the overall dataset facets : { 0 : { _type : query, count : 5678 }, 1 : { _type : query, count : 1234 } }, aggregations : { load_time_outlier : { values : { 1.0 : 0.0, ... 99.0 : 1234 } } } I cant figure out how to structure the request so that I get the percentiles separately for each of the queries Could someone point me in in the right direction please Many thanks John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/579dad15-4470-4f0d-a787-9b51fd7b447a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Help with multiple data ranges in a single query
I've been given a requirement to produce a single kibana dashboard showing app response times for multiple date ranges, and am stumped at how to proceed. The user wants to see today's graph, along with the previous working day, day -7, day -28 and day -364 on the same screen - ideally, all 4 metrics in the same histogram if they select another date range they want that to show the day-1, day-7 (etc) results too The only thing I've been able to come up with so far is pushing each source event into elastic search 4 times (once with right timestamp,one with +1 day, one with +7 days, one with +28 days, etc.) and writing separate queries for each, but this just feels wrong. Any ideas how else the requirement could be met? Many thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3525d473-4172-45b6-852f-a0e4826eca3b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with the percentiles aggregation
Support for aggregations is indeed something that is on the roadmap for the next version of Kibana (Kibana 4), see this message from Rashid: https://groups.google.com/forum/?utm_medium=emailutm_source=footer#!msg/elasticsearch/I7um1mX4GSk/aUsT2EmyxysJ On Mon, Aug 18, 2014 at 4:33 PM, John Ogden johnog65...@gmail.com wrote: Slight follow on - do you know if returning this sort of stuff via Kibana is on the cards? Just looking for an easy way to graph the results. Thanks. On Friday, 15 August 2014 10:23:16 UTC+1, John Ogden wrote: Hi, Am trying to run a single command which calculates percentiles for multiple search queries. The data for this is an Apache log file, and I want to get the percentile response times for the gets, posts, heads (etc) in one go If I run this: curl -XPOST 'http://localhost:9200/_search?search_type=countpretty=true' -d '{ facets: { 0: {query : {term : { verb : get }}}, 1: {query : {term : { verb : post }}} }, aggs : {load_time_outlier : {percentiles : {field : responsetime}}} }' The response I get back has the counts for each subquery but only does the aggregations for the overall dataset facets : { 0 : { _type : query, count : 5678 }, 1 : { _type : query, count : 1234 } }, aggregations : { load_time_outlier : { values : { 1.0 : 0.0, ... 99.0 : 1234 } } } I cant figure out how to structure the request so that I get the percentiles separately for each of the queries Could someone point me in in the right direction please Many thanks John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/579dad15-4470-4f0d-a787-9b51fd7b447a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/579dad15-4470-4f0d-a787-9b51fd7b447a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5pu8of4T06R8nVv1%3DvBy3wrX5Oqqowwhiiiqv5jhyK0w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Aggregates - include source data
Hi, From looking at the docs, didn't seem overly clear. Is it possible to include the data in an aggregate, or is it counts only? John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/83032bdc-53f4-4728-b109-e9ab3eb3d412%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Aggregates - include source data
Aggregations only report counts or various metrics (see the metrics aggregations: stats, min, max, sum, percentiles, cardinality, top_hits, ...). Maybe top_hits is what you are looking for? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html On Mon, Aug 18, 2014 at 5:34 PM, John D. Ament john.d.am...@gmail.com wrote: Hi, From looking at the docs, didn't seem overly clear. Is it possible to include the data in an aggregate, or is it counts only? John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/83032bdc-53f4-4728-b109-e9ab3eb3d412%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/83032bdc-53f4-4728-b109-e9ab3eb3d412%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Kct0y6LfDFENvvfgnN7N04vOpp35zRpg%3DG4AHw94Jhg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Optimization Questions
Hi Greg, I believe max_num_segments is technically a hint that can be overridden by the merge algorithm if it decides to. You might try simply re-running the optimize again to get from ~25 down closer to 1. Sorry but I don't know of any way to see when the optimize is finished - it's really just forcing a merge so looking at merge stats is what you want. Hope that helps. Andrew On Aug 15, 2014, at 8:01 PM, Gregory Sutcliffe gsutcli...@publishthis.com wrote: Hey Guys, We were doing some updates to our es(1.3.1) clusters recently and had some questions about _optimize. We optimized with max_num_segments 1 and we're still seeing ~25 segments per shard. The index that was optimized had no writes going to it during the time, it was actually freshly re-opened after an upgrade. Also, are there any tricks to seeing when an optimize is done other that watching merges stats and disk IO? Maybe some data in marvel? Thanks for your assistance, Greg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/17622c72-f004-4fda-92fb-dda393a64807%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88238DE4-AFC4-41D0-B495-ED9938D7CB9C%40elasticsearch.com. For more options, visit https://groups.google.com/d/optout.
Re: Enhancing perf for my cluster
Hey guys, Finally i changed all my queries to constantscorequeries. It's way better, but still, certain pages take a lot of time running... I don't understand why, and i don't have anything in my ES logs... Now the average time for search 20 users and their mentions/timeline + scoring them is about 4s (and almost 4s for the search). But when it takes time, it's still 60s for 1 page !!! I tried reading the explain data i can't get after the query but there's no response time. How can I find a way to understand why certain queries take so much time ? Thanks ! Le lundi 18 août 2014 12:29:10 UTC+2, Pierrick Boutruche a écrit : Hi everyone ! I'm currently working on a tool with *ES and Twitter Streaming API*, in which I try to find interesting profiles on Twitter, based on what they tweet, RT and which of their interactions are shared/RT. Anyway, I use ES to index and search among tweets. To do that, I get Twitter stream data and put in a *single index users tweets (2 types)*, linked by the user id via un parent-child relation. Actually, I thought of my indexing a lot and it is the best way to do it. - I need to update very often users (because i score them and because they update their profile quite often), so get the user nested in the tweet is not an option (too many replicas) - I could put user's tweets directly in the user object but I would have huge objects and I don't really want that. I work on a SoYouStart Server, 4c/4t 3.2GHz, 32Go RAM, 4To HDD. My settings for the index are : settings = { index : { number_of_replicas : 0, refresh_interval : '10s', routing.allocation.disable_allocation: False }, analysis: { analyzer: { snowFrench:{ type: snowball, language: French }, snowEnglish:{ type: snowball, language: English }, snowGerman:{ type: snowball, language: German }, snowRussian:{ type: snowball, language: Russian }, snowSpanish:{ type: snowball, language: Spanish }, snowJapanese:{ type: snowball, language: Japanese }, edgeNGramAnalyzer:{ tokenizer: myEdgeNGram }, name_analyzer: { tokenizer: whitespace, type: custom, filter: [lowercase, multi_words, name_filter] }, city_analyzer : { type : snowball, language : English } }, tokenizer : { myEdgeNGram : { type : edgeNGram, min_gram : 2, max_gram : 5 }, name_tokenizer: { type: edgeNGram, max_gram: 100, min_gram: 4 } }, filter: { multi_words: { type: shingle, min_shingle_size: 2, max_shingle_size: 10 }, name_filter: { type: edgeNGram, max_gram: 100, min_gram: 4 } } } } And my mappings are : tweet_mapping = { _all : { enabled : False }, _ttl : { enabled : True, default : 400d }, _parent : { type : 'user' }, properties: { textfr: { 'type': 'string', '_analyzer': 'snowFrench', 'copy_to': 'text' }, texten: { 'type': 'string', '_analyzer': 'snowEnglish', 'copy_to': 'text' }, textde: { 'type': 'string', '_analyzer': 'snowGerman', 'copy_to': 'text' }, textja: { 'type': 'string', '_analyzer': 'snowJapanese', 'copy_to': 'text' }, textru: { 'type': 'string', '_analyzer': 'snowRussian', 'copy_to': 'text' }, textes: { 'type': 'string', '_analyzer': 'snowSpanish', 'copy_to': 'text' }, text: { 'type': 'string', 'null_value': '', 'index': 'analyzed', 'store': 'yes' }, entities: { 'type': 'object', 'index': 'analyzed', 'store': 'yes', 'properties': { hashtags: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, mentions: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'long', 'precision_step': 64 } } }, lang: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'string' }, created_at: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'date', 'format' : 'dd-MM- HH:mm:ss' } } } user_mapping = { _all : { enabled : False }, _ttl : { enabled : True, default : 600d }, properties: { lang: { 'index': 'not_analyzed', 'store': 'yes', 'type': 'string' }, name: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, screen_name: { 'index': 'analyzed', 'store': 'yes', 'type': 'string', _analyzer: edgeNGramAnalyzer }, descfr: { 'type': 'string', '_analyzer':
indexing problem when using logstash
I am using the foollowing config file filter{ grok{ match=[ message, (?:\?|\)C\=%{DATA:kw}\%{DATA}\sT\s%{DATA:town}\sS\s%{WORD:state}\s%{DATA}%{IP:ip} ] } grok{ match=[ message, (?:\?|\)SRC\=%{DATA:src}(?:\|$) ] } } output { elasticsearch { host = localhost } stdout { codec = rubydebug } } And I thought kw, town, state, etc. will be fields in elastic search. But trying http://localhost:9200/_search?q=town:* AND state:* I am getting {took:5,timed_out:false,_shards:{total:5,successful:5,failed:0},hits:{*total:0*,max_score:null,hits:[]}} -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b99b5f5a-9063-4970-8da2-106efc5de196%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: help with a grok filter
On Monday, August 18, 2014 9:57:41 AM UTC-4, Kevin M wrote: Could someone help me write a grok filter for this log real quick here is what the log looks like: Aug 18 09:40:39 server01 webmin_log: 172.16.16.96 - username *[18/Aug/2014:09:40:39 -0400]* GET /right.cgi?open=systemopen=status HTTP/1.1 200 3228 here is what I have so far: match = [ message, %{SYSLOGTIMESTAMP:timestamp} %{WORD:Server} webmin_log: %{IP:IP_Address} - %{USERNAME:username} *[ stuck at this middle part [18/Aug/2014:09:40:39 -0400] *] %{WORD:method} %{URIPATHPARAM:request} HTTP/1.1 %{NUMBER:bytes} %{NUMBER:duration} It is just a sequence of regular expressions catching fields one by one. Look, e.g at my post. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fc1251d5-d346-475d-9d21-bf993b45062e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: help with a grok filter
I dont see your post - what I am stuck with is whenever the date changes on that log example: *[18/Aug/2014:09:40:39 -0400]* *[20/Aug/2014:11:40:39 -0104]* *[19/Aug/2014:08:40:39 -0500]* the filter will not match it On Monday, August 18, 2014 1:53:37 PM UTC-4, vitaly wrote: On Monday, August 18, 2014 9:57:41 AM UTC-4, Kevin M wrote: Could someone help me write a grok filter for this log real quick here is what the log looks like: Aug 18 09:40:39 server01 webmin_log: 172.16.16.96 - username *[18/Aug/2014:09:40:39 -0400]* GET /right.cgi?open=systemopen=status HTTP/1.1 200 3228 here is what I have so far: match = [ message, %{SYSLOGTIMESTAMP:timestamp} %{WORD:Server} webmin_log: %{IP:IP_Address} - %{USERNAME:username} *[ stuck at this middle part [18/Aug/2014:09:40:39 -0400] *] %{WORD:method} %{URIPATHPARAM:request} HTTP/1.1 %{NUMBER:bytes} %{NUMBER:duration} It is just a sequence of regular expressions catching fields one by one. Look, e.g at my post. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b2e3db4a-385d-4bb0-aa2c-0b5b7f96b728%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Experimental Highlighter 0.0.11 Released
I released version 0.0.11 of the Experimental Highlighter https://github.com/wikimedia/search-highlighter we've been using . Its compatible with Elasticsearch 1.3.x and has a few new features: 1. Conditional highlighting - skip highlighting fields you aren't going to use! Save time and IO bandwidth! 1. Regular expressions - now you have two problems! Read more at the link above if you are interested. Its in use on our beta site http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page so you can try it and verify that it doesn't crash and stuff. Cheers, Nik -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd00Lq%2B-k-STjnxC%2B-Lbx6jtP0zT9ShhmKcCmQFdN1ZdcA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Mapper Attachment plugin 2.2.1 released
Heya, We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.2.1 The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.. Release Notes - Version 2.2.1 Earlier today there was an Apache POI release to address a security vulnerability. For some document types, the attachment mapper plugin will indirectly use POI. This attachment mapper plugin release forces an update to Apache POI and is a response to the POI issue. Previously, the attachment mapper did not have an explicit dependency on POI. With this release, we have added a direct dependency and set it to the recent versions of POI. This will help users of the attachment mapper, who might be unaware of these vulnerabilities, avoid them. You can read more about the reported issues in CVE-2014-3529 and CVE-2014-3574 We encourage anyone using the attachment mapper plugin with untrusted documents to update the plugin. Update [80] - Update a few dependencies Doc [79] - Docs: make the welcome page more obvious Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-mapper-attachments project repository! For questions or comments around this plugin, feel free to use elasticsearch mailing list! Enjoy, - The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53f25b3e.5bd062c2.132%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Mapper Attachment plugin 2.3.1 released
Heya, We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.3.1 The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.. Release Notes - Version 2.3.1 Earlier today there was an Apache POI release to address a security vulnerability. For some document types, the attachment mapper plugin will indirectly use POI. This attachment mapper plugin release forces an update to Apache POI and is a response to the POI issue. Previously, the attachment mapper did not have an explicit dependency on POI. With this release, we have added a direct dependency and set it to the recent versions of POI. This will help users of the attachment mapper, who might be unaware of these vulnerabilities, avoid them. You can read more about the reported issues in CVE-2014-3529 and CVE-2014-3574 We encourage anyone using the attachment mapper plugin with untrusted documents to update the plugin. Update [80] - Update a few dependencies Doc [79] - Docs: make the welcome page more obvious Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-mapper-attachments project repository! For questions or comments around this plugin, feel free to use elasticsearch mailing list! Enjoy, - The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53f25b7d.1190cde7.132%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/d/optout.
Unassigned Node and shards
I saw this problem twice now. I start with a Green two-node cluster, default 5 shards/node, I index about 50,000 docs, shards/replicas look great and well balanced across the 2 nodes. I try the same test with 8 million docs. I come back when its done, and I see all primary shards on node1 and 2 replicas on node2 and three unassigned replicas on a third unassigned node. I will look through the logs, but I was wondering if anyone has seen something similar or has any idea where/why this is coming from before I dig? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2bddca5e-ea3d-4804-9a0d-47d98bab96c1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] swift-repository-plugin v0.5 released
Hi all, Just released to Central the v0.5 of the swift-repository plugin. Mainly contains documentation updates but also built against 1.3.2 instead of 1.1.0. https://github.com/wikimedia/search-repository-swift -Chad -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0bc2613f-f97b-4dea-af93-d2bc9bb8521c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to increase memory
What version of ES do you use? Jörg On Mon, Aug 18, 2014 at 9:42 PM, rookie7799 pavelbara...@gmail.com wrote: Hello there, We are having the same exact problem with a really resource hungry query: 5 nodes with 16GB ES_HEAP_SIZE 1.2 Billion records inside 1 index with 5 shards Whenever we start running an aggregate query the whole cluster breaks and disconnects. Why can't it just not return results and simple give and error without actually killing the entire cluster? Cheers! On Saturday, February 9, 2013 1:05:54 PM UTC-5, Igor Motov wrote: ES_HEAP_SIZE ES_MAX_MEM ES_MIN_MEM are environment variables. They need to be specified on the command line. For example: ES_HEAP_SIZE=4g bin/elasticsearch -f To get JVM stats, you need to set jvm=true on stats request: curl -XGET 'http://localhost:9200/_cluster/nodes/stats?jvm=true; pretty=true' To understand how much memory you need, give it as much as you can, put some load and monitor jvm.mem.heap_used in the output of the stats command above. If this number ever goes and stays above 90% of available heap it's typically a good indicator that you need more. There is a small Russian elasticsearch forum - https://groups.google.com/ forum/?fromgroups=#!forum/elasticsearch-ru On Saturday, February 9, 2013 12:57:04 PM UTC-5, Николай Измайлов wrote: In continuation of the topic https://github.com/ elasticsearch/elasticsearch/issues/2636#issuecomment-13332877 in continuation of the topic https://github.com/ elasticsearch/elasticsearch/issues/2636#issuecomment-13332877 On the page http://www.elasticsearch.org/guide/reference/setup/ installation.html it is said that it is necessary to increase ES_HEAP_SIZE ES_MAX_MEM ES_MIN_MEM, but I have not found this configuration then /etc/elasticsearch/elasticsearch.yml. Here's my cluster { cluster_name : elasticsearch, nodes : { VPjABUm-REmy24NQ_AkXDQ : { timestamp : 1360432148849, name : Sin, transport_address : inet[/ip:9300], hostname : Ubuntu-1204-precise-64-minimal, indices : { store : { size : 34.6gb, size_in_bytes : 37221752556, throttle_time : 0s, throttle_time_in_millis : 0 }, docs : { count : 58480, deleted : 4759 }, indexing : { index_total : 20, index_time : 1.7s, index_time_in_millis : 1748, index_current : 0, delete_total : 0, delete_time : 0s, delete_time_in_millis : 0, delete_current : 0 }, get : { total : 2, time : 5ms, time_in_millis : 5, exists_total : 0, exists_time : 0s, exists_time_in_millis : 0, missing_total : 2, missing_time : 5ms, missing_time_in_millis : 5, current : 0 }, search : { query_total : 1726375, query_time : 7.7m, query_time_in_millis : 462631, query_current : 0, fetch_total : 61663, fetch_time : 20.9s, fetch_time_in_millis : 20955, fetch_current : 0 }, cache : { field_evictions : 0, field_size : 0b, field_size_in_bytes : 0, filter_count : 5896, filter_evictions : 0, filter_size : 511.6kb, filter_size_in_bytes : 523944, bloom_size : 22.1kb, bloom_size_in_bytes : 22640, id_cache_size : 0b, id_cache_size_in_bytes : 0 }, merges : { current : 0, current_docs : 0, current_size : 0b, current_size_in_bytes : 0, total : 0, total_time : 0s, total_time_in_millis : 0, total_docs : 0, total_size : 0b, total_size_in_bytes : 0 }, refresh : { total : 15, total_time : 143ms, total_time_in_millis : 143 }, flush : { total : 25, total_time : 3.2s, total_time_in_millis : 3205 } } } } } As understand how much I need to allocate memory for elasticsearch and in General the description for each of the parameters. there is a Russian community ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12846c2e-e777-4812-bce2-d6c97a30c352%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/12846c2e-e777-4812-bce2-d6c97a30c352%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit
Top hits aggregation default sort
I using the top hits aggregation with a has_child query. In the top_hits aggregation documentation it says '*By default the hits are sorted by the score of the main query*', but I'm not seeing that in the results for my query { from: 0, size: 3, query: { has_child: { score_mode: max, type: child_type, query: { match: { myField: { query: some text } } } } }, aggs: { replies: { terms: { field: parent_type_id, size: 3 }, aggs: { topChildren: { top_hits: { size: 1 } } } } } } the has_child query returns three parent results with the following scores. - doc 1 = 0.83619833 - doc 2 = 0.7210085 - doc 3 = 0.7210085 The score for the top hits aggregations are: - first top hit aggregation = 0.29160267 - second top hit aggregation = 0.83619833 - third top hit aggregation = 0.58320534 So the 'second top hit aggregation' should be returned first followed with aggregations with the score 0.7210085? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b6849ad-4308-4afe-a76b-80153620f74b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to increase memory
Hi, it's 1.3.2 On Monday, August 18, 2014 5:49:03 PM UTC-4, Jörg Prante wrote: What version of ES do you use? Jörg On Mon, Aug 18, 2014 at 9:42 PM, rookie7799 pavelb...@gmail.com javascript: wrote: Hello there, We are having the same exact problem with a really resource hungry query: 5 nodes with 16GB ES_HEAP_SIZE 1.2 Billion records inside 1 index with 5 shards Whenever we start running an aggregate query the whole cluster breaks and disconnects. Why can't it just not return results and simple give and error without actually killing the entire cluster? Cheers! On Saturday, February 9, 2013 1:05:54 PM UTC-5, Igor Motov wrote: ES_HEAP_SIZE ES_MAX_MEM ES_MIN_MEM are environment variables. They need to be specified on the command line. For example: ES_HEAP_SIZE=4g bin/elasticsearch -f To get JVM stats, you need to set jvm=true on stats request: curl -XGET 'http://localhost:9200/_cluster/nodes/stats?jvm=true; pretty=true' To understand how much memory you need, give it as much as you can, put some load and monitor jvm.mem.heap_used in the output of the stats command above. If this number ever goes and stays above 90% of available heap it's typically a good indicator that you need more. There is a small Russian elasticsearch forum - https://groups.google.com/forum/?fromgroups=#!forum/elasticsearch-ru On Saturday, February 9, 2013 12:57:04 PM UTC-5, Николай Измайлов wrote: In continuation of the topic https://github.com/ elasticsearch/elasticsearch/issues/2636#issuecomment-13332877 in continuation of the topic https://github.com/ elasticsearch/elasticsearch/issues/2636#issuecomment-13332877 On the page http://www.elasticsearch.org/guide/reference/setup/ installation.html it is said that it is necessary to increase ES_HEAP_SIZE ES_MAX_MEM ES_MIN_MEM, but I have not found this configuration then /etc/elasticsearch/elasticsearch.yml. Here's my cluster { cluster_name : elasticsearch, nodes : { VPjABUm-REmy24NQ_AkXDQ : { timestamp : 1360432148849, name : Sin, transport_address : inet[/ip:9300], hostname : Ubuntu-1204-precise-64-minimal, indices : { store : { size : 34.6gb, size_in_bytes : 37221752556, throttle_time : 0s, throttle_time_in_millis : 0 }, docs : { count : 58480, deleted : 4759 }, indexing : { index_total : 20, index_time : 1.7s, index_time_in_millis : 1748, index_current : 0, delete_total : 0, delete_time : 0s, delete_time_in_millis : 0, delete_current : 0 }, get : { total : 2, time : 5ms, time_in_millis : 5, exists_total : 0, exists_time : 0s, exists_time_in_millis : 0, missing_total : 2, missing_time : 5ms, missing_time_in_millis : 5, current : 0 }, search : { query_total : 1726375, query_time : 7.7m, query_time_in_millis : 462631, query_current : 0, fetch_total : 61663, fetch_time : 20.9s, fetch_time_in_millis : 20955, fetch_current : 0 }, cache : { field_evictions : 0, field_size : 0b, field_size_in_bytes : 0, filter_count : 5896, filter_evictions : 0, filter_size : 511.6kb, filter_size_in_bytes : 523944, bloom_size : 22.1kb, bloom_size_in_bytes : 22640, id_cache_size : 0b, id_cache_size_in_bytes : 0 }, merges : { current : 0, current_docs : 0, current_size : 0b, current_size_in_bytes : 0, total : 0, total_time : 0s, total_time_in_millis : 0, total_docs : 0, total_size : 0b, total_size_in_bytes : 0 }, refresh : { total : 15, total_time : 143ms, total_time_in_millis : 143 }, flush : { total : 25, total_time : 3.2s, total_time_in_millis : 3205 } } } } } As understand how much I need to allocate memory for elasticsearch and in General the description for each of the parameters. there is a Russian community ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12846c2e-e777-4812-bce2-d6c97a30c352%40googlegroups.com
How to safely migrate from one mount to another mount in Elasticsearch to store the data
Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyanshaj...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aPBZLB0i-y-_JFJFJzVVLiZBz3VDiJUYTt%2BduUZ-Br6Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyanshaj...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bMQP2HW4UxbV%3DX81fN-M0r7diZ9HVyZFRDcf9Q0P6mrA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it loose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com javascript: wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb602b14-4d3e-430c-93d8-935da98af66a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com javascript: wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyanshaj...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YSXt_0L4W%3DNfA_3PjxyMz%2BXKZi0SS2noQQ3qdb0pOJWw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: how to get char_filter to work?
Sorry if I have not replied sooner, but I was on vacation. I would use the two fields solution, especially since you simply cannot store a stripped version. The source field is compressed, so the additional index size is content dependent. Never used highlighting, so I cannot recommend alternative approaches. I use jsoup to strip HTML before the data reaches Elasticsearch. Not sure if it is the best, but I have been using it for years. Cheers, Ivan On Wed, Aug 13, 2014 at 8:16 AM, IronMan2014 sabdall...@gmail.com wrote: Ivan, A followup question, As I mentioned earlier storing html and applying char-filter doesn't really work especially with highlighted fields coming back with weird html display. So, I am thinking stripping html before indexing, so no html in index and source, but I will add an extra field like html_content which meant to store the html version and not be indexed. Do you see any problems with my approach? I see one like big index size. What do you recommend for an ideal solution? I am still confused as I thought this would be a common problem? On Friday, August 8, 2014 8:16:09 PM UTC-4, IronMan wrote: Thanks again. I wasn't expecting it to remove what's between the tags. I believe I understand the behavior and maybe its the case where I was greedy and expecting ElasticSearch to do it all. Here is a scenario that I was looking for: Assume I am looking to get an excerpt of text (Extracted text from a document), Elastic Search query will give me excerpt with html tags, but the tags are out of context, so I would have liked to be to display this excerpt with no html tags, I know I can probably strip the tags after the fact, but that's what I was trying to avoid. In other words, in a perfect world, I would have liked 2 versions of the document, the original html one and another stripped one. When I need to query things like excerpts, I would query the stripped one, and when I needed the html, I would query the source. Hopefully I didn't make this more confusing. On Friday, August 8, 2014 4:58:03 PM UTC-4, Ivan Brusic wrote: The tokens that appear in the analyze API are the ones that are put into the inverted index. When you search for one of the terms that is not an HTML tag, there will be a match. What I don't understand after reading in detail your original, is exactly what behavior you are expecting. You indexed the phrase htmltrying out bElasticsearch/b, This is an html test/html but you expected a query for the term html to not match. However, the work html is clearly in the content. The html stripper will not remove the contents in between the tags, just the tags themselve. The analyze API should show you the correct term. Lucene has more control over what information you can retrieve, but the only way to get the analyzed token stream back from Elasticsearch is to use the analyze API on the field. Most people do not want an analyzed token stream, just the original field. -- Ivan On Fri, Aug 8, 2014 at 12:01 PM, IronMike sabda...@gmail.com wrote: Also, Here is a link for someone who had the same problem, I am not sure if there was a final answer to that one. http://grokbase.com/t/gg/ elasticsearch/126r4kv8tx/problem-with-standard-html-strip, I have to admit that I am a bit confused now about this topic. I understand analyzers will tokenize the sentence and strip html in the case of the html_strip, and _analyze works fine using the analyzer, what I am failing to understand, is how can I get the results of these tokens. Isn't the whole idea to be able to search for them tokens eventually? If not, whats the solution of what I would think is a common scenario, having to index html documents, where html tags don't need to be indexed, while keeping the original html for presentational purpose? Any ideas (Besides having to strip html tags manually before indexing? On Friday, August 8, 2014 1:02:07 PM UTC-4, IronMike wrote: Thanks for explaining. So, is there a way to be able to get non html from the index? I thought I read that it was possible to index without the html tags while keeping source intact. So, how would I get at the index with non html tags if you will? On Friday, August 8, 2014 12:52:37 PM UTC-4, Ivan Brusic wrote: The field is derived from the source and not generated from the tokens. If we indexed the sentence The quick brown foxes jumped over the lazy dogs with the english analyzer, the tokens would be http://localhost:9200/_analyze?text=The%20quick%20brown% 20foxes%20jumped%20over%20the%20lazy%20dogsanalyzer=english quick brown fox jump over lazi dog After applying stopwords and stemming, the tokens do not form a sentence that looks like the original. -- Ivan On Fri, Aug 8, 2014 at 9:42 AM, IronMike sabda...@gmail.com wrote: Ivan, The search results I am showing is for the field title not for the source. I thought I could query the field not the source and look at it
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyanshaj...@gmail.com wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa 9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/13131192-405a-43b9-ab56-62ff894c8237%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/13131192-405a-43b9-ab56-62ff894c8237%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. --
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Just to make sure if /auto/share goes down I have data in /auto/foo. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyan...@gmail.com javascript: wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa 9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
To make sure if /auto/share goes down, I have data in /auto/foo. And I am short of space on /auto/share. Mainly bcz of these 2 reasons. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyan...@gmail.com javascript: wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa 9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
This is why you have replicas, they give you redundancy at a higher level that the filesystem, If you are still concerned then you should add another node and increase your replicas. Playing around on the FS to create replicas is only extra management overhead and likely to end up causing more problems than it's worth. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:59, shriyansh jain shriyanshaj...@gmail.com wrote: Just to make sure if /auto/share goes down I have data in /auto/foo. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyan...@gmail.com wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa 9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e2 3-4e6b-8cf5-9a8bb31c4328%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/01d32d9d-3041-4fb7-babe-0e73e3908b31%40goo glegroups.com
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Apart from replica's, that's really outside the scope of what ES provides. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 12:12, shriyansh jain shriyanshaj...@gmail.com wrote: I got your point sir, but if my entire /auto/share goes down. Then I wont have any chance to recover the data in /auto/share. Is there any other way to recover the data.? Thanks, Shriyansh On Monday, August 18, 2014 7:03:34 PM UTC-7, Mark Walkom wrote: This is why you have replicas, they give you redundancy at a higher level that the filesystem, If you are still concerned then you should add another node and increase your replicas. Playing around on the FS to create replicas is only extra management overhead and likely to end up causing more problems than it's worth. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:59, shriyansh jain shriyan...@gmail.com wrote: Just to make sure if /auto/share goes down I have data in /auto/foo. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyan...@gmail.com wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google. com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google. com/d/msgid/elasticsearch/2dbadd5f-5e23-4e6b-8cf5-9a8bb31c4328% 40googlegroups.com
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
Thank you for helping me out. I really appreciate it. Regards, Shriyansh On Monday, August 18, 2014 7:23:50 PM UTC-7, Mark Walkom wrote: Apart from replica's, that's really outside the scope of what ES provides. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 12:12, shriyansh jain shriyan...@gmail.com javascript: wrote: I got your point sir, but if my entire /auto/share goes down. Then I wont have any chance to recover the data in /auto/share. Is there any other way to recover the data.? Thanks, Shriyansh On Monday, August 18, 2014 7:03:34 PM UTC-7, Mark Walkom wrote: This is why you have replicas, they give you redundancy at a higher level that the filesystem, If you are still concerned then you should add another node and increase your replicas. Playing around on the FS to create replicas is only extra management overhead and likely to end up causing more problems than it's worth. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:59, shriyansh jain shriyan...@gmail.com wrote: Just to make sure if /auto/share goes down I have data in /auto/foo. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyan...@gmail.com wrote: As you mentioned the node will not lose the data completely, is there any possibility that it will lose some data.? Thank you, Shriyansh On Monday, August 18, 2014 4:17:54 PM UTC-7, Mark Walkom wrote: If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new location it will simply start to copy the data from the other node so that you fulfil your replica requirements. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:58, shriyansh jain shriyan...@gmail.com wrote: Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:32, shriyansh jain shriyan...@gmail.com wrote: I would prefer with no data in /auto/foo.? But would like to go with way, which is efficient and more reliable. Thank you, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google. com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.
Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data
I would like to know one more thing, what would be steps if I want to copy the data from /auto/share to /auto/foo for a particular node.? Thanks, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyan...@gmail.com javascript: wrote: Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without loosing any data.? And is it possible to do that without loosing any data.? Thank you, Shriyansh -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415f8d41-4fa9-4f6d-86b9-41b2059ab67f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c427919f-b8f1-4286-9302-0869d2aa7b5a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Using a char_filter in combination with a lowercase filter
Char filters are applied before the text is tokenized, and therefore they are applied before the normal filters are used, which is why they are a separate class of filter. With Lucene, the order is: char filters - tokenizer - filters Have you looked into the ICU analyzer? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-icu-plugin.html I have no idea how well it works with Dutch. Cheers, Ivan On Mon, Aug 18, 2014 at 2:14 AM, Matthias Hogerheijde matthias.hogerhei...@goabout.com wrote: Hi, We're using Elasticsearch with an Analyzer to map the `y` character to `ij`, (*char_fitler* named char_mapper) since in Dutch these two are somewhat interchangeable. We're also using a *lowercase filter*. This is the configuration: { analysis: { analyzer: { index: { type: custom, tokenizer: standard, filter: [ lowercase, synonym_twoway, standard, asciifolding ], char_filter: [ char_mapper ] }, index_prefix: { type: custom, tokenizer: standard, filter: [ lowercase, synonym_twoway, standard, asciifolding, prefixes ], char_filter: [ char_mapper ] }, search: { alias: [ default ], type: custom, tokenizer: standard, filter: [ lowercase, synonym, synonym_twoway, standard, asciifolding ], char_filter: [ char_mapper ] }, postal_code: { tokenizer: keyword, filter: [ lowercase ] } }, tokenizer: { standard: { stopwords: [ ] } }, filter: { synonym: { type: synonym, synonyms: [ st = sint, jp = jan pieterszoon, mh = maarten harpertszoon ] }, synonym_twoway: { type: synonym, synonyms: [ den haag, s gravenhage, den bosch, s hertogenbosch ] }, prefixes: { type: edgeNGram, side: front, min_gram: 1, max_gram: 30 } }, char_filter: { char_mapper: { type: mapping, mappings: [ y = ij ] } } } } When indexing cities, we're using this mapping: { properties: { city: { type: multi_field, fields: { city: { type: string }, prefix: { type: string, boost: 0.5, index_analyzer: index_prefix } } }, province_code: { type: string }, unique_name: { type: boolean }, point: { type: geo_point }, search_terms: { type: multi_field, fields: { search_terms: { type: string }, prefix: { boost: 0.5, index_analyzer: index_prefix, type: string } } } }, search_analyzer: search, index_analyzer: index } When we index all the (Dutch) cities from our data-source, there are cities starting with both `IJ` and `Y`. (for example, these citiy names exist: *IJssel*, *IJsselstein*, *Yerseke* and *Ysselsteyn.*) It seems that these characters are not lowercased before the char_mapping is applied. Querying the index, results in /top/city/_search?q=ijsselstein - works, returns the document for IJsselstein /top/city/_search?q=Ijsselstein - works, returns the document for IJsselstein /top/city/_search?q=yerseke - *doesn't *work, returns nothing /top/city/_search?q=Yerseke - *does *work, returns the document for Yerseke /top/city/_search?q=YsselsteYn - *doesn't *work, returns nothing /top/city/_search?q=Ysselsteyn - *does *work, returns the document for Ysselsteyn Changing the case of any other letter doesn't affect the results. I've worked around this issue by adding the mapping Y = ij, i.e.: char_filter: { char_mapper: { type: mapping, mappings: [ y = ij, Y = ij ] } } This solves the problem, but I'd rather see that the lowercase filter is applied before the mapping, or, that I can make the order explicit. Is there any stance on this issue? Or is this intended behaviour? Regards, Matthias Hogerheijde -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c60de452-2a3f-42f7-a677-956f81ecec17%40googlegroups.com
Re: A few questions about node types + usage
Master, data and client are really just abstractions of different combinations of node.data and node.master values. A node.master=true, node.data=false can handle both cluster management and queries. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 18 August 2014 22:49, Alex alex.mon...@gmail.com wrote: Hello again Mark, Thanks for your response. Your answers really are very helpful. As with our previous conversation https://groups.google.com/d/topic/elasticsearch/ZouS4NVsTJw/discussion I am confused about how to make a client node also be master eligible. This is what I posted there, I would really like some help understanding this: I've done more investigating and it seems that a Client (AKA Query) node cannot also be a Master node. As it says here http://www.elasticsearch. org/guide/en/elasticsearch/reference/current/modules- discovery-zen.html#master-election *Nodes can be excluded from becoming a master by setting node.master to false. Note, once a node is a client node (node.client set to true), it will not be allowed to become a master (node.master is automatically set to false).* And from the elasticsearch.yml config file it says: *# 2. You want this node to only serve as a master: to not store any data and # to have free resources. This will be the coordinator of your cluster. # #node.master: true #node.data: false # # 3. You want this node to be neither master nor data node, but # to act as a search load balancer (fetching data from nodes, # aggregating results, etc.) # #node.master: false #node.data: false* So I'm wondering how exactly you set up your client nodes to also be master nodes. It seems like a master node can only either be purely a master or master + data. Perhaps you could show the relevant parts of one of your client node's config? Many thanks, Alex On Saturday, 16 August 2014 01:04:37 UTC+1, Mark Walkom wrote: 1 - Up to you. We use the http output and then just use a round robin A record to our 3 masters. 2 - They are routed but it makes more sense to specify. 3 - You're right, but most people only use 1 or 2 masters which is why they get recommended to have at least 3. 4 - That sounds like a lot. We use masters that double as clients and they only have 8GB, our use sounds similar and we don't have issues. I wouldn't bother with 3 client only nodes to start, use them as master and client and then if you find you are hitting memory issues due to queries you can re-evaluate things. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 15 August 2014 20:11, Alex alex@gmail.com wrote: Bump. Any help? Thanks On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote: Hello I would like some clarification about node types and their usage. We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can also be masters (discovery.zen.minimum_master_nodes set to 4). We will use Logstash and Kibana. Kibana will be used 24/7 by between a couple and handfuls of people. Some questions: 1. Should incoming Logstash write requests be sent to the cluster in general (using the *cluster* setting in the *elasticsearch* output) or specifically to the client nodes or to the data nodes (via load balancer)? I am unsure what kind of node is best for handling writes. 2. If client nodes exist in the cluster are Kibana requests automatically routed to them? Do I need to somehow specify to Kibana which nodes to contact? 3. I have heard different information about master nodes and the minimum_master_node setting. I've heard that you should have a odd number of master nodes but I fail to see why the parity of the number of masters matters as long as minimum_master_node is set to at least N/2 + 1. Does it really need to be odd? 4. I have been advised that the client nodes will use huge amount of memory (which makes sense due to the nature of the Kibana facet queries). 64GB per client node was recommended but I have no idea if that sounds right or not. I don't have the ability to actually test it right now so any more guidance on that would be helpful. I'd be so grateful to hear from you even if you only know something about one of my queries. Thank you for your time, Alex -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/70b16a1e-319c-4f7c-b129-b68258b3652f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/70b16a1e-319c-4f7c-b129-b68258b3652f%40googlegroups.com?utm_medium=emailutm_source=footer