Re: Questions about dedicated master client node
If it is good enough for you, it is good enough for you. I will just give you one anecdote: We implemented 3 dedicated clients on a 9 data node cluster and got a 2x performance improvement. Moving the query coordination, network io (has to receive data from every shard), and combination of results (aggs and sorts) off of the nodes providing the results is very helpful. James On Sat, May 30, 2015 at 9:11 AM, Xudong You xudong@gmail.com wrote: Thanks Nikolas, How do you think about dedicated client node (the so called load balance node)? Any benefit of dedicated client node? Seems to me, round robin to data nodes is good enough. On Friday, May 29, 2015 at 10:55:01 PM UTC+8, Nikolas Everett wrote: Dedicated master nodes are super convenient if you have the it infrastructure to host them on shared machines because they are very low load and its useful to be able to restart the master nodes quickly. We don't have that kind of infrastructure and our cluster is pretty large and not having it has bitten us once or twice but its not a huge problem. On Fri, May 29, 2015 at 10:44 AM, Xudong You xudon...@gmail.com wrote: Right now we only need 4 ES nodes due to the small data volume, and all 4 nodes are master data nodes. Q1: I am wondering in this case, is it necessary to have dedicated master and client node? Any benefit of having dedicated master node? Some one said that dedicated master nodes (say, three master nodes) is helpful to avoid the split brain issue, but even we have NO dedicated master nodes, we can also avoid the split brain by setting the *discovery.zen.minimum_master_nodes *to a appropriate value. Q2: Similarly, is dedicated client node really necessary in our 4 nodes case? Any benefit of allocating dedicated client node? Thanks! -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de7db788-a6d2-48c2-934b-bc5f7ae311a9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/de7db788-a6d2-48c2-934b-bc5f7ae311a9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d61afae2-2e47-4b65-866b-5a55d28b84ea%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/d61afae2-2e47-4b65-866b-5a55d28b84ea%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTbQispJSH%3D7_wbk-W5%2BmMq1_4Yy2Mxeh8RL%2BAAYeaRx6g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Aggregation profiling?
I don't have an answer, but I really like this question. I too would love to see more query and aggregation profiling tools for performance optimization purposes. Also, I assume you have already looked at this, but have you made sure you are not evicting anything from your in memory field data? James On Mon, May 25, 2015 at 4:08 PM, Mike Sukmanowsky mike.sukmanow...@gmail.com wrote: I don't believe there are any current endpoints in the API that support this, but are there plans to add better profiling information to ES aggregation queries? We'll see some agg queries return in 11s, then 5s then 11s again. Sometimes we can see associated filter cache expirations, but it's really hard to line these up to one specific query in our production environment since multiple users are executing queries simultaneously. It'd be really helpful to optionally see where aggregation queries are spending the bulk of their time to help us understand what to improve in the future. Anything we can do here right now? -- Mike Sukmanowsky Aspiring Digital Carpenter *e*: mike.sukmanow...@gmail.com facebook http://facebook.com/mike.sukmanowsky | twitter http://twitter.com/msukmanowsky | LinkedIn http://www.linkedin.com/profile/view?id=10897143 | github https://github.com/msukmanowsky -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOH6cu5WSGqQ%2BZ0_qrofXEvwo8JuSH9xoSbZgSwiT90MJ_wxdA%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAOH6cu5WSGqQ%2BZ0_qrofXEvwo8JuSH9xoSbZgSwiT90MJ_wxdA%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTZOmx-fk%2BG9dR6-XYB_1j7mGRNRwTqvQRwKx0YAcopFWA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Shard query cache size
Hi Adrien, thanks very much for this clarification. I am always trying to learn more about how Elasticsearch works, and that clarification was very helpful. James On Thu, May 21, 2015 at 6:12 PM, Adrien Grand adr...@elastic.co wrote: On Thu, May 21, 2015 at 11:49 PM, James Macdonald james.macdon...@geofeedia.com wrote: Hi, I am a little confused by your response. Are you saying that query/filter caches are invalidated across all data in a shard every time the refresh interval ticks over? Sorry for the confusion: - the query cache caches entire requests per index, and is competely invalidated across all data every time the refresh interval ticks over AND there have been changes since the last refresh - the filter cache caches matching documents per segment, it is invalidated per segment only when a segment goes away (typically because it's been merged to a larger segment), which is unfrequent for large segments - the fielddata cache caches the document-value mapping per segment and has the same invalidation rules as the filter cache I was under the impression that all field data and caching related operations were performed on a Lucene index segment level and that the caches would only be invalidated for a given segment if that segment had changed since the last refresh. Since most data is stored in large segments that don't take fresh writes and seldom merge this would mean that most caches are good for long periods of time; even if the shard is under constant indexing load. Am I mistaken? This is right for the fielddata and filter caches, but not for the query cache. -- Adrien -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAjZBXouCRxNtvVmtsx%2BZK1_iymRC-7YSojULF7pSK%2Bebg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAjZBXouCRxNtvVmtsx%2BZK1_iymRC-7YSojULF7pSK%2Bebg%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTZtOWEF%3DRjBj_Xmt9x%2BHQdREn4vT2dFnbjRiF88oPnofA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Shard query cache size
Hi, I am a little confused by your response. Are you saying that query/filter caches are invalidated across all data in a shard every time the refresh interval ticks over? I was under the impression that all field data and caching related operations were performed on a Lucene index segment level and that the caches would only be invalidated for a given segment if that segment had changed since the last refresh. Since most data is stored in large segments that don't take fresh writes and seldom merge this would mean that most caches are good for long periods of time; even if the shard is under constant indexing load. Am I mistaken? Thanks, James On Thu, May 21, 2015 at 9:28 AM, Adrien Grand adr...@elastic.co wrote: It depends how likely it is for you to run the same aggregation again. Note that this cache is fully invalidated at every refresh (meaning either every second by default, or every time that you update/add/remove documents if you perform less than 1 operation per second). So this cache will only be used if you are likely to run the exact same request twice or more in a short period of time. We assumed that this situation is not common, so a small cache would be enough for the maybe 4 or 5 requests that would be run again and again. You can increase it if you think it will be helpful in your case although I would advise to be careful, maybe memory would be better spent on eg. the filesystem cache. On Thu, May 21, 2015 at 3:41 PM, Mike Sukmanowsky mike.sukmanow...@gmail.com wrote: Hi all, We store Marvel-style timeseries data in Elasticsearch and make very heavy use of aggregations (all queries are effectively aggregations). We've been playing around with the shard query cache and have a question. Is there a reason the shard query cache is set to such a low level of JVM heap by default? 1% seems awfully low unless ES assumes most people aren't making heavy use of aggregations? Any harm in us significantly boosting this from 1% to say 15% of heap? Most of our machines have 30GB of RAM and heap at 50% of that (15GB) so the query cache is 150MB by default. Think we'd like to experiment growing that to at least 10% of heap to have 1GB in use for this cache. Mike -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3a0fa7b-49f8-4d78-a520-6eeb16d53de3%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a3a0fa7b-49f8-4d78-a520-6eeb16d53de3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAhjZW%2B7BoLqXOk8nEfOvaKrgUroZhbG41BVUDh2bHhEnQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAhjZW%2B7BoLqXOk8nEfOvaKrgUroZhbG41BVUDh2bHhEnQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTYY85YeKRUBFLoMcAfnPj%3DBP6GqM020N%2BrZ9yp32%2BQY0A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Aggregation not limited to filter?
I had a similar problem recently and solved it by moving my filter into a filtered query (leaving the query as a match_all), see documentation here http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html . I am not certain why filters do not restrict the scope of the aggregates, but queries do, but I suspect it interprets the filter (not wrapped in a filtered_query) as a post_filter ( http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html). Maybe someone else actually knows why. Hope that helps, James On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com wrote: I must be doing something stupid! Using the Java client I can perform a search with a filter and iterate over the hits. I see exactly the right source documents. If I add an aggregation, I see the expected keyAsText string but the docCount reflects the volume if the filter had not been applied. I expected the aggregation to be restricted to the results within that filter? Thanks, James -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.