I have just created a issue: https://github.com/elasticsearch/elasticsearch/issues/6463
Regards. Em quinta-feira, 5 de junho de 2014 20h02min05s UTC-3, Mark Walkom escreveu: > > This would probably be worth raising as a github issue - > https://github.com/elasticsearch/ > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com <javascript:> > web: www.campaignmonitor.com > > > On 5 June 2014 22:38, Marcelo Paes Rech <marcelo...@gmail.com > <javascript:>> wrote: > >> Hi Jörg. Thanks for your reply again. >> >> As I said, I already had used ids filter, but I got the same behaviour. >> >> I realized what was wrong. Maybe it could be a bug in ES or not. When I >> executed the filter I included "from" and "size" attibutes. In this case >> "size" was 999999, but the final result would be just 10 documents. >> Aparently ES pre-allocates the objects that I say I will use (maybe for >> performance reasons), but if the final result is not the total (999999), ES >> doesn't remove remaining pre-allocated objects until the memory (heap) is >> full. >> >> I changed the size attribute to 10 and heap became stable. >> >> >> That's it. Thanks. >> >> Regards. >> >> Em quarta-feira, 4 de junho de 2014 19h54min15s UTC-3, Jörg Prante >> escreveu: >>> >>> Why do you use terms on _id field and not the the ids filter? ids filter >>> is more efficient since it reuses the _uid field which is cached by default. >>> >>> Do the terms in the query vary from query to query? If so, caching might >>> kill your heap. >>> >>> Another possible issue is that your query is not distributed to all >>> shards, if the query does not vary from user to user in your test. If so, >>> you created a "hot spot", all the load from the 100 users wold go to a >>> limited number of node with a limited shard count. >>> >>> The search thread pool seems small with 50 searches if you execute >>> searches for 100 users in parallel, this can lead to a congestion of the >>> search module. Why don't you use 100 (at least)? >>> >>> Jörg >>> >>> >>> On Wed, Jun 4, 2014 at 2:40 PM, Marcelo Paes Rech <marcelo...@gmail.com> >>> wrote: >>> >>>> Hi Jörg. Thanks for your reply. >>>> >>>> Here is my filter. >>>> {"filter": >>>> { >>>> "terms" : { >>>> "_id" : [ "QSxrbEM8TKe5zr8931xBjA", "wj63ghegRwC6qLsWq2chkA", >>>> "hYEhDbAqQwSRxhYfvDgFkg", "4bZmPE1fTYqijphRyyWiuQ", >>>> "Fhq53yYyT3CEw6vclKu_NA", "XL2atBraTEyx57MefjFVhA", >>>> "951i0dZkT064FlQkzHnnWA", "O8Ixbir1TrGT_IA3wKfsHg", >>>> "8k4U7KsuTmsThqxy-5YaKw", "GNOoQTHglf22kzcE7EOf8g", >>>> "-RQeY48fTg2kYnh2M4E1cQ", "u8DGBdfVR9WRVj6d9E4Ebw", >>>> "WFHSXd7UQvCMYFBhFcTsng", "qnQ7q7FyTsg397lM1EWgqA", >>>> "wRQtUzdMRy2qOkMCNxdpgA", "Ll83iglxSUS_Gs7mjkMt8w", >>>> "d2sxZ1oBTfuvAfov5EJ0iw", "cyht-vB4Q-mMSg9N5jcGXg", >>>> "bNSVaO47QTOCkfJhWo0qjg", "BHuhm55IRerKnynJ8WgFTw", >>>> "fHKA4PF2QteWm8E7dW7CAw", "DLE6A7tyQJ-zcKcCa6IPSA", >>>> "qfelTW7-SuGRQ0GKbngARA", "R7VHHJhYsUqfuxYof8BJ8w", >>>> "W4PqiJfPSlSFjVKFsGkA4Q", "Juq62zOsRdheuW3O6Gb2KA", >>>> "U9v0IKj_RrgRNjE31ZTt2g", "uNHa0kOOT5qjPpzxZcs35A", >>>> "SwOgVNgIRwyVU3pEEycBuQ", "LaEpxFGIQgCArsNZ2rd4Pw", >>>> "CiJ9gouZsbmTtxTWx7w6lA", "TaQV_I01RfCq3B6uAtIBoQ", >>>> "9Jpjo5k-RlGfLVLF6nDgze", "57YpjRdASsrrae-RD3spog", >>>> "bmA4EWFSTiKUaDzaNcCFKQ", "Fui9z_UbRe6AY1VhAr8Crw", >>>> "2PORr5BzSDOmBXgmQkO5Zg", "snfwTmtuTv-uj5mOWSJpgA", >>>> "0nHIrtePSaeW8aWArh_Mrg", "s0g9QHnjTgWX3rCIu1g0Hg", >>>> "Jl67fACuQvCFgZxXAFtDOg" ], >>>> "_cache" : true, >>>> "_cache_key" : "my_terms_cache" >>>> } >>>> } >>>> } >>>> >>>> I already used "*ids filter*" but I got same behaviour. One thing that >>>> I realized is that one of the cluster's nodes is increasing the Search >>>> Thread Pool (something like Queue: 50 and Count: 47) and the others don't >>>> (something like Queue: 0 and Count: 1). If I remove this node from the >>>> cluster another one starts with the same problem. >>>> >>>> My current environment is: >>>> - 7 Data nodes with 16Gb (8Gb for ES)and 8 cores each one; >>>> - 4 Load balancer Nodes (no data, no master) with 4Gb (3Gb for ES) and >>>> 8 cores each one; >>>> - 4 MasterNodes (only master, no data) with 4Gb (3Gb for ES) and 8 >>>> cores each one; >>>> - Thread Pool Search 47 (the others are standard config); >>>> - 7 Shards and 2 replicas Index; >>>> - 14.6Gb Index size (14.524.273 documents); >>>> >>>> >>>> I'm executing this filter with 50 concurrent users. >>>> >>>> >>>> Regards >>>> >>>> Em terça-feira, 3 de junho de 2014 20h33min45s UTC-3, Jörg Prante >>>> escreveu: >>>>> >>>>> Can you show your test code? >>>>> >>>>> You seem to look at the wrong settings - by adjusting node number, >>>>> shard number, replica number alone, you can not find out the maximum node >>>>> performance. E.g. concurrency settings, index optimizations, query >>>>> optimizations, thread pooling, and most of all, fast disk subsystem I/O >>>>> is >>>>> important. >>>>> >>>>> Jörg >>>>> >>>>> >>>>> On Wed, Jun 4, 2014 at 12:18 AM, Marcelo Paes Rech < >>>>> marcelo...@gmail.com> wrote: >>>>> >>>>>> Thanks for your reply Nikolas. It helps a lot. >>>>>> >>>>>> And about the quantity of documents of each shard, or size of each >>>>>> shard. And the need of no data nodes or only master nodes. When is it >>>>>> necessary? >>>>>> >>>>>> Some tests I did, when I increased request's number (like 100 users >>>>>> at same moment, and redo it again and again), 5 nodes with 1 shard and 2 >>>>>> replicas each and 16Gb RAM (8Gb for ES and 8Gb for OS) weren't enough. >>>>>> The >>>>>> response time start to increase more than 5s (I think less than 1s, in >>>>>> this case, would be acceptable) . >>>>>> >>>>>> This test has a lot of documents (something like 14 millions). >>>>>> >>>>>> >>>>>> Thanks. Regards. >>>>>> >>>>>> Em segunda-feira, 2 de junho de 2014 17h09min04s UTC-3, Nikolas >>>>>> Everett escreveu: >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 2, 2014 at 3:52 PM, Marcelo Paes Rech < >>>>>>> marcelo...@gmail.com> wrote: >>>>>>> >>>>>>> Hi guys, >>>>>>>> >>>>>>>> I'm looking for an article or a guide for the best cluster >>>>>>>> configuration. I read a lot of articles like "change this >>>>>>>> configuration" >>>>>>>> and "you must create X shards per node" but I didn't saw nothing like >>>>>>>> ElasticSearch Official guide for creating a cluster. >>>>>>>> >>>>>>>> What I would like to know are informations like. >>>>>>>> - How to calculate how many shards will be good for the cluster. >>>>>>>> - How many shards do we need per node? And if this is variable, how >>>>>>>> do I calculate this? >>>>>>>> - How much memory do I need per node and how many nodes? >>>>>>>> >>>>>>>> I think ElasticSearch is well documentated. But it is very >>>>>>>> fragmented. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> For some of these that is because "it depends" is the answer. For >>>>>>> example, you'll want larger heaps for aggregations and faceting. >>>>>>> >>>>>>> There are some rules of thumb: >>>>>>> 1. Set Elasticsearch's heap memory to 1/2 of ram but not more then >>>>>>> 30GB. Bigger then that and the JVM can't do pointer compression and >>>>>>> you >>>>>>> effectively lose ram. >>>>>>> 2. #1 implies that having much more then 60GB of ram on each node >>>>>>> doesn't make a big difference. It helps but its not really as good as >>>>>>> having more nodes. >>>>>>> 3. The most efficient efficient way of sharding is likely one shard >>>>>>> on each node. So if you have 9 nodes and a replication factor of 2 (so >>>>>>> 3 >>>>>>> total copies) then 3 shards is likely to be more efficient then having >>>>>>> 2 or >>>>>>> 4. But this only really matters when those shards get lots of traffic. >>>>>>> >>>>>>> And it breaks down a bit when you get lots of nodes. And the in >>>>>>> presence >>>>>>> of routing. Its complicated. >>>>>>> >>>>>>> But these are really just starting points, safe-ish defaults. >>>>>>> >>>>>>> Nik >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/94b8ecf9-efc4-4046-a862-63b670ccc23e%40goo >>>>>> glegroups.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/94b8ecf9-efc4-4046-a862-63b670ccc23e%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elasticsearc...@googlegroups.com. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/d32487cb-db92-4b7a-b6b3-afd431beaf61% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/d32487cb-db92-4b7a-b6b3-afd431beaf61%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/6f2f0e11-dd3d-4fc5-8adb-6eccddf83640%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/6f2f0e11-dd3d-4fc5-8adb-6eccddf83640%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c727d421-6426-4c52-b3ce-f8533e3dc68b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.