I'm going through a cluster sizing exercise and I'm looking for some pointers.
I have tested the target hardware with a single shard loading production data and simulating some of our actual queries using Gatling. Our heaviest queries are a blend of top_children with filters, match_all with query_string filters and query_string queries with has_parent query_string filters. I've narrowed down an optimal shard size based on those tests which should give me the target number of shards for my index (num_shards = index_size / max_shard_size). Some other notes are: - memory on the box is split 50/50 between JVM heap and OS. - the available OS page cache memory can maybe fit two shards worth of data - I have plenty of room left for field and filter caches - heap and GC seem to be behaving nicely with a slow and consistent step curve in Bigdesk - tests seem to show the box being CPU bound but it's not disk IO: iostat does not show any disk reads during the high load times, Bigdesk shows most time spent in user space and the highest activity seems to be around young generation collections At this point, what are some of the most important things to consider in terms of how many of these shards to co-locate on the same node? - Would adding more shards on the same box negate the tests I just ran? - Do I limit the number of shards to how much can fit in page cache on the node? That really blows up the number of nodes I will need which is pretty costly. Is this approach really worth it? - Since I'm CPU bound, shouldn't I be worried about more cores if I want more shards per box? Any pointers are welcome. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c2f49e2-1cac-4668-a7ae-b029f2ae6950%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.