Re: Clustering/Sharding impact on query performance

Kireet Reddy Mon, 21 Jul 2014 08:47:35 -0700

My working assumption had been that elasticsearch executes queries across all 
shards in parallel and then merges the results. So maybe shards <= cpu cores 
would help in this case where there is only one concurrent query. But I have 
never tested this assumption, out of curiosity during the 20 shard test did you 
still only see 1 cpu being used? Did you try 2 shards and get the same results?


On Jul 20, 2014, at 1:01 AM, 'Fin Sekun' via elasticsearch 
<elasticsearch@googlegroups.com> wrote:

> Hi Kireet, thanks for your answer and sorry for the late response. More 
> shards doesn't help. It will slow down the system because each shard takes 
> quite some overhead to maintain a Lucene index and, the smaller the shards, 
> the bigger the overhead. Having more shards enhances the indexing performance 
> and allows to distribute a big index across machines, but I don't have a 
> cluster with a lot of machines. I could observe this negative effects while 
> testing with 20 shards.
> 
> It would be very cool if somebody could answer/comment to the question 
> summarized at the end of my post. Thanks again.
> 
> 
> 
> 
> 
> On Friday, July 11, 2014 3:02:50 AM UTC+2, Kireet Reddy wrote:
> I would test using multiple primary shards on a single machine. Since your 
> dataset seems to fit into RAM, this could help for these longer latency 
> queries.
> 
> On Thursday, July 10, 2014 12:24:26 AM UTC-7, Fin Sekun wrote:
> Any hints?
> 
> 
> 
> On Monday, July 7, 2014 3:51:19 PM UTC+2, Fin Sekun wrote:
> 
> Hi,
> 
> 
> SCENARIO
> 
> Our Elasticsearch database has ~2.5 million entries. Each entry has the three 
> analyzed fields "match", "sec_match" and "thi_match" (all contains 3-20 
> words) that will be used in this query:
> https://gist.github.com/anonymous/a8d1142512e5625e4e91
> 
> 
> ES runs on two types of servers:
> (1) Real servers (system has direct access to real CPUs, no virtualization) 
> of newest generation - Very performant!
> (2) Cloud servers with virtualized CPUs - Poor CPUs, but this is generic for 
> cloud services.
> 
> See https://gist.github.com/anonymous/3098b142c2bab51feecc for (1) and (2) 
> CPU details.
> 
> 
> ES settings:
> ES version 1.2.0 (jdk1.8.0_05)
> ES_HEAP_SIZE = 512m (we also tested with 1024m with same results)
> vm.max_map_count = 262144
> ulimit -n 64000
> ulimit -l unlimited
> index.number_of_shards: 1
> index.number_of_replicas: 0
> index.store.type: mmapfs
> threadpool.search.type: fixed
> threadpool.search.size: 75
> threadpool.search.queue_size: 5000
> 
> 
> Infrastructure:
> As you can see above, we don't use the cluster feature of ES (1 shard, 0 
> replicas). The reason is that our hosting infrastructure is based on 
> different providers.
> Upside: We aren't dependent on a single hosting provider. Downside: Our 
> servers aren't in the same LAN.
> 
> This means:
> - We cannot use ES sharding, because synchronisation via WAN (internet) seems 
> not a useful solution.
> - So, every ES-server has the complete dataset and we configured only one 
> shard and no replicas for higher performance.
> - We have a distribution process that updates the ES data on every host 
> frequently. This process is fine for us, because updates aren't very often 
> and perfect just-in-time ES synchronisation isn't necessary for our business 
> case.
> - If a server goes down/crashs, the central loadbalancer removes it (the 
> resulting minimal packet lost is acceptable).
>  
> 
> 
> 
> PROBLEM
> 
> For long query terms (6 and more keywords), we have very high CPU loads, even 
> on the high performance server (1), and this leads to high response times: 
> 1-4sec on server (1), 8-20sec on server (2). The system parameters while 
> querying:
> - Very high load (usually 100%) for the thread responsible CPU (the other 
> CPUs are idle in our test scenario)
> - No I/O load (the harddisks are fine)
> - No RAM bottlenecks
> 
> So, we think the file caching is working fine, because we have no I/O 
> problems and the garbage collector seams to be happy (jstat shows very few 
> GCs). The CPU is the problem, and ES hot-threads point to the Scorer module:
> https://gist.github.com/anonymous/9cecfd512cb533114b7d 
> 
> 
> 
> 
> SUMMARY/ASSUMPTIONS
> 
> - Our database size isn't very big and the query not very complex.
> - ES is designed for huge amount of data, but the key is clustering/sharding: 
> Data distribution to many servers means smaller indices, smaller indices 
> leads to fewer CPU load and short response times.
> - So, our database isn't big, but to big for a single CPU and this means 
> especially low performance (virtual) CPUs can only be used in sharding 
> environments.
> 
> If we don't want to lost the provider independency, we have only the 
> following two options:
> 
> 1) Simpler query (I think not possible in our case)
> 2) Smaller database
> 
> 
> 
> 
> QUESTIONS
> 
> Are our assumptions correct? Especially:
> 
> - Is clustering/sharding (also small indices) the main key to performance, 
> that means the only possibility to prevent overloaded (virtual) CPUs?
> - Is it right that clustering is only useful/possible in LANs?
> - Do you have any ES configuration or architecture hints regarding our 
> preference for using multiple hosting providers?
> 
> 
> 
> Thank you. Rgds
> Fin
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/elasticsearch/TpEboTt_8FY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7e1b3a52-23b4-4a22-8433-985e07ae7904%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ED431C9F-670C-4D2C-8F44-1F9E82A8D155%40feedly.com.
For more options, visit https://groups.google.com/d/optout.

Re: Clustering/Sharding impact on query performance

Reply via email to