Hi!

we're using elasticsearch for an open source geocoder called photon. We're 
using solr previously but we switched to elasticsearch some time ago and 
I'am using now multi_match's cross_field 
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#type-cross-fields>
 
query (which is great by the way as it sorts out most problems we had 
before).

I investigated the performance between both implementation and it turned 
out that the elasticsearch is about 5 times slower than the solr 
counterpart. The dataset (100,000,000 documents) is identical and the size 
of both indices too. On the solr side, I am using an edismax 
<https://github.com/komoot/photon/blob/deprecated-solr-version/solrconfig/collection1/conf/solrconfig.xml#L122>
 
query whilst it is a cross_field 
<https://github.com/christophlingg/photon/blob/komoot/website/photon/app.py#L25>
 on 
elasticsearch. Average query time is 120ms vs. 1000s.

I adjusted the number of open file descriptors to 64k, during the benchmark 
there is (almost) no IO whilst the cpu is very high (> 75%, 12 cores). As 
cross_field is a very recent feature I tried out best_field 
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#type-best-fields>
 as 
well, but benchmark results weren't better.

Do you have any ideas on how I can dig more into performance issues like 
this in elasticsearch? Do you have experience with both queries you can 
share with me?

Thanks for your help!
Christoph

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5bff0274-ea12-4f28-a304-3f0ad691880c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to