RE: distributed search is significantly slower than direct search

Elran Dvir Tue, 12 Nov 2013 23:53:39 -0800

Erick, Thanks for your response.

We are upgrading our system using Solr.
We need to preserve old functionality.  Our client displays 5K document and 
groups them.


Is there a way to refactor code in order to improve distributed documents 
fetching?

Thanks. 

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, October 30, 2013 3:17 AM
To: solr-user@lucene.apache.org
Subject: Re: distributed search is significantly slower than direct search

You can't. There will inevitably be some overhead in the distributed case. That 
said, 7 seconds is quite long.

5,000 rows is excessive, and probably where your issue is. You're having to go 
out and fetch the docs across the wire. Perhaps there is some batching that 
could be done there, I don't know whether this is one document per request or 
not.

Why 5K docs?

Best,
Erick


On Tue, Oct 29, 2013 at 2:54 AM, Elran Dvir <elr...@checkpoint.com> wrote:

> Hi all,
>
> I am using Solr 4.4 with multi cores. One core (called template) is my 
> "routing" core.
>
> When I run
> http://127.0.0.1:8983/solr/template/select?rows=5000&q=*:*&shards=127.
> 0.0.1:8983/solr/core1,
> it consistently takes about 7s.
> When I run http://127.0.0.1:8983/solr/core1/select?rows=5000&q=*:*, it 
> consistently takes about 40ms.
>
> I profiled the distributed query.
> This is the distributed query process (I hope the terms are accurate):
> When solr identifies a distributed query, it sends the query to the 
> shard and get matched shard docs.
> Then it sends another query to the shard to get the Solr documents.
> Most time is spent in the last stage in the function "process" of 
> "QueryComponent" in:
>
> for (int i=0; i<idArr.size(); i++) {
>         int id = req.getSearcher().getFirstMatch(
>                 new Term(idField.getName(), 
> idField.getType().toInternal(idArr.get(i))));
>
> How can I make my distributed query as fast as the direct one?
>
> Thanks.
>


Email secured by Check Point

RE: distributed search is significantly slower than direct search

Reply via email to