Re: distributed search is significantly slower than direct search

Mark Miller Sun, 17 Nov 2013 10:27:44 -0800

You are asking for 5000 docs right? And that’s forcing us to look up 5000 
external to internal ids. I think this always had a cost, but it’s obviously 
worse if you ask for a ton of results. I don’t think single node has to do 
this? And if we had like Searcher leases or something (we will eventually), I 
think we could avoid it and just use internal ids.


- Mark

On Nov 17, 2013, at 12:44 PM, Yuval Dotan <yuvaldo...@gmail.com> wrote:

> Hi Tomás
> This is just a test environment meant only to reproduce the issue I am
> currently investigating.
> The number of documents should grow substantially (billions of docs).
> 
> 
> 
> On Sun, Nov 17, 2013 at 7:12 PM, Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
> 
>> Hi Yuval, quick question. You say that your code has 750k docs and around
>> 400mb? Is this some kind of test dataset and you expect it to grow
>> significantly? For an index of this size, I wouldn't use distributed
>> search, single shard should be fine.
>> 
>> 
>> Tomás
>> 
>> 
>> On Sun, Nov 17, 2013 at 6:50 AM, Yuval Dotan <yuvaldo...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I isolated the case
>>> 
>>> Installed on a new machine (2 x Xeon E5410 2.33GHz)
>>> 
>>> I have an environment with 12Gb of memory.
>>> 
>>> I assigned 6gb of memory to Solr and I’m not running any other memory
>>> consuming process so no memory issues should arise.
>>> 
>>> Removed all indexes apart from two:
>>> 
>>> emptyCore – empty – used for routing
>>> 
>>> core1 – holds the stored data – has ~750,000 docs and size of 400Mb
>>> 
>>> Again this is a single machine that holds both indexes.
>>> 
>>> The query
>>> 
>>> 
>> http://localhost:8210/solr/emptyCore/select?rows=5000&q=*:*&shards=127.0.0.1:8210/solr/core1&wt=jsonQTime
>>> takes ~3 seconds
>>> 
>>> and direct query
>>> http://localhost:8210/solr/core1/select?rows=5000&q=*:*&wt=json Qtime
>>> takes
>>> ~15 ms - a magnitude difference.
>>> 
>>> I ran the long query several times and got an improvement of about a sec
>>> (33%) but that’s it.
>>> 
>>> I need to better understand why this is happening.
>>> 
>>> I tried looking at Solr code and debugging the issue but with no success.
>>> 
>>> The one thing I did notice is that the getFirstMatch method which
>> receives
>>> the doc id, searches the term dict and returns the internal id takes most
>>> of the time for some reason.
>>> 
>>> I am pretty stuck and would appreciate any ideas
>>> 
>>> My only solution for the moment is to bypass the distributed query,
>>> implement code in my own app that directly queries the relevant cores and
>>> handles the sorting etc..
>>> 
>>> Thanks
>>> 
>>> 
>>> 
>>> 
>>> On Sat, Nov 16, 2013 at 2:39 PM, Michael Sokolov <
>>> msoko...@safaribooksonline.com> wrote:
>>> 
>>>> Did you say what the memory profile of your machine is?  How much
>> memory,
>>>> and how large are the shards? This is just a random guess, but it might
>>> be
>>>> that if you are memory-constrained, there is a lot of thrashing caused
>> by
>>>> paging (swapping?) in and out the sharded indexes while a single index
>>> can
>>>> be scanned linearly, even if it does need to be paged in.
>>>> 
>>>> -Mike
>>>> 
>>>> 
>>>> On 11/14/2013 8:10 AM, Elran Dvir wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> We tried returning just the id field and got exactly the same
>>> performance.
>>>>> Our system is distributed but all shards are in a single machine so
>>>>> network issues are not a factor.
>>>>> The code we found where Solr is spending its time is on the shard and
>>> not
>>>>> on the routing core, again all shards are local.
>>>>> We investigated the getFirstMatch() method and noticed that the
>>>>> MultiTermEnum.reset (inside MultiTerm.iterator) and
>> MultiTerm.seekExact
>>>>> take 99% of the time.
>>>>> Inside these methods, the call to BlockTreeTermsReader$
>>>>> FieldReader$SegmentTermsEnum$Frame.loadBlock  takes most of the time.
>>>>> Out of the 7 seconds  run these methods take ~5 and
>>>>> BinaryResponseWriter.write takes the rest(~ 2 seconds).
>>>>> 
>>>>> We tried increasing cache sizes and got hits, but it only improved the
>>>>> query time by a second (~6), so no major effect.
>>>>> We are not indexing during our tests. The performance is similar.
>>>>> (How do we measure doc size? Is it important due to the fact that the
>>>>> performance is the same when returning only id field?)
>>>>> 
>>>>> We still don't completely understand why the query takes this much
>>> longer
>>>>> although the cores are on the same machine.
>>>>> 
>>>>> Is there a way to improve the performance (code, configuration,
>> query)?
>>>>> 
>>>>> -----Original Message-----
>>>>> From: idokis...@gmail.com [mailto:idokis...@gmail.com] On Behalf Of
>>>>> Manuel Le Normand
>>>>> Sent: Thursday, November 14, 2013 1:30 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Re: distributed search is significantly slower than direct
>>> search
>>>>> 
>>>>> It's surprising such a query takes a long time, I would assume that
>>> after
>>>>> trying consistently q=*:* you should be getting cache hits and times
>>> should
>>>>> be faster. Try see in the adminUI how do your query/doc cache perform.
>>>>> Moreover, the query in itself is just asking the first 5000 docs that
>>>>> were indexed (returing the first [docid]), so seems all this time is
>>> wasted
>>>>> on transfer. Out of these 7 secs how much is spent on the above
>> method?
>>>>> What do you return by default? How big is every doc you display in
>> your
>>>>> results?
>>>>> Might be the matter that both collections work on the same ressources.
>>>>> Try elaborating your use-case.
>>>>> 
>>>>> Anyway, it seems like you just made a test to see what will be the
>>>>> performance hit in a distributed environment so I'll try to explain
>> some
>>>>> things we encountered in our benchmarks, with a case that has at least
>>> the
>>>>> similarity of the num of docs fetched.
>>>>> 
>>>>> We reclaim 2000 docs every query, running over 40 shards. This means
>>>>> every shard is actually transfering to our frontend 2000 docs every
>>>>> document-match request (the first you were referring to). Even if
>> lazily
>>>>> loaded, reading 2000 id's (on 40 servers) and lazy loading the fields
>>> is a
>>>>> tough job. Waiting for the slowest shard to respond, then sorting the
>>> docs
>>>>> and reloading (lazy or not) the top 2000 docs might take a long time.
>>>>> 
>>>>> Our times are 4-8 secs, but do it's not possible comparing cases.
>> We've
>>>>> done few steps that improved it along the way, steps that led to
>> others.
>>>>> These were our starters:
>>>>> 
>>>>>    1. Profile these queries from different servers and solr
>> instances,
>>>>> try
>>>>>    putting your finger what collection is working hard and why. Check
>>> if
>>>>>    you're stuck on components that don't have an added value for you
>>> but
>>>>> are
>>>>>    used by default.
>>>>>    2. Consider eliminating the doc cache. It loads lots of (partly)
>>> lazy
>>>>>    documents that their probability of secondary usage is low.
>> There's
>>>>> no such
>>>>>    thing "popular docs" when requesting so many docs. You may be
>> using
>>>>> your
>>>>>    memory in a better way.
>>>>>    3. Bottleneck check - inner server metrics as cpu user / iowait,
>>>>> packets
>>>>>    transferred over the network, page faults etc. are excellent in
>>> order
>>>>> to
>>>>>    understand if the disk/network/cpu is slowing you down. Then
>> upgrade
>>>>>    hardware in one of the shards to check if it helps by looking at
>> the
>>>>>    upgraded shard qTime compared to other.
>>>>>    4. Warm up the index after commiting - try to benchmark how do
>>> queries
>>>>>    performs before and after some warm-up, let's say some few
>> hundreds
>>> of
>>>>>    queries (from your previous system) in order to warm up the os
>> cache
>>>>>    (assuming your using NRTDirectoryFactory)
>>>>> 
>>>>> 
>>>>> Good luck,
>>>>> Manu
>>>>> 
>>>>> 
>>>>> On Wed, Nov 13, 2013 at 2:38 PM, Erick Erickson <
>>> erickerick...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>> One thing you can try, and this is more diagnostic than a cure, is
>>>>>> return just the id field (and insure that lazy field loading is
>> true).
>>>>>> That'll tell you whether the issue is actually fetching the document
>>>>>> off disk and decompressing, although frankly that's unlikely since
>> you
>>>>>> can get your 5,000 rows from a single machine quickly.
>>>>>> 
>>>>>> The code you found where Solr is spending its time, is that on the
>>>>>> "routing" core or on the shards? I actually have a hard time
>>>>>> understanding how that code could take a long time, doesn't seem
>>>>>> right.
>>>>>> 
>>>>>> You are transferring 5,000 docs across the network, so it's possible
>>>>>> that your network is just slow, that's certainly a difference between
>>>>>> the local and remote case, but that's a stab in the dark.
>>>>>> 
>>>>>> Not much help I know,
>>>>>> Erick
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Nov 13, 2013 at 2:52 AM, Elran Dvir <elr...@checkpoint.com>
>>>>>> wrote:
>>>>>> 
>>>>>> Erick, Thanks for your response.
>>>>>>> 
>>>>>>> We are upgrading our system using Solr.
>>>>>>> We need to preserve old functionality.  Our client displays 5K
>>>>>>> document and groups them.
>>>>>>> 
>>>>>>> Is there a way to refactor code in order to improve distributed
>>>>>>> documents fetching?
>>>>>>> 
>>>>>>> Thanks.
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>>>>>>> Sent: Wednesday, October 30, 2013 3:17 AM
>>>>>>> To: solr-user@lucene.apache.org
>>>>>>> Subject: Re: distributed search is significantly slower than direct
>>>>>>> 
>>>>>> search
>>>>>> 
>>>>>>> You can't. There will inevitably be some overhead in the distributed
>>>>>>> 
>>>>>> case.
>>>>>> 
>>>>>>> That said, 7 seconds is quite long.
>>>>>>> 
>>>>>>> 5,000 rows is excessive, and probably where your issue is. You're
>>>>>>> having to go out and fetch the docs across the wire. Perhaps there
>>>>>>> is some batching that could be done there, I don't know whether this
>>>>>>> is one document per request or not.
>>>>>>> 
>>>>>>> Why 5K docs?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Erick
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Oct 29, 2013 at 2:54 AM, Elran Dvir <elr...@checkpoint.com>
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>>> 
>>>>>>>> I am using Solr 4.4 with multi cores. One core (called template)
>>>>>>>> is my "routing" core.
>>>>>>>> 
>>>>>>>> When I run
>>>>>>>> 
>>> http://127.0.0.1:8983/solr/template/select?rows=5000&q=*:*&shards=127.
>>>>>>>> 0.0.1:8983/solr/core1,
>>>>>>>> it consistently takes about 7s.
>>>>>>>> When I run
>>>>>>>> http://127.0.0.1:8983/solr/core1/select?rows=5000&q=*:*, it
>>>>>>>> consistently takes about 40ms.
>>>>>>>> 
>>>>>>>> I profiled the distributed query.
>>>>>>>> This is the distributed query process (I hope the terms are
>>> accurate):
>>>>>>>> When solr identifies a distributed query, it sends the query to
>>>>>>>> the shard and get matched shard docs.
>>>>>>>> Then it sends another query to the shard to get the Solr documents.
>>>>>>>> Most time is spent in the last stage in the function "process" of
>>>>>>>> "QueryComponent" in:
>>>>>>>> 
>>>>>>>> for (int i=0; i<idArr.size(); i++) {
>>>>>>>>         int id = req.getSearcher().getFirstMatch(
>>>>>>>>                 new Term(idField.getName(),
>>>>>>>> idField.getType().toInternal(idArr.get(i))));
>>>>>>>> 
>>>>>>>> How can I make my distributed query as fast as the direct one?
>>>>>>>> 
>>>>>>>> Thanks.
>>>>>>>> 
>>>>>>>> 
>>>>>>> Email secured by Check Point
>>>>>>> 
>>>>>>> 
>>>>> Email secured by Check Point
>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: distributed search is significantly slower than direct search

Reply via email to