Re: Distributed Search

Yonik Seeley Wed, 25 Feb 2009 09:22:39 -0800

On Wed, Feb 25, 2009 at 11:52 AM, Mark Miller <markrmil...@gmail.com> wrote:
> You are not supposed to have duplicates is a bit strong - I was over reading
> into something Yonik had mentioned in the past. It looks like its supposed
> to become more useful:


Well, perhaps slightly more deterministic so that two queries return
the same results.
I think we should stick with the position that duplicate docs in
shards is an error, but that we handle it gracefully w/o blowing up.
Things like facet counts, paging, etc, will be slightly off.

-Yonik
Lucene/Solr? http://www.lucidimagination.com



> I think Yonik might have to clear this up, but it looks like the current
> implementation is not deterministic, and he has it listed as a TODO:
>
>           // make which duplicate is used deterministic based on shard
>           // if (prevShard.compareTo(srsp.shard) >= 0) {
>           //  TODO: remove previous from priority queue
>           //  continue;
>           // }
>
>
> Mark Miller wrote:
>>
>> I don't think your supposed to have duplicate keys? I think its supposed
>> to work more as a graceful failure than a feature you should count on. Id's
>> should be unique across the collection.
>>
>>>>
>>>
>>> Ok, now I'm confused, if the shard the document comes from is
>>> non-deterministic, how can you use this 'trick'? (except that since the
>>> response time of the first shard which is smaller is usually better which
>>> would mean it'll work most of time (BAD!)) Or was Koji's memory incorrect
>>> and the shard first mentioned is always the authoritative shard when
>>> encountering duplicate keys?
>>>
>>> Regards,
>>>
>>> gwk
>>>
>>
>>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>

Re: Distributed Search

Reply via email to