Re: Distributed Search

Mark Miller Wed, 25 Feb 2009 09:31:02 -0800

Fair enough. We should update the Wiki then? I think it currently doesread as if its a supported feature rather than something you should avoid.


--
- Mark


http://www.lucidimagination.com



Yonik Seeley wrote:

On Wed, Feb 25, 2009 at 11:52 AM, Mark Miller <markrmil...@gmail.com> wrote:

You are not supposed to have duplicates is a bit strong - I was over reading
into something Yonik had mentioned in the past. It looks like its supposed
to become more useful:


Well, perhaps slightly more deterministic so that two queries return
the same results.
I think we should stick with the position that duplicate docs in
shards is an error, but that we handle it gracefully w/o blowing up.
Things like facet counts, paging, etc, will be slightly off.

-Yonik
Lucene/Solr? http://www.lucidimagination.com

I think Yonik might have to clear this up, but it looks like the current
implementation is not deterministic, and he has it listed as a TODO:

          // make which duplicate is used deterministic based on shard
          // if (prevShard.compareTo(srsp.shard) >= 0) {
          //  TODO: remove previous from priority queue
          //  continue;
          // }


Mark Miller wrote:

I don't think your supposed to have duplicate keys? I think its supposed
to work more as a graceful failure than a feature you should count on. Id's
should be unique across the collection.

Ok, now I'm confused, if the shard the document comes from is
non-deterministic, how can you use this 'trick'? (except that since the
response time of the first shard which is smaller is usually better which
would mean it'll work most of time (BAD!)) Or was Koji's memory incorrect
and the shard first mentioned is always the authoritative shard when
encountering duplicate keys?

Regards,

gwk

--
- Mark

http://www.lucidimagination.com

Re: Distributed Search

Reply via email to