Thanks!  We made variants of this and a couple of other files.

As to why we have the same document in different shards with different
contents: once you hit a certain index size and ingest rate, it is easiest
to create a series of indexes and leave the older ones alone. In the future,
please consider this as a legitimate use case instead of simply a mistake.

Thanks again,

Lance

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley
Sent: Saturday, September 13, 2008 5:50 AM
To: solr-user@lucene.apache.org
Subject: Re: Adding bias to Distributed search feature?

On Thu, Sep 11, 2008 at 10:31 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:
> Is it possible to add a bias to the ordering in the distributed search 
> feature? That is, if the search finds the same content in two 
> different indexes, it always favors the document from the first index over
the second.

Handling duplicates is not currently done as a feature, but as a check
against a mistake.
It's not currently deterministic... first one returned will win.

Here's the relevant code from QueryComponent:

          String prevShard = uniqueDoc.put(id, srsp.getShard());
          if (prevShard != null) {
            // duplicate detected
            numFound--;

            // For now, just always use the first encountered since we can't
currently
            // remove the previous one added to the priority queue.
If we switched
            // to the Java5 PriorityQueue, this would be easier.
            continue;
            // make which duplicate is used deterministic based on shard
            // if (prevShard.compareTo(srsp.shard) >= 0) {
            //  TODO: remove previous from priority queue
            //  continue;
            // }
          }

Reply via email to