Thanks! We made variants of this and a couple of other files. As to why we have the same document in different shards with different contents: once you hit a certain index size and ingest rate, it is easiest to create a series of indexes and leave the older ones alone. In the future, please consider this as a legitimate use case instead of simply a mistake.
Thanks again, Lance -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Saturday, September 13, 2008 5:50 AM To: solr-user@lucene.apache.org Subject: Re: Adding bias to Distributed search feature? On Thu, Sep 11, 2008 at 10:31 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > Is it possible to add a bias to the ordering in the distributed search > feature? That is, if the search finds the same content in two > different indexes, it always favors the document from the first index over the second. Handling duplicates is not currently done as a feature, but as a check against a mistake. It's not currently deterministic... first one returned will win. Here's the relevant code from QueryComponent: String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected numFound--; // For now, just always use the first encountered since we can't currently // remove the previous one added to the priority queue. If we switched // to the Java5 PriorityQueue, this would be easier. continue; // make which duplicate is used deterministic based on shard // if (prevShard.compareTo(srsp.shard) >= 0) { // TODO: remove previous from priority queue // continue; // } }