in QueryComponent.mergeIds. It will remove document which has duplicated uniqueKey with others. In current implementation, it use the first encountered. String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected numFound--; collapseList.remove(id+""); docs.set(i, null);//remove it. // For now, just always use the first encountered since we can't currently // remove the previous one added to the priority queue. If we switched // to the Java5 PriorityQueue, this would be easier. continue; // make which duplicate is used deterministic based on shard // if (prevShard.compareTo(srsp.shard) >= 0) { // TODO: remove previous from priority queue // continue; // } }
It iterate ove ShardResponse by for (ShardResponse srsp : sreq.responses) But the sreq.responses may be different. That is -- shard1's result and shard2's result may interchange position So when an uniqueKey(such as url) occurs in both shard1 and shard2. which one will be used is unpredicatable. But the socre of these 2 docs are different because of different idf. So the same query will get different result. One possible solution is to sort ShardResponse srsp by shard name.