Hi everyone, There are a couple of notes on the limitations of this approach at http://wiki.apache.org/solr/DistributedSearch which I'm having trouble understanding.
1. "When duplicate doc IDs are received, Solr chooses the first doc and discards subsequent ones" "Received" here is from the perspective of the base Solr instance at query time, right? I.e. if you inadvertently indexed 2 versions of the document with the same unique ID but different contents to 2 shards, then at query time, the "first" document (putting aside for the moment what exactly "first" means) would win. Am I reading this right? 2. "The index could change between stages, e.g. a document that matched a query and was subsequently changed may no longer match but will still be retrieved." I have no idea what this second statement means. And one other question about shards: 3. The examples I've seen documented do not illustrate sharded, multicore setups; only sharded monolithic cores. I assume sharding works with multicore as well (i.e. the two issues are orthogonal). Is this right? Any help on interpreting the above would be much appreciated. Thank you, -Babak