On Mon, Sep 15, 2008 at 5:43 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > As to why we have the same document in different shards with different > contents: once you hit a certain index size and ingest rate, it is easiest > to create a series of indexes and leave the older ones alone. In the future, > please consider this as a legitimate use case instead of simply a mistake.
I think the issue with cleanly handling duplicates is that it would be much harder to do in the general case. For example, facet counts... we have no mechanism to take into account duplicates there, and I'm afraid people would expect it if it were considered legit to have dups. It seems best to evaluate the cost and complexity of handing dups on a case by case basis. -Yonik