Duplicate unique ID in implicit collection - Illegal?

2014-12-10 Thread Damien Dykman
Hi all,

With an implicit collection, is it legal to index the same document
(same unique ID) in 2 different shards? I know, it kind of defeats the
purpose of having a unique ID...

The reason I'm doing this, is because I want to move a single document
from 1 shard to an other. During the transition period, I'd use a search
criteria to specify which shard I want to target to find that document.

At search, I do notice some weird behaviors. The facets do take into
account the duplicate nature but the number of results varies, for
instance depending on parameter row=xx. But that doesn't surprise me too
much given the non-uniqueness-of-the-unique-ID.

So my actual question is the following: if my search query guaranties
there will be no duplicate matches, is my search result going to be
consistent? That's assuming it's legal to have duplicates across
shards from an indexation point of view.
 
Thanks,
Damien


Re: Duplicate unique ID in implicit collection - Illegal?

2014-12-10 Thread Alexandre Rafalovitch
On 10 December 2014 at 10:53, Damien Dykman damien.dyk...@gmail.com wrote:
 The facets do take into
 account the duplicate nature but the number of results varies, for
 instance depending on parameter row=xx.

The facets take deleted but not yet expunged (merged segment)
documents into account. One of the limitations of the write-only
segments architecture of Lucene. So, you might be seeing this happen
independently from your question.

Regards,
   Alex.


Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


Re: Duplicate unique ID in implicit collection - Illegal?

2014-12-10 Thread Chris Hostetter

: With an implicit collection, is it legal to index the same document
: (same unique ID) in 2 different shards? I know, it kind of defeats the
: purpose of having a unique ID...

Each doc (defined by uniqueKey) must exist in one and only one shard ... 
when this constraint is violated, you'll start to get undefined behavior 
at request time.

: So my actual question is the following: if my search query guaranties
: there will be no duplicate matches, is my search result going to be
: consistent? That's assuming it's legal to have duplicates across
: shards from an indexation point of view.

the answer is probably -- but that's just an implementation detail at 
the moment, it does it's best to account for weird situations like this.  
but nothing in the architecture / design garuntees that for you in the 
future.


-Hoss
http://www.lucidworks.com/