[jira] [Updated] (SOLR-4260) Inconsistent numDocs between leader/replica

Markus Jelsma (JIRA) Fri, 04 Jan 2013 10:00:20 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Markus Jelsma updated SOLR-4260:
--------------------------------

    Description: 
After wiping all cores and reindexing some 3.3 million docs from Nutch using 
CloudSolrServer we see inconsistencies between the leader and replica for some 
shards.

Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a 
small deviation in then number of documents. The leader and slave deviate for 
roughly 10-20 documents, not more.

Results hopping ranks in the result set for identical queries got my attention, 
there were small IDF differences for exactly the same record causing a record 
to shift positions in the result set. During those tests no records were 
indexed. Consecutive catch all queries also return different number of numDocs.

We're running a 10 node test cluster with 10 shards and a replication factor of 
two and frequently reindex using a fresh build from trunk. I've not seen this 
issue for quite some time until a few days ago.

  was:
After wiping all cores and reindexing some 3.3 million docs from Nutch using 
CloudSolrServer we see inconsistencies between the leader and replica for some 
shards.

Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a 
small deviation in then number of documents. The leader and slave deviate for 
roughly 10-20 documents, not more.

Results hopping ranks in the result set for identical queries got my attention, 
there were small IDF differences for exactly the same record causing a record 
to shift positions in the result set. During those tests no records were 
indexed.

We're running a 10 node test cluster with 10 shards and a replication factor of 
two and frequently reindex using a fresh build from trunk. I've not seen this 
issue for quite some time until a few days ago.

    
> Inconsistent numDocs between leader/replica
> -------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.0
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Priority: Critical
>             Fix For: 5.0
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-4260) Inconsistent numDocs between leader/replica

Reply via email to