[ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821304#comment-13821304 ]
Mark Miller commented on SOLR-4260: ----------------------------------- I have done a bit of work on making it easier to turn on some debug logging that I use for this type of thing recently. I also have some specific local Jenkins jobs I run on dedicated hardware to help track down problems - I have been collecting a lot of logs over the last few weeks. There is a lot more that needs to be done though. I'm hoping to start a wiki page on how I have gone about tracking this type of thing down in the past so that perhaps it's easier for others to get involved. Hopefully Yonik can add any of his useful tricks to that as well. > Inconsistent numDocs between leader and replica > ----------------------------------------------- > > Key: SOLR-4260 > URL: https://issues.apache.org/jira/browse/SOLR-4260 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 5.0 > Environment: 5.0.0.2013.01.04.15.31.51 > Reporter: Markus Jelsma > Priority: Critical > Fix For: 5.0 > > Attachments: 192.168.20.102-replica1.png, > 192.168.20.104-replica2.png, clusterstate.png > > > After wiping all cores and reindexing some 3.3 million docs from Nutch using > CloudSolrServer we see inconsistencies between the leader and replica for > some shards. > Each core hold about 3.3k documents. For some reason 5 out of 10 shards have > a small deviation in then number of documents. The leader and slave deviate > for roughly 10-20 documents, not more. > Results hopping ranks in the result set for identical queries got my > attention, there were small IDF differences for exactly the same record > causing a record to shift positions in the result set. During those tests no > records were indexed. Consecutive catch all queries also return different > number of numDocs. > We're running a 10 node test cluster with 10 shards and a replication factor > of two and frequently reindex using a fresh build from trunk. I've not seen > this issue for quite some time until a few days ago. -- This message was sent by Atlassian JIRA (v6.1#6144) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org