[ 
https://issues.apache.org/jira/browse/SOLR-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657460#comment-13657460
 ] 

Yonik Seeley edited comment on SOLR-4744 at 5/15/13 12:26 PM:
--------------------------------------------------------------

Nice job tracking that down.

bq. Replicate to sub shard leader synchronously (before local update)

This seems like the right fix (I had thought it was this way already).  This 
should include returning failure to the client of course.

                
      was (Author: ysee...@gmail.com):
    Nice job tracking that down.

bq. Replicate to sub shard leader synchronously (before local update)

This seems like the right fix (I had thought it was this way already).
                  
> Version conflict error during shard split test
> ----------------------------------------------
>
>                 Key: SOLR-4744
>                 URL: https://issues.apache.org/jira/browse/SOLR-4744
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 4.4
>
>
> ShardSplitTest fails sometimes with the following error:
> {code}
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
> invoked for collection: collection1
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 
> to inactive
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
> shard1_0 to active
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state 
> shard1_1 to active
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.873; 
> org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= 
> path=/update params={wt=javabin&version=2} {add=[169 (1432319507166134272)]} 
> 0 2
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; 
> org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
> WatchedEvent state:SyncConnected type:NodeDataChanged 
> path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.884; 
> org.apache.solr.update.processor.LogUpdateProcessor; 
> [collection1_shard1_1_replica1] webapp= path=/update 
> params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2}
>  {} 0 1
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.885; 
> org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= 
> path=/update 
> params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2}
>  {add=[169 (1432319507173474304)]} 0 2
> [junit4:junit4]   1> ERROR - 2013-04-14 19:05:26.885; 
> org.apache.solr.common.SolrException; shard update error StdNode: 
> http://127.0.0.1:41028/collection1_shard1_1_replica1/:org.apache.solr.common.SolrException:
>  version conflict for 169 expected=1432319507173474304 actual=-1
> [junit4:junit4]   1>  at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
> [junit4:junit4]   1>  at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> [junit4:junit4]   1>  at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
> [junit4:junit4]   1>  at 
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
> [junit4:junit4]   1>  at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> [junit4:junit4]   1>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:166)
> [junit4:junit4]   1>  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [junit4:junit4]   1>  at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> [junit4:junit4]   1>  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:166)
> [junit4:junit4]   1>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> [junit4:junit4]   1>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [junit4:junit4]   1>  at java.lang.Thread.run(Thread.java:679)
> [junit4:junit4]   1> 
> [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.886; 
> org.apache.solr.update.processor.DistributedUpdateProcessor; try and ask 
> http://127.0.0.1:41028 to recover
> {code}
> The failure is hard to reproduce and very timing sensitive. These kind of 
> failures have always been seen right after "updateshardstate" action.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to