On Aug 14, 2013, at 9:01 AM, Manuel Le Normand <manuel.lenorm...@gmail.com> wrote:
> Does this sound like the scenario that happened: > By removing the index dir from replica 2 I also removed the flog Did you also remove the tlog dir? It's normally: data/index data/tlog > from which > the zookeeper extracts the version of the two replicas and decides which > one should be elected to leader. As replica 2 did have no tlog, the zk > didn't have anyway to compare the 2 registered replicas so it just picked > arbitrarly one of the replicas to lead, resulting in electing empty > replicas. If one had no tlog, it should have recovered from the one that still had a tlog. > > How does the zookeeper compare the 2 tlogs to know which one is more > recent? does it not rely on the version number shown in the admin UI? It looks at recent id's in the tlogs of both and compares them. - Mark > > > On Wed, Aug 14, 2013 at 11:00 AM, Manuel Le Normand < > manuel.lenorm...@gmail.com> wrote: > >> Hello, >> My solr cluster runs on RH Linux with tomcat7 servlet. >> NumOfShards=40, replicationFactor=2, 40 servers each has 2 replicas. Solr >> 4.3 >> >> For experimental reasons I splitted my cluster to 2 sub-clusters, each >> containing a single replica of each shard. >> When connecting back these sub-clusters the sync failed (more than 100 >> docs indexed per shard) so a replication process started on sub-cluster #2. >> Due to transient storage limitations needed for the replication process, I >> removed all the index from sub-cluster #2 before connecting it back, then I >> connected sub-cluster #2's servers in 3-4 bulks to avoid high disk load. >> The first bulk replications worked well, but after a while an internal >> script pkilled all the solr instances, some while replicating. After >> starting back the servlet I discovered the disaster - on part of the >> replicas that were in a replicating stage there was a wrong zookeeper >> leader election - good state replicas (sub-cluster 1) replicated from empty >> replicas (sub-cluster 2) ending up in removing all documents in these >> shards!! >> >> These are the logs from solr-prod32 (sub cluster #2 - bad state) - the >> shard1_replica1 is elected to be leader although it was not before the >> replication process (and shouldn't have the higher version number): >> >> 2013-08-13 13:39:15.838 [INFO ] >> org.apache.solr.cloud.ShardLeaderElectionContext Enough replicas found to >> continue. >> 2013-08-13 13:39:15.838 [INFO ] >> org.apache.solr.cloud.ShardLeaderElectionContext I may be the new leader - >> try and sync >> 2013-08-13 13:39:15.839 [INFO ] org.apache.solr.cloud.SyncStrategy Sync >> replicas to http://solr-prod32:5050/solr/raw shard1_replica1/ >> 2013-08-13 13:39:15.841 [INFO ] >> org.apache.solr.client.solrj.impl.HttpClientUtil Creating new http client, >> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false >> 2013-08-13 13:39:15.844 [INFO ] org.apache.solr.update.PeerSync PeerSync: >> core=raw_shard1_replica1 url=http://solr-prod32:8080/solr START replicas=[ >> http://solr-prod02:5080/solr/raw shard1_replica2/] nUpdates=100 >> 2013-08-13 13:39:15.847 [INFO I org.apache.solr.update.PeerSync PeerSync: >> core=raw shard1_replica1 url=http://solr-prod32:8080/solr DONE. We have >> no versions. sync failed. >> 2013-08-13 13:39:15.847 [INFO ] org.apache.solr.cloud.SyncStrategy >> Leader's attempt to sync with shard failed, moving to the next canidate >> 2013-08-13 13:39:15.847 [INFO ] >> org.apache.solr.cloud.ShardLeaderElectionContext We failed sync, but we >> have no versions - we can't sync in that case - we were active before, so >> become leader anyway >> 2013-08-13 13:39:15.847 [INFO ] >> org.apache.solr.cloud.ShardLeaderElectionContext I am the new leader: >> http://solr-prod32:8080/solr/raw_shard1_replica1/ >> 2013-08-13 13:39:15.847 [INFO ] org.apache.solr.common.cloud.SolrZkClient >> makePath: /collections/raw/leaders/shardl >> 2013-08-13 13:39:17.423 [INFO ] org.apache.solr.common.cloud.ZkStateReader >> A cluster state change: WatchedEvent state:SyncConnected >> type:NodeDataChanged path:/clusterstate.json, has occurred - updating... >> (live nodes size: 40) >> >> While in solr-prod02 (sub cluster #1 - good state) I get: >> 2013-08-13 13:39:15.671 [INFO ] org.apache.solr.cloud.ZkController >> publishing core=raw_shard1_replica2 state=down >> 2013-08-13 13:39:15.671 [INFO ] org.apache.solr.cloud.ZkController >> numShards not found on descriptor - reading it from system property >> 2013-08-13 13:39:15.673 [INFO ] org.apache.solr.core.CoreContainer >> registering core: raw_shard1_replica2 >> 2013-08-13 13:39:15.673 [INFO ] org.apache.solr.cloud.ZkController >> Register replica - core:raw_shard1_replica2 address: >> http://so1r-prod02:8080/solr collection:raw shard:shard1 >> 2013-08-13 13:39:17.423 [INFO ] org.apache.solr.common.cloud.ZkStateReader >> A cluster state change: WatchedEvent stare:SyncConnected >> type:NodeDataChanged path:/clusterstate.json, has occurred - updating... >> (live nodes size: 40) >> 2013-08-13 13:39:17.480 [INFO ] org.apache.solr.cloud.ZkController We are >> httpL//solr-prod02:8080/solr/raw_shard1_replica2/ and leader is >> http://solr-prod32:8080/solr/raw_shard1_replica1/ >> 2013-08-13 13:39:17.481 [INFO ] org.apache.solr.cloud.ZkController No >> LogReplay needed for core=raw_shard1_replica2 >> 2013-08-13 13:39:17.481 [INFO ] org.apache.solr.cloud.ZkController Core >> needs to recover:raw shard1_replica2 >> 2013-08-13 13:39:17.481 [INFO ] >> org.apache.solr.update.DefaultSolrCoreState Running recovery - first >> canceling any ongoing recovery >> 2013-08-13 13:39:17.485 [INFO org.apache.solr.common.cloud.ZkStateReader >> Updating cloud state from ZooKeeper... >> 2013-08-13 13:39:17.485 [INFO ] org.apache.solr.cloud.RecoveryStrategy >> Starting recovery process. core=raw_shard1_rep1ica2 >> >> Why was the leader elected wrongly?? >> >> Thanks >>