Re: Problem on shard allocation when upgrading from 1.2.2 to 1.30

Antonio Augusto Santos Thu, 24 Jul 2014 09:07:07 -0700

Well, don't know what happened but suddenly the shard replicated.
I was trying to copy the index to a new one with stream2es and when the 
copy started the cluster state turned green...


The problem, for me, is solved.

On Thursday, July 24, 2014 11:13:41 AM UTC-3, Antonio Augusto Santos wrote:
>
> Dear,
>
> I've upgraded from 1.2.2 to 1.3.0 today and I've found one issue. 
> I've a 3 node cluster (one without data) running on CentOS 6.5. I've done 
> a rolling upgrade: first upgraded the no data node (server_0), then 
> shutdown server_1, upgraded (with the RPM), and restarted it. All shards 
> came back to live no problem, and my cluster was green.
> Then I've shutdown node 3 (server_2), upgraded and restarted ES. After 
> this almost everything is back to normal but one shard from one index. And 
> I'm getting the following error on server_2
>
> [2014-07-24 10:47:59,575][WARN ][cluster.action.shard     ] [server_2] [
> MY_INDEX][0] received shard failed for [MY_INDEX][0], node[Y0EJ2oh2QI-
> cdh2Jxi9z4A], [R], s[INITIALIZING], indexUUID [OR_0aHy6TIiHZVK_9PaPBQ], 
> reason [Failed to start shard, message [RecoveryFailedException[[MY_INDEX
> ][0]: Recovery failed from [server_1][pkRqLLmtS8iUxF80uQJzFw][server_1][
> inet[/XXX.XXX.XXX.001:9300]]{master=true} into [server_2][Y0EJ2oh2QI-
> cdh2Jxi9z4A][server_2][inet[/XXX.XXX.XXX.002:9300]]{master=true}]; nested: 
> RemoteTransportException[[server_1][inet[/XXX.XXX.XXX.001:9300]][index/
> shard/recovery/startRecovery]]; nested: RecoveryEngineException[[MY_INDEX
> ][0] Phase[2] Execution failed]; nested: RemoteTransportException[[
> server_2][inet[/XXX.XXX.XXX.002:9300]][index/shard/recovery/
> prepareTranslog]]; nested: EngineCreationFailureException[[MY_INDEX][0] 
> failed to open reader on writer]; nested: FileNotFoundException[No such 
> file [_drr_Lucene45_0.dvm]]; ]]
>
> From what I got from the message server_1 i trying to send 
> *_drr_Lucene45_9.dvm* to server_2 but can't find it. I tried looking on 
> */var/lib/elasticsearch/MY_CLUSTER/nodes/0/indices/MY_INDEX/0/index 
> *on server_1, and there is no such file, but there is a 
> *_drr_Lucene49_0.dvm*. 
>
> I've checked and both servers are running on 1.3.0:
> # ps -ef | grep elastic
> 498      121131      1 55 10:46 ?        00:11:41 /usr/bin/java -Xms5g -
> Xmx5g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+
> UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+
> UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+
> DisableExplicitGC -Djna.tmpdir=/usr/share/elasticsearch/tmp -Djava.io.
> tmpdir=/usr/share/elasticsearch/tmp -Delasticsearch -Des.pidfile=/var/run/
> elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp 
> :/usr/share/elasticsearch/lib/elasticsearch-1.3.0.jar:/usr/share/
> elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* 
> -Des.default.path.home=/usr/share/elasticsearch 
> -Des.default.path.logs=/var/log/elasticsearch 
> -Des.default.path.data=/var/lib/elasticsearch 
> -Des.default.path.work=/usr/share/elasticsearch/tmp 
> -Des.default.path.conf=/etc/elasticsearch 
> org.elasticsearch.bootstrap.Elasticsearch
>
> # ps -ef | grep elastic
> 498       25573      1 59 10:41 ?        00:15:15 /usr/bin/java -Xms5g 
> -Xmx5g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC 
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
> -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError 
> -XX:+DisableExplicitGC -Djna.tmpdir=/usr/share/elasticsearch/tmp 
> -Djava.io.tmpdir=/usr/share/elasticsearch/tmp -Delasticsearch 
> -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid 
> -Des.path.home=/usr/share/elasticsearch -cp 
> :/usr/share/elasticsearch/lib/elasticsearch-1.3.0.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/*
>  
> -Des.default.path.home=/usr/share/elasticsearch 
> -Des.default.path.logs=/var/log/elasticsearch 
> -Des.default.path.data=/var/lib/elasticsearch 
> -Des.default.path.work=/usr/share/elasticsearch/tmp 
> -Des.default.path.conf=/etc/elasticsearch 
> org.elasticsearch.bootstrap.Elasticsearch
>
>
> I've already restarted ES on server_2 to no vail. I haven't restarted 
> server_1 because I'm afraid to loose the data that is there (my searches 
> appear to be working Ok, and returning expected results).
>
> Maybe something went wrong during the update? Any suggestions on how to 
> fix this problem?
>
> Cheers
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f632464-d749-4794-a5b4-fca31e475470%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Problem on shard allocation when upgrading from 1.2.2 to 1.30

Reply via email to