Hi Alexander, Here is the stack trace for the NullpointerException -
[23:24:38,929][DEBUG][action.bulk ] [Rasputin, Mikhail] [17f85dcb67b64a13bfef2be74595087e][0], node[a-eZTR9XRiWq-o0QmsM2aA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.bulk.BulkShardRequest@22b11bbf] java.lang.NullPointerException at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:247) at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:242) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performReplicas(TransportShardReplicationOperationAction.java:607) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:533) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) [23:24:38,940][DEBUG][action.bulk ] [Rasputin, Mikhail] [17f85dcb67b64a13bfef2be74595087e][0], node[a-eZTR9XRiWq-o0QmsM2aA], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.bulk.BulkShardRequest@768475c4] java.lang.NullPointerException at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:247) at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:242) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performReplicas(TransportShardReplicationOperationAction.java:607) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:533) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Thanks, Rohit On Fri, Jun 20, 2014 at 12:02 AM, Alexander Reelsen <a...@spinscale.de> wrote: > Hey, > > the exception you showed, can possibly happen, when you remove an alias. > However you mentioned NullPointerException in your first post, which is not > contained in the stacktrace, so it seems, that one is still missing. > > Also, please retry with a newer version of Elasticsearch. > > > --Alex > > > On Thu, Jun 19, 2014 at 5:13 AM, Rohit Jaiswal <rohit.jais...@gmail.com> > wrote: > >> Hi Alexander, >> We sent you the stack trace. Can you please enlighten us >> on this? >> >> Thanks, >> Rohit >> >> >> On Mon, Jun 16, 2014 at 10:25 AM, Rohit Jaiswal <rohit.jais...@gmail.com> >> wrote: >> >>> Hi Alexander, >>> Thanks for your reply. We plan to upgrade in the >>> long run, however we need to fix the data loss problem on 0.90.2 in the >>> immediate term. >>> >>> Here is the stack trace - >>> >>> >>> 10:09:37.783 PM >>> >>> [22:09:37,783][WARN ][indices.cluster ] [Storm] >>> [b7a76aa06cfd4048987d1117f3e0433a][0] failed to start shard >>> org.elasticsearch.indices.recovery.RecoveryFailedException: >>> [b7a76aa06cfd4048987d1117f3e0433a][0]: Recovery failed from [Jeffrey >>> Mace][_jjr5BYJQjO6QzzheyDmhw][inet[/10.4.35.200:9300]] into >>> [Storm][FiW6mbR5ThqqSii5Wc28lQ][inet[/10.4.40.95:9300]] >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:293) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:62) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:163) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> Caused by: org.elasticsearch.transport.RemoteTransportException: >>> [Jeffrey Mace][inet[/10.4.35.200:9300 >>> ]][index/shard/recovery/startRecovery] >>> Caused by: org.elasticsearch.index.engine.RecoveryEngineException: >>> [b7a76aa06cfd4048987d1117f3e0433a][0] Phase[2] Execution failed >>> at >>> org.elasticsearch.index.engine.robin.RobinEngine.recover(RobinEngine.java:1147) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:526) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:116) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:60) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:328) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:314) >>> at >>> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:265) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> Caused by: org.elasticsearch.transport.RemoteTransportException: >>> [Storm][inet[/10.4.40.95:9300]][index/shard/recovery/translogOps] >>> Caused by: org.elasticsearch.indices.InvalidAliasNameException: >>> [b7a76aa06cfd4048987d1117f3e0433a] Invalid alias name >>> [1a4077872e41c0634cee780c1e5fc263bdd5f14b15ac9239480547ab2d3601eb], Unknown >>> alias name was passed to alias Filter >>> at >>> org.elasticsearch.index.aliases.IndexAliasesService.aliasFilter(IndexAliasesService.java:99) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.prepareDeleteByQuery(InternalIndexShard.java:382) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:628) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:447) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:416) >>> at >>> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:265) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> [22:09:37,799][WARN ][cluster.action.shard ] [Storm] sending failed >>> shard for [b7a76aa06cfd4048987d1117f3e0433a][0], >>> node[FiW6mbR5ThqqSii5Wc28lQ], [R], s[INITIALIZING], reason [Failed to start >>> shard, message >>> [RecoveryFailedException[[b7a76aa06cfd4048987d1117f3e0433a][0]: Recovery >>> failed from [Jeffrey Mace][_jjr5BYJQjO6QzzheyDmhw][inet[/ >>> 10.4.35.200:9300]] into >>> [Storm][FiW6mbR5ThqqSii5Wc28lQ][inet[/10.4.40.95:9300]]]; nested: >>> RemoteTransportException[[Jeffrey >>> Mace][inet[/10.4.35.200:9300]][index/shard/recovery/startRecovery]]; >>> nested: RecoveryEngineException[[b7a76aa06cfd4048987d1117f3e0433a][0] >>> Phase[2] Execution failed]; nested: >>> RemoteTransportException[[Storm][inet[/10.4.40.95:9300]][index/shard/recovery/translogOps]]; >>> nested: InvalidAliasNameException[[b7a76aa06cfd4048987d1117f3e0433a] >>> Invalid alias name >>> [1a4077872e41c0634cee780c1e5fc263bdd5f14b15ac9239480547ab2d3601eb], Unknown >>> alias name was passed to alias Filter]; ]] >>> [22:09:38,025][WARN ][indices.cluster ] [Storm] >>> [b7a76aa06cfd4048987d1117f3e0433a][0] failed to start shard >>> org.elasticsearch.indices.recovery.RecoveryFailedException: >>> [b7a76aa06cfd4048987d1117f3e0433a][0]: Recovery failed from [Jeffrey >>> Mace][_jjr5BYJQjO6QzzheyDmhw][inet[/10.4.35.200:9300]] into >>> [Storm][FiW6mbR5ThqqSii5Wc28lQ][inet[/10.4.40.95:9300]] >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:293) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget.access$300(RecoveryTarget.java:62) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:163) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> Caused by: org.elasticsearch.transport.RemoteTransportException: >>> [Jeffrey Mace][inet[/10.4.35.200:9300 >>> ]][index/shard/recovery/startRecovery] >>> Caused by: org.elasticsearch.index.engine.RecoveryEngineException: >>> [b7a76aa06cfd4048987d1117f3e0433a][0] Phase[2] Execution failed >>> at >>> org.elasticsearch.index.engine.robin.RobinEngine.recover(RobinEngine.java:1147) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:526) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:116) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource.access$1600(RecoverySource.java:60) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:328) >>> at >>> org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:314) >>> at >>> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:265) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> Caused by: org.elasticsearch.transport.RemoteTransportException: >>> [Storm][inet[/10.4.40.95:9300]][index/shard/recovery/translogOps] >>> Caused by: org.elasticsearch.indices.InvalidAliasNameException: >>> [b7a76aa06cfd4048987d1117f3e0433a] Invalid alias name >>> [1a4077872e41c0634cee780c1e5fc263bdd5f14b15ac9239480547ab2d3601eb], Unknown >>> alias name was passed to alias Filter >>> at >>> org.elasticsearch.index.aliases.IndexAliasesService.aliasFilter(IndexAliasesService.java:99) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.prepareDeleteByQuery(InternalIndexShard.java:382) >>> at >>> org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:628) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:447) >>> at >>> org.elasticsearch.indices.recovery.RecoveryTarget$TranslogOperationsRequestHandler.messageReceived(RecoveryTarget.java:416) >>> at >>> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:265) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown >>> Source) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) >>> at java.lang.Thread.run(Unknown Source) >>> >>> [22:09:38,042][WARN ][cluster.action.shard ] [Storm] sending failed >>> shard for [b7a76aa06cfd4048987d1117f3e0433a][0], >>> node[FiW6mbR5ThqqSii5Wc28lQ], [R], s[INITIALIZING], reason [Failed to start >>> shard, message >>> [RecoveryFailedException[[b7a76aa06cfd4048987d1117f3e0433a][0]: Recovery >>> failed from [Jeffrey Mace][_jjr5BYJQjO6QzzheyDmhw][inet[/ >>> 10.4.35.200:9300]] into >>> [Storm][FiW6mbR5ThqqSii5Wc28lQ][inet[/10.4.40.95:9300]]]; nested: >>> RemoteTransportException[[Jeffrey >>> Mace][inet[/10.4.35.200:9300]][index/shard/recovery/startRecovery]]; >>> nested: RecoveryEngineException[[b7a76aa06cfd4048987d1117f3e0433a][0] >>> Phase[2] Execution failed]; nested: >>> RemoteTransportException[[Storm][inet[/10.4.40.95:9300]][index/shard/recovery/translogOps]]; >>> nested: InvalidAliasNameException[[b7a76aa06cfd4048987d1117f3e0433a] >>> Invalid alias name >>> [1a4077872e41c0634cee780c1e5fc263bdd5f14b15ac9239480547ab2d3601eb], Unknown >>> alias name was passed to alias Filter]; ]] >>> >>> >>> Let us know.. >>> >>> Thanks, >>> Rohit >>> >>> >>> On Mon, Jun 16, 2014 at 6:13 AM, Alexander Reelsen <a...@spinscale.de> >>> wrote: >>> >>>> Hey, >>>> >>>> without stack traces it is pretty hard to see the actual problem, do >>>> you have them around (on one node this exception has happened, so it should >>>> have been logged into the elasticsearch logfile as well). Also, you should >>>> really upgrade if possible, as releases after 0.90.2 have seen many many >>>> improvements. >>>> >>>> >>>> --Alex >>>> >>>> >>>> On Mon, Jun 9, 2014 at 4:15 AM, Rohit Jaiswal <rohit.jais...@gmail.com> >>>> wrote: >>>> >>>>> Hello Everyone, >>>>> We lost data after restarting Elasticsearch >>>>> cluster. Restarting is a part of deploying our software stack. >>>>> >>>>> We have a 20-node cluster running 0.90.2 and >>>>> we have Splunk configured to index ES logs. >>>>> >>>>> Looking at the Splunk logs, we could find the >>>>> following *error a day before the deployment* (restart) - >>>>> >>>>> [cluster.action.shard ] [Rictor] sending failed shard >>>>> for [c0a71ddaa70b463a9a179c36c7fc26e3][2], node[nJvnclczRNaLbETunjlcWw], >>>>> [R], s[STARTED], reason >>>>> >>>>> >>>>> >>>>> [Failed to perform [bulk/shard] on replica, message >>>>> [RemoteTransportException; nested: >>>>> ResponseHandlerFailureTransportException; nested: NullPointerException; ]] >>>>> >>>>> >>>>> >>>>> >>>>> [cluster.action.shard ] [Kiss] received shard failed >>>>> for [c0a71ddaa70b463a9a179c36c7fc26e3][2], node[nJvnclczRNaLbETunjlcWw], >>>>> [R], s[STARTED], reason >>>>> >>>>> >>>>> >>>>> >>>>> [Failed to perform [bulk/shard] on replica, message >>>>> [RemoteTransportException; nested: >>>>> ResponseHandlerFailureTransportException; nested: NullPointerException; ]] >>>>> >>>>> >>>>> >>>>> Further,* a day after the deploy,* we see >>>>> the same errors on another node - >>>>> >>>>> >>>>> >>>>> [cluster.action.shard ] [Contrary] received shard >>>>> failed for [a58f9413315048ecb0abea48f5f6aae7][1], >>>>> node[3UbHwVCkQvO3XroIl-awPw], [R], s[STARTED], reason >>>>> >>>>> >>>>> >>>>> >>>>> [Failed to perform [bulk/shard] on replica, message >>>>> [RemoteTransportException; nested: >>>>> ResponseHandlerFailureTransportException; nested: NullPointerException; ]] >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *Immediately next, the following error is seen*. This error >>>>> is seen repeatedly on a couple of other nodes as well - >>>>> >>>>> failed to start shard >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> [cluster.action.shard ] [Copperhead] sending failed >>>>> shard for [a58f9413315048ecb0abea48f5f6aae7][0], >>>>> node[EuRzr3MLQiSS6lzTZJbiKw], [R], s[INITIALIZING], >>>>> reason [Failed to start shard, message >>>>> [RecoveryFailedException[[a58f9413315048ecb0abea48f5f6aae7][0]: Recovery >>>>> failed from [Frank Castle][dlv2mPypQaOxLPQhHQ67Fw] >>>>> >>>>> >>>>> >>>>> >>>>> [inet[/10.2.136.81:9300]] into >>>>> [Copperhead][EuRzr3MLQiSS6lzTZJbiKw][inet[/10.3.207.55:9300]]]; nested: >>>>> RemoteTransportException[[Frank Castle] >>>>> >>>>> [inet[/10.2.136.81:9300]][index/shard/recovery/startRecovery]]; nested: >>>>> RecoveryEngineException[[a58f9413315048ecb0abea48f5f6aae7][0] Phase[2] >>>>> Execution failed]; >>>>> >>>>> >>>>> >>>>> >>>>> nested: >>>>> RemoteTransportException[[Copperhead][inet[/10.3.207.55:9300]][index/shard/recovery/translogOps]]; >>>>> nested: InvalidAliasNameException[[a58f9413315048ecb0abea48f5f6aae7] >>>>> >>>>> * Invalid alias name >>>>> [fbf1e55418a2327d308e7632911f9bb8bfed58059dd7f1e4abd3467c5f8519c3], >>>>> Unknown alias name was passed to alias Filter]; ]] >>>>> >>>>> >>>>> >>>>> * >>>>> >>>>> >>>>> *During this time, we could not access previously indexed documents.* >>>>> I looked up the alias error, looks like it is related to >>>>> https://github.com/elasticsearch/elasticsearch/issues/1198 (Delete By >>>>> Query wrongly persisted to translog # 1198), >>>>> >>>>> >>>>> >>>>> >>>>> but this should be fixed in ES 0.18.0 and, we are using >>>>> 0.90.2, so why is ES encountering this issue? >>>>> >>>>> What do we need to do to set this right and get back lost >>>>> data? Please help. >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearch+unsubscr...@googlegroups.com. >>>>> >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elasticsearch/00e54753-ab89-4f63-a39e-0931e8f7e2f0%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/00e54753-ab89-4f63-a39e-0931e8f7e2f0%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "elasticsearch" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/elasticsearch/2wUHvnd_lU4/unsubscribe >>>> . >>>> To unsubscribe from this group and all its topics, send an email to >>>> elasticsearch+unsubscr...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8yrprZNCpzNqOiDzaoFwqh6Dth23OSc1byZe81P7Ba9w%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8yrprZNCpzNqOiDzaoFwqh6Dth23OSc1byZe81P7Ba9w%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAP_rV8FrhSb%2BuDQdb26t3WwUOykB1HEY0q0pkchtKb-6_hboMA%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAP_rV8FrhSb%2BuDQdb26t3WwUOykB1HEY0q0pkchtKb-6_hboMA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to a topic in the > Google Groups "elasticsearch" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/elasticsearch/2wUHvnd_lU4/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-FS8RtP3AfR-cE3Ok33eDK6PtbEKyiPhSXOVLg00xKZQ%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-FS8RtP3AfR-cE3Ok33eDK6PtbEKyiPhSXOVLg00xKZQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAP_rV8EGxL7s58Rgq8c4YhkPboLt7%3Dqx6jb_H5qTwd%3Duqb_imA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.