[ https://issues.apache.org/jira/browse/HDDS-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756458#comment-16756458 ]
Bharat Viswanadham commented on HDDS-1031: ------------------------------------------ Hi [~arpitagarwal] I have missed this comment and committed the patch. The same error is seen on all the other test case failure too. Not sure this appear's some time. Need to dig in to find out the root cause. I will run the test locally to confirm whether it is passing or not. > Update ratis version to fix a DN restart Bug > -------------------------------------------- > > Key: HDDS-1031 > URL: https://issues.apache.org/jira/browse/HDDS-1031 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Bharat Viswanadham > Assignee: Bharat Viswanadham > Priority: Major > Attachments: HDDS-1031.00.patch > > > This is related to RATIS-460. > When datanode is restarted, after ratis has taken a snapshot, we see below > stack trace, and DN won't boot up. For more info refer RATIS-460 > > {code:java} > java.io.IOException: java.lang.IllegalStateException: lastEntry = > 72856=72856: [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, > 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, > add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, > lastEntry.index >= logIndex = 0 > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:283) > at > org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:295) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:427) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:149) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:165) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:334) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalStateException: lastEntry = 72856=72856: > [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, > 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, > add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, > lastEntry.index >= logIndex = 0 > at > org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:72) > at > org.apache.ratis.server.impl.ConfigurationManager.addConfiguration(ConfigurationManager.java:54) > at > org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:352) > at > org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:347) > at > org.apache.ratis.server.storage.RaftLog.lambda$open$6(RaftLog.java:237) > at > org.apache.ratis.server.storage.LogSegment.lambda$loadSegment$0(LogSegment.java:140) > at > org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:121) > at > org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:137) > at > org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:272) > at > org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:159) > at > org.apache.ratis.server.storage.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:129) > at org.apache.ratis.server.storage.RaftLog.open(RaftLog.java:233) > at > org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:191) > at > org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:114) > at > org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:103) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:207) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > at > java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582) > at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > at > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) > 2019-01-29 01:43:41,137 [main] ERROR - Exception in HddsDatanodeService. > java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.join(DatanodeStateMachine.java:363) > at > org.apache.hadoop.ozone.HddsDatanodeService.join(HddsDatanodeService.java:270) > at > org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:127) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org