[ 
https://issues.apache.org/jira/browse/HDFS-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105447#comment-16105447
 ] 

Chen Liang commented on HDFS-12216:
-----------------------------------

Thanks [~msingh] for filing this! I was also just about to this :). I spend 
some time looking at {{TestKeys}} fail. So just to add some info here.

Looks like {{testPutAndGetKeyWithDnRestart}} is the one that fails all the 
time. Seems that after {{restartDatanode(cluster, 0, helper.client);}} gets 
called, the call to {{xceiverClientManager.acquireClient(pipeline);}} will 
fail. The exact place where the exception is thrown is 
{{XceiverClientManager#getClient()}}, look like when XceverClient tries to 
connect the same server port as it was before the restart, but the other end is 
simply no longer listening (Connection refused). I am currently thinking it 
might be that the dn restart is not properly functioning somewhere, or the 
XceverClient should be connecting to some different port after the restart. 
Haven't looked deeper though. Hope this helps.

> Ozone: TestKeys and TestKeysRatis are failing consistently
> ----------------------------------------------------------
>
>                 Key: HDFS-12216
>                 URL: https://issues.apache.org/jira/browse/HDFS-12216
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>             Fix For: HDFS-7240
>
>
> TestKeys and TestKeysRatis are failing consistently as noted in test logs for 
> HDFS-12183
> TestKeysRatis is failing because of the following error
> {code}
> 2017-07-28 23:11:28,783 [StateMachineUpdater-127.0.0.1:55793] ERROR 
> impl.StateMachineUpdater (ExitUtils.java:terminate(80)) - Terminating with 
> exit status 2: StateMachineUpdater-127.0.0.1:55793: the StateMachineUpdater 
> hits Throwable
> org.iq80.leveldb.DBException: Closed
>       at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:123)
>       at org.apache.hadoop.utils.LevelDBStore.put(LevelDBStore.java:98)
>       at 
> org.apache.hadoop.ozone.container.common.impl.KeyManagerImpl.putKey(KeyManagerImpl.java:90)
>       at 
> org.apache.hadoop.ozone.container.common.impl.Dispatcher.handlePutKey(Dispatcher.java:547)
>       at 
> org.apache.hadoop.ozone.container.common.impl.Dispatcher.keyProcessHandler(Dispatcher.java:206)
>       at 
> org.apache.hadoop.ozone.container.common.impl.Dispatcher.dispatch(Dispatcher.java:110)
>       at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch(ContainerStateMachine.java:94)
>       at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:81)
>       at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:913)
>       at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:142)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> where as TestKeys is failing because of
> {code}
> 2017-07-28 23:14:20,889 [Thread-486] INFO  scm.XceiverClientManager 
> (XceiverClientManager.java:getClient(158)) - exception 
> java.util.concurrent.ExecutionException: java.net.ConnectException: 
> Connection refused: /127.0.0.1:55914
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to