Dian Fu created YARN-3930: ----------------------------- Summary: FileSystemNodeLabelsStore should make sure edit log file closed when exception is thrown Key: YARN-3930 URL: https://issues.apache.org/jira/browse/YARN-3930 Project: Hadoop YARN Issue Type: Sub-task Reporter: Dian Fu Assignee: Dian Fu
When I test the node label feature in my local environment, I encountered the following exception: {code} at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2426) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2222) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2523) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2498) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:662) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:418) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.handleStoreEvent(CommonNodeLabelsManager.java:196) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:168) at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:163) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:176) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) at java.lang.Thread.run(Thread.java:745) {code} The reason is that HDFS throws an exception when calling {{ensureAppendEditlogFile}} because of some reason which causes the edit log output stream isn't closed. This caused that the next time we call {{ensureAppendEditlogFile}}, lease recovery will failed because we are just the lease holder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)