[ https://issues.apache.org/jira/browse/YARN-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630017#comment-14630017 ]
Wangda Tan commented on YARN-3930: ---------------------------------- [~dian.fu], Thanks for working on the JIRA. Patch looks good, will commit soon. > FileSystemNodeLabelsStore should make sure edit log file closed when > exception is thrown > ----------------------------------------------------------------------------------------- > > Key: YARN-3930 > URL: https://issues.apache.org/jira/browse/YARN-3930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager > Reporter: Dian Fu > Assignee: Dian Fu > Attachments: YARN-3930.001.patch > > > When I test the node label feature in my local environment, I encountered the > following exception: > {code} > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2426) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2222) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2523) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2498) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:662) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:418) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.handleStoreEvent(CommonNodeLabelsManager.java:196) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:168) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager$ForwardingEventHandler.handle(CommonNodeLabelsManager.java:163) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:176) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) > at java.lang.Thread.run(Thread.java:745) > {code} > The reason is that HDFS throws an exception when calling > {{ensureAppendEditlogFile}} because of some reason which causes the edit log > output stream isn't closed. This caused that the next time we call > {{ensureAppendEditlogFile}}, lease recovery will failed because we are just > the lease holder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)