liuxiaoping created MAPREDUCE-6745:
--------------------------------------
Summary: Job directories should be clean in staging directorg
/tmp/hadoop-yarn/staging after MapReduce job finish successfully
Key: MAPREDUCE-6745
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mr-am
Affects Versions: 2.7.2
Environment: Suse 11 sp3
Reporter: liuxiaoping
Priority: Blocker
If MapReduce client set mapreduce.task.files.preserve.failedtasks=true,
temporary job directory will not be deleted in staging directory
/tmp/hadoop-yarn/staging.
As time goes by, the job files are more and more, eventually lead to below
exeception:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is
exceeded: limit=1048576 items=1048576
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
at
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
at
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
at
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
at
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
at java.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
The official instructions for the configuration
mapreduce.task.files.preserve.failedtasks is below:
Should the files for failed tasks be kept. This should only be used on jobs
that are failing, because the storage is never reclaimed.
It also prevents the map outputs from being erased from the reduce
directory as they are consumed.
According to the instructions, I think the temporary files for successful tasks
shouldn't be kept.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]