[
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455156#comment-15455156
]
liuxiaoping commented on MAPREDUCE-6745:
When task failure but job is successful, that .staging dir shouldn't be keept.
I think that is good idea to add a parameter
"mapreduce.tasks.files.preserve.failedjobs".
> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true,
> temporary job directory will not be deleted in staging directory
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is
> exceeded: limit=1048576 items=1048576
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
> at java.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>
>
> The official instructions for the configuration
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on
> jobs that are failing, because the storage is never reclaimed.
> It also prevents the map outputs from being erased from the reduce
> directory as they are consumed.
>
> According to the instructions, I think the temporary files for successful
> tasks shouldn't be kept.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org