[jira] [Commented] (MAPREDUCE-6745) Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

2016-09-01 Thread liuxiaoping (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455156#comment-15455156
 ] 

liuxiaoping commented on MAPREDUCE-6745:


When task failure but job is successful, that .staging dir shouldn't be keept. 
I think that is good idea to add a parameter 
"mapreduce.tasks.files.preserve.failedjobs".

> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging 
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, 
> temporary job directory will not be deleted in staging directory 
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below 
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is 
> exceeded: limit=1048576 items=1048576
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
>   at java.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>   
>   
> The official instructions for the configuration 
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on 
> jobs that are failing, because the storage is never reclaimed. 
> It also prevents the map outputs from being erased from the reduce 
> directory as they are consumed.
>   
> According to the instructions, I think the temporary files for successful 
> tasks shouldn't be kept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6745) Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

2016-08-02 Thread mujunchao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405252#comment-15405252
 ] 

mujunchao commented on MAPREDUCE-6745:
--

We will move the .staging dir away while Job finished or failed. As the Job is 
not alive, i think no need to keep the .staging dir at that time.

> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging 
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, 
> temporary job directory will not be deleted in staging directory 
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below 
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is 
> exceeded: limit=1048576 items=1048576
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
>   at java.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>   
>   
> The official instructions for the configuration 
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on 
> jobs that are failing, because the storage is never reclaimed. 
> It also prevents the map outputs from being erased from the reduce 
> directory as they are consumed.
>   
> According to the instructions, I think the temporary files for successful 
> tasks shouldn't be kept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6745) Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

2016-08-02 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405195#comment-15405195
 ] 

Akira Ajisaka commented on MAPREDUCE-6745:
--

However, the document is confusing for me. I'd like to add a parameter 
"mapreduce.tasks.files.preserve.failedjobs" for keep the .staging dir only for 
the failing jobs. What do you think?

> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging 
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, 
> temporary job directory will not be deleted in staging directory 
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below 
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is 
> exceeded: limit=1048576 items=1048576
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
>   at java.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>   
>   
> The official instructions for the configuration 
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on 
> jobs that are failing, because the storage is never reclaimed. 
> It also prevents the map outputs from being erased from the reduce 
> directory as they are consumed.
>   
> According to the instructions, I think the temporary files for successful 
> tasks shouldn't be kept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6745) Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

2016-08-02 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405191#comment-15405191
 ] 

Akira Ajisaka commented on MAPREDUCE-6745:
--

Probably MAPREDUCE-6607 is related. As I commented 
[there|https://issues.apache.org/jira/browse/MAPREDUCE-6607?focusedCommentId=15140967=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15140967],
 if the parameter is set, the files for failed tasks are kept. However, the 
files in .staging can be used for all tasks, so it's difficult to search what 
is the files in .staging of the failed tasks. That's why if the parameter is 
set, all the files in .staging are preserved for now.
Therefore you need to set the parameter to true only for the failing jobs to 
avoid the issue. I'm thinking this is what the document want to say.

> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging 
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, 
> temporary job directory will not be deleted in staging directory 
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below 
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is 
> exceeded: limit=1048576 items=1048576
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
>   at java.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>   
>   
> The official instructions for the configuration 
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on 
> jobs that are failing, because the storage is never reclaimed. 
> It also prevents the map outputs from being erased from the reduce 
> directory as they are consumed.
>   
> According to the instructions, I think the temporary files for successful 
> tasks shouldn't be kept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6745) Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging after MapReduce job finish successfully

2016-07-28 Thread mujunchao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398603#comment-15398603
 ] 

mujunchao commented on MAPREDUCE-6745:
--

Nice Catch!
As "/tmp/hadoop-yarn/staging/username/.staging" is exceeded, all JOB failed.

> Job directories should be clean in staging directorg /tmp/hadoop-yarn/staging 
> after MapReduce job finish successfully
> -
>
> Key: MAPREDUCE-6745
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6745
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
> Environment: Suse 11 sp3
>Reporter: liuxiaoping
>Priority: Blocker
>
> If MapReduce client set mapreduce.task.files.preserve.failedtasks=true, 
> temporary job directory will not be deleted in staging directory 
> /tmp/hadoop-yarn/staging.
> As time goes by, the job files are more and more, eventually lead to below 
> exeception:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemExceededException):
> The directory item limit of /tmp/hadoop-yarn/staging/username/.staging is 
> exceeded: limit=1048576 items=1048576
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:936)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addLastINode(FSDirectory.java:981)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.unprotectedMkdir(FSDirMkdirOp.java:237)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createSingleDirectory(FSDirMkdirOp.java:191)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.createChildrenDirectories(FSDirMkdirOp.java:166)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:97)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3788)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:986)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:624)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolProtos.$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:624)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
>   at java.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
>   
>   
> The official instructions for the configuration 
> mapreduce.task.files.preserve.failedtasks is below:
> Should the files for failed tasks be kept. This should only be used on 
> jobs that are failing, because the storage is never reclaimed. 
> It also prevents the map outputs from being erased from the reduce 
> directory as they are consumed.
>   
> According to the instructions, I think the temporary files for successful 
> tasks shouldn't be kept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org