[jira] [Commented] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-12-01 Thread Zhen Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451804#comment-17451804
 ] 

Zhen Wang commented on HDFS-16363:
--

In the org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue#merge method, the 
tmpFilename only gets the path string. 
[https://github.com/wForget/hadoop/commit/f03a8922b14030f52d913cbd9dcc130260654ba3]

> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
> --
>
> Key: HDFS-16363
> URL: https://issues.apache.org/jira/browse/HDFS-16363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.2.2
>Reporter: Zhen Wang
>Priority: Major
> Attachments: image-2021-12-01-15-07-42-432.png, 
> image-2021-12-01-15-09-54-965.png, image-2021-12-01-15-14-25-549.png
>
>
> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
>  
> task log:
> {code:java}
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
> 24631997; dirCnt = 1750444
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
> Exception: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
> directory: /system/mapred/XX/.staging/_distcp-260350640
>         at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
>         at 
> org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
>         at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
>         at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
>         at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
> 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for hdfs://rbf-XX/system/mapred/XX/
> .staging/_distcp-260350640/intermediate.1
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> 

[jira] [Comment Edited] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558
 ] 

Zhen Wang edited comment on HDFS-16363 at 12/1/21, 7:15 AM:


DEBUG

org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): 

!image-2021-12-01-15-09-54-965.png!

org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath:

!image-2021-12-01-15-07-42-432.png!

org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue#merge:

!image-2021-12-01-15-14-25-549.png!


was (Author: wforget):
DEBUG

org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): 

!image-2021-12-01-15-09-54-965.png!

org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath:

!image-2021-12-01-15-07-42-432.png!

> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
> --
>
> Key: HDFS-16363
> URL: https://issues.apache.org/jira/browse/HDFS-16363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.2.2
>Reporter: Zhen Wang
>Priority: Major
> Attachments: image-2021-12-01-15-07-42-432.png, 
> image-2021-12-01-15-09-54-965.png, image-2021-12-01-15-14-25-549.png
>
>
> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
>  
> task log:
> {code:java}
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
> 24631997; dirCnt = 1750444
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
> Exception: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
> directory: /system/mapred/XX/.staging/_distcp-260350640
>         at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
>         at 
> org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
>         at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
>         at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
>         at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
> 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for hdfs://rbf-XX/system/mapred/XX/
> .staging/_distcp-260350640/intermediate.1
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
>         at 
> 

[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Wang updated HDFS-16363:
-
Description: 
An exception occurs in the distcp task of a large number of files, when 
yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

 

task log:
{code:java}
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
24631997; dirCnt = 1750444
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
Instead, use mapreduce.task.io.sort.mb
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. 
Instead, use mapreduce.task.io.sort.factor
21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
Exception: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: 
/system/mapred/XX/.staging/_distcp-260350640
        at 
org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
        at 
org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for hdfs://rbf-XX/system/mapred/XX/
.staging/_distcp-260350640/intermediate.1
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at 

[jira] [Comment Edited] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558
 ] 

Zhen Wang edited comment on HDFS-16363 at 12/1/21, 7:09 AM:


DEBUG

org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): 

!image-2021-12-01-15-09-54-965.png!

org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath:

!image-2021-12-01-15-07-42-432.png!


was (Author: wforget):
DEBUG

org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath:

!image-2021-12-01-15-07-42-432.png!

> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
> --
>
> Key: HDFS-16363
> URL: https://issues.apache.org/jira/browse/HDFS-16363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.2.2
>Reporter: Zhen Wang
>Priority: Major
> Attachments: image-2021-12-01-15-07-42-432.png, 
> image-2021-12-01-15-09-54-965.png
>
>
> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
>  
> task log:
> {code:java}
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
> 24631997; dirCnt = 1750444
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
> Exception: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
> directory: /system/mapred/aa/.staging/_distcp-260350640
>         at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
>         at 
> org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
>         at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
>         at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
>         at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
> 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for hdfs://rbf-XX/system/mapred/aa/
> .staging/_distcp-260350640/intermediate.1
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> 

[jira] [Commented] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558
 ] 

Zhen Wang commented on HDFS-16363:
--

DEBUG

org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath:

!image-2021-12-01-15-07-42-432.png!

> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
> --
>
> Key: HDFS-16363
> URL: https://issues.apache.org/jira/browse/HDFS-16363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.2.2
>Reporter: Zhen Wang
>Priority: Major
> Attachments: image-2021-12-01-15-07-42-432.png
>
>
> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
>  
> task log:
> {code:java}
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
> 24631997; dirCnt = 1750444
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
> Exception: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
> directory: /system/mapred/aa/.staging/_distcp-260350640
>         at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
>         at 
> org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
>         at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
>         at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
>         at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
> 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for hdfs://rbf-XX/system/mapred/aa/
> .staging/_distcp-260350640/intermediate.1
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> 

[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Wang updated HDFS-16363:
-
Attachment: image-2021-12-01-15-07-42-432.png

> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
> --
>
> Key: HDFS-16363
> URL: https://issues.apache.org/jira/browse/HDFS-16363
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.2.2
>Reporter: Zhen Wang
>Priority: Major
> Attachments: image-2021-12-01-15-07-42-432.png
>
>
> An exception occurs in the distcp task of a large number of files, when 
> yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
>  
> task log:
> {code:java}
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
> 24631997; dirCnt = 1750444
> 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
> Instead, use mapreduce.task.io.sort.mb
> 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is 
> deprecated. Instead, use mapreduce.task.io.sort.factor
> 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
> Exception: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
> directory: /system/mapred/aa/.staging/_distcp-260350640
>         at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
>         at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
>         at 
> org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
>         at 
> org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
>         at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
>         at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
>         at 
> org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
>         at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
>         at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
> 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any 
> valid local directory for hdfs://rbf-XX/system/mapred/aa/
> .staging/_distcp-260350640/intermediate.1
>         at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
>         at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
>         at 
> org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
>         at 
> 

[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Wang updated HDFS-16363:
-
Description: 
An exception occurs in the distcp task of a large number of files, when 
yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

 

task log:
{code:java}
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
24631997; dirCnt = 1750444
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
Instead, use mapreduce.task.io.sort.mb
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. 
Instead, use mapreduce.task.io.sort.factor
21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
Exception: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: 
/system/mapred/aa/.staging/_distcp-260350640
        at 
org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
        at 
org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for hdfs://rbf-XX/system/mapred/aa/
.staging/_distcp-260350640/intermediate.1
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at 

[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Wang updated HDFS-16363:
-
Description: 
An exception occurs in the distcp task of a large number of files, when 
yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

 

task log:
{code:java}
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
24631997; dirCnt = 1750444
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
Instead, use mapreduce.task.io.sort.mb
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. 
Instead, use mapreduce.task.io.sort.factor
21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
Exception: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: 
/system/mapred/aa/.staging/_distcp-260350640
        at 
org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
        at 
org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for hdfs://rbf-XX/system/mapred/aa/
.staging/_distcp-260350640/intermediate.1
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at 

[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Wang updated HDFS-16363:
-
Description: 
An exception occurs in the distcp task of a large number of files, when 
yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

 

error:

 
{code:java}
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 
24631997; dirCnt = 1750444
21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed.
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. 
Instead, use mapreduce.task.io.sort.mb
21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. 
Instead, use mapreduce.task.io.sort.factor
21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error 
Exception: 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: 
/system/mapred/aa/.staging/_distcp-260350640
        at 
org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
        at 
org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367)
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)
21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered 
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for hdfs://rbf-XX/system/mapred/aa/
.staging/_distcp-260350640/intermediate.1
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476)
        at 
org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450)
        at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
        at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
        at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
        at 

[jira] [Created] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.

2021-11-30 Thread Zhen Wang (Jira)
Zhen Wang created HDFS-16363:


 Summary: An exception occurs in the distcp task of a large number 
of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
 Key: HDFS-16363
 URL: https://issues.apache.org/jira/browse/HDFS-16363
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Affects Versions: 3.2.2
Reporter: Zhen Wang


An exception occurs in the distcp task of a large number of files, when 
yarn.app.mapreduce.am.staging-dir is set to the hdfs path.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org