[ https://issues.apache.org/jira/browse/HDFS-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831838#comment-17831838 ]
ASF GitHub Bot commented on HDFS-17216: --------------------------------------- dineshchitlangia merged PR #6138: URL: https://github.com/apache/hadoop/pull/6138 > When distcp handle the small files, the bandwidth parameter will be invalid, > resulting in serious overspeed behavior > -------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-17216 > URL: https://issues.apache.org/jira/browse/HDFS-17216 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp > Affects Versions: 3.3.4 > Reporter: xiaojunxiang > Priority: Major > Labels: pull-request-available > Attachments: DiscpAnalyze.jpg > > > When distcp copies small files (file size slightly smaller than the > bandwidth), the throbber only starts to throb after 1 second, and the > throttled is specific to a single file. so the throbber becomes invalid, > causing distcp to fill the cluster bandwidth and crush production traffic, > which is a terrible thing. > Also, it takes time for files to set up the IO pipeline, so you shouldn't > test with very small files, which will slow the transfer, especially as > bandwidth kicks in, which will amplify the impact of small files on the rate -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org