steveloughran commented on a change in pull request #1794: HADOOP-15887: Add an option to avoid writing data locally in Distcp URL: https://github.com/apache/hadoop/pull/1794#discussion_r363378165
########## File path: hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm ########## @@ -362,6 +362,7 @@ Command Line Options | `-copybuffersize <copybuffersize>` | Size of the copy buffer to use. By default, `<copybuffersize>` is set to 8192B | | | `-xtrack <path>` | Save information about missing source files to the specified path. | This option is only valid with `-update` option. This is an experimental property and it cannot be used with `-atomic` option. | | `-direct` | Write directly to destination paths | Useful for avoiding potentially very expensive temporary file rename operations when the destination is an object store | +| `-noLocalWrite` | Write data to target cluster with data locality disabled. | If this option is set, the distcp task will not write data replication to local datanode to avoid datanode being imbalanced. This option is suggested to be specified when the data to copy is very large and the DistCp job runs on the target cluster. | Review comment: suggest: Write data to an HDFS cluster with data locality disabled. | If this option is set, the distcp tasks will not write data blocks to their local datanodes, so avoiding datanodes becoming imbalanced. Recommended when the amount of data to copy is very large, the target cluster is HDFS and the DistCp job runs on that target cluster. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org