[ https://issues.apache.org/jira/browse/HADOOP-15292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387182#comment-16387182 ]
Virajith Jalaparti commented on HADOOP-15292: --------------------------------------------- Attached patch fixes this issue by replacing the positioned read with an initial seek, and reading the remaining data directly from the open stream. > Distcp's use of pread is slowing it down. > ----------------------------------------- > > Key: HADOOP-15292 > URL: https://issues.apache.org/jira/browse/HADOOP-15292 > Project: Hadoop Common > Issue Type: Bug > Reporter: Virajith Jalaparti > Priority: Major > Attachments: HADOOP-15292.000.patch > > > Distcp currently uses positioned-reads (in > RetriableFileCopyCommand#copyBytes) when the source offset is > 0. This > results in unnecessary overheads (new BlockReader being created on the > client-side, multiple readBlock() calls to the Datanodes, each of requires > the creation of a BlockSender and an inputstream to the ReplicaInfo). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org