[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434276#comment-13434276 ]
Eli Collins commented on HDFS-3788: ----------------------------------- bq. Do you mean "fs -get"? No way, they should have the same code path. Are you sure that both server and client were running trunk? Yes, hadoop fs -get of a 3gb file works but distcp of the directory containing that file fails. And yes, using a trunk build for everything, just running this via a pseudo distributed tarball install on my laptop. Can you explain what the bug is and the relevant fix? I don't see why we were not setting the content length header as we do that unconditionally on the server side. > distcp can't copy large files using webhdfs due to missing Content-Length > header > -------------------------------------------------------------------------------- > > Key: HDFS-3788 > URL: https://issues.apache.org/jira/browse/HDFS-3788 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Affects Versions: 0.23.3, 2.0.0-alpha > Reporter: Eli Collins > Assignee: Tsz Wo (Nicholas), SZE > Priority: Critical > Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, > h3788_20120814.patch > > > The following command fails when data1 contains a 3gb file. It passes when > using hftp or when the directory just contains smaller (<2gb) files, so looks > like a webhdfs issue with large files. > {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 > hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira