[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-3788: -- Target Version/s: (was: 2.0.2-alpha) distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Fix For: 0.23.3, 2.0.2-alpha Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch, h3788_20120815.patch, h3788_20120816.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3788: -- Fix Version/s: 0.23.3 Merged to 23. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Fix For: 0.23.3, 2.2.0-alpha Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch, h3788_20120815.patch, h3788_20120816.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120815.patch h3788_20120815.patch: use Long for file length. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch, h3788_20120815.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120816.patch Fair enough. I added spaces in the comment so that it is consistent with the existing code. I did not change the for-loop since there is no existing for-loop having the space. h3788_20120816.patch distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch, h3788_20120815.patch, h3788_20120816.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Resolution: Fixed Fix Version/s: 2.2.0-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed this. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Fix For: 2.2.0-alpha Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch, h3788_20120815.patch, h3788_20120816.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: 20120814NullEntity.patch 20120814NullEntity.patch: This is the mock test I mentioned. I used it to figure when the web server switch to chunked transfer encoding when the size = 2GB - 1 but not 2GB. Unfortunately, the patch added some test code to DatanodeWebHdfsMethods. So it won't be committed. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: 20120814NullEntity.patch, distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120814.patch Thanks Daryn for taking a look. Here is a new patch: h3788_20120814.patch I have run 3GB file test included in the previous patch. The test will not be committed since it takes 10 minutes. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120814b.patch h3788_20120814b.patch: fixes a NPE bug in previous patch. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814b.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated HDFS-3788: - Affects Version/s: 0.23.3 This affects 0.23 as well. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Priority: Critical Attachments: distcp-webhdfs-errors.txt The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120813.patch h3788_20120813.patch: check content-length only for non-chunked transfer encoding. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Assignee: Tsz Wo (Nicholas), SZE Status: Patch Available (was: Open) distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.0.0-alpha, 0.23.3 Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3788: -- Attachment: distcp-webhdfs-errors.txt Full logs attached. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.0.0-alpha Reporter: Eli Collins Priority: Critical Attachments: distcp-webhdfs-errors.txt The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira