[ https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415162#comment-13415162 ]
Daryn Sharp commented on HDFS-3577: ----------------------------------- The {{BoundedInputStream}} is a no-op when the ctor w/o a length is used, so I think this: {code}final InputStream is = cl == null? new BoundedInputStream(in) : new BoundedInputStream(in, Long.parseLong(cl));{code} can be: {code}final InputStream is = cl == null? in : new BoundedInputStream(in, Long.parseLong(cl));{code} A chunk size can be specified for a {{HttpURLConnection}} and we should be able to enable keep-alive on the socket (I thought it was the default?) to avoid new connections for every chunk. I don't know anything about {{MessageBodyWriter}} et al, so if my suggestion isn't feasible and someone else oks the {{MessageBodyWriter}}, I'm fine with it. > WebHdfsFileSystem can not read files larger than 24KB > ----------------------------------------------------- > > Key: HDFS-3577 > URL: https://issues.apache.org/jira/browse/HDFS-3577 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 2.0.0-alpha > Reporter: Alejandro Abdelnur > Assignee: Tsz Wo (Nicholas), SZE > Priority: Blocker > Attachments: h3577_20120705.patch, h3577_20120708.patch, > h3577_20120714.patch > > > If reading a file large enough for which the httpserver running > webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of > webhdfs), then the WebHdfsFileSystem client fails with an IOException with > message *Content-Length header is missing*. > It looks like WebHdfsFileSystem is delegating opening of the inputstream to > *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* > header, but when using chunked transfer encoding the *Content-Length* header > is not present and the *URLOpener.openInputStream()* method thrown an > exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira