[ https://issues.apache.org/jira/browse/HADOOP-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211366#comment-16211366 ]
John Zhuge commented on HADOOP-14872: ------------------------------------- The changes can be divided into 2 independent groups: Enhance StreamCapabilities and add it to input streams * FSDataInputStream.java * StreamCapabilities.java * filesystem.md * DFSInputStream.java * DFSOutputStream.java * BlockBlobAppendStream.java CryptoInputStream to support unbuffer * CryptoInputStream.java * CryptoStreamsTestBase.java * TestCryptoStreams.java * TestCryptoStreamsForLocalFS.java * TestCryptoStreamsNormal.java If CryptoInputStream unbuffer change has to be reverted for any reason, we will still have StreamCapabilities enhancements. > CryptoInputStream should implement unbuffer > ------------------------------------------- > > Key: HADOOP-14872 > URL: https://issues.apache.org/jira/browse/HADOOP-14872 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 2.6.4 > Reporter: John Zhuge > Assignee: John Zhuge > Attachments: HADOOP-14872.001.patch, HADOOP-14872.002.patch, > HADOOP-14872.003.patch, HADOOP-14872.004.patch, HADOOP-14872.005.patch, > HADOOP-14872.006.patch, HADOOP-14872.007.patch, HADOOP-14872.008.patch, > HADOOP-14872.009.patch, HADOOP-14872.010.patch, HADOOP-14872.011.patch, > HADOOP-14872.012.patch > > > Discovered in IMPALA-5909. > Opening an encrypted HDFS file returns a chain of wrapped input streams: > {noformat} > HdfsDataInputStream > CryptoInputStream > DFSInputStream > {noformat} > If an application such as Impala or HBase calls HdfsDataInputStream#unbuffer, > FSDataInputStream#unbuffer will be called: > {code:java} > try { > ((CanUnbuffer)in).unbuffer(); > } catch (ClassCastException e) { > throw new UnsupportedOperationException("this stream does not " + > "support unbuffering."); > } > {code} > If the {{in}} class does not implement CanUnbuffer, UOE will be thrown. If > the application is not careful, tons of UOEs will show up in logs. > In comparison, opening an non-encrypted HDFS file returns this chain: > {noformat} > HdfsDataInputStream > DFSInputStream > {noformat} > DFSInputStream implements CanUnbuffer. > It is good for CryptoInputStream to implement CanUnbuffer for 2 reasons: > * Release buffer, cache, or any other resource when instructed > * Able to call its wrapped DFSInputStream unbuffer > * Avoid the UOE described above. Applications may not handle the UOE very > well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org