[ https://issues.apache.org/jira/browse/HADOOP-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208272#comment-16208272 ]
Xiao Chen commented on HADOOP-14872: ------------------------------------ Thanks for the continued work on this John. My belated review, mostly cosmetic: - We should verify the exception. Also should log the exception using the standard loggers. {code:title=CryptoStreamsTestBase:testUnbuffer} // Test buffered read again after unbuffer try { // Must seek to the beginning first ((Seekable) in).seek(0); // Can not read again if seek fails readCheck(in); } catch (Exception e) { System.err.println(e); } {code} - Suggest to include {{Locale.ENGLISH}} in {{capability.toLowerCase()}}. Steve's above example also does this. Hadoop's {{StringUtils#toLowerCase}} makes this a little easier to write, but also conflicts the the apache commons StringUtils in some classes. :( - From existing code. The javadoc for hsync says 'flush out the data in client's buffer and the disk device', I feel the {{Syncable}}'s javadoc is more accurate 'flush out the data in client's user buffer all the way to the disk device (but the disk may have it in its cache).' We should also update the documentation about this. - In {{CryptoInputStream}}, replace method name s/cleanDescriptorPool/clearDecryptorPool/g. - Maybe we should provide a 1-sentence explanation on readahead, dropbehind and unbuffer, similar to what hsync and hflush have. > CryptoInputStream should implement unbuffer > ------------------------------------------- > > Key: HADOOP-14872 > URL: https://issues.apache.org/jira/browse/HADOOP-14872 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 2.6.4 > Reporter: John Zhuge > Assignee: John Zhuge > Attachments: HADOOP-14872.001.patch, HADOOP-14872.002.patch, > HADOOP-14872.003.patch, HADOOP-14872.004.patch, HADOOP-14872.005.patch, > HADOOP-14872.006.patch, HADOOP-14872.007.patch, HADOOP-14872.008.patch, > HADOOP-14872.009.patch, HADOOP-14872.010.patch, HADOOP-14872.011.patch, > HADOOP-14872.012.patch > > > Discovered in IMPALA-5909. > Opening an encrypted HDFS file returns a chain of wrapped input streams: > {noformat} > HdfsDataInputStream > CryptoInputStream > DFSInputStream > {noformat} > If an application such as Impala or HBase calls HdfsDataInputStream#unbuffer, > FSDataInputStream#unbuffer will be called: > {code:java} > try { > ((CanUnbuffer)in).unbuffer(); > } catch (ClassCastException e) { > throw new UnsupportedOperationException("this stream does not " + > "support unbuffering."); > } > {code} > If the {{in}} class does not implement CanUnbuffer, UOE will be thrown. If > the application is not careful, tons of UOEs will show up in logs. > In comparison, opening an non-encrypted HDFS file returns this chain: > {noformat} > HdfsDataInputStream > DFSInputStream > {noformat} > DFSInputStream implements CanUnbuffer. > It is good for CryptoInputStream to implement CanUnbuffer for 2 reasons: > * Release buffer, cache, or any other resource when instructed > * Able to call its wrapped DFSInputStream unbuffer > * Avoid the UOE described above. Applications may not handle the UOE very > well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org