[ https://issues.apache.org/jira/browse/HADOOP-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176240#comment-16176240 ]
Steve Loughran commented on HADOOP-14872: ----------------------------------------- This is *exactly* what I'm thinking of! # we should add a comment to CanUnbuffer saying "if you implement this then do stream capabilities...." # And that crypto test of yours should probe the stream capabilities in an assert to make sure it works now that we are adding read-stream capabilities, maybe we should define the naming scheme here. I'm thinking * Any filesystem must prefix with their schema, e.g. "s3a", "adl" for a schema-specific option * And we could use "in:" for input options * no prefix: output, FS independent > CryptoInputStream should implement unbuffer > ------------------------------------------- > > Key: HADOOP-14872 > URL: https://issues.apache.org/jira/browse/HADOOP-14872 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Affects Versions: 2.6.4 > Reporter: John Zhuge > Assignee: John Zhuge > Attachments: HADOOP-14872.001.patch, HADOOP-14872.002.patch, > HADOOP-14872.003.patch, HADOOP-14872.004.patch > > > Discovered in IMPALA-5909. > Opening an encrypted HDFS file returns a chain of wrapped input streams: > {noformat} > HdfsDataInputStream > CryptoInputStream > DFSInputStream > {noformat} > If an application such as Impala or HBase calls HdfsDataInputStream#unbuffer, > FSDataInputStream#unbuffer will be called: > {code:java} > try { > ((CanUnbuffer)in).unbuffer(); > } catch (ClassCastException e) { > throw new UnsupportedOperationException("this stream does not " + > "support unbuffering."); > } > {code} > If the {{in}} class does not implement CanUnbuffer, UOE will be thrown. If > the application is not careful, tons of UOEs will show up in logs. > In comparison, opening an non-encrypted HDFS file returns this chain: > {noformat} > HdfsDataInputStream > DFSInputStream > {noformat} > DFSInputStream implements CanUnbuffer. > It is good for CryptoInputStream to implement CanUnbuffer for 2 reasons: > * Release buffer, cache, or any other resource when instructed > * Able to call its wrapped DFSInputStream unbuffer > * Avoid the UOE described above. Applications may not handle the UOE very > well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org