[ 
https://issues.apache.org/jira/browse/HDFS-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358951#comment-14358951
 ] 

Sean Busbey commented on HDFS-7911:
-----------------------------------

We should take a more general fix and instead move the synchronization to 
FSDataOutputStream, as suggested in HADOOP-11708

> Buffer Overflow when running HBase on HDFS Encryption Zone
> ----------------------------------------------------------
>
>                 Key: HDFS-7911
>                 URL: https://issues.apache.org/jira/browse/HDFS-7911
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: encryption
>    Affects Versions: 2.6.0
>            Reporter: Xiaoyu Yao
>            Assignee: Yi Liu
>            Priority: Blocker
>
> Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, 
> including creating tables, listing, adding a few rows, scanning them, etc. 
> However, when doing bulk load 100's k rows. After 10 minutes or so, we get 
> the following error on the Region Server that owns the table.
> {code}
> 2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] 
> wal.FSHLog: Error while AsyncSyncer sync, request close of hlog 
> java.io.IOException: java.nio.BufferOverflowException 
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156)
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127)
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) 
> at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) 
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
>  
> at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
>  
> at java.lang.Thread.run(Thread.java:744) 
> Caused by: java.nio.BufferOverflowException 
> at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) 
> at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) 
> at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) 
> at javax.crypto.Cipher.update(Cipher.java:1760) 
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145)
> ... 9 more 
> {code}
> It looks like the HBase WAL  (Write Ahead Log) use case is broken on the 
> CryptoOutputStream(). The use case has one flusher thread that keeps calling 
> the hflush() on WAL file while other roller threads are trying to write 
> concurrently to that same file handle.
> As the class comments mentioned. *""CryptoOutputStream encrypts data. It is 
> not thread-safe."* I check the code and it seems the buffer overflow is 
> related to the race between the CryptoOutputStream#write() and 
> CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The 
> inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be 
> changed during encrypt for flush() when write() is coming from other threads. 
> I have validated this with multi-threaded unit tests that mimic the HBase WAL 
> use case. For file not under encryption zone (*DFSOutputStream*), 
> multi-threaded flusher/writer works fine. For file under encryption zone 
> (*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with 
> Buffer Overflow/Underflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to