[
https://issues.apache.org/jira/browse/HBASE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837335#comment-13837335
]
Andrew Purtell edited comment on HBASE-10062 at 12/3/13 6:12 AM:
-----------------------------------------------------------------
We can deduce the encrypted data size by subtracting the encryption header size
from the remainder of the block on disk, the size of which is recorded in the
block header. I am testing a change locally that uses this as the block crypto
header:
{noformat}
// +--------------------------+
// | byte iv length |
// +--------------------------+
// | iv data ... |
// +--------------------------+
// | encrypted block data ... |
// +--------------------------+
{noformat}
We use an IV length of 0 to detect the special case where the block encoder is
required to encode a block length of zero. It happens (at least in unit tests).
Will submit to HadoopQA if all is well locally.
Previously we were in this mixed state where knowing the plaintext length means
we could read to the end of the encrypted data and maybe if there was more
information stuffed into the block we would ignore it, but we would have to
actually decrypt the data to get there. Now we just won't allow that kind of
extension, and can save some space per block.
was (Author: apurtell):
We can deduce the encrypted data size by subtracting the encryption header size
from the remainder of the block on disk, the size of which is recorded in the
block header. I am testing a change locally that uses this as the block crypto
header:
{noformat}
// +--------------------------+
// | byte iv length |
// +--------------------------+
// | iv data ... |
// +--------------------------+
// | encrypted block data ... |
// +--------------------------+
{noformat}
We use an IV length of 0 to detect the special case where the block encoder is
required to encode a block length of zero. It happens (at least in unit tests).
Will submit to HadoopQA if all is well locally.
> Reconsider storing plaintext length in the encrypted block header
> -----------------------------------------------------------------
>
> Key: HBASE-10062
> URL: https://issues.apache.org/jira/browse/HBASE-10062
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Minor
> Fix For: 0.98.0
>
> Attachments: 10062.patch
>
>
> After HBASE-7544, if an HFile belongs to an encrypted family, it is encrypted
> on a per block basis. The encrypted blocks include the following header:
> {noformat}
> // +--------------------------+
> // | vint plaintext length |
> // +--------------------------+
> // | vint iv length |
> // +--------------------------+
> // | iv data ... |
> // +--------------------------+
> // | encrypted block data ... |
> // +--------------------------+
> {noformat}
> The reason for storing the plaintext length is so we can create an decryption
> stream over the encrypted block data and, no matter the internal details of
> the crypto algorithm (whether it adds padding, etc.) after reading the
> expected plaintext bytes we know the reader is finished. However my colleague
> Jerry Chen pointed out today this construction mandates the block be
> processed exactly that way. Storing and using the encrypted data length
> instead could provide more implementation flexibility down the road.
--
This message was sent by Atlassian JIRA
(v6.1#6144)