[
https://issues.apache.org/jira/browse/HBASE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell updated HBASE-10062:
-----------------------------------
Attachment: 10062.patch
We can deduce the encrypted data size by subtracting the encryption header size
from the remainder of the block on disk, the size of which is recorded in the
block header. I am testing a change locally that uses this as the block crypto
header:
{noformat}
// +--------------------------+
// | byte iv length |
// +--------------------------+
// | iv data ... |
// +--------------------------+
// | encrypted block data ... |
// +--------------------------+
{noformat}
We use an IV of 0 to detect the special case where the block encoder is
required to encode a block length of zero. It happens (at least in unit tests).
Will submit to HadoopQA if all is well locally.
> Reconsider storing plaintext length in the encrypted block header
> -----------------------------------------------------------------
>
> Key: HBASE-10062
> URL: https://issues.apache.org/jira/browse/HBASE-10062
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Minor
> Fix For: 0.98.0
>
> Attachments: 10062.patch
>
>
> After HBASE-7544, if an HFile belongs to an encrypted family, it is encrypted
> on a per block basis. The encrypted blocks include the following header:
> {noformat}
> // +--------------------------+
> // | vint plaintext length |
> // +--------------------------+
> // | vint iv length |
> // +--------------------------+
> // | iv data ... |
> // +--------------------------+
> // | encrypted block data ... |
> // +--------------------------+
> {noformat}
> The reason for storing the plaintext length is so we can create an decryption
> stream over the encrypted block data and, no matter the internal details of
> the crypto algorithm (whether it adds padding, etc.) after reading the
> expected plaintext bytes we know the reader is finished. However my colleague
> Jerry Chen pointed out today this construction mandates the block be
> processed exactly that way. Storing and using the encrypted data length
> instead could provide more implementation flexibility down the road.
--
This message was sent by Atlassian JIRA
(v6.1#6144)