[ 
https://issues.apache.org/jira/browse/HADOOP-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310101#comment-16310101
 ] 

Steve Moist commented on HADOOP-15006:
--------------------------------------

{quote}
Before worrying about these, why not conduct some experiments? You could take 
S3A and modify it to always encrypt client side with the same key, then run as 
many integration tests as you can against it (Hive, Spark, impala, ...), and 
see what fails. I think that should be a first step to anything client-side 
related
{quote}

I wrote a simple proof of concept back in May using the HDFS Crytro Streams 
wrapping the S3 streams with a fixed AES key and IV.  I was able to run the S3 
integration tests without issue, terragen/sort/verify without issue and write 
various files (of differing sizes) and compare the check sums.  It's given me 
enough confidence to move forward back then with writing the original proposal. 
 Unfortunately, I've seemed to misplaced the work since it's been so long.  
I'll work on re-creating it in the next few weeks and post it here; I've got a 
deadline I've got to focus on for now instead.  Besides AES/CTR/NoPadding 
should generate a cipher text the same size as the plain text unlike the AWS's 
SDK's AES/CBC/PKCS5Padding which is causing the file size to change.

> Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
> ---------------------------------------------------------------
>
>                 Key: HADOOP-15006
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15006
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3, kms
>            Reporter: Steve Moist
>            Priority: Minor
>         Attachments: S3-CSE Proposal.pdf
>
>
> This is for the proposal to introduce Client Side Encryption to S3 in such a 
> way that it can leverage HDFS transparent encryption, use the Hadoop KMS to 
> manage keys, use the `hdfs crypto` command line tools to manage encryption 
> zones in the cloud, and enable distcp to copy from HDFS to S3 (and 
> vice-versa) with data still encrypted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to