[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425286#comment-13425286
 ] 

Benoy Antony commented on MAPREDUCE-4491:
-----------------------------------------

To Rob's questions :

Different Encryption Keys for Different files:  At this point, the PGPCodec 
supports only one secret key/Key Pair  for all input files. 
What we need is the ability to specify secret keys/key pair per input file. 
Another enhancement will be to specify secret keys/key pair per each phase like 
map->output , reduce->output .
As you mentioned, this mapping has to specified via configuration.
I'll try to add these two enhancements. 

Decryption/Encryption of different columns within the same file: This is 
actually left to the mapreduce programmer as he has to do the 
Decryption/Encryption of the fields programmatically. The programmer can choose 
to use different keys  for different fields in the mapreduce program. Multiple 
keys can be retrieved from the keystore and these keys can be retrieved in the 
mapper/reducer using the credentials API.  
In a higher level interface like Hive, it may be possible to add additional 
metadata information to specify the key name. Another reviewer also has 
recommended to add this capability Hive to identify an encryption field and 
specify the key (name of the key)  to be used to decrypt/encrypt it.

Thanks for the review and recommendations, Rob. Please let me know if I have 
not answered the question correctly.
                
> Encryption and Key Protection
> -----------------------------
>
>                 Key: MAPREDUCE-4491
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: documentation, security, task-controller, tasktracker
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>         Attachments: Hadoop_Encryption.pdf
>
>
> When dealing with sensitive data, it is required to keep the data encrypted 
> wherever it is stored. Common use case is to pull encrypted data out of a 
> datasource and store in HDFS for analysis. The keys are stored in an external 
> keystore. 
> The feature adds a customizable framework to integrate different types of 
> keystores, support for Java KeyStore, read keys from keystores, and transport 
> keys from JobClient to Tasks.
> The feature adds PGP encryption as a codec and additional utilities to 
> perform encryption related steps.
> The design document is attached. It explains the requirement, design and use 
> cases.
> Kindly review and comment. Collaboration is very much welcome.
> I have a tested patch for this for 1.1 and will upload it soon as an initial 
> work for further refinement. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to