[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759634#action_12759634
 ] 

Devaraj Das commented on MAPREDUCE-1026:
----------------------------------------

bq. 1. Use a job specific random key, which is included in the URL of the fetch.
Yes.
bq. 2. Allow jobs to request encryption of the map output using a second job 
specific random key. I assume the configuration boolean would be something like 
mapred.job.shuffle.encrypt.
Yes.

bq. If the outputs are encrypted, I assume that we checksum the unencrypted 
data and include the checksum in the encryption.
I am not sure whether this is required to be done. The encrypted bytes would be 
checksummed automatically as we write them to the disk. Do we need to build the 
extra logic of checksumming the unencrypted bytes (that might be a big deal 
when we have multiple map output spills that we finally merge at the end, and 
spill to disk). I propose we just live with the (auto) checksum of the 
encrypted bytes.

> Shuffle should be secure
> ------------------------
>
>                 Key: MAPREDUCE-1026
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1026
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: security
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>
> Since the user's data is available via http from the TaskTrackers, we should 
> require a job-specific secret to access it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to