[ https://issues.apache.org/jira/browse/MAPREDUCE-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759634#action_12759634 ]
Devaraj Das commented on MAPREDUCE-1026: ---------------------------------------- bq. 1. Use a job specific random key, which is included in the URL of the fetch. Yes. bq. 2. Allow jobs to request encryption of the map output using a second job specific random key. I assume the configuration boolean would be something like mapred.job.shuffle.encrypt. Yes. bq. If the outputs are encrypted, I assume that we checksum the unencrypted data and include the checksum in the encryption. I am not sure whether this is required to be done. The encrypted bytes would be checksummed automatically as we write them to the disk. Do we need to build the extra logic of checksumming the unencrypted bytes (that might be a big deal when we have multiple map output spills that we finally merge at the end, and spill to disk). I propose we just live with the (auto) checksum of the encrypted bytes. > Shuffle should be secure > ------------------------ > > Key: MAPREDUCE-1026 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1026 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security > Reporter: Owen O'Malley > Assignee: Devaraj Das > > Since the user's data is available via http from the TaskTrackers, we should > require a job-specific secret to access it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.