[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758545#action_12758545
 ] 

Kan Zhang commented on MAPREDUCE-1026:
--------------------------------------

I had some rough idea for this when I opened HADOOP-4991. Briefly,
1. The output of Map tasks of a job should be accessed only by Reduce tasks of 
the same job.
2. Since currently this access is done over HTTP, I suggest we use HTTP DIGEST 
authentication mechanism as defined in RFC 2617. This is better than HTTP BASIC 
authentication since in the case of HTTP DIGEST, the secret key is never sent 
over to the server in the clear and it allows for mutual authentication.
3. We should use whatever key length that is recommended by the standard and 
JCE implementation.
4. The key is per-job and should be chosen by the JobTracker at job submission 
and persisted in the job conf in such a way that only tasks of that job + TT/JT 
can access it. I favor chosen by JT over chosen by JobClient for 2 reasons.
- The key is considered an internal detail of the M/R framework and should be 
transparent to anyone outside the M/R cluster, including the JobClient.
- You don't need to worry about the key being accidentally disclosed 
before/after being submitted to the JT at the client site.

> Shuffle should be secure
> ------------------------
>
>                 Key: MAPREDUCE-1026
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1026
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: security
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>
> Since the user's data is available via http from the TaskTrackers, we should 
> require a job-specific secret to access it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to