[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752353#comment-13752353
 ] 

Avner BenHanoch commented on MAPREDUCE-5329:
--------------------------------------------

Hi Siddharth,

*Regarding your previous comment:*
{quote}
Isn't the ShuffleToken already specific to a shuffle provider (specifically an 
Aux service) - in terms of YARN
{quote}

I don’t think there is an architectural reason for having _shuffleToken_ 
specific per shuffle provider.  *It is safe and reasonable to have one 
_shuffleToken_ for all shuffle-providers and for all shuffle-consumers of a 
job.*
Looking in the current code, _TaskAttemptImpl_ creates _shuffleToken_ on the 
fly for shuffle needs based on the job token and on user credentials.  The 
_shuffleToken_ is stored in _TokenCache/Credentials map_ under the general 
shuffle label _"MapReduceShuffleToken"_.  
Also, in the shuffle-consumer side, _shuffleToken_ is part of the ReduceTask 
and it is not specific to a shuffle-consumer in any way.

The only thing that may relate _shuffleToken_ to a specific aux-service is an 
existing biased in the current code. The code for serializing/deserializing 
_shuffleToken_ is located in 2 static methods in the _AuxiliaryService_ class 
_ShuffleHandler_.  However, I think there is no real justification for that and 
it is very easy to move these 2 simple methods to a more appropriate place like 
the _Token_ class itself.



*Regarding your current comment:*
{quote}
If the MR AM knows how to handle multiple services, and informs the NM about 
all of these, APPLICATION_INIT should just go out to all of them. Sounds like 
this jira can be used to fix this?
{quote}

I think that sending _APPLICATION_INIT_ message worth an issue for itself and 
fixing it can help a lot several aux-services regardless of any additional fix 
for service port (for example aux-services that rely on RDMA and do not use TCP 
at all, or aux-services that can read port from conf).

The purpose of the suggested new MAPREDUCE issue is one step further - for 
*configuring multiple aux-services* with service-port, regardless of 
shuffle-providers and regardless of _APPLICATION_INIT_.  Beside, the new issue 
is blocked on YARN-1065.  Hence, I prefer to fix this little issue without 
waiting for a comprehensive fix for all the “neighbor” issues.

Thanks for your help,
Avner

                
> APPLICATION_INIT is never sent to AuxServices other than the builtin 
> ShuffleHandler
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5329
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5329
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.4-alpha
>            Reporter: Avner BenHanoch
>
> APPLICATION_INIT is never sent to AuxServices other than the built-in 
> ShuffleHandler.  This means that 3rd party ShuffleProvider(s) will not be 
> able to function, because APPLICATION_INIT enables the AuxiliaryService to 
> map jobId->userId. This is needed for properly finding the MOFs of a job per 
> reducers' requests.
> NOTE: The built-in ShuffleHandler does get APPLICATION_INIT events due to 
> hard-coded expression in hadoop code. The current TaskAttemptImpl.java code 
> explicitly call: serviceData.put (ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, 
> ...) and ignores any additional AuxiliaryService. As a result, only the 
> built-in ShuffleHandler will get APPLICATION_INIT events.  Any 3rd party 
> AuxillaryService will never get APPLICATION_INIT events.
> I think a solution can be in one of two ways:
> 1. Change TaskAttemptImpl.java to loop on all Auxiliary Services and register 
> each of them, by calling serviceData.put (…) in loop.
> 2. Change AuxServices.java similar to the fix in: MAPREDUCE-2668  
> "APPLICATION_STOP is never sent to AuxServices".  This means that in case the 
> 'handle' method gets APPLICATION_INIT event it will demultiplex it to all Aux 
> Services regardless of the value in event.getServiceID().
> I prefer the 2nd solution.  I am welcoming any ideas.  I can provide the 
> needed patch for any option that people like.
> See [Pluggable Shuffle in Hadoop 
> documentation|http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to