[ https://issues.apache.org/jira/browse/MAPREDUCE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752353#comment-13752353 ]
Avner BenHanoch commented on MAPREDUCE-5329: -------------------------------------------- Hi Siddharth, *Regarding your previous comment:* {quote} Isn't the ShuffleToken already specific to a shuffle provider (specifically an Aux service) - in terms of YARN {quote} I don’t think there is an architectural reason for having _shuffleToken_ specific per shuffle provider. *It is safe and reasonable to have one _shuffleToken_ for all shuffle-providers and for all shuffle-consumers of a job.* Looking in the current code, _TaskAttemptImpl_ creates _shuffleToken_ on the fly for shuffle needs based on the job token and on user credentials. The _shuffleToken_ is stored in _TokenCache/Credentials map_ under the general shuffle label _"MapReduceShuffleToken"_. Also, in the shuffle-consumer side, _shuffleToken_ is part of the ReduceTask and it is not specific to a shuffle-consumer in any way. The only thing that may relate _shuffleToken_ to a specific aux-service is an existing biased in the current code. The code for serializing/deserializing _shuffleToken_ is located in 2 static methods in the _AuxiliaryService_ class _ShuffleHandler_. However, I think there is no real justification for that and it is very easy to move these 2 simple methods to a more appropriate place like the _Token_ class itself. *Regarding your current comment:* {quote} If the MR AM knows how to handle multiple services, and informs the NM about all of these, APPLICATION_INIT should just go out to all of them. Sounds like this jira can be used to fix this? {quote} I think that sending _APPLICATION_INIT_ message worth an issue for itself and fixing it can help a lot several aux-services regardless of any additional fix for service port (for example aux-services that rely on RDMA and do not use TCP at all, or aux-services that can read port from conf). The purpose of the suggested new MAPREDUCE issue is one step further - for *configuring multiple aux-services* with service-port, regardless of shuffle-providers and regardless of _APPLICATION_INIT_. Beside, the new issue is blocked on YARN-1065. Hence, I prefer to fix this little issue without waiting for a comprehensive fix for all the “neighbor” issues. Thanks for your help, Avner > APPLICATION_INIT is never sent to AuxServices other than the builtin > ShuffleHandler > ----------------------------------------------------------------------------------- > > Key: MAPREDUCE-5329 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5329 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am > Affects Versions: 2.0.4-alpha > Reporter: Avner BenHanoch > > APPLICATION_INIT is never sent to AuxServices other than the built-in > ShuffleHandler. This means that 3rd party ShuffleProvider(s) will not be > able to function, because APPLICATION_INIT enables the AuxiliaryService to > map jobId->userId. This is needed for properly finding the MOFs of a job per > reducers' requests. > NOTE: The built-in ShuffleHandler does get APPLICATION_INIT events due to > hard-coded expression in hadoop code. The current TaskAttemptImpl.java code > explicitly call: serviceData.put (ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, > ...) and ignores any additional AuxiliaryService. As a result, only the > built-in ShuffleHandler will get APPLICATION_INIT events. Any 3rd party > AuxillaryService will never get APPLICATION_INIT events. > I think a solution can be in one of two ways: > 1. Change TaskAttemptImpl.java to loop on all Auxiliary Services and register > each of them, by calling serviceData.put (…) in loop. > 2. Change AuxServices.java similar to the fix in: MAPREDUCE-2668 > "APPLICATION_STOP is never sent to AuxServices". This means that in case the > 'handle' method gets APPLICATION_INIT event it will demultiplex it to all Aux > Services regardless of the value in event.getServiceID(). > I prefer the 2nd solution. I am welcoming any ideas. I can provide the > needed patch for any option that people like. > See [Pluggable Shuffle in Hadoop > documentation|http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira