[ 
https://issues.apache.org/jira/browse/TEZ-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916724#comment-13916724
 ] 

Bikas Saha commented on TEZ-698:
--------------------------------

Yes. Wordcount has examples though some more may be needed for configuring 
security.
Wordcount also uses ShuffleInput/SortedOutput. These depend on existing helpers 
like MRHelpers.createUserPayloadFromConf(mapStageConf), 
MultiStageMRConfToTezTranslator.translateVertexConfToTez(finalReduceConf, 
mapStageConf) etc.

There are 2 alternatives to utility method. 1) Have Helper class like exising 
MRHelper. 2) Put the utility in the class itself. eg. MRInput.createInstance() 
or MapProcessor.createInstance(). Advantage of the second one is that its 
self-contained and new inputs etc can be added independent of changing existing 
utility methods. I would prefer 2).



> Make it easy to create and configure 
> MRInput/MROutput/ShuffleInput/SortedOutput
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-698
>                 URL: https://issues.apache.org/jira/browse/TEZ-698
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>
> We have moved away from MR and its not necessary for anyone to write mappers 
> and reducers or to configure them. But MR input and output and Shuffle 
> related inputs/outputs. Currently we have to invoke a host of methods to 
> configure them. If we can have a single API to make these configs then it 
> would really help. Secondly for IO pairs like ShuffleInput/SortedOutput, 
> their configs are related (KV types e.g.) So it maybe useful to have a 
> combined API that generates configs for both in a single API.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to