[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-199: -- Assignee: (was: Harsh J) > Locality hints for Reduce > - > > Key: MAPREDUCE-199 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-199 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: applicationmaster, mrv2 >Reporter: Benjamin Reed > Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch > > > It would be nice if we could add method to OutputFormat that would allow a > job to indicate where a reducer for a given partition should should run. This > is similar to the getSplits() method on InputFormat. In our application the > reducer is using other data in addition to the map outputs during processing > and data accesses could be made more efficient if the JobTracker scheduled > the reducers to run on specific hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric baldeschwieler updated MAPREDUCE-199: -- I can see the value of matching reduce outputs to region servers. This does seem like a compelling use case. That said, the MR interface is already very broad. Let's let any extensions to the API bake for a while to make sure we are doing the right thing. Its a lot easier to add thing to the config or API than take them out. Using the same abstractions / API as the Map would be nice if doable. Locality hints for Reduce - Key: MAPREDUCE-199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-199 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Benjamin Reed Assignee: Harsh J Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch It would be nice if we could add method to OutputFormat that would allow a job to indicate where a reducer for a given partition should should run. This is similar to the getSplits() method on InputFormat. In our application the reducer is using other data in addition to the map outputs during processing and data accesses could be made more efficient if the JobTracker scheduled the reducers to run on specific hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-199: -- Attachment: MAPREDUCE-199.patch Here's a first shot it. Adds support for locality hints via Job configuration (rather than an API). Supports one host hint per partition at the moment. For those who feel this would benefit: Would you like to have multi-host locality hinting support for each partition, akin to maps? Patch still needs some work, but hopefully the approach is right. Yet to test with a built cluster to observe the locality pick-up, but the framework around these areas is pretty nice (compared to MR1's JIP classes and such). Locality hints for Reduce - Key: MAPREDUCE-199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-199 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Benjamin Reed Attachments: MAPREDUCE-199.patch It would be nice if we could add method to OutputFormat that would allow a job to indicate where a reducer for a given partition should should run. This is similar to the getSplits() method on InputFormat. In our application the reducer is using other data in addition to the map outputs during processing and data accesses could be made more efficient if the JobTracker scheduled the reducers to run on specific hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-199: -- Attachment: MAPREDUCE-199.patch - Fixed a blooper in test (Reduce task attempt impl. constructor had some args switched, failing the test) - Added a precondition check to the static method of extracting reducer hints, just as a completeness check. Thanks to Sho for this. Locality hints for Reduce - Key: MAPREDUCE-199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-199 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Benjamin Reed Assignee: Harsh J Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch It would be nice if we could add method to OutputFormat that would allow a job to indicate where a reducer for a given partition should should run. This is similar to the getSplits() method on InputFormat. In our application the reducer is using other data in addition to the map outputs during processing and data accesses could be made more efficient if the JobTracker scheduled the reducers to run on specific hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-199) Locality hints for Reduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-199: Component/s: mrv2 applicationmaster Locality hints for Reduce - Key: MAPREDUCE-199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-199 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Benjamin Reed It would be nice if we could add method to OutputFormat that would allow a job to indicate where a reducer for a given partition should should run. This is similar to the getSplits() method on InputFormat. In our application the reducer is using other data in addition to the map outputs during processing and data accesses could be made more efficient if the JobTracker scheduled the reducers to run on specific hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira