Is there a description of how MapReduce under Hadoop 2.0 assigns mapper tasks
to preferred nodes? I think that someone on the list mentioned previously that
it attempted to assign "one HDFS block per mapper task", but given that there
can be multiple block instances per data split, how does MapReduce try to
obtain an even task assignment while optimizing data locality?
Thanks,
John Lilley
Chief Architect, RedPoint Global Inc.
1515 Walnut Street | Suite 200 | Boulder, CO 80302
T: +1 303 541 1516 | M: +1 720 938 5761 | F: +1 781-705-2077
Skype: jlilley.redpoint |
john.lil...@redpoint.net<mailto:john.lil...@redpoint.net> |
www.redpoint.net<http://www.redpoint.net/>