[ http://issues.apache.org/jira/browse/HADOOP-195?page=comments#action_12378315 ]
paul sutter commented on HADOOP-195: ------------------------------------ eric, most of my suggestions relate to the copy phase of the sort path, not the sort itself. once that is working, i can make sort suggestions (although my best sort suggestion is for you guys to talk with david cossock about sorts). this whole area is critical. on that cluster, owen's 2TB should sort in 10 minutes, and the data should be copied in less than that time, for a total run time of <20 minutes. pleased that yahoo has resources to apply. paul > transfer map output transfer with http instead of rpc > ----------------------------------------------------- > > Key: HADOOP-195 > URL: http://issues.apache.org/jira/browse/HADOOP-195 > Project: Hadoop > Type: Improvement > Components: mapred > Versions: 0.2 > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Fix For: 0.3 > > The data transfer of the map output should be transfered via http instead > rpc, because rpc is very slow for this application and the timeout behavior > is suboptimal. (server sends data and client ignores it because it took more > than 10 seconds to be received.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
