[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Jothi Padmanabhan (JIRA) Wed, 07 Jan 2009 04:43:09 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661538#action_12661538
 ]


Jothi Padmanabhan commented on HADOOP-1338:
-------------------------------------------

I tested a patch where 
* Each copier thread is assigned one host
* Each copier thread would pull  'n' map outputs from a given host (until a 
specific size threshold has been pulled), before moving on to the next thread
* Each fetch would be one map request/response (as it exists in the trunk)

With the above patch, I did not observe any improvement at all (for a variety 
of map sizes with the loadgen program). The underlying presumption with this 
patch was that since each thread is holding on to the host, the keep-alive 
would kick in (by JVM?) and make a few of the connections as no-op, as these 
are connections made to the same host/port.  However, it looks like keep-alive 
is not kicking in and see similar shuffle times with and without this patch.

We did another test where the code was hacked so that the copier fetches a 
configurable number of maps at a time and the the TT replies to this request by 
clubbing the map outputs together. The received map outputs were just discarded 
at the reducer (neither written to disk nor memory) so that we just measured 
the network performance. The following are the results

||Number of Maps Per Fetch||Average Shuffle Time||Worst case Shuffle Time||
|1|1:27|4:20|
|2|1:11|2:11|
|4|47s|1:11|
|8|29s|41s|

>From this it does appear that we would benefit from modifying the fetch 
>protocol to fetch several maps at one shot, using the same connection. 
>Thoughts?



> Improve the shuffle phase by using the "connection: keep-alive" and doing 
> batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Jothi Padmanabhan
>
> We should do transfers of map outputs at the granularity of  
> *total-bytes-transferred* rather than the current way of transferring a 
> single file and then closing the connection to the server. A single 
> TaskTracker might have a couple of map output files for a given reduce, and 
> we should transfer multiple of them (upto a certain total size) in a single 
> connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would 
> help since it would keep the connection open for more than one file transfer. 
> We should limit the transfers to a certain size so that we don't hold up a 
> jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Reply via email to