[jira] [Comment Edited] (TEZ-1157) Optimize broadcast :- Tasks pertaining to same job in same machine should not download multiple copies of broadcast data

Rajesh Balamohan (JIRA) Wed, 28 May 2014 18:53:32 -0700

    [ 
https://issues.apache.org/jira/browse/TEZ-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011964#comment-14011964
 ]


Rajesh Balamohan edited comment on TEZ-1157 at 5/29/14 1:52 AM:
----------------------------------------------------------------

Other containers first check if the data for a specific source is available (or 
being written. i.e partial writes) locally.  If so, it would wait until the 
local copy is available and would not make remote http calls. Tailing is not 
supported (to reduce the complexity) in option 2, for which progress indicator 
is needed and it has the associated issues mentioned by you.  This would also 
need premature EOF handling, which needs good amount of changes to the existing 
codebase.

Using existing shufflehandler service would be beneficial when containers are 
started in different times.  When containers start at the same time, they would 
not realize that the data is being downloaded locally by one of the containers. 
 So they would end up downloading remotely.  

If we have custom shuffle handler with tailing feature, it would be possible to 
serve partially downloaded data to other containers. Added advantage of custom 
shuffle handler approach is that, it can be used to service local as well as 
other nodes in the cluster as well.  

In short, data has to be copied to local node to avoid duplicate downloads, but 
should be serviced via NM's custom shuffle handler.  Other containers would try 
to fetch the data from local NM URL and upon failure it can reach out to remote 
URL.   Please let me know your thoughts.


was (Author: rajesh.balamohan):
Other containers first check if the data for a specific source is available (or 
being written. i.e partial writes) locally.  If so, it would wait until the 
local copy is available and would not make remote http calls. Tailing is not 
supported (to reduce the complexity) in option 2, for which progress indicator 
is needed and it has the associated issues mentioned by you.  This would also 
need premature EOF handling, which needs good amount of changes to the existing 
codebase.

Using existing shufflehandler service would be beneficial when containers are 
started in different times.  When containers start at the same time, they would 
not realize that the data is being downloaded locally by one of the containers. 
 So they would end up downloading remotely.  

If we have custom shuffle handler with tailing feature, it would be possible to 
serve partially downloaded data to other containers. Added advantage of custom 
shuffle handler approach is that, it can be used to service local as well as 
other nodes in the cluster as well.  

In short, data has to be copied to local node to avoid duplicate downloads, but 
should be serviced via NM's custom shuffle handler.  Other containers would try 
to fetch the data from local NM URL and upon failure it can reach out to remote 
URL. 

> Optimize broadcast :- Tasks pertaining to same job in same machine should not 
> download multiple copies of broadcast data
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1157
>                 URL: https://issues.apache.org/jira/browse/TEZ-1157
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: performance
>         Attachments: TEZ-1152.WIP.patch
>
>
> Currently tasks (belonging to same job) running in the same machine download 
> its own copy of broadcast data.  Optimization could be to  download one copy 
> in the machine, and the rest of the tasks can refer to this downloaded copy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (TEZ-1157) Optimize broadcast :- Tasks pertaining to same job in same machine should not download multiple copies of broadcast data

Reply via email to