[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565792#comment-13565792
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-4892:
----------------------------------------------------

In my example, the maxMapperSize is one split (the default).

bq. In general and specially in skewed distributions we will be running the 
inner for loop (that tries to group blocks) multiple times with no useful 
outcome.
I don't see how there will be iterations with no outcome. We will always try to 
allocate to some node or break if we cannot allocate any.

OTOH, we can improve on allocating one at a time by instead trying to allocate 
as much as possible but not more than the minimum across all nodes.

bq. Secondly, for large jobs with multi-wave mappers, effects of non-local 
reads or delay scheduling might be small compared to the overall cost 
(including multiple runs). So this change mainly boils down to effects on 
single mapper wave jobs or small jobs.
That's not true, locality affects large jobs too. Unless your definition of 
large means read less and process for a long time.

bq. In these cases, it is arguable that having a few rack local mappers (which 
significantly increases the on time start of the mapper) might be beneficial 
than having the mapper wait on a machine because it was node local to it. 
Likeliness of waiting is high since the machine has many mappers already 
assigned to it.
That's a general argument against node locality itself, one that is take care 
of by other mechanisms like locality-wait/delay.

                
> CombineFileInputFormat node input split can be skewed on small clusters
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4892
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4892
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-4892.1.patch
>
>
> The CombineFileInputFormat split generation logic tries to group blocks by 
> node in order to create splits. It iterates through the nodes and creates 
> splits on them until there aren't enough blocks left on a node that can be 
> grouped into a valid split. If the first few nodes have a lot of blocks on 
> them then they can end up getting a disproportionately large share of the 
> total number of splits created. This can result in poor locality of maps. 
> This problem is likely to happen on small clusters where its easier to create 
> a skew in the distribution of blocks on nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to