[ 
https://issues.apache.org/jira/browse/MRQL-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonidas Fegaras updated MRQL-54:
---------------------------------
    Attachment: MRQL-54.patch

> Adjust the split size of a map-reduce input file based on the number of 
> requested nodes
> ---------------------------------------------------------------------------------------
>
>                 Key: MRQL-54
>                 URL: https://issues.apache.org/jira/browse/MRQL-54
>             Project: MRQL
>          Issue Type: Improvement
>          Components: Run-Time/MapReduce
>    Affects Versions: 0.9.4
>            Reporter: Leonidas Fegaras
>            Assignee: Leonidas Fegaras
>            Priority: Critical
>         Attachments: MRQL-54.patch
>
>
> This patch fixes a performance problem reported by Eldon Carman. It improves 
> the degree of parallelism of map tasks in map-reduce mode. Before this, the 
> mapred.min.split.size was set to 256MBs before each map-reduce task, which 
> prevented mappers to use all requested cluster nodes (but the number of 
> reducers was set correctly using setNumReduceTasks). Now the 
> mapred.min.split.size and mapred.max.split.size are set correctly based on 
> the input size and the number of requested nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to