[ https://issues.apache.org/jira/browse/MRQL-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leonidas Fegaras updated MRQL-54: --------------------------------- Attachment: MRQL-54.patch > Adjust the split size of a map-reduce input file based on the number of > requested nodes > --------------------------------------------------------------------------------------- > > Key: MRQL-54 > URL: https://issues.apache.org/jira/browse/MRQL-54 > Project: MRQL > Issue Type: Improvement > Components: Run-Time/MapReduce > Affects Versions: 0.9.4 > Reporter: Leonidas Fegaras > Assignee: Leonidas Fegaras > Priority: Critical > Attachments: MRQL-54.patch > > > This patch fixes a performance problem reported by Eldon Carman. It improves > the degree of parallelism of map tasks in map-reduce mode. Before this, the > mapred.min.split.size was set to 256MBs before each map-reduce task, which > prevented mappers to use all requested cluster nodes (but the number of > reducers was set correctly using setNumReduceTasks). Now the > mapred.min.split.size and mapred.max.split.size are set correctly based on > the input size and the number of requested nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)