The maximum number of tasks running at once per node is dictated by
mapred.tasktracker.map.tasks.maximum
6
mapred.tasktracker.reduce.tasks.maximum
4
I do not work with ec2 so I do not know if how to adjust it.
Hive prints a message like this during the query.
Number of reduce tasks not
Hi,
Is there any page/document that describes the methods/techniques used by
Hive to arrive at the optimum number of map tasks & optimum number of reduce
tasks?
I'm running a 3-node Amazon EMR cluster, and Hive has determined that 34 map
& 2 reduce tasks are optimum. Out of the 34 map tasks only