RE: Re:RE: Why a sql only use one map task?

Steven Wong Wed, 24 Aug 2011 16:02:25 -0700

I think mapred.max.split.size is not set by default. The max split size is not 
the same as the HDFS block size.

From: Daniel,Wu [mailto:hadoop...@163.com]
Sent: Tuesday, August 23, 2011 11:44 PM
To: user@hive.apache.org
Subject: Re:RE: Why a sql only use one map task?

I checked my setting, all are with the default value.So per the book of "Hadoop 
the definitive guide", the split size should be 64M. And the file size is about 
500M, so that's about 8 splits. And from the map job information (after the map 
job is done), I can see it gets 8 split from one node. But anyhow it starts 
only one map task.

At 2011-08-24 02:28:18,"Aggarwal, Vaibhav" 
<vagg...@amazon.com<mailto:vagg...@amazon.com>> wrote:

If you actually have splittable files you can set the following setting to 
create more splits:

mapred.max.split.size appropriately.

Thanks
Vaibhav

From: Daniel,Wu [mailto:hadoop...@163.com<mailto:hadoop...@163.com>]
Sent: Tuesday, August 23, 2011 6:51 AM
To: hive
Subject: Why a sql only use one map task?

  I run the following simple sql
select count(*) from sales;
And the job information shows it only uses one map task.

The underlying hadoop has 3 data/data nodes. So I expect hive should kick off 3 
map tasks, one on each task nodes. What can make hive only run one map task? Do 
I need to set something to kick off multiple map task?  in my config, I didn't 
change hive config.

RE: Re:RE: Why a sql only use one map task?

Reply via email to