ssage-
From: "Daniel,Wu"
Date: Thu, 25 Aug 2011 20:02:43
To:
Reply-To: user@hive.apache.org
Subject: Re:Re:Re: Re: RE: Why a sql only use one map task?
after I set
set mapred.min.split.size=2;
Then it will kick off 3 map tasks (the file I have is 500M). So looks like
after I set
set mapred.min.split.size=2;
Then it will kick off 3 map tasks (the file I have is 500M). So looks like we
need to set mapred.min.split.size instead of mapred.map.tasks to control how
many maps to kick off.
At 2011-08-25 19:38:30,"Daniel,Wu" wrote:
It works, after I set
It works, after I set as you said, but looks like I can't control the map task,
it always use 9 maps, even if I set
set mapred.map.tasks=2;
Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed
Task Attempts
map100.00%
900900 / 0
reduce100.00%
100100 / 0
At 2011-08-25 06:35:38,
This may be because CombineHiveInputFormat is combining your splits in one
map task. If you don't want that to happen, do:
hive> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat
2011/8/24 Daniel,Wu
> I pasted the inform I pasted blow, the map capacity is 6. And no matter how
>
I pasted the inform I pasted blow, the map capacity is 6. And no matter how I
set mapred.map.tasks, such as 3, it doesn't work, as it always use 1 map task
(please see the completed job information).
Cluster Summary (Heap Size is 16.81 MB/966.69 MB)
Running Map TasksRunning Reduce TasksTotal
What about your total Map Task Capacity?
you may check it from http://your_jobtracker:50030/jobtracker.jsp
2011/8/24 Daniel,Wu :
> I checked my setting, all are with the default value.So per the book of
> "Hadoop the definitive guide", the split size should be 64M. And the file
> size is about 500
If you actually have splittable files you can set the following setting to
create more splits:
mapred.max.split.size appropriately.
Thanks
Vaibhav
From: Daniel,Wu [mailto:hadoop...@163.com]
Sent: Tuesday, August 23, 2011 6:51 AM
To: hive
Subject: Why a sql only use one map task?
I run the fo
hey did u storing data in zipped format
if yes becoz of that its only split in single map.
2011/8/23 Daniel,Wu
> I run the following simple sql
> select count(*) from sales;
> And the job information shows it only uses one map task.
>
> The underlying hadoop has 3 data/data nodes. So I expect