Re: Re:Re: Re: RE: Why a sql only use one map task?

2011-08-25 Thread bejoy_ks
ssage- From: "Daniel,Wu" Date: Thu, 25 Aug 2011 20:02:43 To: Reply-To: user@hive.apache.org Subject: Re:Re:Re: Re: RE: Why a sql only use one map task? after I set set mapred.min.split.size=2; Then it will kick off 3 map tasks (the file I have is 500M). So looks like

Re:Re:Re: Re: RE: Why a sql only use one map task?

2011-08-25 Thread Daniel,Wu
after I set set mapred.min.split.size=2; Then it will kick off 3 map tasks (the file I have is 500M). So looks like we need to set mapred.min.split.size instead of mapred.map.tasks to control how many maps to kick off. At 2011-08-25 19:38:30,"Daniel,Wu" wrote: It works, after I set

Re:Re: Re: RE: Why a sql only use one map task?

2011-08-25 Thread Daniel,Wu
It works, after I set as you said, but looks like I can't control the map task, it always use 9 maps, even if I set set mapred.map.tasks=2; Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed Task Attempts map100.00% 900900 / 0 reduce100.00% 100100 / 0 At 2011-08-25 06:35:38,

Re: Re: RE: Why a sql only use one map task?

2011-08-24 Thread Ashutosh Chauhan
This may be because CombineHiveInputFormat is combining your splits in one map task. If you don't want that to happen, do: hive> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat 2011/8/24 Daniel,Wu > I pasted the inform I pasted blow, the map capacity is 6. And no matter how >

Re:Re: RE: Why a sql only use one map task?

2011-08-24 Thread Daniel,Wu
I pasted the inform I pasted blow, the map capacity is 6. And no matter how I set mapred.map.tasks, such as 3, it doesn't work, as it always use 1 map task (please see the completed job information). Cluster Summary (Heap Size is 16.81 MB/966.69 MB) Running Map TasksRunning Reduce TasksTotal

Re: RE: Why a sql only use one map task?

2011-08-24 Thread wd
What about your total Map Task Capacity? you may check it from http://your_jobtracker:50030/jobtracker.jsp 2011/8/24 Daniel,Wu : > I checked my setting, all are with the default value.So per the book of > "Hadoop the definitive guide", the split size should be 64M. And the file > size is about 500

RE: Why a sql only use one map task?

2011-08-23 Thread Aggarwal, Vaibhav
If you actually have splittable files you can set the following setting to create more splits: mapred.max.split.size appropriately. Thanks Vaibhav From: Daniel,Wu [mailto:hadoop...@163.com] Sent: Tuesday, August 23, 2011 6:51 AM To: hive Subject: Why a sql only use one map task? I run the fo

Re: Why a sql only use one map task?

2011-08-23 Thread Vikas Srivastava
hey did u storing data in zipped format if yes becoz of that its only split in single map. 2011/8/23 Daniel,Wu > I run the following simple sql > select count(*) from sales; > And the job information shows it only uses one map task. > > The underlying hadoop has 3 data/data nodes. So I expect