RE: Throttling hive queries

Sagi, Lee Thu, 17 Dec 2009 11:18:17 -0800

I tried that but it does not seem to work.

Hive> set mapred.map.tasks=100;
Hive> select count(1) from FCT_PRSC where date_key>='2009121600' and
date_key<'2009121700';
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>



And this is what I see:
Jobid                           Priority        User    Name    Map %
Complete        Map Total       Maps Completed  Reduce % Complete
Reduce Total    Reduces Completed       Job Scheduling Information
job_200912162225_0033   NORMAL  dwhs    select count(1) from
...ate_key<'2009121700'(1/1)    1.06%
657     7       0.00%
1        0         

As you can see the "Map Total" is 657.


Lee Sagi | Data Warehouse Tech Lead & Architect | Work: 650-616-6575 |
Cell: 718-930-7947

-----Original Message-----
From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
Sent: Thursday, December 17, 2009 11:03 AM
To: hive-user@hadoop.apache.org
Subject: Re: Throttling hive queries

You should be able

hive > set mapred.map.tasks=1000
hive > set mapred.reduce.tasks=5

In some cases mappers is controlled by input files (pre hadoop 20)


On Thu, Dec 17, 2009 at 1:58 PM, Sagi, Lee <ls...@shopping.com> wrote:
> Is there a way to throttle hive queries?
>
> For example, I want to tell hive to not use more then 1000 mappers and

> 5 reducers for a particular query (or session).
>

RE: Throttling hive queries

Reply via email to