Re: Throttling hive queries

2009-12-17 Thread Todd Lipcon
Hi Sagi, Any chance you're running on a directory that has 614 small files? -Todd On Thu, Dec 17, 2009 at 2:30 PM, Sagi, Lee wrote: > Todd, Here is the job info. > > > > Counter Map Reduce Total File Systems HDFS bytes read 199,115,508 0 > 199,115,508 HDFS bytes written 0 9,665,472 9,665,472

[ANNOUNCE] Hive 0.4.1 released

2009-12-17 Thread Zheng Shao
Hi Folks, We have release the rc3 candidate as Hive 0.4.1. You can find download it from the download page. http://hadoop.apache.org/hive/releases.html#Download Thanks, Zheng

RE: Throttling hive queries

2009-12-17 Thread Sagi, Lee
Todd, Here is the job info. Counter Map Reduce Total File Systems HDFS bytes read 199,115,508 0 199,115,508 HDFS bytes written 0 9,665,472 9,665,472 Local bytes read 0 321,210,205 321,210,205 Local bytes written

Re: Throttling hive queries

2009-12-17 Thread Todd Lipcon
Hi Lee, The MapReduce framework in general makes it hard for you assign fewer mappers than there are blocks in the input data, when using FileInputFormat. Is your input set about 42GB with a 64M block size, or 84G with a 128M block size? -Todd On Thu, Dec 17, 2009 at 11:32 AM, Sagi, Lee wrote:

RE: Throttling hive queries

2009-12-17 Thread Sagi, Lee
Here is the query that I am running, just in case someone has an idea of how to improve it. SELECT CONCAT(CONCAT('"', PRSS.DATE_KEY), '"'), CONCAT(CONCAT('"', PRSC.DATE_KEY), '"'), CONCAT(CONCAT('"', PRSS.VOTF_REQUEST_ID), '"'), CONCAT(CONCAT('"', PRSC.VOTF_REQUEST_ID), '"

RE: Throttling hive queries

2009-12-17 Thread Sagi, Lee
I tried that but it does not seem to work. Hive> set mapred.map.tasks=100; Hive> select count(1) from FCT_PRSC where date_key>='2009121600' and date_key<'2009121700'; Total MapReduce jobs = 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in

Re: Throttling hive queries

2009-12-17 Thread Edward Capriolo
You should be able hive > set mapred.map.tasks=1000 hive > set mapred.reduce.tasks=5 In some cases mappers is controlled by input files (pre hadoop 20) On Thu, Dec 17, 2009 at 1:58 PM, Sagi, Lee wrote: > Is there a way to throttle hive queries? > > For example, I want to tell hive to not use m

Throttling hive queries

2009-12-17 Thread Sagi, Lee
Is there a way to throttle hive queries? For example, I want to tell hive to not use more then 1000 mappers and 5 reducers for a particular query (or session).