Hi Lee, The MapReduce framework in general makes it hard for you assign fewer mappers than there are blocks in the input data, when using FileInputFormat. Is your input set about 42GB with a 64M block size, or 84G with a 128M block size?
-Todd On Thu, Dec 17, 2009 at 11:32 AM, Sagi, Lee <ls...@shopping.com> wrote: > Here is the query that I am running, just in case someone has an idea of > how to improve it. > > SELECT > CONCAT(CONCAT('"', PRSS.DATE_KEY), '"'), > CONCAT(CONCAT('"', PRSC.DATE_KEY), '"'), > CONCAT(CONCAT('"', PRSS.VOTF_REQUEST_ID), '"'), > CONCAT(CONCAT('"', PRSC.VOTF_REQUEST_ID), '"'), > CONCAT(CONCAT('"', PRSS.PRS_REQUEST_ID), '"'), > CONCAT(CONCAT('"', PRSC.PRS_REQUEST_ID), '"'), > ... > ... > ... > FROM > FCT_PRSS PRSS FULL OUTER JOIN FCT_PRSC PRSC ON > (PRSS.PRS_REQUEST_ID = PRSC.PRS_REQUEST_ID) > WHERE (PRSS.date_key >= '2009121600' AND > PRSS.date_key < '2009121700') OR > (PRSC.date_key >= '2009121600' AND > PRSC.date_key < '2009121700') > > > Lee Sagi | Data Warehouse Tech Lead & Architect | Work: 650-616-6575 | > Cell: 718-930-7947 > > -----Original Message----- > From: Edward Capriolo [mailto:edlinuxg...@gmail.com] > Sent: Thursday, December 17, 2009 11:03 AM > To: hive-user@hadoop.apache.org > Subject: Re: Throttling hive queries > > You should be able > > hive > set mapred.map.tasks=1000 > hive > set mapred.reduce.tasks=5 > > In some cases mappers is controlled by input files (pre hadoop 20) > > > On Thu, Dec 17, 2009 at 1:58 PM, Sagi, Lee <ls...@shopping.com> wrote: > > Is there a way to throttle hive queries? > > > > For example, I want to tell hive to not use more then 1000 mappers and > > > 5 reducers for a particular query (or session). > > >