Hi Sagi,
Any chance you're running on a directory that has 614 small files?
-Todd
On Thu, Dec 17, 2009 at 2:30 PM, Sagi, Lee wrote:
> Todd, Here is the job info.
>
>
>
> Counter Map Reduce Total File Systems HDFS bytes read 199,115,508 0
> 199,115,508 HDFS bytes written 0 9,665,472 9,665,472
Hi Folks,
We have release the rc3 candidate as Hive 0.4.1. You can find download
it from the download page.
http://hadoop.apache.org/hive/releases.html#Download
Thanks,
Zheng
Todd, Here is the job info.
Counter Map Reduce Total
File Systems HDFS bytes read 199,115,508 0
199,115,508
HDFS bytes written 0 9,665,472 9,665,472
Local bytes read 0 321,210,205 321,210,205
Local bytes written
Hi Lee,
The MapReduce framework in general makes it hard for you assign fewer
mappers than there are blocks in the input data, when using FileInputFormat.
Is your input set about 42GB with a 64M block size, or 84G with a 128M block
size?
-Todd
On Thu, Dec 17, 2009 at 11:32 AM, Sagi, Lee wrote:
Here is the query that I am running, just in case someone has an idea of
how to improve it.
SELECT
CONCAT(CONCAT('"', PRSS.DATE_KEY), '"'),
CONCAT(CONCAT('"', PRSC.DATE_KEY), '"'),
CONCAT(CONCAT('"', PRSS.VOTF_REQUEST_ID), '"'),
CONCAT(CONCAT('"', PRSC.VOTF_REQUEST_ID), '"
I tried that but it does not seem to work.
Hive> set mapred.map.tasks=100;
Hive> select count(1) from FCT_PRSC where date_key>='2009121600' and
date_key<'2009121700';
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in
You should be able
hive > set mapred.map.tasks=1000
hive > set mapred.reduce.tasks=5
In some cases mappers is controlled by input files (pre hadoop 20)
On Thu, Dec 17, 2009 at 1:58 PM, Sagi, Lee wrote:
> Is there a way to throttle hive queries?
>
> For example, I want to tell hive to not use m
Is there a way to throttle hive queries?
For example, I want to tell hive to not use more then 1000 mappers and 5
reducers for a particular query (or session).