Only for count(1) though. For others it still does 2mr.

See hive-223 - it does what Qing is asking for. Still not committed - so can 
try out patch. 1mr with the option mentioned below. Will also do 1mr with 
hive.groupby.skewindata=false for non map-side aggregate as well.

________________________________
From: Zheng Shao [mailto:zsh...@gmail.com]
Sent: Wednesday, February 18, 2009 8:14 PM
To: hive-user@hadoop.apache.org
Subject: Re: Is there a way to hint Hive the reduce key will be evenly 
distributed?

Hi Qing,

The easiest way to do that is to enable map side aggregation. In that case Hive 
will do one map-reduce job.

hive> set hive.map.aggr=true;
hive> explain select count(1) from mytable;

Zheng
On Wed, Feb 18, 2009 at 7:52 PM, Qing Yan 
<qing...@gmail.com<mailto:qing...@gmail.com>> wrote:

So it does not have to run map reduce twice *every time*.



Thank you!



--
Yours,
Zheng

Reply via email to