Re: confused on different behavior of Bucketized tables do not support INSERT INTO

2012-05-31 Thread Bruce Bian
I'm using hive 0.9.0 On Thursday, May 31, 2012, Bruce Bian wrote: Hi, I've got a table vt_new_data which is defined as follows: CREATE TABLE VT_NEW_DATA ( V_ACCOUNT_NUM string ,V_ACCOUNT_MODIFIER_NUM string ,V_DEPOSIT_TYPE_CD string ,V_DEPOSIT_TERM int

Re: confused on different behavior of Bucketized tables do not support INSERT INTO

2012-05-31 Thread Bruce Bian
(i mean why not support insert into a bucketized table from the same table)?And isn't that error message kind of misleading? On Thu, May 31, 2012 at 6:43 PM, Bruce Bian weidong@gmail.com wrote: I'm using hive 0.9.0 On Thursday, May 31, 2012, Bruce Bian wrote: Hi, I've got a table

Condition for doing a sort merge bucket map join

2012-05-22 Thread Bruce Bian
Hi , I've got 7 large tables to join(each ~10G in size) into one table, all with the same* 2 *join keys, I've read some documents on sort merge bucket map join, but failed to fire that. I've bucketed all the 7 tables into 20 buckets and sorted by one of the join key, set

Re: how is number of mappers determined in mapside join?

2012-03-20 Thread Bruce Bian
memory issues pretty soon. Consider increasing it at least to 64 mb, though all larger clusters use either 128 or 256 Mb blocks. Hope it helps!.. Regards Bejoy -- *From:* Bruce Bian weidong@gmail.com *To:* user@hive.apache.org; Bejoy Ks bejoy...@yahoo.com

how is number of mappers determined in mapside join?

2012-03-19 Thread Bruce Bian
Hi there, when I'm executing the following queries in hive set hive.auto.convert.join = true; CREATE TABLE IDAP_ROOT as SELECT a.*,b.acnt_no FROM idap_pi_root a LEFT OUTER JOIN idap_pi_root_acnt b ON a.acnt_id=b.acnt_id the number of mappers to run in the mapside join is 3, how is it determined?

Re: how is number of mappers determined in mapside join?

2012-03-19 Thread Bruce Bian
-- *From:* Bruce Bian weidong@gmail.com *To:* user@hive.apache.org *Sent:* Monday, March 19, 2012 2:42 PM *Subject:* how is number of mappers determined in mapside join? Hi there, when I'm executing the following queries in hive set hive.auto.convert.join = true

Reduce the number of map/reduce jobs during join

2012-03-13 Thread Bruce Bian
, Mar 13, 2012 at 9:54 PM, Bruce Bian weidong@gmail.com wrote: Hi there, when I'm using Hive to doing a query as follows, 6 Map/Reduce jobs are launched, one for each join, and it deals with ~460M data in ~950 seconds, which I think is way t slow for a cluster with 5 slaves and 24GB memory