[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-20 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6668:


Status: Open  (was: Patch Available)

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Assignee: Navis
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt, 
 HIVE-6668.3.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6668:


Attachment: HIVE-6668.3.patch.txt

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Assignee: Navis
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt, 
 HIVE-6668.3.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-16 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6668:


Attachment: HIVE-6668.2.patch.txt

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Assignee: Navis
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-16 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6668:


Status: Patch Available  (was: Open)

kick test

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Assignee: Navis
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6668:


Attachment: HIVE-6668.1.patch.txt

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)