[ 
https://issues.apache.org/jira/browse/HIVE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154930#comment-15154930
 ] 

Chaoyu Tang commented on HIVE-13082:
------------------------------------

[~gopalv] The query seems not work in Hive and throws out following error
{code}
hive> explain select table1.id, table1.val, table1.val1 from table1 left semi 
join table3 on table1.dimid = table3.id where table3.id = 1;
FAILED: SemanticException [Error 10004]: Line 1:118 Invalid table alias or 
column reference 'table3': (possible column names are: id, val, val1, dimid)
{code}
Neither this equivalent query:
{code}
explain select table1.id, table1.val, table1.val1 from table1 where 
table1.dimid in (select id from table3) and table3.id =1;
FAILED: SemanticException [Error 10004]: Line 1:112 Invalid table alias or 
column reference 'table3': (possible column names are: id, val, val1, dimid)
{code}
The 2nd query seems also not valid in MySQL with error "Unknown column 
'table3.id' in 'where clause'" and the 1st one does not work in MySQL probably 
'semi join' is not a supported sql syntax in MySQL.

If the query is rewritten as following, it works
{code}
select table1.id, table1.val, table1.val1 from table1 where table1.dimid in 
(select id from table3 where table3.id =1);
{code}
It works. That the 1st and 2nd queries do not work in Hive (or MySQL) seems to 
me not related to the constant folding, but some the syntax issue, right? 
BTW, what is the -ve test? Thanks


> Enable constant propagation optimization in query with left semi join
> ---------------------------------------------------------------------
>
>                 Key: HIVE-13082
>                 URL: https://issues.apache.org/jira/browse/HIVE-13082
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 2.0.0
>            Reporter: Chaoyu Tang
>            Assignee: Chaoyu Tang
>         Attachments: HIVE-13082.1.patch, HIVE-13082.patch
>
>
> Currently constant folding is only allowed for inner or unique join, I think 
> it is also applicable and allowed for left semi join. Otherwise the query 
> like following having multiple joins with left semi joins will fail:
> {code} 
> select table1.id, table1.val, table2.val2 from table1 inner join table2 on 
> table1.val = 't1val01' and table1.id = table2.id left semi join table3 on 
> table1.dimid = table3.id;
> {code}
> with errors:
> {code}
> java.lang.Exception: java.lang.RuntimeException: Error in configuring object
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.0.jar:?]
> Caused by: java.lang.RuntimeException: Error in configuring object
>       at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> ~[hadoop-common-2.6.0.jar:?]
>       at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
>       at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.6.0.jar:?]
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.6.0.jar:?]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_45]
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_45]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_45]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_45]
>       at java.lang.Thread.run(Thread.java:744) ~[?:1.7.0_45]
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>       at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_45]
>       at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_45]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.<init>(StandardStructObjectInspector.java:109)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:326)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:311)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:319)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:78)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:138)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to