[ 
https://issues.apache.org/jira/browse/HIVE-26582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612566#comment-17612566
 ] 

Stamatis Zampetakis commented on HIVE-26582:
--------------------------------------------

[~kkasa] my bad I didn't pay enough attention to the input query sorry about 
that.

I am thinking that maybe we could use the {{RelMdMaxRowCount}} metadata to do 
similar pruning with what we do with {{PruneEmptyRules}} when the return value 
is zero. This could be an improvement to {{PruneEmptyRules}} or a new rule 
altogether. This may not be a solution to this ticket but if it may be an 
optimization worth adding, WDYT?

> Cartesian join fails if the query has an empty table when cartesian product 
> edge is used
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-26582
>                 URL: https://issues.apache.org/jira/browse/HIVE-26582
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Tez
>            Reporter: Sourabh Badhya
>            Priority: Major
>
> The following example fails when "hive.tez.cartesian-product.enabled" is true 
> - 
> Test command - 
> {code:java}
> mvn test -Dtest=TestMiniLlapCliDriver -Dqfile=file.q 
> -Dtest.output.overwrite=true {code}
> Query - file.q
> {code:java}
> set hive.tez.cartesian-product.enabled=true;
> create table c (a1 int) stored as orc;
> create table tmp1 (a int) stored as orc;
> create table tmp2 (a int) stored as orc;
> insert into table c values (3);
> insert into table tmp1 values (3);
> with
> first as (
> select a1 from c where a1 = 3
> ),
> second as (
> select a from tmp1
> union all
> select a from tmp2
> )
> select a from second cross join first; {code}
> The following stack trace is seen - 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Number of items is 0. Should 
> be positive
>         at 
> org.apache.tez.common.Preconditions.checkArgument(Preconditions.java:38)
>         at org.apache.tez.runtime.library.utils.Grouper.init(Grouper.java:41)
>         at 
> org.apache.tez.runtime.library.cartesianproduct.FairCartesianProductEdgeManager.initialize(FairCartesianProductEdgeManager.java:66)
>         at 
> org.apache.tez.runtime.library.cartesianproduct.CartesianProductEdgeManager.initialize(CartesianProductEdgeManager.java:51)
>         at org.apache.tez.dag.app.dag.impl.Edge.initialize(Edge.java:213)
>         ... 22 more{code}
> The following error is seen because one of the tables (tmp2 in this case) has 
> 0 rows in it. 
> The query works fine when the config hive.tez.cartesian-product.enabled is 
> set to false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to