Chen Luo created ASTERIXDB-2779: ----------------------------------- Summary: Join Condition Is Not Identified for TPC-H Q18 Key: ASTERIXDB-2779 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2779 Project: Apache AsterixDB Issue Type: Bug Components: COMP - Compiler Reporter: Chen Luo Attachments: tpch_ddl.sql
The query optimizer fails to identify the join condition for TPC-H Q18, and thus produced a query plan with a cartisian product. The DDLs are attached below. The original Q18 is as follows: {code:java} use tpch; WITH tmp AS ( SELECT l_orderkey l_orderkey, sum(l.l_quantity) t_sum_quantity FROM LineItem AS l GROUP BY l.l_orderkey as l_orderkey )SELECT c_name c_name, c_custkey c_custkey, o_orderkey o_orderkey, o_orderdate o_orderdate, o_totalprice o_totalprice, sum(l.l_quantity) sum_quantity FROM Customer c, Orders o, tmp t, LineItem l WHERE c.c_custkey = o.o_custkey AND o.o_orderkey = t.l_orderkey AND t.t_sum_quantity > 300 AND l.l_orderkey = t.l_orderkey GROUP BY c.c_name AS c_name,c.c_custkey AS c_custkey, o.o_orderkey AS o_orderkey,o.o_orderdate AS o_orderdate, o.o_totalprice AS o_totalprice ORDER BY o_totalprice DESC,o_orderdate LIMIT 100 ; {code} However, the query condition was correctly identified after Q18 is refactored as follows: {code:java} use tpch; WITH tmp AS ( SELECT l_orderkey, sum(l.l_quantity) t_sum_quantity FROM LineItem AS l GROUP BY l.l_orderkey as l_orderkey HAVING sum(l.l_quantity)>300 )SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l.l_quantity) sum_quantity FROM Customer c JOIN Orders o ON c.c_custkey = o.o_custkey JOIN tmp t ON o.o_orderkey = t.l_orderkey JOIN LineItem l ON t.l_orderkey = l.l_orderkey GROUP BY c.c_name AS c_name,c.c_custkey AS c_custkey, o.o_orderkey AS o_orderkey,o.o_orderdate AS o_orderdate, o.o_totalprice AS o_totalprice ORDER BY o_totalprice DESC,o_orderdate LIMIT 100 ; {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)