Robert Hou created DRILL-7109: --------------------------------- Summary: Statistics adds external sort, which spills to disk Key: DRILL-7109 URL: https://issues.apache.org/jira/browse/DRILL-7109 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Robert Hou Assignee: Gautam Parai Fix For: 1.16.0
TPCH query 4 with sf 100 runs many times slower. One issue is that an extra external sort has been added, and both external sorts spill to disk. Also, the hash join sees 100x more data. Here is the query: {noformat} select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority; {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)