jiaan.geng created SPARK-44571:
----------------------------------

             Summary: Eliminate the Join by Combine multiple Aggregates
                 Key: SPARK-44571
                 URL: https://issues.apache.org/jira/browse/SPARK-44571
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: jiaan.geng


Recently, I investigate the test case q28 which is belong to the TPC-DS queries.

The query contains multiple scalar subquery with aggregation and connected with 
inner join.
If we can merge the filters and aggregates, we can scan data source only once 
and eliminate the join so as avoid shuffle. Obviously, this change will improve 
the performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to