date:20221111

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread Jie Han

Hmmm… Sorry, I don’t have an idea. Maybe we can try subquery? I’m not sure whether it can work :( . We need help from other members of the community. > 2022年11月12日 00:10，hajyoussef amine 写道： > > Hi Jie, > Let's suppose we have ((dimension_table Join fact_table1) join fact_table2). > In the

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread hajyoussef amine

Hi Jie, Let's suppose we have ((dimension_table Join fact_table1) join fact_table2). In the case where (dimension_table JOIN fact_table1) is small enough, the result ideally can be treated as another dimension table and thus used to prune the fact_table2. I don't find an easy way to implement it

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread Jie Han

FYI, https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89 This blog may

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread hajyoussef amine

Hi Jie, Thank you for the response. Dynamic pruning work to filter prune the first join not the second one. so in the example I shared above. big_table is partition pruned but bigger_table is not. Here's the result of running explain extended on the following query: Select * FROM