Re: [Spark Core] Adaptive dynamic partition pruning

hajyoussef amine Fri, 11 Nov 2022 08:13:44 -0800

Hi Jie,
Let's suppose we have ((dimension_table Join fact_table1) join
fact_table2). In the case where (dimension_table JOIN fact_table1) is small
enough, the result ideally can be treated as another dimension table and
thus used to prune the fact_table2. I don't find an easy way to implement
it though.



On Fri, Nov 11, 2022 at 4:32 PM Jie Han <tunyu...@gmail.com> wrote:

> FYI,
> https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89
>
> This blog may be helpful. Dynamic pruning often works for star schema
> queries. So, your fact table is big_table which is used to join the others.
> So there’s only one subqueryboradcast dynamicpruning plan before
> big_table’s scan while there’s none for the others.
>
> I’m not sure that I’m correct. Hope it’s helpful to you.
>
> 2022年11月11日 21:43，hajyoussef amine <hajyoussef.am...@gmail.com> 写道：
>
> SubqueryBroadcast
>
>
>

Re: [Spark Core] Adaptive dynamic partition pruning

Reply via email to