[jira] [Updated] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes

Asif (Jira) Tue, 12 Aug 2025 19:18:05 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-36786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Asif updated SPARK-36786:
-------------------------
    Description: 
h2. Q1. What are you trying to do? Articulate your objectives using absolutely 
no jargon.

The aim is to improve the compile time performance of query which in WorkDay's 
use case takes > 24 hrs ( & eventually fails) , to  < 8 min.

To explain the problem, I will provide the context.

The query plan in our production system, is huge, with nested *case when* 
expressions ( level of nesting could be >  8) , where each *case when* can have 
branches sometimes > 1000.

The plan could look like
{quote}Project1
    |
   Filter 1
    |

Project2
    |
 Filter2
    |
 Project3
    |
 Filter3

  |

Join
{quote}
Now the optimizer has a Batch of Rules , intended to run at max 100 times.

*Also note that the, the batch will continue to run till one of the condition 
is satisfied*

*i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
achieved)*

One of the early  Rule is   *PushDownPredicateRule.*

**Followed by **CollapseProject**.

 

The first issue is *PushDownPredicate* rule.
*Though the intention of the rule is to combine the filters as well as push it 
down in a single pass, but in practice that is not achieved.*

The reason is explained below
bq. 
bq. The intention of the above code to chain CombineFilters rule with pushdown 
rules, is that if the case arises like Filter1 -> Filter2 -> Filter3 -> Project 
-> BaseRelation, the CombineRule will match the case Filter -> Filter => which 
will return a SingleFilter.
bq. 
bq. And since CombineFilters rule had a match, the orElse ( chained rules ) 
will skip. and then SingleFilter -> Filter3, will be adjacent, and CombineRule 
would make it FinalFilter, which will then get pushed down with the help of 
orElse chained rules and in a single tree traversal the rule will pushdown the 
filters and also combine them
bq. 
bq. But in practice, that is not achieved
bq. 
bq. The reason is that when the tree Filter1 -> Filter2 -> Filter3 -> 
Project->BaseRelation, is traversed from top to bottom,
bq. 
bq. The first node to be analyzed by the rule is Filter1 , and it will satisfy 
the CombineFilters rule, there by combining Filter1 -> Filter2 to say Filter12 ,
bq. 
bq. so in effect when rule operated on Filter1, it subsumed Filter2 there by 
creating a new tree Filter12 -> Filter3 -> Projection -> Base Relation.
bq. 
bq. And next invocation on the tree would be the new child of the node which 
replaces Filter1, i.e (which is NOT Filter12, but Filter3.) This is because 
Filter1 is effectively replaced by Filter12 and its child is Filter3.
bq. 
bq. As a result , when the single pass of the rule ends, the tree would look 
like
bq. 
bq. Filter12 -> Project -> Filter3 -> BaseRelation, resulting in another 
iteration.

bq. Apart from the above issue, this logic of pushing down Filters like this 
also has an inherent inefficiency.
bq. 
bq. When a Filter is pushed below the Project node, the Filter expression, 
needs to be re-aliased , in terms of the expressions of the Aliases.
bq. 
bq. Thus as the filter keeps getting pushed down, the expression tree size 
keeps increasing more and more. and subsequent re-aliasing becomes costlier as 
tree size increases.
bq. 
bq. The gist being that in the above case the Tree being visited (substituted) 
is increasing in size, while the substitution value ( say a subtree) is 
relatively small. and this tree traversal is happening from top to bottom.
bq. 
bq. Ideally, if the re-aliasing happens at the end, i.e when the filter has 
reached its final resting place, and keeping track of all the projects 
encountered till then. And if we start collapsing the projects collected from 
bottom to top ( instead of earlier case of top to bottom), then effectively the 
tree to be substituted ( visited) would be small, and substitution value would 
be large.. But since the tree will be traversed from bottom to Top, we will not 
have to traverse the substitution value . This makes a huge difference in 
abnormally complex filters.
  
h2. Q2. What problem is this proposal NOT designed to solve?

It is not going to change any runtime profile.
h2. Q3. How is it done today, and what are the limits of current practice?

Like mentioned above , currently PushDownPredicate pushes one filter at a time  
& at each Project , it materialized the re-aliased filter.  This results in 
large number of iterations to achieve idempotency as well as immediate 
materialization of Filter after each Project pass,, results in unnecessary tree 
traversals of filter expression that too using transformUp. and the expression 
tree of filter is bound to keep increasing as it is pushed down.
h2. Q4. What is new in your approach and why do you think it will be successful?

In the new approach we push all the filters down in a single pass. And do not 
materialize filters as it pass through Project. Instead keep collecting 
projects in sequential order and materialize the final filter once its final 
position is achieved ( above a join , in case of 2.1 , or above the base 
relation etc).

This approach when coupled with the logic of identifying those Project operator 
whose expressions will not mutate ( which I will share later) , so that rules 
like 

NullPropagation,
 OptimizeIn.,
 LikeSimplification.,
 BooleanSimplification.,
 SimplifyConditionals.,
 RemoveDispensableExpressions.,
 SimplifyBinaryComparison.,
 SimplifyCaseConversionExpressions.,
 SimplifyExtractValueOps
  

are applied only in first pass on the expressions of that Project operator,  
the compilation time of offending queries have been reduced to under 8 mins 
from 24 hrs or more.

 
h2. Q5. Who cares? If you are successful, what difference will it make?

For My company WorkDay, it will solve the currently failing plans due to OOM & 
compilation time running into 24 hrs or so. I have a PR for this locally, will 
publish it in some  time.

 
h2. Q6. What are the risks?

The risk in the change of PushDownPredicate is very low. 
 For the next proposal of identifying Project operator whose expressions will 
be immutable such that the above set of rules run only once has some relatively 
complex logic but with extra tests coverage it should be safe.
h2. Q7. How long will it take?

The basic changes are already in place. tests will take time. around 10 -15 
days.
h2. Q8. What are the mid-term and final “exams” to check for success?

All tests should pass.
 The perf benefit should justify the changes.

  was:
h2. Q1. What are you trying to do? Articulate your objectives using absolutely 
no jargon.

The aim is to improve the compile time performance of query which in WorkDay's 
use case takes > 24 hrs ( & eventually fails) , to  < 8 min.

To explain the problem, I will provide the context.

The query plan in our production system, is huge, with nested *case when* 
expressions ( level of nesting could be >  8) , where each *case when* can have 
branches sometimes > 1000.

The plan could look like
{quote}Project1
    |
   Filter 1
    |

Project2
    |
 Filter2
    |
 Project3
    |
 Filter3

  |

Join
{quote}
Now the optimizer has a Batch of Rules , intended to run at max 100 times.

*Also note that the, the batch will continue to run till one of the condition 
is satisfied*

*i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
achieved)*

One of the early  Rule is   *PushDownPredicateRule.*

**Followed by **CollapseProject**.

 

The first issue is *PushDownPredicate* rule.

It picks  one filter at a time & pushes it at lowest level ( I understand that 
in 3.1 it pushes through join, while in 2.4 it stops at Join) , but either case 
it picks 1 filter at time starting from top, in each iteration.

*The above comment is no longer true in 3.1 release as it now combines filters. 
so it does push now all the encountered filters in a single pass. But it still 
materializes the filter on each push by realiasing.*



So if there are say  50 projects interspersed with Filters , the idempotency is 
guaranteedly not going to get achieved till around 49 iterations. Moreover, 
CollapseProject will also be modifying tree on each iteration as a filter will 
get removed within Project.

Moreover, on each movement of filter through project tree, the filter is 
re-aliased using transformUp rule.  transformUp is very expensive compared to 
transformDown. As the filter keeps getting pushed down , its size increases.

To optimize this rule , 2 things are needed
 # Instead of pushing one filter at a time,  collect all the filters as we 
traverse the tree in that iteration itself.
 # Do not re-alias the filters on each push. Collect the sequence of projects 
it has passed through, and  when the filters have reached their resting place, 
do the re-alias by processing the projects collected in down to up manner.

This will result in achieving idempotency in a couple of iterations. 

*How reducing the number of iterations help in performance*

There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals ( 
... there are around 6 more such rules)*  which traverse the tree using 
transformUp, and they run unnecessarily in each iteration , even when the 
expressions in an operator have not changed since the previous runs.

*I have a different proposal which I will share later, as to how to avoid the 
above rules from running unnecessarily, if it can be guaranteed that the 
expression is not going to mutate in the operator.* 

The cause of our huge compilation time has been identified as the above.
  
h2. Q2. What problem is this proposal NOT designed to solve?

It is not going to change any runtime profile.
h2. Q3. How is it done today, and what are the limits of current practice?

Like mentioned above , currently PushDownPredicate pushes one filter at a time  
& at each Project , it materialized the re-aliased filter.  This results in 
large number of iterations to achieve idempotency as well as immediate 
materialization of Filter after each Project pass,, results in unnecessary tree 
traversals of filter expression that too using transformUp. and the expression 
tree of filter is bound to keep increasing as it is pushed down.
h2. Q4. What is new in your approach and why do you think it will be successful?

In the new approach we push all the filters down in a single pass. And do not 
materialize filters as it pass through Project. Instead keep collecting 
projects in sequential order and materialize the final filter once its final 
position is achieved ( above a join , in case of 2.1 , or above the base 
relation etc).

This approach when coupled with the logic of identifying those Project operator 
whose expressions will not mutate ( which I will share later) , so that rules 
like 

NullPropagation,
 OptimizeIn.,
 LikeSimplification.,
 BooleanSimplification.,
 SimplifyConditionals.,
 RemoveDispensableExpressions.,
 SimplifyBinaryComparison.,
 SimplifyCaseConversionExpressions.,
 SimplifyExtractValueOps
  

are applied only in first pass on the expressions of that Project operator,  
the compilation time of offending queries have been reduced to under 8 mins 
from 24 hrs or more.

 
h2. Q5. Who cares? If you are successful, what difference will it make?

For My company WorkDay, it will solve the currently failing plans due to OOM & 
compilation time running into 24 hrs or so. I have a PR for this locally, will 
publish it in some  time.

 
h2. Q6. What are the risks?

The risk in the change of PushDownPredicate is very low. 
 For the next proposal of identifying Project operator whose expressions will 
be immutable such that the above set of rules run only once has some relatively 
complex logic but with extra tests coverage it should be safe.
h2. Q7. How long will it take?

The basic changes are already in place. tests will take time. around 10 -15 
days.
h2. Q8. What are the mid-term and final “exams” to check for success?

All tests should pass.
 The perf benefit should justify the changes.


> SPIP: Improving the compile time performance, by improving  a couple of 
> rules, from 24 hrs to under 8 minutes
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36786
>                 URL: https://issues.apache.org/jira/browse/SPARK-36786
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.1, 3.1.2
>            Reporter: Asif
>            Priority: Major
>              Labels: SPIP
>
> h2. Q1. What are you trying to do? Articulate your objectives using 
> absolutely no jargon.
> The aim is to improve the compile time performance of query which in 
> WorkDay's use case takes > 24 hrs ( & eventually fails) , to  < 8 min.
> To explain the problem, I will provide the context.
> The query plan in our production system, is huge, with nested *case when* 
> expressions ( level of nesting could be >  8) , where each *case when* can 
> have branches sometimes > 1000.
> The plan could look like
> {quote}Project1
>     |
>    Filter 1
>     |
> Project2
>     |
>  Filter2
>     |
>  Project3
>     |
>  Filter3
>   |
> Join
> {quote}
> Now the optimizer has a Batch of Rules , intended to run at max 100 times.
> *Also note that the, the batch will continue to run till one of the condition 
> is satisfied*
> *i.e  either numIter == 100 || inputPlan == outputPlan (idempotency is 
> achieved)*
> One of the early  Rule is   *PushDownPredicateRule.*
> **Followed by **CollapseProject**.
>  
> The first issue is *PushDownPredicate* rule.
> *Though the intention of the rule is to combine the filters as well as push 
> it down in a single pass, but in practice that is not achieved.*
> The reason is explained below
> bq. 
> bq. The intention of the above code to chain CombineFilters rule with 
> pushdown rules, is that if the case arises like Filter1 -> Filter2 -> Filter3 
> -> Project -> BaseRelation, the CombineRule will match the case Filter -> 
> Filter => which will return a SingleFilter.
> bq. 
> bq. And since CombineFilters rule had a match, the orElse ( chained rules ) 
> will skip. and then SingleFilter -> Filter3, will be adjacent, and 
> CombineRule would make it FinalFilter, which will then get pushed down with 
> the help of orElse chained rules and in a single tree traversal the rule will 
> pushdown the filters and also combine them
> bq. 
> bq. But in practice, that is not achieved
> bq. 
> bq. The reason is that when the tree Filter1 -> Filter2 -> Filter3 -> 
> Project->BaseRelation, is traversed from top to bottom,
> bq. 
> bq. The first node to be analyzed by the rule is Filter1 , and it will 
> satisfy the CombineFilters rule, there by combining Filter1 -> Filter2 to say 
> Filter12 ,
> bq. 
> bq. so in effect when rule operated on Filter1, it subsumed Filter2 there by 
> creating a new tree Filter12 -> Filter3 -> Projection -> Base Relation.
> bq. 
> bq. And next invocation on the tree would be the new child of the node which 
> replaces Filter1, i.e (which is NOT Filter12, but Filter3.) This is because 
> Filter1 is effectively replaced by Filter12 and its child is Filter3.
> bq. 
> bq. As a result , when the single pass of the rule ends, the tree would look 
> like
> bq. 
> bq. Filter12 -> Project -> Filter3 -> BaseRelation, resulting in another 
> iteration.
> bq. Apart from the above issue, this logic of pushing down Filters like this 
> also has an inherent inefficiency.
> bq. 
> bq. When a Filter is pushed below the Project node, the Filter expression, 
> needs to be re-aliased , in terms of the expressions of the Aliases.
> bq. 
> bq. Thus as the filter keeps getting pushed down, the expression tree size 
> keeps increasing more and more. and subsequent re-aliasing becomes costlier 
> as tree size increases.
> bq. 
> bq. The gist being that in the above case the Tree being visited 
> (substituted) is increasing in size, while the substitution value ( say a 
> subtree) is relatively small. and this tree traversal is happening from top 
> to bottom.
> bq. 
> bq. Ideally, if the re-aliasing happens at the end, i.e when the filter has 
> reached its final resting place, and keeping track of all the projects 
> encountered till then. And if we start collapsing the projects collected from 
> bottom to top ( instead of earlier case of top to bottom), then effectively 
> the tree to be substituted ( visited) would be small, and substitution value 
> would be large.. But since the tree will be traversed from bottom to Top, we 
> will not have to traverse the substitution value . This makes a huge 
> difference in abnormally complex filters.
>   
> h2. Q2. What problem is this proposal NOT designed to solve?
> It is not going to change any runtime profile.
> h2. Q3. How is it done today, and what are the limits of current practice?
> Like mentioned above , currently PushDownPredicate pushes one filter at a 
> time  & at each Project , it materialized the re-aliased filter.  This 
> results in large number of iterations to achieve idempotency as well as 
> immediate materialization of Filter after each Project pass,, results in 
> unnecessary tree traversals of filter expression that too using transformUp. 
> and the expression tree of filter is bound to keep increasing as it is pushed 
> down.
> h2. Q4. What is new in your approach and why do you think it will be 
> successful?
> In the new approach we push all the filters down in a single pass. And do not 
> materialize filters as it pass through Project. Instead keep collecting 
> projects in sequential order and materialize the final filter once its final 
> position is achieved ( above a join , in case of 2.1 , or above the base 
> relation etc).
> This approach when coupled with the logic of identifying those Project 
> operator whose expressions will not mutate ( which I will share later) , so 
> that rules like 
> NullPropagation,
>  OptimizeIn.,
>  LikeSimplification.,
>  BooleanSimplification.,
>  SimplifyConditionals.,
>  RemoveDispensableExpressions.,
>  SimplifyBinaryComparison.,
>  SimplifyCaseConversionExpressions.,
>  SimplifyExtractValueOps
>   
> are applied only in first pass on the expressions of that Project operator,  
> the compilation time of offending queries have been reduced to under 8 mins 
> from 24 hrs or more.
>  
> h2. Q5. Who cares? If you are successful, what difference will it make?
> For My company WorkDay, it will solve the currently failing plans due to OOM 
> & compilation time running into 24 hrs or so. I have a PR for this locally, 
> will publish it in some  time.
>  
> h2. Q6. What are the risks?
> The risk in the change of PushDownPredicate is very low. 
>  For the next proposal of identifying Project operator whose expressions will 
> be immutable such that the above set of rules run only once has some 
> relatively complex logic but with extra tests coverage it should be safe.
> h2. Q7. How long will it take?
> The basic changes are already in place. tests will take time. around 10 -15 
> days.
> h2. Q8. What are the mid-term and final “exams” to check for success?
> All tests should pass.
>  The perf benefit should justify the changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes

Reply via email to