[jira] [Commented] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes
[ https://issues.apache.org/jira/browse/SPARK-36786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781976#comment-17781976 ] Asif commented on SPARK-36786: -- I had put this on back burner as my changes were on 3.2, so I have to do a merge . on latest. Though whatever optimizations I did on 3.2 are still applicable as the drawback still exist. But chnages are going to be a a little extensive. If there is interest on it I can pick up , after some days as right now occupied with another spip which proposes chnages for improving perf of broadcast hash joins on non partition column joins. > SPIP: Improving the compile time performance, by improving a couple of > rules, from 24 hrs to under 8 minutes > - > > Key: SPARK-36786 > URL: https://issues.apache.org/jira/browse/SPARK-36786 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.1, 3.1.2 >Reporter: Asif >Priority: Major > Labels: SPIP > > h2. Q1. What are you trying to do? Articulate your objectives using > absolutely no jargon. > The aim is to improve the compile time performance of query which in > WorkDay's use case takes > 24 hrs ( & eventually fails) , to < 8 min. > To explain the problem, I will provide the context. > The query plan in our production system, is huge, with nested *case when* > expressions ( level of nesting could be > 8) , where each *case when* can > have branches sometimes > 1000. > The plan could look like > {quote}Project1 > | > Filter 1 > | > Project2 > | > Filter2 > | > Project3 > | > Filter3 > | > Join > {quote} > Now the optimizer has a Batch of Rules , intended to run at max 100 times. > *Also note that the, the batch will continue to run till one of the condition > is satisfied* > *i.e either numIter == 100 || inputPlan == outputPlan (idempotency is > achieved)* > One of the early Rule is *PushDownPredicateRule.* > **Followed by **CollapseProject**. > > The first issue is *PushDownPredicate* rule. > It picks one filter at a time & pushes it at lowest level ( I understand > that in 3.1 it pushes through join, while in 2.4 it stops at Join) , but > either case it picks 1 filter at time starting from top, in each iteration. > *The above comment is no longer true in 3.1 release as it now combines > filters. so it does push now all the encountered filters in a single pass. > But it still materializes the filter on each push by realiasing.* > So if there are say 50 projects interspersed with Filters , the idempotency > is guaranteedly not going to get achieved till around 49 iterations. > Moreover, CollapseProject will also be modifying tree on each iteration as a > filter will get removed within Project. > Moreover, on each movement of filter through project tree, the filter is > re-aliased using transformUp rule. transformUp is very expensive compared to > transformDown. As the filter keeps getting pushed down , its size increases. > To optimize this rule , 2 things are needed > # Instead of pushing one filter at a time, collect all the filters as we > traverse the tree in that iteration itself. > # Do not re-alias the filters on each push. Collect the sequence of projects > it has passed through, and when the filters have reached their resting > place, do the re-alias by processing the projects collected in down to up > manner. > This will result in achieving idempotency in a couple of iterations. > *How reducing the number of iterations help in performance* > There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals > ( ... there are around 6 more such rules)* which traverse the tree using > transformUp, and they run unnecessarily in each iteration , even when the > expressions in an operator have not changed since the previous runs. > *I have a different proposal which I will share later, as to how to avoid the > above rules from running unnecessarily, if it can be guaranteed that the > expression is not going to mutate in the operator.* > The cause of our huge compilation time has been identified as the above. > > h2. Q2. What problem is this proposal NOT designed to solve? > It is not going to change any runtime profile. > h2. Q3. How is it done today, and what are the limits of current practice? > Like mentioned above , currently PushDownPredicate pushes one filter at a > time & at each Project , it materialized the re-aliased filter. This > results in large number of iterations to achieve idempotency as well as > immediate materialization of Filter after each Project pass,, results in > unnecessary tree traversals of filter expression that too using transformUp. > and the expression tree of filter i
[jira] [Commented] (SPARK-36786) SPIP: Improving the compile time performance, by improving a couple of rules, from 24 hrs to under 8 minutes
[ https://issues.apache.org/jira/browse/SPARK-36786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781966#comment-17781966 ] Abhinav Kumar commented on SPARK-36786: --- [~ashahid7] [~adou...@sqli.com] where are we on this one? > SPIP: Improving the compile time performance, by improving a couple of > rules, from 24 hrs to under 8 minutes > - > > Key: SPARK-36786 > URL: https://issues.apache.org/jira/browse/SPARK-36786 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.1, 3.1.2 >Reporter: Asif >Priority: Major > Labels: SPIP > > h2. Q1. What are you trying to do? Articulate your objectives using > absolutely no jargon. > The aim is to improve the compile time performance of query which in > WorkDay's use case takes > 24 hrs ( & eventually fails) , to < 8 min. > To explain the problem, I will provide the context. > The query plan in our production system, is huge, with nested *case when* > expressions ( level of nesting could be > 8) , where each *case when* can > have branches sometimes > 1000. > The plan could look like > {quote}Project1 > | > Filter 1 > | > Project2 > | > Filter2 > | > Project3 > | > Filter3 > | > Join > {quote} > Now the optimizer has a Batch of Rules , intended to run at max 100 times. > *Also note that the, the batch will continue to run till one of the condition > is satisfied* > *i.e either numIter == 100 || inputPlan == outputPlan (idempotency is > achieved)* > One of the early Rule is *PushDownPredicateRule.* > **Followed by **CollapseProject**. > > The first issue is *PushDownPredicate* rule. > It picks one filter at a time & pushes it at lowest level ( I understand > that in 3.1 it pushes through join, while in 2.4 it stops at Join) , but > either case it picks 1 filter at time starting from top, in each iteration. > *The above comment is no longer true in 3.1 release as it now combines > filters. so it does push now all the encountered filters in a single pass. > But it still materializes the filter on each push by realiasing.* > So if there are say 50 projects interspersed with Filters , the idempotency > is guaranteedly not going to get achieved till around 49 iterations. > Moreover, CollapseProject will also be modifying tree on each iteration as a > filter will get removed within Project. > Moreover, on each movement of filter through project tree, the filter is > re-aliased using transformUp rule. transformUp is very expensive compared to > transformDown. As the filter keeps getting pushed down , its size increases. > To optimize this rule , 2 things are needed > # Instead of pushing one filter at a time, collect all the filters as we > traverse the tree in that iteration itself. > # Do not re-alias the filters on each push. Collect the sequence of projects > it has passed through, and when the filters have reached their resting > place, do the re-alias by processing the projects collected in down to up > manner. > This will result in achieving idempotency in a couple of iterations. > *How reducing the number of iterations help in performance* > There are many rules like *NullPropagation, OptimizeIn, SimplifyConditionals > ( ... there are around 6 more such rules)* which traverse the tree using > transformUp, and they run unnecessarily in each iteration , even when the > expressions in an operator have not changed since the previous runs. > *I have a different proposal which I will share later, as to how to avoid the > above rules from running unnecessarily, if it can be guaranteed that the > expression is not going to mutate in the operator.* > The cause of our huge compilation time has been identified as the above. > > h2. Q2. What problem is this proposal NOT designed to solve? > It is not going to change any runtime profile. > h2. Q3. How is it done today, and what are the limits of current practice? > Like mentioned above , currently PushDownPredicate pushes one filter at a > time & at each Project , it materialized the re-aliased filter. This > results in large number of iterations to achieve idempotency as well as > immediate materialization of Filter after each Project pass,, results in > unnecessary tree traversals of filter expression that too using transformUp. > and the expression tree of filter is bound to keep increasing as it is pushed > down. > h2. Q4. What is new in your approach and why do you think it will be > successful? > In the new approach we push all the filters down in a single pass. And do not > materialize filters as it pass through Project. Instead keep collecting > projects in sequential order and materialize the final filter once i