GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22060
[DO NOT MERGE][TEST ONLY] Add once-policy rule check ## What changes were proposed in this pull request? Rules like `HandleNullInputsForUDF` (https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can apply new changes to a plan indefinitely) and can cause problems like SQL cache mismatching. Ideally, all rules whether in a once-policy batch or a fixed-point-policy batch should stabilize after the number of runs specified. Once-policy should be considered a performance improvement, a assumption that the rule can stabilize after just one run rather than an assumption that the rule won't be applied more than once. Those once-policy rules should be able to run fine with fixed-point policy rule as well. Currently we already have a check for fixed-point and throws an exception if maximum number of runs is reached and the plan is still changing. Here, in this PR, a similar check is added for once-policy and throws an exception if the plan changes between the first run and the second run of a once-policy rule. From this test result, we can find out which of the analysis rules break this check so we can fix them later. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/maryannxue/spark once_policy Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22060.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22060 ---- commit 323656872799b8dd636061220f3ed139379c9c79 Author: maryannxue <maryannxue@...> Date: 2018-08-09T05:20:32Z Add once-policy batch check ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org