GitHub user maryannxue opened a pull request:

    https://github.com/apache/spark/pull/22060

    [DO NOT MERGE][TEST ONLY] Add once-policy rule check

    ## What changes were proposed in this pull request?
    
    Rules like `HandleNullInputsForUDF` 
(https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can apply 
new changes to a plan indefinitely) and can cause problems like SQL cache 
mismatching.
    Ideally, all rules whether in a once-policy batch or a fixed-point-policy 
batch should stabilize after the number of runs specified. Once-policy should 
be considered a performance improvement, a assumption that the rule can 
stabilize after just one run rather than an assumption that the rule won't be 
applied more than once. Those once-policy rules should be able to run fine with 
fixed-point policy rule as well.
    Currently we already have a check for fixed-point and throws an exception 
if maximum number of runs is reached and the plan is still changing. Here, in 
this PR, a similar check is added for once-policy and throws an exception if 
the plan changes between the first run and the second run of a once-policy rule.
    
    From this test result, we can find out which of the analysis rules break 
this check so we can fix them later.
    
    ## How was this patch tested?
    
    N/A


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maryannxue/spark once_policy

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22060
    
----
commit 323656872799b8dd636061220f3ed139379c9c79
Author: maryannxue <maryannxue@...>
Date:   2018-08-09T05:20:32Z

    Add once-policy batch check

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to