[GitHub] spark pull request #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule ch...

2018-11-10 Thread maryannxue
Github user maryannxue closed the pull request at:

https://github.com/apache/spark/pull/22060


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule ch...

2018-08-09 Thread maryannxue
GitHub user maryannxue opened a pull request:

https://github.com/apache/spark/pull/22060

[DO NOT MERGE][TEST ONLY] Add once-policy rule check

## What changes were proposed in this pull request?

Rules like `HandleNullInputsForUDF` 
(https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can apply 
new changes to a plan indefinitely) and can cause problems like SQL cache 
mismatching.
Ideally, all rules whether in a once-policy batch or a fixed-point-policy 
batch should stabilize after the number of runs specified. Once-policy should 
be considered a performance improvement, a assumption that the rule can 
stabilize after just one run rather than an assumption that the rule won't be 
applied more than once. Those once-policy rules should be able to run fine with 
fixed-point policy rule as well.
Currently we already have a check for fixed-point and throws an exception 
if maximum number of runs is reached and the plan is still changing. Here, in 
this PR, a similar check is added for once-policy and throws an exception if 
the plan changes between the first run and the second run of a once-policy rule.

From this test result, we can find out which of the analysis rules break 
this check so we can fix them later.

## How was this patch tested?

N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maryannxue/spark once_policy

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22060


commit 323656872799b8dd636061220f3ed139379c9c79
Author: maryannxue 
Date:   2018-08-09T05:20:32Z

Add once-policy batch check




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org