GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/19475

    [SPARK-22257][SQL]Reserve all non-deterministic expressions in ExpressionSet

    ## What changes were proposed in this pull request?
    
    For non-deterministic expressions, they should be considered as not 
contained in the [[ExpressionSet]].
    This is consistent with how we define `semanticEquals` between two 
expressions.
    Otherwise, combining expressions will remove non-deterministic expressions 
which should be reserved.
    E.g
    Combine filters of 
    ```scala
    testRelation.where(Rand(0) > 0.1).where(Rand(0) > 0.1)
    ```
    should result in
    ```scala
    testRelation.where(Rand(0) > 0.1 && Rand(0) > 0.1)
    ```
    
    ## How was this patch tested?
    
    Unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark 
non-deterministic-expressionSet

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19475.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19475
    
----
commit 262a0647da01f3e2edae6cb7ab9b66954a899067
Author: Wang Gengliang <ltn...@gmail.com>
Date:   2017-10-11T21:01:54Z

    Reserve all non-deterministic expressions in ExpressionSet.

commit f97fb9808fdeb2a9d46cd70105c7d05b876ad3fa
Author: Wang Gengliang <ltn...@gmail.com>
Date:   2017-10-11T22:32:15Z

    revise comments

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to