GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/20021

    [SPARK-22668][SQL] Ensure no global variables in arguments of method split 
by CodegenContext.splitExpressions()

    ## What changes were proposed in this pull request?
    
    Passing global variables to the split method is dangerous, as any mutating 
to it is ignored and may lead to unexpected behavior.
    
    To prevent this, one approach is to make sure no expression would output 
global variables: Localizing lifetime of mutable states in expressions.
    
    Another approach is, when calling `ctx.splitExpression`, make sure we don't 
use children's output as parameter names.
    
    Approach 1 is actually hard to do, as we need to check all expressions and 
operators that support whole-stage codegen. Approach 2 is easier as the callers 
of `ctx.splitExpressions` are not too many.
    
    Besides, approach 2 is more flexible, as children's output may be other 
stuff that can't be parameter name: literal, inlined statement(a + 1), etc.
    
    close https://github.com/apache/spark/pull/19865
    close https://github.com/apache/spark/pull/19938
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark codegen

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20021.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20021
    
----
commit aadb838c20a3e64b6eed3bcb2d32a461e2851575
Author: Wenchen Fan <wenc...@databricks.com>
Date:   2017-12-19T15:18:13Z

    Ensure no global variables in arguments of method split by 
CodegenContext.splitExpressions()

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to