GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/20021
[SPARK-22668][SQL] Ensure no global variables in arguments of method split by CodegenContext.splitExpressions() ## What changes were proposed in this pull request? Passing global variables to the split method is dangerous, as any mutating to it is ignored and may lead to unexpected behavior. To prevent this, one approach is to make sure no expression would output global variables: Localizing lifetime of mutable states in expressions. Another approach is, when calling `ctx.splitExpression`, make sure we don't use children's output as parameter names. Approach 1 is actually hard to do, as we need to check all expressions and operators that support whole-stage codegen. Approach 2 is easier as the callers of `ctx.splitExpressions` are not too many. Besides, approach 2 is more flexible, as children's output may be other stuff that can't be parameter name: literal, inlined statement(a + 1), etc. close https://github.com/apache/spark/pull/19865 close https://github.com/apache/spark/pull/19938 ## How was this patch tested? existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark codegen Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20021.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20021 ---- commit aadb838c20a3e64b6eed3bcb2d32a461e2851575 Author: Wenchen Fan <wenc...@databricks.com> Date: 2017-12-19T15:18:13Z Ensure no global variables in arguments of method split by CodegenContext.splitExpressions() ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org