GitHub user dvogelbacher opened a pull request: https://github.com/apache/spark/pull/21993
[SPARK-24983][SQL] Add configuration for maximum number of leaf expressions in collapsed project nodes ## What changes were proposed in this pull request? Add a configuration option for the maximum number of leaf expressions in collapsed project nodes. If a collapsed project node (result of the `CollapseProject` optimizer rule) would have more leaf expressions than the configured maximum number we don't collapse. This is to protect against an exponentially exploding number of leaf expressions when collapsing many (binary) expression that refer to the same columns (see https://issues.apache.org/jira/browse/SPARK-24983). ## How was this patch tested? Add a new unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dvogelbacher/spark dv/limitProjectCollapse Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21993.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21993 ---- commit 096340b2d9b65d32cd033f4f9c1d2668336d1fbf Author: David Vogelbacher <dvogelbacher@...> Date: 2018-08-03T19:40:25Z implement config setting limiting number of leaf expressions after collapsing commit b7ced5433a55d188889e333806c84a3bdcb19e15 Author: David Vogelbacher <dvogelbacher@...> Date: 2018-08-03T20:26:11Z refactor a bit ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org