GitHub user dvogelbacher opened a pull request:

    https://github.com/apache/spark/pull/21993

    [SPARK-24983][SQL] Add configuration for maximum number of leaf expressions 
in collapsed project nodes

    ## What changes were proposed in this pull request?
    
    Add a configuration option for the maximum number of leaf expressions in 
collapsed project nodes. If a collapsed project node (result of the 
`CollapseProject` optimizer rule) would have more leaf expressions than the 
configured maximum number we don't collapse. This is to protect against an 
exponentially exploding number of leaf expressions when collapsing many 
(binary) expression that refer to the same columns (see 
https://issues.apache.org/jira/browse/SPARK-24983).
    
    ## How was this patch tested?
    Add a new unit test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dvogelbacher/spark dv/limitProjectCollapse

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21993.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21993
    
----
commit 096340b2d9b65d32cd033f4f9c1d2668336d1fbf
Author: David Vogelbacher <dvogelbacher@...>
Date:   2018-08-03T19:40:25Z

    implement config setting limiting number of leaf expressions after 
collapsing

commit b7ced5433a55d188889e333806c84a3bdcb19e15
Author: David Vogelbacher <dvogelbacher@...>
Date:   2018-08-03T20:26:11Z

    refactor a bit

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to