GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/22344

    [SPARK-25352][SQL] Perform ordered global limit when limit number is bigger 
than topKSortFallbackThreshold

    ## What changes were proposed in this pull request?
    
    We have optimization on global limit to evenly distribute limit rows across 
all partitions. This optimization doesn't work for ordered results.
    
    For a query ending with sort + limit, in most cases it is performed by 
`TakeOrderedAndProjectExec`.
    
    But if limit number is bigger than `SQLConf.TOP_K_SORT_FALLBACK_THRESHOLD`, 
global limit will be used. At this moment, we need to do ordered global limit.
    
    ## How was this patch tested?
    
    Unit tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-25352

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22344.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22344
    
----
commit 8d49c1afdbd6c0219d6cc182e53311201f73489f
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-09-06T04:43:15Z

    Do ordered global limit when limit number is bigger than 
topKSortFallbackThreshold.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to