[ 
https://issues.apache.org/jira/browse/IMPALA-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134613#comment-17134613
 ] 

Tim Armstrong commented on IMPALA-3471:
---------------------------------------

I think we still want to use the regular external sort implementation for large 
limits, but we could make some further optimisations to avoid spilling as much 
data. Specifically, in SortCurrentInputRun() we could truncate the in-memory 
sorted run, and then when merging sorted runs we can apply the limit there too.

There are additional tricks that we could add to optimise this for spilling 
sorts further, mostly various ways to keep track of the upper bound on the row 
that would be past the threshold.

> TopN should be able to spill
> ----------------------------
>
>                 Key: IMPALA-3471
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3471
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.6.0
>            Reporter: Jim Apple
>            Priority: Minor
>
> TopN nodes store OFFSET + LIMIT  tuples in memory. (In fact, in a vector 
> which will throw an exception if allocation fails.) It would be nice to check 
> allocations before they fail and spill when there isn't enough memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to