On Sun, Sep 14, 2014 at 9:32 AM, Peter Geoghegan <p...@heroku.com> wrote:

> I think we might be better off if a tuplesort function was called
> shortly after tuplesort_begin_heap() is called. How top-n heap sorts
> work is something that largely lives in tuplesort's head. Today, we
> call tuplesort_set_bound() to hint to tuplesort "By the way, this is a
> top-n heap sort applicable sort". I think that with this patch, we
> should then hint (where applicable) "by the way, you won't actually be
> required to sort those first n indexed attributes; rather, you can
> expect to scan those in logical order. You may work the rest out
> yourself, and may be clever about exploiting the sorted-ness of the
> first few columns". The idea of managing a bunch of tiny sorts from
> with ExecSort(), and calling the new function tuplesort_reset() seems
> questionable. tuplesortstate is supposed to be private/opaque to
> nodeSort.c, and the current design strains that.
>
> I'd like to keep nodeSort.c simple. I think it's pretty clear that the
> guts of this do not belong within ExecSort(), in any case. Also, the
> additions there should be much better commented, wherever they finally
> end up.
>

As I understand, you propose to incapsulate partial sort algorithm into
tuplesort. However, in this case we anyway need some significant change of
its interface: let tuplesort decide when it's able to return tuple.
Otherwise, we would miss significant part of LIMIT clause optimization.
tuplesort_set_bound() can't solve all the cases. There could be other
planner nodes between the partial sort and LIMIT.

------
With best regards,
Alexander Korotkov.

Reply via email to