Re: [HACKERS] [PATCH] Incremental sort

2017-10-03 Thread Alexander Korotkov
On Tue, Oct 3, 2017 at 2:52 PM, Robert Haas  wrote:

> On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov
>  wrote:
> > I've applied patch on top of c12d570f and rerun the same benchmarks.
> > CSV-file with results is attached.  There is no dramatical changes.
> There
> > is still minority of performance regression cases while majority of cases
> > has improvement.
>
> Yes, I think these results look pretty good.  But are these times in
> seconds?  You might need to do some testing with bigger sorts.


Good point.  I'll rerun benchmarks with larger dataset size.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-10-03 Thread Robert Haas
On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov
 wrote:
> I've applied patch on top of c12d570f and rerun the same benchmarks.
> CSV-file with results is attached.  There is no dramatical changes.  There
> is still minority of performance regression cases while majority of cases
> has improvement.

Yes, I think these results look pretty good.  But are these times in
seconds?  You might need to do some testing with bigger sorts.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-09-30 Thread Alexander Korotkov
On Sat, Sep 16, 2017 at 2:46 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Thu, Sep 14, 2017 at 2:48 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>> Patch rebased to current master is attached.  I'm going to improve my
>> testing script and post new results.
>>
>
> New benchmarking script and results are attached.  There new dataset
> parameter is introduced: skew factor.  Skew factor defines skew in
> distribution of groups sizes.
> My idea of generating is just usage of power function where power is from
> 0 to 1.  Following formula is used to get group number for particular item
> number i.
> [((i / number_of_indexes) ^ power) * number_of_groups]
> For example, power = 1/6 gives following distribution of groups sizes:
> group numbergroup size
> 0   2
> 1   63
> 2   665
> 3   3367
> 4   11529
> 5   31031
> 6   70993
> 7   144495
> 8   269297
> 9   468558
>
> For convenience, instead of power itself, I use skew factor where power =
> 1.0 / (1.0 + skew).  Therefore, with skew = 0.0, distribution of groups
> sizes is uniform.  Larger skew gives more skewed distribution (and that
> seems to be quite intuitive).  For, negative skew, group sizes are mirrored
> as for corresponding positive skew.  For example, skew factor = -5.0 gives
> following groups sizes distribution:
> group numbergroup size
> 0   468558
> 1   269297
> 2   144495
> 3   70993
> 4   31031
> 5   11529
> 6   3367
> 7   665
> 8   63
> 9   2
>
> Results shows that between 2172 test cases, in 2113 incremental sort gives
> speedup while in 59 it causes slowdown.  The following 4 test cases show
> most significant slowdown (>10% of time).
>
> Table   GroupedCols GroupCount Skew PreorderedFrac
> FullSortMedian IncSortMedian TimeChangePercent
> int4|int4|numeric   1  100  -10  0
> 1.5688240528  2.0607631207 31.36
> text|int8|text|int4 110  0
> 1.7785198689 <(778)%20519-8689>  2.1816160679 22.66
> int8|int8|int4  1   10  -10  0
>  1.136412859  1.3166360855 15.86
> numeric|text|int4|int8  2   10  -10  1
> 0.4403841496  0.5070910454 15.15
>
> As you can see, 3 of this 4 test cases have skewed distribution while one
> of them is related to costly location-aware comparison of text.  I've no
> particular idea of how to cope these slowdowns.  Probably, it's OK to have
> slowdown in some cases while have speedup in majority of cases (assuming
> there is an option to turn off new behavior).  Probably, we should teach
> optimizer more about skewed distributions of groups, but that doesn't seem
> feasible for me.
>
> Any thoughts?
>

BTW, replacement selection sort was removed by 8b304b8b.  I think it worth
to rerun benchmarks after that, because results might be changed.  Will do.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-05-08 Thread Robert Haas
On Fri, May 5, 2017 at 11:13 AM, Alexander Korotkov
 wrote:
> Incremental sort is faster in vast majority of cases.  It appears to be
> slower only when whose dataset is one sort group.  In this case incremental
> sort is useless, and it should be considered as misuse of incremental sort.
> Slowdown is related to the fact that we anyway have to do extra comparisons,
> unless we somehow push our comparison result into qsort itself and save some
> cpu cycles (but that would be unreasonable break of encapsulation).  Thus,
> in such cases regression seems to be inevitable anyway.  I think we could
> evade this regression during query planning.  If we see that there would be
> only few groups, we should choose plain sort instead of incremental sort.

I'm sorry that I don't have time to review this in detail right now,
but it sounds like you are doing good work to file down cases where
this might cause regressions, which is great.  Regarding the point in
the paragraph above, I'd say that it's OK for the planner to be
responsible for picking between Sort and Incremental Sort in some way.
It is, after all, the planner's job to decide between different
strategies for executing the same query and, of course, sometimes it
will be wrong, but that's OK as long as it's not wrong too often (or
by too much, hopefully).  It may be a little difficult to get this
right, though, because I'm not sure that the information you need
actually exists (or is reliable).  For example, consider the case
where we need to sort 100m rows and there are 2 groups.  If 1 group
contains 1 row and the other group contains all of the rest, there is
really no point in an incremental sort.  On the other hand, if each
group contains 50m rows and we can get the data presorted by the
grouping column, there might be a lot of point to an incremental sort,
because two 50m-row sorts might be a lot cheaper than one 100m sort.
More generally, it's quite easy to imagine situations where the
individual groups can be quicksorted but sorting all of the rows
requires I/O, even when the number of groups isn't that big.  On the
other hand, the real sweet spot for this is probably the case where
the number of groups is very large, with many single-row groups or
many groups with just a few rows each, so if we can at least get this
to work in those cases that may be good enough.  On the third hand,
when costing aggregation, I think we often underestimate the number of
groups and there might well be similar problems here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-27 Thread Alexander Korotkov
On Thu, Apr 27, 2017 at 5:06 PM, Robert Haas  wrote:

> On Wed, Apr 26, 2017 at 11:39 AM, Alexander Korotkov
>  wrote:
> > But I'd like to make incremental sort not slower than quicksort in case
> of
> > presorted data.  New idea about it comes to my mind.  Since cause of
> > incremental sort slowness in this case is too frequent reset of
> tuplesort,
> > then what if we would artificially put data in larger groups.  Attached
> > revision of patch implements this: it doesn't stop to accumulate tuples
> to
> > tuplesort until we have MIN_GROUP_SIZE tuples.
> >
> > Now, incremental sort is not slower than quicksort.  And this seems to be
> > cool.
> > However, in the LIMIT case we will pay the price of fetching some extra
> > tuples from outer node.  But, that doesn't seem to hurt us too much.
> >
> > Any thoughts?
>
> Nice idea.



Cool.
Than I'm going to make a set of synthetic performance tests in order to
ensure that there is no regression.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-04-27 Thread Robert Haas
On Wed, Apr 26, 2017 at 11:39 AM, Alexander Korotkov
 wrote:
> But I'd like to make incremental sort not slower than quicksort in case of
> presorted data.  New idea about it comes to my mind.  Since cause of
> incremental sort slowness in this case is too frequent reset of tuplesort,
> then what if we would artificially put data in larger groups.  Attached
> revision of patch implements this: it doesn't stop to accumulate tuples to
> tuplesort until we have MIN_GROUP_SIZE tuples.
>
> Now, incremental sort is not slower than quicksort.  And this seems to be
> cool.
> However, in the LIMIT case we will pay the price of fetching some extra
> tuples from outer node.  But, that doesn't seem to hurt us too much.
>
> Any thoughts?

Nice idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Alexander Korotkov
On Wed, Apr 26, 2017 at 8:20 PM, Peter Geoghegan  wrote:

> On Wed, Apr 26, 2017 at 10:10 AM, Alexander Korotkov
>  wrote:
> > OK, I get it.  Our qsort is so fast not only on 100% presorted case.
> > However, that doesn't change many things in context of incremental sort.
>
> The important point is to make any presorted test case only ~99%
> presorted, so as to not give too much credit to the "high risk"
> presort check optimization.
>
> The switch to insertion sort that we left in (not the bad one removed
> by a3f0b3d -- the insertion sort that actually comes from the B
> paper) does "legitimately" make sorting faster with presorted cases.


I'm still focusing on making incremental sort not slower than qsort with
presorted optimization.  Independently on whether this is "high risk"
optimization or not...
However, adding more test cases is always good.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Peter Geoghegan
On Wed, Apr 26, 2017 at 10:10 AM, Alexander Korotkov
 wrote:
> OK, I get it.  Our qsort is so fast not only on 100% presorted case.
> However, that doesn't change many things in context of incremental sort.

The important point is to make any presorted test case only ~99%
presorted, so as to not give too much credit to the "high risk"
presort check optimization.

The switch to insertion sort that we left in (not the bad one removed
by a3f0b3d -- the insertion sort that actually comes from the B
paper) does "legitimately" make sorting faster with presorted cases.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Alexander Korotkov
On Wed, Apr 26, 2017 at 7:56 PM, Peter Geoghegan  wrote:

> On Wed, Apr 26, 2017 at 8:39 AM, Alexander Korotkov
>  wrote:
> > That appears to be wrong.  I intended to make cost_sort prefer plain sort
> > over incremental sort for this dataset size.  But, that appears to be not
> > always right solution.  Quick sort is so fast only on presorted data.
>
> As you may know, I've often said that the precheck for sorted input
> added to our quicksort implementation by a3f0b3d is misguided. It
> sometimes throws away a ton of work if the presorted input isn't
> *perfectly* presorted. This happens when the first out of order tuple
> is towards the end of the presorted input.
>
> I think that it isn't fair to credit our qsort with doing so well on a
> 100% presorted case, because it doesn't do the necessary bookkeeping
> to not throw that work away completely in certain important cases.
>

OK, I get it.  Our qsort is so fast not only on 100% presorted case.
However, that doesn't change many things in context of incremental sort.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-04-26 Thread Peter Geoghegan
On Wed, Apr 26, 2017 at 8:39 AM, Alexander Korotkov
 wrote:
> That appears to be wrong.  I intended to make cost_sort prefer plain sort
> over incremental sort for this dataset size.  But, that appears to be not
> always right solution.  Quick sort is so fast only on presorted data.

As you may know, I've often said that the precheck for sorted input
added to our quicksort implementation by a3f0b3d is misguided. It
sometimes throws away a ton of work if the presorted input isn't
*perfectly* presorted. This happens when the first out of order tuple
is towards the end of the presorted input.

I think that it isn't fair to credit our qsort with doing so well on a
100% presorted case, because it doesn't do the necessary bookkeeping
to not throw that work away completely in certain important cases.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Andres Freund
On 2017-04-03 22:18:21 -0400, Robert Haas wrote:
> On Mon, Apr 3, 2017 at 5:09 PM, Andres Freund  wrote:
> > To me this hasn't gotten even remotely enough performance evaluation.
> > And I don't think it's fair to characterize it as pending since 2013,
> > given it was essentially "waiting on author" for most of that.
> 
> This is undeniably a patch which has been kicking around for a lot of
> time without getting a lot of attention, and if it just keeps getting
> punted down the road, it's never going to become committable.

Indeed, it's old.  And it hasn't gotten enough timely feedback.

But I don't think the wait time can meaningfully be measured by
subtracting two dates:
The first version of the patch, as a PoC, has been posted 2013-12-14,
which then got a good amount of feedback & revisions, and then stalled
till 2014-07-12.  There a few back-and forths yielded a new version.
>From 2014-09-15 till 2015-10-16 the patch stalled, waiting on its
author.  That version had open todos ([1]), as had the version from
2016-03-13 [2], which weren't addressed 2016-03-30 - unfortunately that
was pretty much when the tree was frozen.  2016-09-13 a rebased patch
was sent, some minor points were raised 2016-10-02 (unaddressed), a
larger review was done 2016-12-01 ([5]), unaddressed till 2017-02-18.
At that point we're in this thread.

There's obviously some long waiting-on-author periods in there.  And
some long needs-review periods.


> Alexander's questions upthread about what decisions the committer who
> took an interest (Heikki) would prefer never really got an answer, for
> example.  I don't deny that there may be some work left to do here,
> but I think blaming the author for a week's delay when this has been
> ignored so often for so long is unfair.

I'm not trying to blame Alexander for a week's worth of delay, at all.
It's just that, well, we're past the original code-freeze date, three
days before the "final" code freeze. I don't think fairness is something
we can achieve at this point :(.  Given the risk of regressions -
demonstrated in this thread although partially adressed - and the very
limited amount of benchmarking done, it seems unlikely that this is
going to be merged.

Regards,

Andres


[1] 
http://archives.postgresql.org/message-id/CAPpHfdvhwMsG69exCRUGK3ms-ng0PSPcucH5FU6tAaM-qL-1%2Bw%40mail.gmail.com
[2] 
http://archives.postgresql.org/message-id/CAPpHfdvzjYGLTyA-8ib8UYnKLPrewd9Z%3DT4YJNCRWiHWHHweWw%40mail.gmail.com
[3] 
http://archives.postgresql.org/message-id/CAPpHfdtCcHZ-mLWzsFrRCvHpV1LPSaOGooMZ3sa40AkwR=7...@mail.gmail.com
[4] 
http://archives.postgresql.org/message-id/capphfdvj1tdi2wa64zbbp5-yg-uzarxzk3k7j7zt-crx6ys...@mail.gmail.com
[5] 
http://archives.postgresql.org/message-id/CA+TgmoZapyHRm7NVyuyZ+yAV=u1a070bogre7pkgyraegr4...@mail.gmail.com
[6] 
http://archives.postgresql.org/message-id/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etze...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Robert Haas
On Mon, Apr 3, 2017 at 5:09 PM, Andres Freund  wrote:
> To me this hasn't gotten even remotely enough performance evaluation.
> And I don't think it's fair to characterize it as pending since 2013,
> given it was essentially "waiting on author" for most of that.

This is undeniably a patch which has been kicking around for a lot of
time without getting a lot of attention, and if it just keeps getting
punted down the road, it's never going to become committable.
Alexander's questions upthread about what decisions the committer who
took an interest (Heikki) would prefer never really got an answer, for
example.  I don't deny that there may be some work left to do here,
but I think blaming the author for a week's delay when this has been
ignored so often for so long is unfair.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Alexander Korotkov
On Tue, Apr 4, 2017 at 12:09 AM, Andres Freund  wrote:

> On 2017-04-04 00:04:09 +0300, Alexander Korotkov wrote:
> > > >Thank you!
> > > >I already sent version of patch after David's reminder.
> > > >Please find rebased patch in the attachment.
> > >
> > > Cool. I think that's still a bit late for v10?
> > >
> >
> > I don't know.  ISTM, that I addressed all the issues raised by reviewers.
> > Also, this patch is pending since late 2013.  It would be very nice to
> > finally get it in...
>
> To me this hasn't gotten even remotely enough performance evaluation.
>

I'm ready to put my efforts on that.


> And I don't think it's fair to characterize it as pending since 2013,
>

Probably, this duration isn't good characteristic at all.


> given it was essentially "waiting on author" for most of that.
>

What makes you think so?  Do you have some statistics?  Or is it just
random assumption?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Andres Freund
Hi,

On 2017-04-04 00:04:09 +0300, Alexander Korotkov wrote:
> > >Thank you!
> > >I already sent version of patch after David's reminder.
> > >Please find rebased patch in the attachment.
> >
> > Cool. I think that's still a bit late for v10?
> >
> 
> I don't know.  ISTM, that I addressed all the issues raised by reviewers.
> Also, this patch is pending since late 2013.  It would be very nice to
> finally get it in...

To me this hasn't gotten even remotely enough performance evaluation.
And I don't think it's fair to characterize it as pending since 2013,
given it was essentially "waiting on author" for most of that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Alexander Korotkov
On Mon, Apr 3, 2017 at 10:05 PM, Andres Freund  wrote:

> On April 3, 2017 12:03:56 PM PDT, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
> >On Mon, Apr 3, 2017 at 9:34 PM, Andres Freund 
> >wrote:
> >
> >> On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote:
> >> > On Tue, Mar 28, 2017 at 5:27 PM, David Steele 
> >> wrote:
> >> > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote:
> >> > >
> >> > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote:
> >> > >>
> >> > >>> Please, find rebased patch in the attachment.
> >> > >>>
> >> > >>
> >> > >> I had a quick look at this.
> >> > >>
> >> > >
> >> > > <...>
> >> > >
> >> > > According to 'perf', 85% of the CPU time is spent in
> >ExecCopySlot(). To
> >> > >> alleviate that, it might be worthwhile to add a special case for
> >when
> >> > >> the group contains exactly one group, and not put the tuple to
> >the
> >> > >> tuplesort in that case. Or if we cannot ensure that the
> >Incremental
> >> Sort
> >> > >> is actually faster, the cost model should probably be smarter,
> >to
> >> avoid
> >> > >> picking an incremental sort when it's not a win.
> >> > >>
> >> > >
> >> > > This thread has been idle for over a week.  Please respond with a
> >new
> >> > > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be
> >> marked
> >> > > "Returned with Feedback".
> >>
> >> > Thank you for reminder!
> >>
> >> I've just done so.  Please resubmit once updated, it's a cool
> >feature.
> >>
> >
> >Thank you!
> >I already sent version of patch after David's reminder.
> >Please find rebased patch in the attachment.
>
> Cool. I think that's still a bit late for v10?
>

I don't know.  ISTM, that I addressed all the issues raised by reviewers.
Also, this patch is pending since late 2013.  It would be very nice to
finally get it in...

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Andres Freund


On April 3, 2017 12:03:56 PM PDT, Alexander Korotkov 
 wrote:
>On Mon, Apr 3, 2017 at 9:34 PM, Andres Freund 
>wrote:
>
>> On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote:
>> > On Tue, Mar 28, 2017 at 5:27 PM, David Steele 
>> wrote:
>> > > On 3/20/17 10:19 AM, Heikki Linnakangas wrote:
>> > >
>> > >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote:
>> > >>
>> > >>> Please, find rebased patch in the attachment.
>> > >>>
>> > >>
>> > >> I had a quick look at this.
>> > >>
>> > >
>> > > <...>
>> > >
>> > > According to 'perf', 85% of the CPU time is spent in
>ExecCopySlot(). To
>> > >> alleviate that, it might be worthwhile to add a special case for
>when
>> > >> the group contains exactly one group, and not put the tuple to
>the
>> > >> tuplesort in that case. Or if we cannot ensure that the
>Incremental
>> Sort
>> > >> is actually faster, the cost model should probably be smarter,
>to
>> avoid
>> > >> picking an incremental sort when it's not a win.
>> > >>
>> > >
>> > > This thread has been idle for over a week.  Please respond with a
>new
>> > > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be
>> marked
>> > > "Returned with Feedback".
>>
>> > Thank you for reminder!
>>
>> I've just done so.  Please resubmit once updated, it's a cool
>feature.
>>
>
>Thank you!
>I already sent version of patch after David's reminder.
>Please find rebased patch in the attachment.

Cool. I think that's still a bit late for v10?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-04-03 Thread Andres Freund
On 2017-03-29 00:17:02 +0300, Alexander Korotkov wrote:
> On Tue, Mar 28, 2017 at 5:27 PM, David Steele  wrote:
> 
> > Hi Alexander,
> >
> > On 3/20/17 10:19 AM, Heikki Linnakangas wrote:
> >
> >> On 03/20/2017 11:33 AM, Alexander Korotkov wrote:
> >>
> >>> Please, find rebased patch in the attachment.
> >>>
> >>
> >> I had a quick look at this.
> >>
> >
> > <...>
> >
> > According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To
> >> alleviate that, it might be worthwhile to add a special case for when
> >> the group contains exactly one group, and not put the tuple to the
> >> tuplesort in that case. Or if we cannot ensure that the Incremental Sort
> >> is actually faster, the cost model should probably be smarter, to avoid
> >> picking an incremental sort when it's not a win.
> >>
> >
> > This thread has been idle for over a week.  Please respond with a new
> > patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked
> > "Returned with Feedback".

> Thank you for reminder!

I've just done so.  Please resubmit once updated, it's a cool feature.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-03-28 Thread Alexander Korotkov
On Tue, Mar 28, 2017 at 5:27 PM, David Steele  wrote:

> Hi Alexander,
>
> On 3/20/17 10:19 AM, Heikki Linnakangas wrote:
>
>> On 03/20/2017 11:33 AM, Alexander Korotkov wrote:
>>
>>> Please, find rebased patch in the attachment.
>>>
>>
>> I had a quick look at this.
>>
>
> <...>
>
> According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To
>> alleviate that, it might be worthwhile to add a special case for when
>> the group contains exactly one group, and not put the tuple to the
>> tuplesort in that case. Or if we cannot ensure that the Incremental Sort
>> is actually faster, the cost model should probably be smarter, to avoid
>> picking an incremental sort when it's not a win.
>>
>
> This thread has been idle for over a week.  Please respond with a new
> patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked
> "Returned with Feedback".


Thank you for reminder!

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Incremental sort

2017-03-28 Thread David Steele

Hi Alexander,

On 3/20/17 10:19 AM, Heikki Linnakangas wrote:

On 03/20/2017 11:33 AM, Alexander Korotkov wrote:

Please, find rebased patch in the attachment.


I had a quick look at this.


<...>


According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To
alleviate that, it might be worthwhile to add a special case for when
the group contains exactly one group, and not put the tuple to the
tuplesort in that case. Or if we cannot ensure that the Incremental Sort
is actually faster, the cost model should probably be smarter, to avoid
picking an incremental sort when it's not a win.


This thread has been idle for over a week.  Please respond with a new 
patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked 
"Returned with Feedback".


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-03-20 Thread Heikki Linnakangas

On 03/20/2017 11:33 AM, Alexander Korotkov wrote:

Please, find rebased patch in the attachment.


I had a quick look at this.

* I'd love to have an explanation of what an Incremental Sort is, in the 
file header comment for nodeIncrementalSort.c.


* I didn't understand the maxMem stuff in tuplesort.c. The comments 
there use the phrase "on-disk memory", which seems like an oxymoron. 
Also, "maximum status" seems weird, as it assumes that there's a natural 
order to the states.


* In the below example, the incremental sort is significantly slower 
than the Seq Scan + Sort you get otherwise:


create table foo (a int4, b int4, c int4);
insert into sorttest select g, g, g from generate_series(1, 100) g;
vacuum foo;
create index i_sorttest on sorttest (a, b, c);
set work_mem='100MB';


postgres=# explain select count(*) from (select * from sorttest order by 
a, c) as t;
  QUERY PLAN 


---
 Aggregate  (cost=138655.68..138655.69 rows=1 width=8)
   ->  Incremental Sort  (cost=610.99..124870.38 rows=1102824 width=12)
 Sort Key: sorttest.a, sorttest.c
 Presorted Key: sorttest.a
 ->  Index Only Scan using i_sorttest on sorttest 
(cost=0.43..53578.79 rows=1102824 width=12)

(5 rows)

Time: 0.409 ms
postgres=# select count(*) from (select * from sorttest order by a, c) as t;
  count
-
 100
(1 row)

Time: 387.091 ms


postgres=# explain select count(*) from (select * from sorttest order by 
a, c) as t;
  QUERY PLAN 


---
 Aggregate  (cost=130063.84..130063.85 rows=1 width=8)
   ->  Sort  (cost=115063.84..117563.84 rows=100 width=12)
 Sort Key: sorttest.a, sorttest.c
 ->  Seq Scan on sorttest  (cost=0.00..15406.00 rows=100 
width=12)

(4 rows)

Time: 0.345 ms
postgres=# select count(*) from (select * from sorttest order by a, c) as t;
  count
-
 100
(1 row)

Time: 231.668 ms

According to 'perf', 85% of the CPU time is spent in ExecCopySlot(). To 
alleviate that, it might be worthwhile to add a special case for when 
the group contains exactly one group, and not put the tuple to the 
tuplesort in that case. Or if we cannot ensure that the Incremental Sort 
is actually faster, the cost model should probably be smarter, to avoid 
picking an incremental sort when it's not a win.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)

2017-03-20 Thread Mithun Cy
On Mon, Feb 27, 2017 at 8:29 PM, Alexander Korotkov
 wrote:

This patch needs to be rebased.

1. It fails while applying as below

patching file src/test/regress/expected/sysviews.out
Hunk #1 FAILED at 70.
1 out of 1 hunk FAILED -- saving rejects to file
src/test/regress/expected/sysviews.out.rej
patching file src/test/regress/sql/inherit.sql

2. Also, there are compilation errors due to new commits.

-fwrapv -fexcess-precision=standard -O2 -I../../../../src/include
-D_GNU_SOURCE   -c -o createplan.o createplan.c
createplan.c: In function ‘create_gather_merge_plan’:
createplan.c:1510:11: warning: passing argument 3 of ‘make_sort’ makes
integer from pointer without a cast [enabled by default]
   gm_plan->nullsFirst);
   ^
createplan.c:235:14: note: expected ‘int’ but argument is of type ‘AttrNumber *’
 static Sort *make_sort(Plan *lefttree, int numCols, int skipCols,
  ^
createplan.c:1510:11: warning: passing argument 4 of ‘make_sort’ from
incompatible pointer type [enabled by default]
   gm_plan->nullsFirst);

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)

2017-02-19 Thread Robert Haas
On Sat, Feb 18, 2017 at 4:01 PM, Alexander Korotkov
 wrote:
> I decided to start new thread for this patch for following two reasons.
>  * It's renamed from "Partial sort" to "Incremental sort" per suggestion by
> Robert Haas [1].  New name much better characterizes the essence of
> algorithm.
>  * I think it's not PoC anymore.  Patch received several rounds of review
> and now it's in the pretty good shape.
>
> Attached revision of patch has following changes.
>  * According to review [1], two new path and plan nodes are responsible for
> incremental sort: IncSortPath and IncSort which are inherited from SortPath
> and Sort correspondingly.  That allowed to get rid of set of hacks with
> minimal code changes.
>  * According to review [1] and comment [2], previous tuple is stored in
> standalone tuple slot of SortState rather than just HeapTuple.
>  * New GUC parameter enable_incsort is introduced to control planner ability
> to choose incremental sort.
>  * Test of postgres_fdw with not pushed down cross join is corrected.  It
> appeared that with incremental sort such query is profitable to push down.
> I changed ORDER BY columns so that index couldn't be used.  I think this
> solution is more elegant than setting enable_incsort = off.

I usually advocate for spelling things out instead of abbreviating, so
I guess I'll stay true to form here and suggest that abbreviating
incremental to inc doesn't seem like a great idea.  Is that sort
incrementing, incremental, incredible, incautious, or incorporated?

The first hunk in the patch, a change in the postgres_fdw regression
test output, looks an awful lot like a bug: now the query that
formerly returned various different numbers is returning all zeroes.
It might not actually be a bug, because you've also changed the test
query (not sure why), but anyway the new regression test output that
is all zeroes seems less useful for catching bugs in, say, the
ordering of the results than the old output where the different rows
were different.

I don't know of any existing cases where the same executor file is
responsible for executing more than 1 different type of executor node.
I was imagining a more-complete separation of the new executor node.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Incremental sort (was: PoC: Partial sort)

2017-02-18 Thread Alexander Korotkov
Hi all!

I decided to start new thread for this patch for following two reasons.
 * It's renamed from "Partial sort" to "Incremental sort" per suggestion by
Robert Haas [1].  New name much better characterizes the essence of
algorithm.
 * I think it's not PoC anymore.  Patch received several rounds of review
and now it's in the pretty good shape.

Attached revision of patch has following changes.
 * According to review [1], two new path and plan nodes are responsible for
incremental sort: IncSortPath and IncSort which are inherited from SortPath
and Sort correspondingly.  That allowed to get rid of set of hacks with
minimal code changes.
 * According to review [1] and comment [2], previous tuple is stored in
standalone tuple slot of SortState rather than just HeapTuple.
 * New GUC parameter enable_incsort is introduced to control planner
ability to choose incremental sort.
 * Test of postgres_fdw with not pushed down cross join is corrected.  It
appeared that with incremental sort such query is profitable to push down.
I changed ORDER BY columns so that index couldn't be used.  I think this
solution is more elegant than setting enable_incsort = off.

Also patch has set of assorted code and comments improvements.

Links
1.
https://www.postgresql.org/message-id/CA+TgmoZapyHRm7NVyuyZ+yAV=u1a070bogre7pkgyraegr4...@mail.gmail.com
2.
https://www.postgresql.org/message-id/cam3swzql4yd2sndhemcgl0q2b2otdkuvv_l6zg_fcgoluwm...@mail.gmail.com

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


incremental-sort-1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers