Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-18 Thread Tom Lane
Dean Rasheed writes: > I think the behaviour of an ORDER BY in the query can also be pretty > surprising. Indeed. The fundamental question is this: in > SELECT ARRAY[random(), random(), random()] > FROM generate_series(1, 3) > ORDER BY random(); are those four occurrences of random() supposed

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-18 Thread Dean Rasheed
On Wed, 18 Jan 2023 at 09:49, David Rowley wrote: > > On Wed, 18 Jan 2023 at 22:37, Richard Guo wrote: > > I'm still confused that when the same scenario happens with ORDER BY in > > an aggregate function, like in query 1, the result is different in that > > we would get an unsorted output. > >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-18 Thread David Rowley
On Wed, 18 Jan 2023 at 22:37, Richard Guo wrote: > I'm still confused that when the same scenario happens with ORDER BY in > an aggregate function, like in query 1, the result is different in that > we would get an unsorted output. > > I wonder if we should avoid this inconsistent behavior. It

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-18 Thread Richard Guo
On Tue, Jan 17, 2023 at 3:05 PM Tom Lane wrote: > Richard Guo writes: > > BTW, I wonder if we should have checked CoercionForm before > > fix_upper_expr_mutator steps into CoerceViaIO->arg to adjust the expr > > there. > > I will just quote what it says in primnodes.h: > > * NB: equal()

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-16 Thread Tom Lane
Richard Guo writes: > BTW, I wonder if we should have checked CoercionForm before > fix_upper_expr_mutator steps into CoerceViaIO->arg to adjust the expr > there. I will just quote what it says in primnodes.h: * NB: equal() ignores CoercionForm fields, therefore this *must* not carry * any

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-16 Thread Richard Guo
On Tue, Jan 17, 2023 at 11:39 AM David Rowley wrote: > On Tue, 17 Jan 2023 at 13:16, Dean Rasheed > wrote: > > I took a look at this, and I agree that the best solution is probably > > to have make_pathkeys_for_groupagg() ignore Aggrefs that contain > > volatile functions. > > Thanks for giving

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-16 Thread David Rowley
On Tue, 17 Jan 2023 at 13:16, Dean Rasheed wrote: > > On Wed, 11 Jan 2023 at 05:24, David Rowley wrote: > > > > I'm wondering if 1349d279 should have just never opted to presort > > Aggrefs which have volatile functions so that the existing behaviour > > of unordered output is given always and

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-16 Thread Dean Rasheed
On Wed, 11 Jan 2023 at 05:24, David Rowley wrote: > > I'm wondering if 1349d279 should have just never opted to presort > Aggrefs which have volatile functions so that the existing behaviour > of unordered output is given always and nobody is fooled into thinking > this works correctly only to be

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-11 Thread Richard Guo
On Wed, Jan 11, 2023 at 12:13 PM David Rowley wrote: > postgres=# set enable_presorted_aggregate=0; > SET > postgres=# select string_agg(random()::text, ',' order by random()) > from generate_series(1,3); > string_agg >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-10 Thread David Rowley
On Wed, 11 Jan 2023 at 17:32, Tom Lane wrote: > > David Rowley writes: > > I think whatever the fix is here, we should likely ensure that the > > results are consistent regardless of which Aggrefs are the presorted > > ones. Perhaps the easiest way to do that, and to ensure we call the > >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-10 Thread Tom Lane
David Rowley writes: > I think whatever the fix is here, we should likely ensure that the > results are consistent regardless of which Aggrefs are the presorted > ones. Perhaps the easiest way to do that, and to ensure we call the > volatile functions are called the same number of times would

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-10 Thread David Rowley
On Wed, 11 Jan 2023 at 15:46, Richard Guo wrote: > However the scan/join plan's > tlist does not contain random(), which I think we need to fix. I was wondering if that's true and considered that we don't want to evaluate random() for the sort then again when doing the aggregate transitions, but

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-10 Thread Richard Guo
On Tue, Jan 10, 2023 at 6:12 PM Dean Rasheed wrote: > While doing some random testing, I noticed that the following is broken in > HEAD: > > SELECT COUNT(DISTINCT random()) FROM generate_series(1,10); > > ERROR: ORDER/GROUP BY expression not found in targetlist > > It appears to have been

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2023-01-10 Thread Dean Rasheed
While doing some random testing, I noticed that the following is broken in HEAD: SELECT COUNT(DISTINCT random()) FROM generate_series(1,10); ERROR: ORDER/GROUP BY expression not found in targetlist It appears to have been broken by 1349d279, though I haven't looked at the details. I'm

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-12-15 Thread David Rowley
On Fri, 16 Dec 2022 at 00:10, David Rowley wrote: > v3-0002 removes the 1.5 x cost pessimism from incremental sort and > also rewrites how we make incremental sort paths. I've now gone > through the remaining places where we create an incremental sort path > to give all those the same treatment

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-12-15 Thread David Rowley
On Tue, 13 Dec 2022 at 20:53, David Rowley wrote: > I think what we need to do is: Do our best to give incremental sort > the most realistic costs we can and accept that it might choose a > worse plan in some cases. Users can turn it off if they really have no > other means to convince the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-12-12 Thread David Rowley
On Wed, 9 Nov 2022 at 14:58, David Rowley wrote: > v2 attached. I've been looking at this again and this time around understand why the * 1.5 pessimism factor was included in the incremental sort code. If we create a table with a very large skew in the number of rows per what will be our

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-08 Thread David Rowley
On Tue, 8 Nov 2022 at 19:51, Richard Guo wrote: > For unsorted paths, the original logic here is to explicitly add a Sort > path only for the cheapest-total path. This patch changes that and may > add a Sort path for other paths besides the cheapest-total path. I > think this may introduce in

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-08 Thread Ronan Dunklau
Le mardi 8 novembre 2022, 02:31:12 CET David Rowley a écrit : > 1. Adjusts add_paths_to_grouping_rel so that we don't add a Sort path > when we can add an Incremental Sort path instead. This removes quite a > few redundant lines of code. This seems sensible > 2. Removes the * 1.5 fuzz-factor in

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Richard Guo
On Tue, Nov 8, 2022 at 9:31 AM David Rowley wrote: > I've been playing around with the attached patch which does: > > 1. Adjusts add_paths_to_grouping_rel so that we don't add a Sort path > when we can add an Incremental Sort path instead. This removes quite a > few redundant lines of code.

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Pavel Luzanov
On 08.11.2022 04:31, David Rowley wrote: I've been playing around with the attached patch which does: 1. Adjusts add_paths_to_grouping_rel so that we don't add a Sort path when we can add an Incremental Sort path instead. This removes quite a few redundant lines of code. 2. Removes the * 1.5

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread David Rowley
On Tue, 8 Nov 2022 at 03:53, Ronan Dunklau wrote: > I can't see why an incrementalsort could be more expensive than a full sort, > using the same presorted path. It looks to me that in that case we should > always prefer an incrementalsort. Maybe we should bound incremental sorts cost > to make

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Pavel Luzanov
On 07.11.2022 20:30, Ronan Dunklau wrote: What I meant here is that disabling seqscans, the planner still chooses a full sort over a partial sort. The underlying index is the same, it is just a matter of choosing a Sort node over an IncrementalSort node. This, I think, is wrong: I can't see how

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Ronan Dunklau
Le lundi 7 novembre 2022, 17:58:50 CET Pavel Luzanov a écrit : > > I can't see why an incrementalsort could be more expensive than a full > > sort, using the same presorted path. > > The only reason I can see is the number of buffers to read. In the plan > with incremental sort we read the whole

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Pavel Luzanov
Hi, On 07.11.2022 17:53, Ronan Dunklau wrote: In your exact use case, the combo incremental-sort + Index scan is evaluated to cost more than doing a full sort + seqscan. I think we can trace that back to incremental sort being pessimistic about it's performance. If you try the same query,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-07 Thread Ronan Dunklau
Le samedi 5 novembre 2022, 09:51:23 CET Pavel Luzanov a écrit : > While playing with the patch I found a situation where the performance > may be degraded compared to previous versions. > > The test case below. > If you create a proper index for the query (a,c), version 16 wins. On my > notebook,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-05 Thread Pavel Luzanov
Hello, While playing with the patch I found a situation where the performance may be degraded compared to previous versions. The test case below. If you create a proper index for the query (a,c), version 16 wins. On my notebook, the query runs ~50% faster. But if there is no index (a,c), but

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-17 Thread David Rowley
On Wed, 17 Aug 2022 at 13:57, Justin Pryzby wrote: > But in a few places, it removes the locally-computed group_pathkeys: > > - List *group_pathkeys = root->group_pathkeys; > I noticed because that creates a new shadow variable, which seems accidental. Thanks for the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-16 Thread Justin Pryzby
On Tue, Aug 02, 2022 at 11:21:04PM +1200, David Rowley wrote: > I've now pushed the patch. I've not studied the patch at all. But in a few places, it removes the locally-computed group_pathkeys: - List *group_pathkeys = root->group_pathkeys; However it doesn't do

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 07:31, Zhihong Yu wrote: > On Tue, Aug 2, 2022 at 11:02 AM Zhihong Yu wrote: >> I was looking at the final patch and noticed that setno field in >> agg_presorted_distinctcheck struct is never used. > Looks like setoff field is not used either. Thanks for the report. It

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread Zhihong Yu
On Tue, Aug 2, 2022 at 11:02 AM Zhihong Yu wrote: > >> Hi, David: > > I was looking at the final patch and noticed that setno field > in agg_presorted_distinctcheck struct is never used. > > Looks like it was copied from neighboring struct. > > Can you take a look at the patch ? > > Thanks > >>

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread Zhihong Yu
> > > Hi, David: I was looking at the final patch and noticed that setno field in agg_presorted_distinctcheck struct is never used. Looks like it was copied from neighboring struct. Can you take a look at the patch ? Thanks > drop-setno-from-agg_presorted_distinctcheck.patch Description:

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 01:19, Tom Lane wrote: > > David Rowley writes: > > I chatted to Andres and Thomas about this last week and their view > > made me think it might not be quite as clear-cut as "just bump it up a > > bunch because it's ridiculously low" that I had in mind. They > > mentioned

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread Tom Lane
David Rowley writes: > On Mon, 1 Aug 2022 at 03:49, Tom Lane wrote: >> Likewise, it might be >> better to fix DEFAULT_FDW_TUPLE_COST beforehand, to detangle what >> the effects of that are. > I chatted to Andres and Thomas about this last week and their view > made me think it might not be

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Mon, 1 Aug 2022 at 03:49, Tom Lane wrote: > Are you going to push the other patch (adjusting > select_outer_pathkeys_for_merge) first, so that we can see the residual > plan changes that this patch creates? I'm not entirely comfortable > with the regression test changes as posted. Yes, I

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-31 Thread Tom Lane
David Rowley writes: > I'd like to take a serious look at pushing this patch on the first few > days of August, so if anyone is following along here that might have > objections, can you do so before then? Are you going to push the other patch (adjusting select_outer_pathkeys_for_merge) first,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-28 Thread David Rowley
On Wed, 27 Jul 2022 at 15:16, Richard Guo wrote: > That makes sense. The patch looks in a good shape to me in this part. Thanks for giving it another look. I'm also quite happy with the patch now. The 2 plan changes are explained. I have a patch on another thread [1] for the change in the Merge

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread Richard Guo
On Wed, Jul 27, 2022 at 6:46 AM David Rowley wrote: > On Tue, 26 Jul 2022 at 19:39, Richard Guo wrote: > > Also I'm wondering if it's possible to take into consideration the > > ordering indicated by existing indexes when determining the pathkeys. So > > that for the query below we can avoid

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread David Rowley
On Tue, 26 Jul 2022 at 19:39, Richard Guo wrote: > Also I'm wondering if it's possible to take into consideration the > ordering indicated by existing indexes when determining the pathkeys. So > that for the query below we can avoid the Incremental Sort node if we > consider that there is an

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread David Rowley
On Tue, 26 Jul 2022 at 12:01, Zhihong Yu wrote: > sort order the the planner chooses is simply : there is duplicate `the` I think the first "the" should be "that" > + /* mark this aggregate is covered by 'currpathkeys' */ > > is covered by -> as covered by I think it was

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread Richard Guo
On Tue, Jul 26, 2022 at 7:38 AM David Rowley wrote: > On Fri, 22 Jul 2022 at 21:33, Richard Guo wrote: > > I can see this problem with > > the query below: > > > > select max(b order by b), max(a order by a) from t group by a; > > > > When processing the first aggregate, we compose the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-25 Thread Zhihong Yu
On Mon, Jul 25, 2022 at 4:39 PM David Rowley wrote: > On Fri, 22 Jul 2022 at 21:33, Richard Guo wrote: > > I can see this problem with > > the query below: > > > > select max(b order by b), max(a order by a) from t group by a; > > > > When processing the first aggregate, we compose the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-25 Thread David Rowley
On Fri, 22 Jul 2022 at 21:33, Richard Guo wrote: > I can see this problem with > the query below: > > select max(b order by b), max(a order by a) from t group by a; > > When processing the first aggregate, we compose the 'currpathkeys' as > {a, b} and mark this aggregate in 'aggindexes'. When

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-22 Thread Richard Guo
On Wed, Jul 20, 2022 at 1:27 PM David Rowley wrote: > I've been working on this patch again. There was a bit of work to do > to rebase it atop db0d67db2. The problem there was that since this > patch appends pathkeys to suit ORDER BY / DISTINCT aggregates to the > query's group_pathkeys,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-21 Thread David Rowley
On Thu, 4 Nov 2021 at 20:59, Ronan Dunklau wrote: > I took some time to toy with this again. > > At first I thought that charging a discount in foreign grouping paths for > Aggref targets (since they are computed remotely) would be a good idea, > similar to what is done for the grouping keys. > >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-19 Thread David Rowley
On Thu, 4 Nov 2021 at 20:59, Ronan Dunklau wrote: > I took some time to toy with this again. > > At first I thought that charging a discount in foreign grouping paths for > Aggref targets (since they are computed remotely) would be a good idea, > similar to what is done for the grouping keys.

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-04-06 Thread David Rowley
On Thu, 31 Mar 2022 at 06:36, Greg Stark wrote: > > This patch is now failing in the sqljson regression test. It looks > like it's just the ordering of the elements in json_arrayagg() calls > which may actually be a faulty test that needs more ORDER BY clauses > rather than any issues with the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-03-30 Thread Greg Stark
This patch is now failing in the sqljson regression test. It looks like it's just the ordering of the elements in json_arrayagg() calls which may actually be a faulty test that needs more ORDER BY clauses rather than any issues with the code. Nonetheless it's something that needs to be addressed

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-11-04 Thread Ronan Dunklau
Le jeudi 22 juillet 2021, 10:42:49 CET Ronan Dunklau a écrit : > Le jeudi 22 juillet 2021, 09:38:50 CEST David Rowley a écrit : > > On Thu, 22 Jul 2021 at 02:01, Ronan Dunklau wrote: > > > I tested the 0001 patch against both HEAD and my proposed bugfix for > > > postgres_fdw. > > > > > > There

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-22 Thread Ronan Dunklau
Le jeudi 22 juillet 2021, 09:38:50 CEST David Rowley a écrit : > On Thu, 22 Jul 2021 at 02:01, Ronan Dunklau wrote: > > I tested the 0001 patch against both HEAD and my proposed bugfix for > > postgres_fdw. > > > > There is a problem that the ordered aggregate is not pushed down anymore. > > The

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-22 Thread David Rowley
On Thu, 22 Jul 2021 at 02:01, Ronan Dunklau wrote: > I tested the 0001 patch against both HEAD and my proposed bugfix for > postgres_fdw. > > There is a problem that the ordered aggregate is not pushed down anymore. The > underlying Sort node is correctly pushed down though. > > This comes from

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-21 Thread Ronan Dunklau
Le mercredi 21 juillet 2021, 04:52:43 CEST David Rowley a écrit : > Thanks for finding this. I've made a few changes to make this case > work as you describe. Please see attached v6 patches. > > Because I had to add additional code to take the GROUP BY pathkeys > into account when choosing the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-20 Thread David Rowley
On Mon, 19 Jul 2021 at 18:32, Ronan Dunklau wrote: > regression=# explain select sum(unique1 order by ten, two), sum(unique1 order > by four), sum(unique1 order by two, four) from tenk2 group by ten; >QUERY PLAN >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-20 Thread David Rowley
On Mon, 19 Jul 2021 at 18:32, Ronan Dunklau wrote: > It means the logic for appending the order by pathkeys to the existing group > by pathkeys would ideally need to remove the redundant group by keys from the > order by keys, considering this example: > > regression=# explain select sum(unique1

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-19 Thread Ronan Dunklau
> Looks like I did a sloppy job of that. I messed up the condition in > standard_qp_callback() that sets the ORDER BY aggregate pathkeys so > that it accidentally set them when there was an unsortable GROUP BY > clause, as highlighted by the postgres_fdw tests failing. I've now > added a comment

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-16 Thread David Rowley
On Fri, 16 Jul 2021 at 22:00, David Rowley wrote: > > On Fri, 16 Jul 2021 at 18:04, David Rowley wrote: > > What we maybe could consider instead would be to pick the first Aggref > > then look for the most sorted derivative of that then tally up the > > number of Aggrefs that can be sorted using

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-16 Thread David Rowley
On Fri, 16 Jul 2021 at 18:04, David Rowley wrote: > What we maybe could consider instead would be to pick the first Aggref > then look for the most sorted derivative of that then tally up the > number of Aggrefs that can be sorted using those pathkeys, then repeat > that process for the remaining

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-16 Thread David Rowley
On Fri, 16 Jul 2021 at 01:02, Ronan Dunklau wrote: > The approach of building a pathkey for the first order by we find, then > appending to it as needed seems sensible but I'm a bit worried about users > starting to rely on this as an optimization. Even if we don't document it, > people may start

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-15 Thread Ronan Dunklau
Le mardi 13 juillet 2021, 06:44:12 CEST David Rowley a écrit : > I've attached the updated patches. The approach of building a pathkey for the first order by we find, then appending to it as needed seems sensible but I'm a bit worried about users starting to rely on this as an optimization.

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-13 Thread Ranier Vilela
Em ter., 13 de jul. de 2021 às 22:15, David Rowley escreveu: > On Tue, 13 Jul 2021 at 23:45, Ranier Vilela wrote: > > The question not answered is if *argno* can '>=' that > pertrans->numTransInputs, > > before entering the loop? > > If *can*, the loop might be useless in that case. > > > >> >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-13 Thread David Rowley
On Tue, 13 Jul 2021 at 23:45, Ranier Vilela wrote: > The question not answered is if *argno* can '>=' that > pertrans->numTransInputs, > before entering the loop? > If *can*, the loop might be useless in that case. > >> >> >> Note that we're doing argno++ inside the loop. > > Another question

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-13 Thread Ranier Vilela
Em ter., 13 de jul. de 2021 às 01:44, David Rowley escreveu: > Thanks for having a look at this. > > On Tue, 13 Jul 2021 at 11:04, Ranier Vilela wrote: > >> 0001 Adds planner support for ORDER BY aggregates. > > > > /* Normal transition function without ORDER BY / DISTINCT. */ > > Is it

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-12 Thread David Rowley
Thanks for having a look at this. On Tue, 13 Jul 2021 at 11:04, Ranier Vilela wrote: >> 0001 Adds planner support for ORDER BY aggregates. > > /* Normal transition function without ORDER BY / DISTINCT. */ > Is it possible to avoid entering to initialize args if 'argno >= >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-12 Thread Ranier Vilela
Em seg., 12 de jul. de 2021 às 09:04, David Rowley escreveu: > On Sun, 13 Jun 2021 at 03:07, David Rowley wrote: > > > > Please find attached my WIP patch. It's WIP due to what I mentioned > > in the above paragraph and also because I've not bothered to add JIT > > support for the new

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-12 Thread David Rowley
On Sun, 13 Jun 2021 at 03:07, David Rowley wrote: > > Please find attached my WIP patch. It's WIP due to what I mentioned > in the above paragraph and also because I've not bothered to add JIT > support for the new expression evaluation steps. I've split this patch into two parts. 0001 Adds

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-06 Thread David Rowley
On Mon, 5 Jul 2021 at 18:38, Ronan Dunklau wrote: > I think the overhead occurs because in the ExecAgg case, we use the > tuplesort_*_datum API as an optimization when we have a single column as an > input, which the ExecSort code doesn't. Maybe it would be worth it to try to > use that API in

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-06 Thread Ronan Dunklau
> If you make a separate thread and CF entry, please CC me and add me as > a reviewer on the CF entry. Ok, I started a new thread and added it to the next CF: https:// commitfest.postgresql.org/34/3237/ -- Ronan Dunklau

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread James Coleman
On Mon, Jul 5, 2021 at 8:08 AM Ronan Dunklau wrote: > > > Ok, I reproduced that case, just not using a group by: by adding the group > > by a sort node is added in both cases (master and your patch), except that > > with your patch we sort on both keys and that doesn't really incur a > >

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread James Coleman
On Sat, Jun 12, 2021 at 11:07 AM David Rowley wrote: > > A few years ago I wrote a patch to implement the missing aggregate > combine functions for array_agg and string_agg [1]. In the end, the > patch was rejected due to some concern [2] that if we allow these > aggregates to run in parallel

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread Ranier Vilela
Em seg., 5 de jul. de 2021 às 12:07, Ronan Dunklau escreveu: > Le lundi 5 juillet 2021, 16:51:59 CEST Ranier Vilela a écrit : > > >Please find attached a POC patch to do just that. > > > > > >The switch to the single-datum tuplesort is done when there is only one > > >attribute, it is byval (to

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread Ronan Dunklau
Le lundi 5 juillet 2021, 16:51:59 CEST Ranier Vilela a écrit : > >Please find attached a POC patch to do just that. > > > >The switch to the single-datum tuplesort is done when there is only one > >attribute, it is byval (to avoid having to deal with copy of the > > references > > >everywhere)

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread Ranier Vilela
>Please find attached a POC patch to do just that. >The switch to the single-datum tuplesort is done when there is only one >attribute, it is byval (to avoid having to deal with copy of the references >everywhere) and we are not in bound mode (to also avoid having to move things >around). Hi,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread Ronan Dunklau
> Ok, I reproduced that case, just not using a group by: by adding the group > by a sort node is added in both cases (master and your patch), except that > with your patch we sort on both keys and that doesn't really incur a > performance penalty. > > I think the overhead occurs because in the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-05 Thread Ronan Dunklau
Le vendredi 2 juillet 2021, 10:39:44 CEST David Rowley a écrit : > On Fri, 2 Jul 2021 at 19:54, Ronan Dunklau wrote: > > I don't know if it's acceptable, but in the case where you add both an > > aggregate with an ORDER BY clause, and another aggregate without the > > clause, the output for the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-02 Thread Tom Lane
Gavin Flower writes: > On 2/07/21 8:39 pm, David Rowley wrote: >> That's a good question. There was an argument in [1] that mentions >> that there might be a group of people who rely on aggregation being >> done in a certain order where they're not specifying an ORDER BY >> clause in the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-02 Thread Gavin Flower
On 2/07/21 8:39 pm, David Rowley wrote: On Fri, 2 Jul 2021 at 19:54, Ronan Dunklau wrote: I don't know if it's acceptable, but in the case where you add both an aggregate with an ORDER BY clause, and another aggregate without the clause, the output for the unordered one will change and use the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-02 Thread David Rowley
On Fri, 2 Jul 2021 at 19:54, Ronan Dunklau wrote: > I don't know if it's acceptable, but in the case where you add both an > aggregate with an ORDER BY clause, and another aggregate without the clause, > the output for the unordered one will change and use the same ordering, maybe > suprising the

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-02 Thread Ronan Dunklau
> > This allows us to give presorted input to both aggregates in the following > case: > > SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ... > > but just the first agg in this one: > > SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ... I don't know if it's acceptable, but in the case where you

Add proper planner support for ORDER BY / DISTINCT aggregates

2021-06-12 Thread David Rowley
A few years ago I wrote a patch to implement the missing aggregate combine functions for array_agg and string_agg [1]. In the end, the patch was rejected due to some concern [2] that if we allow these aggregates to run in parallel then it might mess up the order in which values are being