On Mon, Jul 3, 2023 at 5:15 PM David Rowley wrote:
>
> On Fri, 30 Jun 2023 at 18:45, John Naylor
wrote:
> I looked over your patch and don't see anything to report aside from
> the unfinished/undecided part around the tiebreak function for
> tuplesort_begin_index_hash().
I went ahead and added
On Wed, Feb 15, 2023 at 11:23 AM John Naylor
wrote:
>
>
> On Tue, Feb 14, 2023 at 5:46 PM David Rowley wrote:
> >
> > I didn't do a detailed review of the sort patch, but I did wonder
> > about the use of the name "fallback" in the new functions. The
> > comment in the following snippet from qso
On Thu, Feb 16, 2023 at 10:03 AM David Rowley wrote:
> I suspect it's slower because the final sort must sort the entire
> array still without knowledge that portions of it are pre-sorted. It
> would be very interesting to improve this and do some additional work
> and keep track of the "memtups
On Thu, 16 Feb 2023 at 00:45, John Naylor wrote:
> Okay, here's a rerun including the sort hack, and it looks like incremental
> sort is only ahead with the smallest set, otherwise same or maybe slightly
> slower:
>
> HEAD:
>
> 4 ^ 8: latency average = 113.461 ms
> 5 ^ 8: latency average = 786.0
On Wed, Feb 15, 2023 at 3:02 PM David Rowley wrote:
>
> On Wed, 15 Feb 2023 at 17:23, John Naylor
wrote:
> > HEAD:
> >
> > 4 ^ 8: latency average = 113.976 ms
> > 5 ^ 8: latency average = 783.830 ms
> > 6 ^ 8: latency average = 3990.351 ms
> > 7 ^ 8: latency average = 15793.629 ms
> >
> > Skip re
I wrote:
> it might be worthwhile to "zoom in" with more measurements, but haven't
done that yet.
I've attached the script and image for 1 million / random / varying the mod
by quarter-log intervals. Unfortunately I didn't get as good results as
yesterday. Immediately going from mod 1 to mod 2, so
On Wed, 15 Feb 2023 at 17:23, John Naylor wrote:
> HEAD:
>
> 4 ^ 8: latency average = 113.976 ms
> 5 ^ 8: latency average = 783.830 ms
> 6 ^ 8: latency average = 3990.351 ms
> 7 ^ 8: latency average = 15793.629 ms
>
> Skip rechecking first key:
>
> 4 ^ 8: latency average = 107.028 ms
> 5 ^ 8: late
On Tue, Feb 14, 2023 at 5:46 PM David Rowley wrote:
>
> I didn't do a detailed review of the sort patch, but I did wonder
> about the use of the name "fallback" in the new functions. The
> comment in the following snippet from qsort_tuple_unsigned_compare()
> makes me think "tiebreak" is a better
On Tue, 14 Feb 2023 at 17:21, John Naylor wrote:
> Not rechecking seems to eliminate the regression in 4 cases, and reduce it in
> the other 2 cases. For those 2 cases (10e6 rows, random, mod 10 and 100), it
> might be worthwhile to "zoom in" with more measurements, but haven't done
> that yet.
On Thu, Jan 26, 2023 at 9:11 AM David Rowley wrote:
>
> I'm unsure if 69749243 might be partially to blame here as it favours
> single-key sorts. If you look at qsort_tuple_signed_compare(), you'll
> see that the tiebreak function will be called only when it's needed
> and there are > 1 sort keys
> On 30/01/23 11:01, John Naylor wrote:
> Since David referred to L3 size as the starting point of a possible
configuration parameter, that's actually cache-conscious.
Okay, makes sense. I am correcting error on my part.
> I'm not close enough to this thread to guess at the right direction
On Sun, Jan 29, 2023 at 7:05 PM Ankit Kumar Pandey
wrote:
>
>
> > On 26/01/23 07:40, David Rowley wrote:
> > Sorting in smaller batches that better fit into CPU cache.
>
> More reading yielded that we are looking for cache-oblivious
> sorting algorithm.
Since David referred to L3 size as the sta
> On 28/01/23 14:31, John Naylor wrote:
> I recently brought up this older thread (mostly about #1), partly
because of the issues discovered above, and partly because I hope to
make progress on it before feature freeze (likely early April):
>
https://www.postgresql.org/message-id/CAApHDvqXcmz
On Sat, Jan 28, 2023 at 3:21 PM Ankit Kumar Pandey
wrote:
>
> > On 26/01/23 07:40, David Rowley wrote:
> > We might want to look at 1) Expanding
> > on what 69749243 did and considering if we want sort specialisations
> > that are specifically for 1 column and another set for multi-columns.
> > T
On 26/01/23 07:40, David Rowley wrote:
You can see from the results that the patched version is not looking
particularly great. I did a 1 million row test and a 10 million row
test. work_mem was 4GB for each, so the sorts never went to disk.
Yes, its lackluster gains are very limited (pret
I think more benchmarking is required
so we can figure out if this is a corner case or a common case
I did some more benchmarks:
#1. AIM: Pushdown column whose size is very high
create table test(a int, b int, c text);
insert into test select a,b,c from generate_series(1,1000)a,
generate_s
On 19/01/23 08:58, David Rowley wrote:
The problem is that the next item looked at is 1 and the value 2 is skipped.
Okay, I see the issue.
I think you are misinterpreting the results, but the main point
remains - it's slower. The explain analyze timing shows the time
between outputting
On Thu, 19 Jan 2023 at 06:27, Ankit Kumar Pandey wrote:
> Hmm, not really sure why did I miss that. I tried this again (added
> following in postgres.c above
>
> PortalStart)
>
> List* l = NIL;
> l = lappend(l, 1);
> l = lappend(l, 2);
> l = lappend(l, 3);
> l = lappend(l, 4);
>
> ListCell *lc;
>
On 18/01/23 15:12, David Rowley wrote:
I also thought I'd better test that foreach_delete_current() works
with foreach_reverse(). I can confirm that it *does not* work
correctly. I guess maybe you only tested the fact that it deleted the
current item and not that the subsequent loop correct
On Tue, 10 Jan 2023 at 06:15, Ankit Kumar Pandey wrote:
>
>
> > On 09/01/23 17:53, David Rowley wrote:
> > I gave some thought to whether doing foreach_delete_current() is safe
> > within a foreach_reverse() loop. I didn't try it, but I couldn't see
> > any reason why not. It is pretty late here
On 16/01/23 09:48, David Rowley wrote:
I don't think we should touch this. It could significantly increase
the number of indexes that we consider when generating paths on base
relations and therefore *significantly* increase the number of paths
we consider during the join search. What I had i
On Mon, 16 Jan 2023 at 06:52, Ankit Kumar Pandey wrote:
> Hence, input_rel->pathlist returns null for select distinct b,a from ab
> where a < 10; even if index is created on a.
>
> In order to tackle this, I have added index_pathkeys in indexpath node
> itself.
I don't think we should touch this.
On 13/01/23 07:48, David Rowley wrote:
It would just care if the
pathkeys were present and return a list of pathkeys not contained so
that an incremental sort could be done only on the returned list and a
Unique on an empty returned list.
In create_final_distinct_paths, presorted keys are
On 13/01/23 07:48, David Rowley wrote:
I don't think you can claim that one so easily. The two should have
quite different scaling characteristics which will be more evident
with a larger number of input rows. Also, Hash Aggregate makes use of
work_mem * hash_mem_multiplier, whereas sort us
On Wed, 11 Jan 2023 at 19:21, Ankit Kumar Pandey wrote:
> HashAgg has better cost than Unique even with incremental sort (tried
> with other case
>
> where we have more columns pushed down but still hashAgg wins).
I don't think you can claim that one so easily. The two should have
quite differen
On 11/01/23 06:18, David Rowley wrote:
Not sure if we should be trying to improve that in this patch. I just
wanted to identify it as something else that perhaps could be done.
This could be within reach but still original problem of having hashagg
removing
any gains from this remains.
On Tue, 10 Jan 2023 at 18:36, Tom Lane wrote:
>
> David Rowley writes:
> > Ideally, our sort costing would just be better, but I think that
> > raises the bar a little too high to start thinking of making
> > improvements to that for this patch.
>
> It's trickier than it looks, cf f4c7c410e. But
On Wed, 11 Jan 2023 at 08:17, Ankit Kumar Pandey wrote:
>
>
> > On 10/01/23 10:53, David Rowley wrote:
>
> > the total cost is the same for both of these, but the execution time
> > seems to vary quite a bit.
>
> This is really weird, I tried it different ways (to rule out any issues
> due to
>
>
On 10/01/23 10:53, David Rowley wrote:
the total cost is the same for both of these, but the execution time
seems to vary quite a bit.
This is really weird, I tried it different ways (to rule out any issues
due to
caching) and execution time varied in spite of having same cost.
Maybe
On 10/01/23 10:53, David Rowley wrote:
On Tue, 10 Jan 2023 at 06:15, Ankit Kumar Pandey wrote:
> Do we have any pending items for this patch now?
I'm just wondering if not trying this when the query has a DISTINCT
clause is a copout. What I wanted to avoid was doing additional
sorting wor
David Rowley writes:
> Ideally, our sort costing would just be better, but I think that
> raises the bar a little too high to start thinking of making
> improvements to that for this patch.
It's trickier than it looks, cf f4c7c410e. But if you just want
to add a small correction based on number
On Tue, 10 Jan 2023 at 06:15, Ankit Kumar Pandey wrote:
> Do we have any pending items for this patch now?
I'm just wondering if not trying this when the query has a DISTINCT
clause is a copout. What I wanted to avoid was doing additional
sorting work for WindowAgg just to have it destroyed by H
On 09/01/23 17:53, David Rowley wrote:
We need to keep looping backwards until we find the first WindowClause
which does not contain the pathkeys of the ORDER BY.
I found the cause of confusion, *first* WindowClause means first from
forward direction. Since we are looping from backward, I
On Mon, 9 Jan 2023 at 21:34, Ankit Kumar Pandey wrote:
>
>
> > On 09/01/23 03:48, David Rowley wrote:
> > 3. I don't think you need to set "index" on every loop. Why not just
> set it to foreach_current_index(l) - 1; break;
>
> Consider this query
>
> EXPLAIN (COSTS OFF)
> SELECT empno,
>
On 09/01/23 03:48, David Rowley wrote:
On Mon, 9 Jan 2023 at 06:17, Ankit Kumar Pandey wrote:
I have addressed all points 1-6 in the attached patch.
A few more things:
1. You're still using the 'i' variable in the foreach loop.
foreach_current_index() will work.
2. I think the "in
On Mon, 9 Jan 2023 at 06:17, Ankit Kumar Pandey wrote:
> I have addressed all points 1-6 in the attached patch.
A few more things:
1. You're still using the 'i' variable in the foreach loop.
foreach_current_index() will work.
2. I think the "index" variable needs a better name. sort_pushdown_id
On Mon, 9 Jan 2023 at 05:06, Vik Fearing wrote:
> +EXPLAIN (COSTS OFF)
> +SELECT empno,
> + depname,
> + min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary,
> + sum(salary) OVER (PARTITION BY depname) depsalary,
> + count(*) OVER (ORDER BY enroll_date DESC)
On 1/8/23 18:05, Ankit Kumar Pandey wrote:
On 08/01/23 21:36, Vik Fearing wrote:
On 1/8/23 11:21, Ankit Kumar Pandey wrote:
Please find attached patch with addressed issues mentioned before.
I am curious about this plan:
+-- ORDER BY's in multiple Window functions can be combined in
On 08/01/23 16:33, David Rowley wrote:
On Sun, 8 Jan 2023 at 23:21, Ankit Kumar Pandey wrote:
Please find attached patch with addressed issues mentioned before.
Here's a quick review:
1. You can use foreach_current_index(l) to obtain the index of the list element.
2. I'd rather see you
On 08/01/23 21:36, Vik Fearing wrote:
On 1/8/23 11:21, Ankit Kumar Pandey wrote:
Please find attached patch with addressed issues mentioned before.
I am curious about this plan:
+-- ORDER BY's in multiple Window functions can be combined into one
+-- if they are subset of QUERY's OR
On 1/8/23 11:21, Ankit Kumar Pandey wrote:
Please find attached patch with addressed issues mentioned before.
I am curious about this plan:
+-- ORDER BY's in multiple Window functions can be combined into one
+-- if they are subset of QUERY's ORDER BY
+EXPLAIN (COSTS OFF)
+SELECT empno,
+
On Sun, 8 Jan 2023 at 23:21, Ankit Kumar Pandey wrote:
> Please find attached patch with addressed issues mentioned before.
Here's a quick review:
1. You can use foreach_current_index(l) to obtain the index of the list element.
2. I'd rather see you looping backwards over the list in the first
Hi David,
Please find attached patch with addressed issues mentioned before.
Things resolved:
1. Correct position of window function from where order by push down can
happen
is determined, this fixes issue mentioned in case #1.
While writing test cases, I found that optimization do not hap
On 08/01/23 04:06, David Rowley wrote:
On Sun, 8 Jan 2023 at 05:45, Ankit Kumar Pandey wrote:
Attached patch with test cases.
I can look at this in a bit more detail if you find a way to fix the
case you mentioned earlier. i.e, push the sort down to the deepest
WindowAgg that has pathkeys
On 08/01/23 03:56, David Rowley wrote:
> (your email client still seems broken)
I am looking at this again, will be changing client for here onward.
You might need to have another loop before the foreach loop that loops
backwards through the WindowClauses and remembers the index of the
Windo
On Sun, 8 Jan 2023 at 05:45, Ankit Kumar Pandey wrote:
> Attached patch with test cases.
I can look at this in a bit more detail if you find a way to fix the
case you mentioned earlier. i.e, push the sort down to the deepest
WindowAgg that has pathkeys contained in the query's ORDER BY
pathkeys.
(your email client still seems broken)
On Sun, 8 Jan 2023 at 05:27, Ankit Kumar Pandey wrote:
>
>
> While writing test cases, I found that optimization do not happen for
> case #1
>
> (which is prime candidate for such operation) like
>
> EXPLAIN (COSTS OFF)
> SELECT empno,
> depname,
>
On Sun, 8 Jan 2023 at 01:52, Ankit Kumar Pandey wrote:
> I am just wondering though, why can we not do top-N sort
> in optimized version if we include limit clause? Is top-N sort is
> limited to non strict sorting or
> cases last operation before limit is sort? .
Maybe the sort bound can be pushe
On 07/01/23 21:57, Ankit Kumar Pandey wrote:
On 07/01/23 09:58, David Rowley wrote:
>
> The attached patch has no tests added. It's going to need some of
> those.
While writing test cases, I found that optimization do not happen for
case #1
(which is prime candidate for such operation) like
E
On 07/01/23 09:58, David Rowley wrote:
The attached patch has no tests added. It's going to need some of
those.
While writing test cases, I found that optimization do not happen for
case #1
(which is prime candidate for such operation) like
EXPLAIN (COSTS OFF)
SELECT empno,
depnam
On 07/01/23 17:28, David Rowley wrote:
Your email client seems to be adding additional vertical space to your
emails. I've removed the additional newlines in the quotes. Are you
able to fix the client so it does not do that?
I have adjusted my mail client, hope it is better now?
On Sun, 8 Jan
Your email client seems to be adding additional vertical space to your
emails. I've removed the additional newlines in the quotes. Are you
able to fix the client so it does not do that?
On Sun, 8 Jan 2023 at 00:10, Ankit Kumar Pandey wrote:
>
> On 07/01/23 09:58, David Rowley wrote:
> > You also
On 07/01/23 07:59, David Rowley wrote:
On Thu, 5 Jan 2023 at 04:11, Ankit Kumar Pandey wrote:
Attaching test cases for this (+ small change in doc).
Tested this in one of WIP branch where I had modified
select_active_windows and it failed
as expected.
Please let me know if something can be
Thanks for looking into this.
On 07/01/23 09:58, David Rowley wrote:
On Sat, 7 Jan 2023 at 02:11, Ankit Kumar Pandey wrote:
On 05/01/23 12:53, David Rowley wrote:
We *can* reuse Sorts where a more strict or equivalent sort order is
available. The question is how do we get the final WindowCla
On Sat, 7 Jan 2023 at 02:11, Ankit Kumar Pandey wrote:
> > On 05/01/23 12:53, David Rowley wrote:
> >>
> >> We *can* reuse Sorts where a more strict or equivalent sort order is
> >> available. The question is how do we get the final WindowClause to do
> >> something slightly more strict to save h
On Thu, 5 Jan 2023 at 04:11, Ankit Kumar Pandey wrote:
>
> Attaching test cases for this (+ small change in doc).
>
> Tested this in one of WIP branch where I had modified
> select_active_windows and it failed
>
> as expected.
>
> Please let me know if something can be improved in this.
Thanks fo
Sorry if multiple mails has been sent for this.
On 05/01/23 12:53, David Rowley wrote:
We *can* reuse Sorts where a more strict or equivalent sort order is
available. The question is how do we get the final WindowClause to do
something slightly more strict to save having to do anything for t
On Thu, 5 Jan 2023 at 19:14, Ankit Kumar Pandey wrote:
> We are already doing something like I mentioned.
>
> Consider this example:
>
> explain SELECT rank() OVER (ORDER BY a), count(*) OVER (ORDER BY a,b)
> FROM abcd;
> QUERY PLAN
> --
On 05/01/23 12:53, David Rowley wrote:
We *can* reuse Sorts where a more strict or equivalent sort order is
available. The question is how do we get the final WindowClause to do
something slightly more strict to save having to do anything for the
ORDER BY. One way you might think would be to
On Thu, 5 Jan 2023 at 16:12, Tom Lane wrote:
>
> David Rowley writes:
> > Additionally, it's also not that clear to me that sorting by more
> > columns in the sort below the WindowAgg would always be a win over
> > doing the final sort for the ORDER BY. What if the WHERE clause (that
> > could n
On 05/01/23 07:48, Vik Fearing wrote:
On 1/4/23 13:07, Ankit Kumar Pandey wrote:
Also, one thing, consider the following query:
explain analyze select row_number() over (order by a,b),count(*) over
(order by a) from abcd order by a,b,c;
In this case, sorting is done on (a,b) followed by in
David Rowley writes:
> Additionally, it's also not that clear to me that sorting by more
> columns in the sort below the WindowAgg would always be a win over
> doing the final sort for the ORDER BY. What if the WHERE clause (that
> could not be pushed down before a join) filtered out the vast maj
On Thu, 5 Jan 2023 at 15:18, Vik Fearing wrote:
>
> On 1/4/23 13:07, Ankit Kumar Pandey wrote:
> > Also, one thing, consider the following query:
> >
> > explain analyze select row_number() over (order by a,b),count(*) over
> > (order by a) from abcd order by a,b,c;
> >
> > In this case, sorting i
Vik Fearing writes:
> On 1/4/23 13:07, Ankit Kumar Pandey wrote:
>> Also, one thing, consider the following query:
>> explain analyze select row_number() over (order by a,b),count(*) over
>> (order by a) from abcd order by a,b,c;
>> In this case, sorting is done on (a,b) followed by incremental s
On 1/4/23 13:07, Ankit Kumar Pandey wrote:
Also, one thing, consider the following query:
explain analyze select row_number() over (order by a,b),count(*) over
(order by a) from abcd order by a,b,c;
In this case, sorting is done on (a,b) followed by incremental sort on c
at final stage.
If
Attaching test cases for this (+ small change in doc).
Tested this in one of WIP branch where I had modified
select_active_windows and it failed
as expected.
Please let me know if something can be improved in this.
Regards,
Ankit Kumar Pandey
From 7647759eb92e1a0560bcff73b4169be8694f83d8 Mo
On 04/01/23 09:32, David Rowley wrote:
It looks like that works by accident. I see no mention of this either
in the comments or in [1].
This kind of troubles me because function name
/select_active_windows///doesn't tell me if its only job is
to reorder window clauses for optimizing sort.
On Wed, 4 Jan 2023 at 03:11, Ankit Kumar Pandey wrote:
> #2. If order by clause in the Window function is superset of order by in query
>
> explain analyze select a,row_number() over (order by a,b,c),count(*) over
> (order by a,b) from abcd order by a;
>
>
Hi,
On 03/01/23 08:21, David Rowley wrote:
I do think you'll likely want to put any WindowClauses which have
pathkeys which are a true subset or true superset of the ORDER BY /
DISTINCT pathkeys last. If they're a superset then we won't need to
perform any additional ordering for the DISTINCT
On 03/01/23 08:21, David Rowley wrote:
On Mon, 26 Dec 2022 at 02:04, Ankit Kumar Pandey wrote:
Point #1
In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3
sorts. We also perform 3.
This shouldn't be too hard to do. See the code in
select_active_windows(). You'll likel
On Mon, 26 Dec 2022 at 02:04, Ankit Kumar Pandey wrote:
> Point #1
>
> In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3
> sorts. We also perform 3.
This shouldn't be too hard to do. See the code in
select_active_windows(). You'll likely want to pay attention to the
DISTIN
71 matches
Mail list logo