Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread Richard Guo
On Mon, Jan 29, 2024 at 11:20 AM vignesh C wrote: > On Mon, 29 Jan 2024 at 08:01, Richard Guo wrote: > > On Sat, Jan 27, 2024 at 10:08 AM vignesh C wrote: > >> I have changed the status of the commitfest entry to "Committed" as I > >> noticed the patch has already been committed. > > > > Well,

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread vignesh C
On Mon, 29 Jan 2024 at 08:01, Richard Guo wrote: > > > On Sat, Jan 27, 2024 at 10:08 AM vignesh C wrote: >> >> On Mon, 8 Jan 2024 at 22:21, Tom Lane wrote: >> > >> > Richard Guo writes: >> > > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane wrote: >> > >> Thanks for the report! I guess we need

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread Richard Guo
On Sat, Jan 27, 2024 at 10:08 AM vignesh C wrote: > On Mon, 8 Jan 2024 at 22:21, Tom Lane wrote: > > > > Richard Guo writes: > > > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane wrote: > > >> Thanks for the report! I guess we need something like the attached. > > > > > +1. > > > > Pushed, thanks

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-26 Thread vignesh C
On Mon, 8 Jan 2024 at 22:21, Tom Lane wrote: > > Richard Guo writes: > > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane wrote: > >> Thanks for the report! I guess we need something like the attached. > > > +1. > > Pushed, thanks for looking at it. I have changed the status of the commitfest entry to

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-08 Thread Tom Lane
Richard Guo writes: > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane wrote: >> Thanks for the report! I guess we need something like the attached. > +1. Pushed, thanks for looking at it. >> I'm surprised that this hasn't been noticed before; was the case >> really unreachable before? > It seems

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-08 Thread Richard Guo
On Sun, Jan 7, 2024 at 6:41 AM Tom Lane wrote: > Alexander Lakhin writes: > > Please look at the following query: > > CREATE TABLE t(i int); > > INSERT INTO t VALUES (1); > > VACUUM ANALYZE t; > > > WITH ir AS (INSERT INTO t VALUES (2) RETURNING i) > > SELECT * FROM ir WHERE i = 2; > > > which

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-06 Thread Tom Lane
Alexander Lakhin writes: > Please look at the following query: > CREATE TABLE t(i int); > INSERT INTO t VALUES (1); > VACUUM ANALYZE t; > WITH ir AS (INSERT INTO t VALUES (2) RETURNING i) > SELECT * FROM ir WHERE i = 2; > which produces ERROR:  no relation entry for relid 1 > starting from

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-06 Thread Alexander Lakhin
Hello Tom and Richard, 17.11.2023 22:42, Tom Lane wrote: OK. I pushed the patch after a bit more review: we can simplify things some more by using the subroot->parse querytree for all tests. After the previous refactoring, it wasn't buying us anything to do some initial tests with the raw

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-20 Thread Richard Guo
On Tue, Nov 21, 2023 at 1:46 AM Tom Lane wrote: > * Do we really need to use make_tlist_from_pathtarget? Why isn't > the tlist of the cteplan good enough (indeed, more so)? I think you are right. The cteplan->targetlist is built for the CTE's best path by build_path_tlist(), which is almost

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-20 Thread Tom Lane
Richard Guo writes: > On Fri, Nov 17, 2023 at 11:38 AM Tom Lane wrote: >> That line of argument also leads to the conclusion that it'd be >> okay to expose info about the ordering of the CTE result to the >> upper planner. > In the light of this conclusion, I had a go at propagating the

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-19 Thread Richard Guo
On Fri, Nov 17, 2023 at 11:38 AM Tom Lane wrote: > That line of argument also leads to the conclusion that it'd be > okay to expose info about the ordering of the CTE result to the > upper planner. This patch doesn't do that, and I'm not sufficiently > excited about the issue to go write some

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-17 Thread Tom Lane
Richard Guo writes: > On Fri, Nov 17, 2023 at 11:38 AM Tom Lane wrote: >> That line of argument also leads to the conclusion that it'd be >> okay to expose info about the ordering of the CTE result to the >> upper planner. This patch doesn't do that, and I'm not sufficiently >> excited about

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Richard Guo
On Fri, Nov 17, 2023 at 11:38 AM Tom Lane wrote: > That line of argument also leads to the conclusion that it'd be > okay to expose info about the ordering of the CTE result to the > upper planner. This patch doesn't do that, and I'm not sufficiently > excited about the issue to go write some

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Laurenz Albe
On Thu, 2023-11-16 at 22:38 -0500, Tom Lane wrote: > That line of argument also leads to the conclusion that it'd be > okay to expose info about the ordering of the CTE result to the > upper planner.  [...]  The fence is sort of one-way > in this line of thinking: information can propagate up to

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread David G. Johnston
On Thursday, November 16, 2023, Tom Lane wrote: > > That line of argument also leads to the conclusion that it'd be > okay to expose info about the ordering of the CTE result to the > upper planner. This patch doesn't do that, and I'm not sufficiently > excited about the issue to go write some

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Tom Lane
Richard Guo writes: > On Fri, Nov 17, 2023 at 2:16 AM Tom Lane wrote: >> So you could argue that there's more to do here, but I'm hesitant >> to go further. Part of the point of MATERIALIZED is to be an >> optimization fence, so breaking down that fence is something to be >> wary of. Maybe we

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Richard Guo
On Fri, Nov 17, 2023 at 2:16 AM Tom Lane wrote: > So you could argue that there's more to do here, but I'm hesitant > to go further. Part of the point of MATERIALIZED is to be an > optimization fence, so breaking down that fence is something to be > wary of. Maybe we shouldn't even take this

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Tom Lane
Richard Guo writes: > I think the second plan (patched) makes more sense. In the first plan > (unpatched), the HashAggregate node actually does not reduce the the > number of rows because it groups by 'unique1', but planner does not know > that because it lacks statistics for Vars referencing

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-16 Thread Richard Guo
On Thu, Nov 9, 2023 at 6:45 AM Tom Lane wrote: > The existing RTE_SUBQUERY stanza has most of what we need for this, > so I experimented with extending that to also handle RTE_CTE. It > seems to work, though I soon found out that it needed tweaking for > the case where the CTE is

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-11-08 Thread Tom Lane
Jian Guo writes: > I found a new approach to fix this issue, which seems better, so I would like > to post another version of the patch here. The origin patch made the > assumption of the values of Vars from CTE must be unique, which could be very > wrong. This patch examines variables for

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-09-07 Thread Jian Guo
Hi Jian He, Thanks for fixing the compiler warnings, seems the CI used a little old compiler and complained: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] But later C standard have relaxed the requirements for this, ISO C99 and later standard allow

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-21 Thread Jian Guo
Sure, Tomas. Here is the PG Commitfest link: https://commitfest.postgresql.org/44/4510/ From: Tomas Vondra Sent: Monday, August 21, 2023 18:56 To: Jian Guo ; Hans Buschmann ; pgsql-hackers@lists.postgresql.org Cc: Zhenghua Lyu Subject: Re: Wrong rows

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-21 Thread Tomas Vondra
On 8/21/23 10:16, Jian Guo wrote: > Hi hackers, > > I found a new approach to fix this issue, which seems better, so I would > like to post another version of the patch here. The origin patch made > the assumption of the values of Vars from CTE must be unique, which > could be very wrong. This

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-21 Thread Jian Guo
Hi hackers, I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch here. The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong. This patch examines variables for Vars inside CTE,

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-14 Thread Tomas Vondra
Hi, I haven't looked at the patch, but please add the patch to the next commit fest (2023-09), so that we don't lose track of it. See https://commitfest.postgresql.org regards Tomas On 8/14/23 13:12, Jian Guo wrote: > Hi hackers, > > I have written a patch to add stats info for Vars in

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-14 Thread Jian Guo
Hi hackers, I have written a patch to add stats info for Vars in CTEs. With this patch, the join size estimation on the upper of CTE scans became more accurate. In the function selfuncs.c:eqjoinsel it uses the number of the distinct values of the two join variables to estimate join size, and

Re: AW: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-09 Thread Tomas Vondra
> > FWIW I suggest you provide the data in a form that's easier to use (like > a working SQL script). More people are likely to look and help than when > they have to extract stuff from an e-mail, fill in missing pieces etc. > BTW if anyone wants to play with this, here are the SQL scripts I

Re: AW: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-09 Thread Tomas Vondra
On 2/9/23 10:03, Hans Buschmann wrote: > Hello Tomas, > > > Thank you for looking at. > > > First, I miscalculated the factor which should be about 50, not 500. Sorry. > > Then I want to show you the table definitions (simple, very similar, > ommited child_tables and additional indexes,

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-08 Thread Tomas Vondra
On 2/8/23 14:55, Hans Buschmann wrote: > During data refactoring of our Application I encountered $subject when > joining 4 CTEs with left join or inner join. > > > 1. Background > > PG 15.1 on Windows x64 (OS seems no to have no meening here) > > > I try to collect data from 4 (analyzed) 

Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-08 Thread Hans Buschmann
During data refactoring of our Application I encountered $subject when joining 4 CTEs with left join or inner join. 1. Background PG 15.1 on Windows x64 (OS seems no to have no meening here) I try to collect data from 4 (analyzed) tables (up,li,in,ou) by grouping certain data (4 CTEs