Re: Eager aggregation, take 3

2024-09-25 Thread Richard Guo
On Fri, Sep 13, 2024 at 3:48 PM Tender Wang wrote: > Since MERGE/SPLIT partition has been reverted, the tests *partition_merge* > and *partition_split* should be removed > from parallel_schedule. After doing the above, the 0002 patch can be applied. Yeah, that's what I need to do. Thanks Ric

Re: Eager aggregation, take 3

2024-09-25 Thread Richard Guo
On Wed, Sep 11, 2024 at 10:52 AM Tender Wang wrote: > 1. In make_one_rel(), we have the below codes: > /* > * Build grouped base relations for each base rel if possible. > */ > setup_base_grouped_rels(root); > > As far as I know, each base rel only has one grouped base relation, if > possible. >

Re: Eager aggregation, take 3

2024-09-24 Thread Richard Guo
On Thu, Sep 5, 2024 at 9:40 AM Tender Wang wrote: > 1. in setup_eager_aggregation(), before calling create_agg_clause_infos(), > it does > some checks if eager aggregation is available. Can we move those checks into > a function, > for example, can_eager_agg(), like can_partial_agg() does? We

Re: Eager aggregation, take 3

2024-09-24 Thread Richard Guo
On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote: > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote: > > I haven't look all of them. I just pick few simple plan test(e.g. 19.sql, > > 45.sql). > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a little > > performance reg

Re: Why don't we consider explicit Incremental Sort?

2024-09-23 Thread Richard Guo
On Mon, Sep 23, 2024 at 10:01 PM Tomas Vondra wrote: > You don't need any skew. Consider this perfectly uniform dataset: > > create table t (a int, b int); > insert into t select 10 * random(), 100 * random() > from generate_series(1,100) s(i); > create index on t (a); > vacuum analyze;

Re: Why don't we consider explicit Incremental Sort?

2024-09-22 Thread Richard Guo
On Sun, Sep 22, 2024 at 1:38 PM David Rowley wrote: > Just looking at the commit message: > > > The rationale is based on the assumption that incremental sort is > > always faster than full sort when there are presorted keys, a premise > > that has been applied in various parts of the code. This

Re: Why don't we consider explicit Incremental Sort?

2024-09-20 Thread Richard Guo
On Sun, Sep 15, 2024 at 6:01 AM Tomas Vondra wrote: > Hmmm ... I wasn't involved in that discussion and I haven't studied the > thread now, but this seems a bit weird to me. If the presorted keys have > low cardinality, can't the full sort be faster? > > Maybe it's not possible with the costing ch

Re: Why don't we consider explicit Incremental Sort?

2024-09-13 Thread Richard Guo
On Fri, Sep 13, 2024 at 7:51 PM Tomas Vondra wrote: > Sure, an incremental sort can improve things if things go well, no doubt > about that. But how significant can the improvement be, especially if we > need to sort everything? In your example it's ~15%, which is nice, and > maybe the benefits ca

Re: Why don't we consider explicit Incremental Sort?

2024-09-13 Thread Richard Guo
On Fri, Sep 13, 2024 at 7:35 PM Tomas Vondra wrote: > On 9/13/24 06:04, Richard Guo wrote: > > It seems to me that some parts of our code assume that, for a given > > input path that is partially ordered, an incremental sort is always > > preferable to a full sort (see com

Re: Why don't we consider explicit Incremental Sort?

2024-09-13 Thread Richard Guo
On Mon, Sep 9, 2024 at 6:22 PM Tomas Vondra wrote: > I have not thought about this particular case too much, but how likely > is it that mergejoin will win for this plan in practice? If we consider > a query that only needs a fraction of rows (say, thanks to LIMIT), > aren't we likely to pick a ne

Re: Why don't we consider explicit Incremental Sort?

2024-09-12 Thread Richard Guo
On Mon, Sep 9, 2024 at 6:22 PM Tomas Vondra wrote: > I think we intentionally added incremental sort ... incrementally ;-) Haha, right. > I think one challenge with this case is that create_mergejoin_plan > creates these Sort plans explicitly - it's not "pathified" so it doesn't > go through the

Allow pushdown of HAVING clauses with grouping sets

2024-09-10 Thread Richard Guo
In some cases, we may want to transfer a HAVING clause into WHERE in hopes of eliminating tuples before aggregation instead of after. Previously, we couldn't do this if there were any nonempty grouping sets, because we didn't have a way to tell if the HAVING clause referenced any columns that were

Re: Wrong results with grouping sets

2024-09-09 Thread Richard Guo
On Wed, Sep 4, 2024 at 9:16 AM Richard Guo wrote: > I'm seeking the possibility to push 0001 and 0002 sometime this month. > Please let me know if anyone thinks this is unreasonable. > > For 0003, it might be extended to remove all no-op PHVs except those > that ar

Why don't we consider explicit Incremental Sort?

2024-09-09 Thread Richard Guo
While looking at label_sort_with_costsize() due to another issue, I noticed that we build explicit Sort nodes but not explicit Incremental Sort nodes. I wonder why this is the case. It seems to me that Incremental Sorts are preferable in some cases. For example: create table t (a int, b int); c

Re: On disable_cost

2024-09-08 Thread Richard Guo
On Fri, Sep 6, 2024 at 5:27 PM Richard Guo wrote: > On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin wrote: > > static void > > label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples) > > { > > ... > > cost_sort(&sort_path, r

Re: On disable_cost

2024-09-06 Thread Richard Guo
On Fri, Sep 6, 2024 at 5:27 PM Richard Guo wrote: > On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin wrote: > > static void > > label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples) > (I'm a little surprised that this does not cause any plan diffs in

Re: On disable_cost

2024-09-06 Thread Richard Guo
On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin wrote: > static void > label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples) > { > ... > cost_sort(&sort_path, root, NIL, >lefttree->total_cost, >plan->plan.disabled_nodes, >

Re: Support run-time partition pruning for hash join

2024-09-06 Thread Richard Guo
On Fri, Sep 6, 2024 at 9:22 AM David Rowley wrote: > Maybe instead of inventing a very pessimistic part prune Hash Join, it > might be better to make the above work without the LATERAL + OFFSET 0 > by creating the parameterized paths Seq Scan paths. That's going to be > an immense help when the no

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-09-03 Thread Richard Guo
On Tue, Sep 3, 2024 at 5:51 PM Richard Guo wrote: > The new test case fails starting from adf97c156, and we have to > install a hash opfamily and a hash function for the hacked int8alias1 > type to make the test case work again. > > Now, I'm more dubious about whether we reall

Re: Wrong results with grouping sets

2024-09-03 Thread Richard Guo
On Tue, Aug 6, 2024 at 4:17 PM Richard Guo wrote: > I fixed this issue in v13 by performing the replacement of GROUP Vars > after we've done with expression preprocessing on targetlist and > havingQual. An ensuing effect of this approach is that a HAVING > clause may contain exp

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-09-03 Thread Richard Guo
On Fri, Jul 26, 2024 at 3:56 PM Richard Guo wrote: > Do you think it works if we place this test in equivclass.sql and > write a comment explaining why it's there, like attached? Now I’m > also starting to wonder if this change actually warrants such a test. The new test case

Re: Remove no-op PlaceHolderVars

2024-09-03 Thread Richard Guo
On Tue, Sep 3, 2024 at 11:31 AM Tom Lane wrote: > Yeah. I've been mulling over how we could do this, and the real > problem is that the expression containing the PHV *has* been fully > preprocessed by the time we get to outer join strength reduction > (cf. file header comment in prepjointree.c).

Remove no-op PlaceHolderVars

2024-09-02 Thread Richard Guo
In [1] there was a short discussion about removing no-op PlaceHolderVars. This thread is for continuing that discussion. We may insert PlaceHolderVars when pulling up a subquery that is within the nullable side of an outer join: if subquery pullup needs to replace a subquery-referencing Var that

Re: Eager aggregation, take 3

2024-08-28 Thread Richard Guo
On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote: > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote: > > I haven't look all of them. I just pick few simple plan test(e.g. 19.sql, > > 45.sql). > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a little > > performance reg

Re: Eager aggregation, take 3

2024-08-28 Thread Richard Guo
On Wed, Aug 28, 2024 at 11:57 AM Tender Wang wrote: > Rectenly, I do some benchmark tests, mainly on tpch and tpcds. > tpch tests have no plan diff, so I do not continue to test on tpch. > tpcds(10GB) tests have 22 plan diff as below: > 4.sql, 5.sql, 8.sql,11.sql,19.sql,23.sql,31.sql, > 33.sql,39

Re: Eager aggregation, take 3

2024-08-28 Thread Richard Guo
On Fri, Aug 23, 2024 at 11:59 PM Robert Haas wrote: > Here are some initial, high-level thoughts about this patch set. Thank you for your review and feedback! It helps a lot in moving this work forward. > 1. As far as I can see, there's no real performance testing on this > thread. I expect tha

Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.

2024-08-28 Thread Richard Guo
On Thu, Aug 29, 2024 at 4:47 AM Tom Lane wrote: > In the meantime, I think this test case is mighty artificial, > and it wouldn't bother me any to just take it out again for the > time being. Yeah, I think we can remove the 't1.two+t2.two' test case if we go with your proposed patch to address th

Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.

2024-08-28 Thread Richard Guo
On Wed, Aug 28, 2024 at 7:58 AM David Rowley wrote: > On Wed, 28 Aug 2024 at 11:37, Tom Lane wrote: > > Oh, scratch that, I see you mean this is an additional way to do it > > not the only way to do it. But I'm confused why it works for > > t1.two+1 AS c1 > > but not > > t1.two+t

Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.

2024-08-28 Thread Richard Guo
On Wed, Aug 28, 2024 at 12:15 AM Tom Lane wrote: > If we > are willing to accept a HEAD-only fix, it'd likely be better to > attack the other end and make it possible to remove no-op PHVs. > I think that'd require marking PHVs that need to be kept because > they are serving to isolate subexpressio

Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.

2024-08-27 Thread Richard Guo
On Wed, Aug 28, 2024 at 11:30 AM Richard Guo wrote: > On Wed, Aug 28, 2024 at 5:52 AM Tom Lane wrote: > > I realized that actually we do have the mechanism for making that > > work: we could apply add_nulling_relids to the expression, if it > > meets those same conditi

Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.

2024-08-27 Thread Richard Guo
On Wed, Aug 28, 2024 at 5:52 AM Tom Lane wrote: > I realized that actually we do have the mechanism for making that > work: we could apply add_nulling_relids to the expression, if it > meets those same conditions. I think this should work, as long as we apply add_nulling_relids only to Vars/PHVs

Re: Redundant Result node

2024-08-26 Thread Richard Guo
On Thu, Aug 22, 2024 at 9:02 PM Ranier Vilela wrote: > Em qui., 22 de ago. de 2024 às 04:34, Richard Guo > escreveu: >> This does not seem right to me, as PathTargets are not canonical, so >> we cannot guarantee that two identical PathTargets will have the same >> poi

Re: Redundant Result node

2024-08-26 Thread Richard Guo
On Mon, Aug 26, 2024 at 8:58 PM Peter Eisentraut wrote: > On 23.08.24 10:27, Richard Guo wrote: > > Fair point. After looking at the code for a while, I believe it is > > sufficient to compare PathTarget.exprs after we've checked that the > > two targe

Re: Redundant Result node

2024-08-23 Thread Richard Guo
On Fri, Aug 23, 2024 at 11:56 AM Tom Lane wrote: > Richard Guo writes: > > I agree that it’s always desirable to postpone work from path-creation > > time to plan-creation time. In this case, however, it’s a little > > different. The projection step could actually be avoid

Re: Redundant Result node

2024-08-22 Thread Richard Guo
On Fri, Aug 23, 2024 at 11:19 AM Tom Lane wrote: > Richard Guo writes: > > ... we'll always make a separate ProjectionPath on top of the SortPath > > in create_ordered_paths. It’s only when we create the plan node for > > the projection step in createplan.c that we

Re: Redundant Result node

2024-08-22 Thread Richard Guo
On Thu, Aug 22, 2024 at 3:34 PM Richard Guo wrote: > /* Add projection step if needed */ > if (sorted_path->pathtarget != target) > sorted_path = apply_projection_to_path(root, ordered_rel, >sorted_path, target); &g

Re: Redundant Result node

2024-08-22 Thread Richard Guo
On Thu, Aug 22, 2024 at 8:03 PM David Rowley wrote: > On Thu, 22 Aug 2024 at 23:33, Peter Eisentraut wrote: > > > I wonder if we need to invent a function to compare two PathTargets. > > > > Wouldn't the normal node equal() work? > > It might. I think has_volatile_expr might be missing a > pg_nod

Redundant Result node

2024-08-22 Thread Richard Guo
I ran into a query plan where the Result node seems redundant to me: create table t (a int, b int, c int); insert into t select i%10, i%10, i%10 from generate_series(1,100)i; create index on t (a, b); analyze t; set enable_hashagg to off; set enable_seqscan to off; explain (verbose, costs off) s

Re: Small code simplification

2024-08-21 Thread Richard Guo
On Wed, Aug 21, 2024 at 6:25 PM Tender Wang wrote: >Oversight in 7ff9afbbd; I think we can do the same way for the > ATExecAddColumn(). LGTM. Pushed. Thanks Richard

Re: A problem about partitionwise join

2024-08-11 Thread Richard Guo
On Mon, Mar 25, 2024 at 7:09 PM Ashutosh Bapat wrote: > I think we need some way to avoid two different ways of looking up partition > keys - if we can't teach the EC machinery to produce clauses with partition > keys (always), we need to teach EC to contain partition keys in case of outer > jo

Re: A problem about partitionwise join

2024-08-11 Thread Richard Guo
On Sat, Aug 10, 2024 at 6:22 AM Alexey Dvoichenkov wrote: > I haven't read the entire thread so I might be missing something, but > one interesting consequence of this patch is that it kind of breaks > the initial pruning of generic plans. Given a query such as SELECT > ... WHERE A.PK = B.PK AND A

Re: Wrong results with grouping sets

2024-08-06 Thread Richard Guo
On Wed, Jun 5, 2024 at 5:42 PM Richard Guo wrote: > I found a bug in the v6 patch. The following query would trigger the > Assert in make_restrictinfo that the given subexpression should not be > an AND clause. > > select max(a) from t group by a > b and a = b having a > b a

Re: Wrong results with grouping sets

2024-08-02 Thread Richard Guo
I've been looking at cases where there are grouping-set keys that reduce to Consts, and I noticed a plan with v11 patch that is not very great. explain (verbose, costs off) select 1 as one group by rollup(one) order by one nulls first; QUERY PLAN --- Sort

Re: Query results vary depending on the plan cache mode used

2024-08-01 Thread Richard Guo
On Thu, Aug 1, 2024 at 10:34 PM Tom Lane wrote: > Richard Guo writes: > > While working on the grouping sets patches for queries with GROUP BY > > items that are constants, I noticed $subject on master. As an > > example, consider > > This particular example seems lik

Query results vary depending on the plan cache mode used

2024-08-01 Thread Richard Guo
While working on the grouping sets patches for queries with GROUP BY items that are constants, I noticed $subject on master. As an example, consider prepare q1(int) as select $1 as c1, $1 as c2 from generate_series(1,2) t group by rollup(c1); set plan_cache_mode to force_custom_plan; execute q1(

Re: Short-circuit sort_inner_and_outer if there are no mergejoin clauses

2024-07-30 Thread Richard Guo
On Fri, Apr 26, 2024 at 2:57 PM Richard Guo wrote: > I doubt that there is measurable performance improvement. But I found > that throughout the run of the regression tests, sort_inner_and_outer is > called a total of 44,424 times. Among these calls, there are 11,064 > ins

Re: Trivial revise for the check of parameterized partial paths

2024-07-30 Thread Richard Guo
On Thu, Jan 25, 2024 at 3:21 PM Richard Guo wrote: > On Sun, Jan 21, 2024 at 8:36 PM vignesh C wrote: >> I'm seeing that there has been no activity in this thread for nearly 7 >> months, I'm planning to close this in the current commitfest unless >> someone

Re: A problem about partitionwise join

2024-07-30 Thread Richard Guo
On Wed, May 8, 2024 at 5:01 PM Richard Guo wrote: > Below is what I got on my local machine. > > -- on master > > measurement | average | maximum | minimum | std_dev | > std_dev_as_perc_of_avg > ---+--+--+--+-+---

Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

2024-07-28 Thread Richard Guo
On Fri, Jul 26, 2024 at 5:44 PM Richard Guo wrote: > I've worked a bit more on the comments and commit message, and I plan > to push the attached soon, barring any objections or comments. Pushed. Thanks Richard

Re: Simplify create_merge_append_path a bit for clarity

2024-07-28 Thread Richard Guo
On Fri, Jul 26, 2024 at 1:28 PM Paul A Jungwirth wrote: > Is there a reason you don't want to remove the required_outer > parameter altogether? I guess because it is such a common pattern to > pass it? I think it's best to keep this parameter unchanged to maintain consistency with other functions

Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

2024-07-26 Thread Richard Guo
On Wed, Jul 24, 2024 at 3:30 PM Ashutosh Bapat wrote: > Done. Are you suggesting it for aesthetic purposes or there's > something else (read, less defragmentation, performance gain etc.)? I > am curious. I think it's just for readability. > PFA patches: > 0001 - same as previous one > 0002 - add

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-07-26 Thread Richard Guo
On Thu, Jul 25, 2024 at 12:07 AM Tom Lane wrote: > I took a brief look at this. I think the basic idea is sound, > but I have a couple of nits: Thank you for reviewing this patch! > * It's not entirely obvious that the checks preceding these additions > are sufficient to ensure that the clauses

Re: apply_scanjoin_target_to_paths and partitionwise join

2024-07-23 Thread Richard Guo
On Wed, May 22, 2024 at 3:57 PM Ashutosh Bapat wrote: > I will create patches for the back-branches once the patch for master is in a > committable state. AFAIU, this patch prevents apply_scanjoin_target_to_paths() from discarding old paths of partitioned joinrels. Therefore, we can retain non-

Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

2024-07-23 Thread Richard Guo
On Wed, Jun 5, 2024 at 3:48 PM Ashutosh Bapat wrote: > This is different. But it needs a rebase. PFA rebased patch. The revised > commit message explains the change better. I looked through this patch and I think it is in a good shape. Some minor comments: * I think it's better to declare 'chi

Re: Redundant code in create_gather_merge_path

2024-07-22 Thread Richard Guo
On Thu, Jul 18, 2024 at 11:08 AM Richard Guo wrote: > On Thu, Jul 18, 2024 at 10:02 AM Richard Guo wrote: > > I noticed this while reviewing patch [1], thinking that it might be > > worth fixing. Any thoughts? > > Here is the patch. This patch is quite straightforward to

Re: Possible incorrect row estimation for Gather paths

2024-07-22 Thread Richard Guo
On Mon, Jul 22, 2024 at 5:55 PM Richard Guo wrote: > Otherwise I think the v3 patch looks good to me. > > Attached is an updated version of this patch with cosmetic changes and > comment updates. It also squishes the two patches together into one. > I'm planning to push

Re: Possible incorrect row estimation for Gather paths

2024-07-22 Thread Richard Guo
On Mon, Jul 22, 2024 at 4:47 PM Anthonin Bonnefoy wrote: > On Wed, Jul 17, 2024 at 3:59 AM Richard Guo wrote: > > I can reproduce this problem with the query below. > > > > explain (costs on) select * from tenk1 order by twenty; > >

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-21 Thread Richard Guo
On Fri, Jul 19, 2024 at 12:00 PM Alexander Lakhin wrote: > 18.07.2024 17:30, Richard Guo wrote: > > I have no idea why the underlying statistics changed, but it seems > > that this slight change is sufficent to result in a different plan. > > I think it could be caused by

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-18 Thread Richard Guo
On Thu, Jul 18, 2024 at 4:11 PM Richard Guo wrote: > On Thu, Jul 18, 2024 at 4:00 PM Alexander Lakhin wrote: > > Please look at a recent buildfarm failure [1], which shows some > > instability of that test addition: > > -- the joinrel is not parallel-safe due to the

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-18 Thread Richard Guo
On Thu, Jul 18, 2024 at 4:00 PM Alexander Lakhin wrote: > Please look at a recent buildfarm failure [1], which shows some > instability of that test addition: > -- the joinrel is not parallel-safe due to the OFFSET clause in the subquery > explain (costs off) > select * from tenk1 t1, (s

Re: Redundant code in create_gather_merge_path

2024-07-17 Thread Richard Guo
On Thu, Jul 18, 2024 at 10:02 AM Richard Guo wrote: > I noticed this while reviewing patch [1], thinking that it might be > worth fixing. Any thoughts? Here is the patch. Thanks Richard v1-0001-Remove-redundant-code-in-create_gather_merge_path.patch Description: Binary data

Redundant code in create_gather_merge_path

2024-07-17 Thread Richard Guo
In create_gather_merge_path, we should always guarantee that the subpath is adequately ordered, and we do not add a Sort node in createplan.c for a Gather Merge node. Therefore, the 'else' branch in the snippet from create_gather_merge_path is redundant. if (pathkeys_contained_in(pathkeys, su

Re: Wrong results with grouping sets

2024-07-17 Thread Richard Guo
On Thu, Jul 18, 2024 at 8:31 AM Richard Guo wrote: > I am confused. Does the SQL standard explicitly define or standardize > the behavior of grouping by volatile expressions? Does anyone know > about that? Just for the record, multiple instances of non-volatile grouping expressio

Re: Wrong results with grouping sets

2024-07-17 Thread Richard Guo
On Wed, Jul 17, 2024 at 8:50 AM Paul George wrote: > > Since a subquery is a volatile expression, each of its instances > should be evaluated separately. I don't think this conclusion is correct. Look at: select random(), random() from t group by random(); random | random ---

Re: Possible incorrect row estimation for Gather paths

2024-07-16 Thread Richard Guo
I can reproduce this problem with the query below. explain (costs on) select * from tenk1 order by twenty; QUERY PLAN - Gather Merge (cost=772.11..830.93 rows=5882 width=244) Wor

Re: Wrong results with grouping sets

2024-07-16 Thread Richard Guo
On Mon, Jul 15, 2024 at 4:38 PM Sutou Kouhei wrote: > I'm not familiar with related codes but here are my > comments: Thanks for reviewing this patchset! > +* Fields valid for a GROUP RTE (else NULL/zero): > > There is only one field and it's LIST. So how about using > the following? > >

Duplicate unique key values in inheritance tables

2024-07-15 Thread Richard Guo
I came across a query that returned incorrect results and I traced it down to being caused by duplicate unique key values in an inheritance table. As a simple example, consider create table p (a int primary key, b int); create table c () inherits (p); insert into p select 1, 1; insert into c sel

Re: Wrong results with grouping sets

2024-07-14 Thread Richard Guo
FWIW, in addition to fixing wrong result issues for queries with grouping sets, the changes in 0001 also improve performance for queries that have subqueries in the grouping expressions, because different instances of the same subquery would need to be executed only once. As a simple example, cons

Re: Check lateral references within PHVs for memoize cache keys

2024-07-14 Thread Richard Guo
On Thu, Jul 11, 2024 at 5:18 PM Richard Guo wrote: > Attached is an updated version of this patch. I've pushed this patch. Thanks for all the reviews. Thanks Richard

Re: Check lateral references within PHVs for memoize cache keys

2024-07-12 Thread Richard Guo
On Fri, Jul 12, 2024 at 11:18 AM Andrei Lepikhov wrote: > I'm not sure about stability of output format of AVG aggregate across > different platforms. Maybe better to return the result of comparison > between the AVG() and expected value? I don't think this is a problem. AFAIK we use AVG() a lot

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-11 Thread Richard Guo
On Sat, Jul 6, 2024 at 5:32 PM Richard Guo wrote: > Here is a new rebase. > > I'm planning to push it next week, barring any objections. Pushed. Thanks Richard

Re: Check lateral references within PHVs for memoize cache keys

2024-07-11 Thread Richard Guo
On Fri, Jun 28, 2024 at 10:14 PM Andrei Lepikhov wrote: > I have reviewed v7 of the patch. This improvement is good enough to be > applied, thought. Thank you for reviewing this patch! > Comment may be rewritten for clarity: > "Determine if the clauses in param_info and innerrel's lateral_vars"

Re: Eager aggregation, take 3

2024-07-10 Thread Richard Guo
On Sun, Jul 7, 2024 at 10:45 AM Paul George wrote: > Thanks for reviving this patch and for all of your work on it! Eager > aggregation pushdown will be beneficial for my work and I'm hoping to see it > land. Thanks for looking at this patch! > The output of both the original query and this on

Re: Wrong results with grouping sets

2024-07-09 Thread Richard Guo
On Sat, Jul 6, 2024 at 9:26 AM Richard Guo wrote: > On Thu, Jul 4, 2024 at 6:02 PM Ashutosh Bapat > wrote: > > I don't have any specific thoughts on backpatching, but I have started > > reviewing the patches. > Thanks for reviewing this patchset! Here is an update

Re: Wrong results with grouping sets

2024-07-09 Thread Richard Guo
On Sat, Jul 6, 2024 at 10:37 AM Tom Lane wrote: > Richard Guo writes: > > BTW, from catversion.h I read: > > > * Another common reason for a catversion update is a change in parsetree > > * external representation, since serialized parsetrees appear in stored > &

Re: report a typo in comments of ComputeXidHorizonsResult

2024-07-07 Thread Richard Guo
On Mon, Jul 8, 2024 at 7:49 AM Richard Guo wrote: > On Sun, Jul 7, 2024 at 5:43 PM Junwang Zhao wrote: > > I think the period here should be a typo. > > > > index 16b5803d388..af3b15e93df 100644 > > --- a/src/backend/storage/ipc/procarray.c > > +++ b/sr

Re: report a typo in comments of ComputeXidHorizonsResult

2024-07-07 Thread Richard Guo
On Sun, Jul 7, 2024 at 5:43 PM Junwang Zhao wrote: > I think the period here should be a typo. > > index 16b5803d388..af3b15e93df 100644 > --- a/src/backend/storage/ipc/procarray.c > +++ b/src/backend/storage/ipc/procarray.c > @@ -185,7 +185,7 @@ typedef struct ComputeXidHorizonsResult > F

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-06 Thread Richard Guo
On Wed, Jun 19, 2024 at 10:55 AM Tender Wang wrote: > Richard Guo 于2024年6月18日周二 17:24写道: >> I updated the patch to include a check in consider_parallel_nestloop >> ensuring that inner_cheapest_total is not parameterized by outerrel >> before materializing it. I also tweak

Re: Wrong results with grouping sets

2024-07-05 Thread Richard Guo
On Fri, Jul 5, 2024 at 5:51 AM Andres Freund wrote: > On 2024-07-01 16:29:16 +0800, Richard Guo wrote: > > Here is an updated version of this patchset. I've run pgindent for it, > > and also tweaked the commit messages a bit. > > > > In principle, 0001 can

Re: Wrong results with grouping sets

2024-07-05 Thread Richard Guo
On Thu, Jul 4, 2024 at 6:02 PM Ashutosh Bapat wrote: > On Mon, Jul 1, 2024 at 1:59 PM Richard Guo wrote: > > Here is an updated version of this patchset. I've run pgindent for it, > > and also tweaked the commit messages a bit. > > > > In principle, 0001 can

Re: Support "Right Semi Join" plan shapes

2024-07-04 Thread Richard Guo
On Thu, Jul 4, 2024 at 11:18 PM Japin Li wrote: > On Thu, 04 Jul 2024 at 17:17, Richard Guo wrote: > > Here is a new rebase. > > > > Barring objections, I'm planning to push it soon. Pushed. Thanks for all the reviews. > Thanks for updating the patch. It looks g

Re: Support "Right Semi Join" plan shapes

2024-07-04 Thread Richard Guo
On Fri, Jun 28, 2024 at 3:21 PM Richard Guo wrote: > On Fri, Jun 28, 2024 at 2:54 PM Richard Guo wrote: > > I've refined this test case further to make it more stable by using an > > additional filter 'a.tenthous < 5000'. Besides, I noticed a surplus > >

Re: Wrong results with grouping sets

2024-07-01 Thread Richard Guo
On Mon, Jun 10, 2024 at 5:05 PM Richard Guo wrote: > This patchset does not apply any more. Here is a new rebase. Here is an updated version of this patchset. I've run pgindent for it, and also tweaked the commit messages a bit. In principle, 0001 can be backpatched to all supported

Re: Support "Right Semi Join" plan shapes

2024-06-28 Thread Richard Guo
On Fri, Jun 28, 2024 at 2:54 PM Richard Guo wrote: > On Mon, Jun 24, 2024 at 5:59 PM Richard Guo wrote: > > I noticed that this patch changes the plan of a query in join.sql from > > a semi join to right semi join, compromising the original purpose of > > this query, whic

Re: Support "Right Semi Join" plan shapes

2024-06-27 Thread Richard Guo
On Mon, Jun 24, 2024 at 5:59 PM Richard Guo wrote: > I noticed that this patch changes the plan of a query in join.sql from > a semi join to right semi join, compromising the original purpose of > this query, which was to test the fix for neqjoinsel's behavior for > semijoins (se

Re: Support "Right Semi Join" plan shapes

2024-06-24 Thread Richard Guo
Thank you for reviewing. On Mon, Jun 24, 2024 at 1:27 PM Li Japin wrote: > + /* > +* For now we do not support RIGHT_SEMI join in mergejoin or nestloop > +* join. > +*/ > + if (jointype == JOIN_RIGHT_SEMI) > + return; > + > > How about adding some

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-06-23 Thread Richard Guo
On Wed, Jun 19, 2024 at 10:24 PM Alexander Pyhalov wrote: > Richard Guo писал(а) 2024-06-19 16:30: > > I think we can simply verify the validity of commutators for clauses in > > the form "inner op outer" when selecting mergejoin/hash clauses. If a > > claus

Re: Avoid incomplete copy string (src/backend/access/transam/xlog.c)

2024-06-23 Thread Richard Guo
On Mon, Jun 24, 2024 at 7:51 AM Ranier Vilela wrote: > In src/include/access/xlogbackup.h, the field *name* > has one byte extra to store null-termination. > > But, in the function *do_pg_backup_start*, > I think that is a mistake in the line (8736): > > memcpy(state->name, backupidstr, strlen(bac

Re: An inefficient query caused by unnecessary PlaceHolderVar

2024-06-20 Thread Richard Guo
On Mon, Jan 15, 2024 at 1:50 PM Richard Guo wrote: > Updated this patch over 29f114b6ff, which indicates that we should apply > the same rules for PHVs. Here is a new rebase of this patch, with some tweaks to comments. I've also updated the commit message to better explain the c

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-06-19 Thread Richard Guo
On Wed, Jun 19, 2024 at 12:49 PM Tom Lane wrote: > Richard Guo writes: > > It seems to me that the new operator is somewhat artificial, since it is > > designed to support a mergejoin but lacks a valid commutator. So before > > we proceed to discuss the fix, I'd like

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-06-18 Thread Richard Guo
On Mon, Jun 17, 2024 at 10:51 PM Alexander Pyhalov wrote: > There's the following inconsistency between try_mergejoin_path() and > create_mergejoin_plan(). > When clause operator has no commutator, we can end up with mergejoin > path. > Later create_mergejoin_plan() will call get_switched_clauses(

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-18 Thread Richard Guo
On Tue, Jun 4, 2024 at 6:51 PM Tender Wang wrote: > Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1] > > * I think we should not consider materializing the cheapest inner path > if we're doing JOIN_UNIQUE_INNER, because in this case we have to > unique-ify the inner path. > > We

Re: assertion failure at cost_memoize_rescan()

2024-06-17 Thread Richard Guo
On Tue, Jun 18, 2024 at 10:53 AM David Rowley wrote: > Out of the places I saw, it seems we do tend to code things so that we > don't assume the value has been clamped. E.g. > adjust_limit_rows_costs() does if (*rows < 1) *rows = 1; Agreed. In costsize.c I saw a few instances where we have

Re: Check lateral references within PHVs for memoize cache keys

2024-06-17 Thread Richard Guo
On Mon, Mar 18, 2024 at 4:36 PM Richard Guo wrote: > Here is another rebase over master so it applies again. I also added a > commit message to help review. Nothing else has changed. AFAIU currently we do not add Memoize nodes on top of join relation paths. This is because the ParamPat

Re: Wrong results with grouping sets

2024-06-10 Thread Richard Guo
On Wed, Jun 5, 2024 at 5:42 PM Richard Guo wrote: > Hence here is the v7 patchset. I've also added detailed commit messages > for the two patches. This patchset does not apply any more. Here is a new rebase. While at it, I added more checks for 'root->group_rtindex', a

Re: A wrong comment about search_indexed_tlist_for_var

2024-06-10 Thread Richard Guo
On Thu, May 16, 2024 at 8:42 PM Robert Haas wrote: > You don't need to wait for the next CommitFest to fix a comment (or a > bug). And, indeed, it's better if you do this before we branch. Patch pushed and the CF entry closed. Thank you for the suggestion. Thanks Richard

Re: Reordering DISTINCT keys to match input path's pathkeys

2024-06-07 Thread Richard Guo
On Mon, Feb 5, 2024 at 11:18 AM Richard Guo wrote: > cfbot reminds that this patch does not apply any more. So I've rebased > it on master, and also adjusted the test cases a bit. This patch does not apply any more, so here is a new rebase, with some tweaks to the comments. Thanks R

Re: Wrong results with grouping sets

2024-06-05 Thread Richard Guo
On Fri, May 24, 2024 at 9:08 PM Richard Guo wrote: > On the basis of the parser infrastructure fixup, 0002 patch adds the > nullingrel bit that references the grouping RTE to the grouping > expressions. I found a bug in the v6 patch. The following query would trigger the

Re: Wrong results with grouping sets

2024-05-24 Thread Richard Guo
On the basis of the parser infrastructure fixup, 0002 patch adds the nullingrel bit that references the grouping RTE to the grouping expressions. However, it seems to me that we have to manually remove this nullingrel bit from expressions in various cases where these expressions are logically belo

Re: Wrong results with grouping sets

2024-05-23 Thread Richard Guo
On Fri, May 17, 2024 at 5:41 PM Richard Guo wrote: > I've spent some more time on this patch, and now it passes all the > regression tests. But I had to hack explain.c and ruleutils.c to make > the varprefix stuff work as it did before, which is not great. > I've realized

  1   2   3   4   5   6   7   8   >