Re: shadow variables - pg15 edition

2022-08-25 Thread David Rowley
On Thu, 25 Aug 2022 at 14:08, Justin Pryzby wrote: > Here, I've included the rest of your list. OK, I've gone through v3-remove-var-declarations.txt, v4-reuse.txt v4-reuse-more.txt and committed most of what you had and removed a few that I thought should be renames instead. I also added some

Re: shadow variables - pg15 edition

2022-08-24 Thread David Rowley
On Wed, 24 Aug 2022 at 22:47, David Rowley wrote: > 5. "Refactor" (fix the code to make it better) > I have some ideas on how to fix the two #5s, so I'm going to go and do that > now. I've attached a patch which I think improves the code in gistRelocateBuildBuffersOnSpli

Re: Tracking last scan time

2022-08-24 Thread David Rowley
On Thu, 25 Aug 2022 at 03:03, Bruce Momjian wrote: > > On Wed, Aug 24, 2022 at 04:01:21PM +0100, Dave Page wrote: > > On Wed, 24 Aug 2022 at 15:18, Bruce Momjian wrote: > > Would it be simpler to allow the sequential and index scan columns to be > > cleared so you can look later to see

Re: shadow variables - pg15 edition

2022-08-24 Thread David Rowley
On Thu, 25 Aug 2022 at 02:00, Justin Pryzby wrote: > > On Wed, Aug 24, 2022 at 10:47:31PM +1200, David Rowley wrote: > > I was hoping we'd already caught all of the #1s in 421892a19, but I > > caught a few of those in some of your other patches. One you'd done > > another

Re: Change pfree to accept NULL argument

2022-08-24 Thread David Rowley
On Wed, 24 Aug 2022 at 23:07, David Rowley wrote: > One counter argument to that is for cases like list_free_deep(). > Right now if I'm not mistaken there's a bug (which I just noticed) in > list_free_private() that would trigger if you have a List of Lists and > one of the inner

Re: Fix typo in func.sgml

2022-08-24 Thread David Rowley
On Wed, 24 Aug 2022 at 22:44, Shinya Kato wrote: > I've found a duplicate "a a" in func.sgml and fixed it. > Patch is attached. Thanks. Pushed. David

Re: Change pfree to accept NULL argument

2022-08-24 Thread David Rowley
On Tue, 23 Aug 2022 at 13:17, David Rowley wrote: > I think making pfree() accept NULL is a bad idea. One counter argument to that is for cases like list_free_deep(). Right now if I'm not mistaken there's a bug (which I just noticed) in list_free_private() that would trigger if you have a L

Re: shadow variables - pg15 edition

2022-08-24 Thread David Rowley
On Wed, 24 Aug 2022 at 14:39, Justin Pryzby wrote: > Attached are half of the remainder of what I've written, ready for review. Thanks for the patches. I started to do some analysis of the remaining warnings and put them in the attached spreadsheet. I put each of the remaining warnings into a

Re: shadow variables - pg15 edition

2022-08-23 Thread David Rowley
On Tue, 23 Aug 2022 at 14:14, Justin Pryzby wrote: > Actually, they didn't sneak in - what I sent are the patches which are ready > to > be reviewed, excluding the set of "this" and "tmp" and other renames which you > disliked. In the branch (not the squished patch) the first ~15 patches were >

Re: Considering additional sort specialisation functions for PG16

2022-08-22 Thread David Rowley
On Tue, 23 Aug 2022 at 15:22, John Naylor wrote: > Did you happen to see > > https://www.postgresql.org/message-id/CAFBsxsFhq8VUSkUL5YO17cFXbCPwtbbxBu%2Bd9MFrrsssfDXm3Q%40mail.gmail.com I missed that. It looks like a much more promising idea than what I came up with. I've not looked at your

Considering additional sort specialisation functions for PG16

2022-08-22 Thread David Rowley
Hi, (Background: 697492434 added 3 new sort functions to remove the indirect function calls for the comparator function. This sped up sorting for various of our built-in data types.) There was a bit of unfinished discussion around exactly how far to take these specialisations for PG15. We

Re: shadow variables - pg15 edition

2022-08-22 Thread David Rowley
On Tue, 23 Aug 2022 at 13:17, Justin Pryzby wrote: > Attached is a squished version. I see there's some renaming ones snuck in there. e.g: - Relation rel; - HeapTuple tuple; + Relation pg_foreign_table; + HeapTuple foreigntuple; This one does not seem to be in the category I mentioned: @@

Re: Change pfree to accept NULL argument

2022-08-22 Thread David Rowley
On Tue, 23 Aug 2022 at 06:30, Tom Lane wrote: > > Peter Eisentraut writes: > > Per discussion in [0], here is a patch set that allows pfree() to accept > > a NULL argument, like free() does. > > So the question is, is this actually a good thing to do? I think making pfree() accept NULL is a bad

Re: shadow variables - pg15 edition

2022-08-20 Thread David Rowley
On Fri, 19 Aug 2022 at 16:28, Justin Pryzby wrote: > Let me know what I can do when it's time for round two. I pushed the modified 0001-0008 patches earlier today and also the one I wrote to fixup the 36 warnings about "expected" being shadowed. I looked through a bunch of your remaining

Re: shadow variables - pg15 edition

2022-08-18 Thread David Rowley
On Fri, 19 Aug 2022 at 11:21, Justin Pryzby wrote: > > On Thu, Aug 18, 2022 at 07:27:09PM +1200, David Rowley wrote: > > Any objections to pushing this to master only? > > I won't object, but some of your changes are what makes backpatching this less > reasonable (

Re: shadow variables - pg15 edition

2022-08-18 Thread David Rowley
On Thu, 18 Aug 2022 at 17:16, Justin Pryzby wrote: > > On Thu, Aug 18, 2022 at 03:17:33PM +1200, David Rowley wrote: > > I'm probably not the only committer to want to run a mile when they > > see someone posting 17 or 26 patches in an email. So maybe "bang for >

Re: shadow variables - pg15 edition

2022-08-17 Thread David Rowley
On Thu, 18 Aug 2022 at 02:54, Justin Pryzby wrote: > The first half of the patches fix shadow variables newly-introduced in v15 > (including one of my own patches), the rest are fixing the lowest hanging > fruit > of the "short list" from COPT=-Wshadow=compatible-local I wonder if it's better

Re: POC: GROUP BY optimization

2022-08-17 Thread David Rowley
On Thu, 18 Aug 2022 at 02:46, Tomas Vondra wrote: > So I don't think the current costing is wrong, but it certainly is more > complex. But the test does not test what it intended - I have two ideas > how to make it work: > > 1) increase the number of rows in the table > > 2) increase

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-17 Thread David Rowley
On Wed, 17 Aug 2022 at 13:57, Justin Pryzby wrote: > But in a few places, it removes the locally-computed group_pathkeys: > > - List *group_pathkeys = root->group_pathkeys; > I noticed because that creates a new shadow variable, which seems accidental. Thanks for the

Re: [PoC] Reducing planning time when tables have many partitions

2022-08-16 Thread David Rowley
esOn Tue, 9 Aug 2022 at 19:10, David Rowley wrote: > I've not had a chance to look at the 0003 patch yet. I've looked at the 0003 patch now. The performance numbers look quite impressive, however, there were a few things about the patch that I struggled to figure what they were done the way

Re: avoid negating LONG_MIN in cash_out()

2022-08-11 Thread David Rowley
On Fri, 12 Aug 2022 at 05:58, Zhihong Yu wrote: > Here is sample output with patch: > > # SELECT '-92233720368547758.085'::money; > ERROR: value "-92233720368547758.085" is out of range for type money > LINE 1: SELECT '-92233720368547758.085'::money; I'm struggling to follow along here. There

Re: Reducing the chunk header sizes on all memory context types

2022-08-10 Thread David Rowley
ontextMethods mcxt_methods[] = { ... > Mildly wondering whether we ought to use designated initializers instead, > given we're whacking it around already. Too easy to get the order wrong when > adding new members, and we might want to have optional callbacks too. I've adjusted how this array

Re: dropping datumSort field

2022-08-09 Thread David Rowley
On Wed, 10 Aug 2022 at 03:16, Zhihong Yu wrote: > On Tue, Aug 9, 2022 at 8:01 AM Robert Haas wrote: >> >> One problem with this patch is that, if I apply it, PostgreSQL does not >> compile: >> >> nodeSort.c:197:6: error: use of undeclared identifier 'tupDesc' >> if (tupDesc->natts == 1)

Re: Reducing the chunk header sizes on all memory context types

2022-08-09 Thread David Rowley
Thanks for giving this a look. On Wed, 10 Aug 2022 at 02:37, Robert Haas wrote: > # We also add a restriction that block sizes for all 3 of the memory > # allocators cannot be 1GB or larger. We would be unable to store the > # number of bytes that the block is offset from the chunk stored

Re: Reducing the chunk header sizes on all memory context types

2022-08-09 Thread David Rowley
On Wed, 10 Aug 2022 at 09:23, Tom Lane wrote: > > Andres Freund writes: > > On 2022-08-09 15:21:57 -0400, Tom Lane wrote: > >> Do we really need it to be that tight? I know we only have 3 methods > >> today, > >> but 8 doesn't seem that far away. If there were six bits reserved for > >> this

Re: Reducing the chunk header sizes on all memory context types

2022-08-09 Thread David Rowley
On Wed, 13 Jul 2022 at 17:20, David Rowley wrote: > I did consider that in all cases where the allocations are above > allocChunkLimit that the chunk is put on a dedicated block and in > fact, the blockoffset is always the same for those. I wondered if we > could use the

Re: [PoC] Reducing planning time when tables have many partitions

2022-08-09 Thread David Rowley
On Mon, 8 Aug 2022 at 23:28, Yuya Watari wrote: > If you have already applied David's patch, please start the 'git am' > command from 0002-Fix-bugs.patch. All regression tests passed with > this patch on my environment. Thanks for fixing those scope bugs. In regards to the 0002 patch, you have;

Re: ERREUR: cache lookup failed for function 0 with PostgreSQL 15 beta 2, no error with PostgreSQL 14.4

2022-08-04 Thread David Rowley
On Fri, 5 Aug 2022 at 01:20, Phil Florent wrote: > with fakedata as ( >select 'hello' word >union all >select 'world' word > ) > select * > from ( >select word, count(*) over (partition by word) nb from fakedata > ) t where nb = 1; >

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 07:04, Jacob Champion wrote: > This entry has been waiting on author input for a while (our current > threshold is roughly two weeks), so I've marked it Returned with > Feedback. Thanks for taking care of this. You dealt with this correctly based on the fact that I'd failed

Re: Parallel Aggregates for string_agg and array_agg

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 14:40, Zhihong Yu wrote: > For array_agg_combine(): > > + if (state1->alen < reqsize) > + { > + /* Use a power of 2 size rather than allocating just reqsize */ > + state1->alen = pg_nextpower2_32(reqsize); > ... > + state1->nelems =

Re: Parallel Aggregates for string_agg and array_agg

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 12:18, Tom Lane wrote: > Now as against that, if the underlying relation scan is parallelized > then you already have unpredictable input ordering and thus unpredictable > aggregate results. So anyone who cares about that has already had > to either disable parallelism or

Re: Parallel Aggregates for string_agg and array_agg

2022-08-02 Thread David Rowley
On Wed, 20 Jun 2018 at 19:08, David Rowley wrote: > I'll submit it again when there more consensus that we want this. Waking up this old thread again. If you don't have a copy, the entire thread is in [1]. The remaining item that seemed to cause this patch to be rejected was raised in

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 07:31, Zhihong Yu wrote: > On Tue, Aug 2, 2022 at 11:02 AM Zhihong Yu wrote: >> I was looking at the final patch and noticed that setno field in >> agg_presorted_distinctcheck struct is never used. > Looks like setoff field is not used either. Thanks for the report. It

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Wed, 3 Aug 2022 at 01:19, Tom Lane wrote: > > David Rowley writes: > > I chatted to Andres and Thomas about this last week and their view > > made me think it might not be quite as clear-cut as "just bump it up a > > bunch because it's ridiculously low" that

Why is DEFAULT_FDW_TUPLE_COST so insanely low?

2022-08-02 Thread David Rowley
Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST, which is defined as 0.01 was unrealistically low. For comparison, cpu_tuple_cost, something we probably expect to be in a CPU cache is also 0.01. We've defined DEFAULT_PARALLEL_TUPLE_COST to be 0.1, which is 10x cpu_tuple_cost.

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
On Mon, 1 Aug 2022 at 03:49, Tom Lane wrote: > Are you going to push the other patch (adjusting > select_outer_pathkeys_for_merge) first, so that we can see the residual > plan changes that this patch creates? I'm not entirely comfortable > with the regression test changes as posted. Yes, I

Re: POC: GROUP BY optimization

2022-08-02 Thread David Rowley
On Tue, 2 Aug 2022 at 22:21, Michael Paquier wrote: > > On Fri, Jul 15, 2022 at 09:46:51PM +0200, Tomas Vondra wrote: > > I agree this is a mistake in db0d67db2 that makes the test useless. > > Tomas, please note that this is an open item assigned to you. Are you > planning to do something with

Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-08-01 Thread David Rowley
On Thu, 21 Jul 2022 at 14:23, Richard Guo wrote: > > On Thu, Jul 21, 2022 at 3:47 AM David Rowley wrote: >> Maybe your idea could be made to work in cases where >> bms_equal(joinrel->relids, root->all_baserels). In that case, we >> should not be processing an

Re: Skip partition tuple routing with constant partition key

2022-08-01 Thread David Rowley
On Thu, 28 Jul 2022 at 19:37, Amit Langote wrote: > > On Thu, Jul 28, 2022 at 11:59 AM David Rowley wrote: > > I'd quite like to push this patch early next week, so if anyone else > > is following along that might have any objections, could they do so > > before

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-28 Thread David Rowley
On Wed, 27 Jul 2022 at 15:16, Richard Guo wrote: > That makes sense. The patch looks in a good shape to me in this part. Thanks for giving it another look. I'm also quite happy with the patch now. The 2 plan changes are explained. I have a patch on another thread [1] for the change in the Merge

Re: Skip partition tuple routing with constant partition key

2022-07-27 Thread David Rowley
On Thu, 28 Jul 2022 at 00:50, Amit Langote wrote: > So, in a way the caching scheme works for > LIST partitioning only if the same value appears consecutively in the > input set, whereas it does not for *a set of* values belonging to the > same partition appearing consecutively. Maybe that's a

Re: [PoC] Reducing planning time when tables have many partitions

2022-07-27 Thread David Rowley
On Wed, 27 Jul 2022 at 18:07, Yuya Watari wrote: > I agree that introducing a Bitmapset field may solve this problem. I > will try this approach in addition to previous ones. I've attached a very half-done patch that might help you get started on this. There are still 2 failing regression tests

Re: [PoC] Reducing planning time when tables have many partitions

2022-07-26 Thread David Rowley
On Mon, 4 Jul 2022 at 09:28, Tom Lane wrote: > For the bms_equal class of lookups, I wonder if we could get anywhere > by adding an additional List field to every RelOptInfo that chains > all EquivalenceMembers that match that RelOptInfo's relids. > The trick here would be to figure out when to

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread David Rowley
On Tue, 26 Jul 2022 at 19:39, Richard Guo wrote: > Also I'm wondering if it's possible to take into consideration the > ordering indicated by existing indexes when determining the pathkeys. So > that for the query below we can avoid the Incremental Sort node if we > consider that there is an

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-26 Thread David Rowley
On Tue, 26 Jul 2022 at 12:01, Zhihong Yu wrote: > sort order the the planner chooses is simply : there is duplicate `the` I think the first "the" should be "that" > + /* mark this aggregate is covered by 'currpathkeys' */ > > is covered by -> as covered by I think it was

Re: Skip partition tuple routing with constant partition key

2022-07-26 Thread David Rowley
Thank for looking at this. On Sat, 23 Jul 2022 at 01:23, Amit Langote wrote: > + /* > +* The Datum has changed. Zero the number of times we've > +* found last_found_datum_index in a row. > +*/ > +

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-25 Thread David Rowley
ached to make it properly include the case where currpathkeys are better. Thanks for the review. David From 1ebfac68080b10181086b4c9c9dcb3e3e1582cdb Mon Sep 17 00:00:00 2001 From: David Rowley Date: Mon, 3 May 2021 16:31:29 +1200 Subject: [PATCH v8] Add planner support for ORDER BY aggregates

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread David Rowley
On Fri, 22 Jul 2022 at 15:22, Amit Langote wrote: > BTW, the only way I found to *forcefully* exercise llvm_compile_expr() > is to add `set jit_above_cost to 0` at the top of the test file, or > are we missing a force_jit_mode, like there is force_parallel_mode? I don't think we'd need any

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-21 Thread David Rowley
On Thu, 4 Nov 2021 at 20:59, Ronan Dunklau wrote: > I took some time to toy with this again. > > At first I thought that charging a discount in foreign grouping paths for > Aggref targets (since they are computed remotely) would be a good idea, > similar to what is done for the grouping keys. > >

Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-20 Thread David Rowley
On Wed, 20 Jul 2022 at 21:19, Richard Guo wrote: > So the idea is if the ECs used by the mergeclauses are prefix of the > query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why > not relax this further that if the ECs in the mergeclauses and the > query_pathkeys have common prefix,

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-19 Thread David Rowley
postgresql.org/message-id/caaphdvrtzu0phvfdpfm4yx3jnr2wuwosv+t2zqa7lrhhbr2...@mail.gmail.com From f586a426d1bb3b32aa3be8dec94b17944353600e Mon Sep 17 00:00:00 2001 From: David Rowley Date: Mon, 3 May 2021 16:31:29 +1200 Subject: [PATCH v7] Add planner support for ORDER BY aggregates ORDER BY aggreagte

Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-19 Thread David Rowley
On Wed, 20 Jul 2022 at 16:21, Thomas Munro wrote: > Back in commit 4f658dc8 we gained src/port/fls.c. As anticipated by > its commit message, we later finished up with something better in > src/include/port/pg_bitutils.h. fls() ("find last set") is an > off-by-one cousin of

Re: Windows now has fdatasync()

2022-07-19 Thread David Rowley
On Wed, 20 Jul 2022 at 15:22, Thomas Munro wrote: > Ok, I've pushed the Windows patch. I'll watch the build farm to see > if I've broken any of the frankentoolchain Windows animals. Just to get in there before the farm does... I just got a boatload of redefinition of HAVE_FDATASYNC warnings. I

Re: Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-19 Thread David Rowley
On Wed, 20 Jul 2022 at 15:02, David Rowley wrote: > I've attached a patch for this and it changes the plan for the above query to: Looks like I based that patch on the wrong branch. Here's another version of the patch that's based on master. David diff --git a/src/backend/optimizer/p

Is select_outer_pathkeys_for_merge() too strict now we have Incremental Sorts?

2022-07-19 Thread David Rowley
Hackers, Currently, if we have a query such as: SELECT a,b, COUNT(*) FROM a INNER JOIN b on a.a = b.x GROUP BY a,b ORDER BY a DESC, b; With enable_hashagg = off, we get the following plan: QUERY PLAN --- GroupAggregate Group Key: a.a, a.b

Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread David Rowley
On Sat, 16 Jul 2022 at 10:40, Jonathan S. Katz wrote: > What I find interesting is the resistance to adding any documentation > around this feature to guide users in case they hit the regression. I > understand it can be difficult to provide guidance on issues related to > adjusting work_mem, but

Re: POC: GROUP BY optimization

2022-07-14 Thread David Rowley
On Thu, 31 Mar 2022 at 12:19, Tomas Vondra wrote: > Pushed, after going through the patch once more, running check-world > under valgrind, and updating the commit message. I'm still working in this area and I noticed that db0d67db2 updated some regression tests in partition_aggregate.out without

Re: [PATCH v1] remove redundant check of item pointer

2022-07-14 Thread David Rowley
On Fri, 15 Jul 2022 at 10:31, Bruce Momjian wrote: > for non-Assert builds, ItemPointerGetOffsetNumberNoCheck() and > ItemPointerGetOffsetNumber() are the same, so I don't see the point to > making this change. Frankly, I don't know why we even have two > functions for this. I am guessing

Re: Reducing the chunk header sizes on all memory context types

2022-07-12 Thread David Rowley
On Wed, 13 Jul 2022 at 05:42, Andres Freund wrote: > > There is at least one. It might be major; to reduce the AllocSet chunk > > header from 16 bytes down to 8 bytes I had to get rid of the freelist > > pointer that was reusing the "aset" field in the chunk header struct. > > This works now by

Re: Reducing the chunk header sizes on all memory context types

2022-07-12 Thread David Rowley
On Wed, 13 Jul 2022 at 05:44, Andres Freund wrote: > On 2022-07-12 20:22:57 +0300, Yura Sokolov wrote: > > I don't get, why "large chunk" needs additional fields for size and > > offset. > > Large allocation sizes are certainly rounded to page size. > > And allocations which doesn't fit 1GB we

Re: minor change for create_list_bounds()

2022-07-12 Thread David Rowley
On Thu, 30 Jun 2022 at 11:41, Nathan Bossart wrote: > > On Tue, Mar 08, 2022 at 11:05:10AM -0800, Zhihong Yu wrote: > > I was looking at commit db632fbca and noticed that, > > in create_list_bounds(), if index is added to boundinfo->interleaved_parts > > in the first if statement, there is no

Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-12 Thread David Rowley
On Tue, 12 Jul 2022 at 17:15, David Rowley wrote: > So far only Robert has raised concerns with this regression for PG15 > (see [2]). Tom voted for leaving things as they are for PG15 in [3]. > John agrees, as quoted above. Does anyone else have any opinion? Let me handle this

Re: remove_useless_groupby_columns is too enthusiastic

2022-07-12 Thread David Rowley
On Wed, 13 Jul 2022 at 05:31, Tom Lane wrote: > I tried the attached quick-hack patch that just prevents > remove_useless_groupby_columns from removing anything that > appears in ORDER BY. That successfully fixes the complained-of > case, and it doesn't change any existing regression test

Re: POC: GROUP BY optimization

2022-07-12 Thread David Rowley
On Thu, 31 Mar 2022 at 12:19, Tomas Vondra wrote: > Pushed, after going through the patch once more, running check-world > under valgrind, and updating the commit message. I'm just in this general area of the code again today and wondered about the header comment for the preprocess_groupclause()

Re: Some clean-up work in get_cheapest_group_keys_order()

2022-07-12 Thread David Rowley
On Wed, 13 Jul 2022 at 13:12, Tom Lane wrote: > > David Rowley writes: > > On Wed, 13 Jul 2022 at 11:02, Tom Lane wrote: > >> Agreed, but I think there are other instances of that idiom that > >> should be cleaned up while you're at it. > > > Agreed. I

Re: Some clean-up work in get_cheapest_group_keys_order()

2022-07-12 Thread David Rowley
On Wed, 13 Jul 2022 at 11:02, Tom Lane wrote: > > David Rowley writes: > > * I think list_truncate(list_copy(list), n) is a pretty bad way to > > copy the first n elements of a list, especially when n is likely to be > > 0 most of the time. I think we should ju

Some clean-up work in get_cheapest_group_keys_order()

2022-07-12 Thread David Rowley
I was rebasing a patch which requires me to make some changes in get_cheapest_group_keys_order(). I noticed a few things in there that I think we could do a little better on: * The code uses pfree() on a list and it should be using list_free() * There's a manually coded for loop over a list

Re: Reducing Memory Consumption (aset and generation)

2022-07-11 Thread David Rowley
On Mon, 11 Jul 2022 at 20:48, Matthias van de Meent wrote: > > 2) v1-002-generation-reduces-memory-consumption.patch > > Reduces memory used by struct GenerationBlock, by minus 8 bits, > > That seems fairly straight-forward -- 8 bytes saved on each page isn't > a lot, but it's something. I think

Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-11 Thread David Rowley
On Fri, 3 Jun 2022 at 16:03, John Naylor wrote: > > On Fri, Jun 3, 2022 at 10:14 AM David Rowley wrote: > > 4. As I just demonstrated in [1], if anyone is caught by this and has > > a problem, the work_mem size increase required seems very small to get > > performance bac

Re: tuplesort Generation memory contexts don't play nicely with index builds

2022-07-06 Thread David Rowley
On Thu, 7 Jul 2022 at 13:41, John Naylor wrote: > > On Thu, Jul 7, 2022 at 3:16 AM David Rowley wrote: > > > > Pushed. > > Hmm, the commit appeared on git.postgresql.org, but apparently not in > my email nor the list archives. Strange. I'd suspect a temporary hic

Re: tuplesort Generation memory contexts don't play nicely with index builds

2022-07-06 Thread David Rowley
On Wed, 6 Jul 2022 at 13:34, David Rowley wrote: > If there are no objections, I plan to push this in the next 24 hours. Pushed. David

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-06 Thread David Rowley
Thanks for looking at this. On Wed, 6 Jul 2022 at 12:32, Andres Freund wrote: > > On 2022-06-29 11:40:45 +1200, David Rowley wrote: > > Another small thing which I considered doing was to put the > > hash_fcinfo_data field as the final field in > > ScalarArrayOpExprHa

Re: tuplesort Generation memory contexts don't play nicely with index builds

2022-07-05 Thread David Rowley
On Wed, 29 Jun 2022 at 12:59, David Rowley wrote: > I noticed while doing some memory context related work that since we > now use generation.c memory contexts for tuplesorts (40af10b57) that > tuplesort_putindextuplevalues() causes memory "leaks" in the >

Re: tuplesort Generation memory contexts don't play nicely with index builds

2022-06-29 Thread David Rowley
On Wed, 29 Jun 2022 at 12:59, David Rowley wrote: > I noticed while doing some memory context related work that since we > now use generation.c memory contexts for tuplesorts (40af10b57) that > tuplesort_putindextuplevalues() causes memory "leaks" in the >

Re: Can we do something to help stop users mistakenly using force_parallel_mode?

2022-06-29 Thread David Rowley
On Wed, 29 Jun 2022 at 18:57, Laurenz Albe wrote: > Perhaps some stronger wording in the documetation would be beneficial. > I have little sympathy with people who set unusual parameters without > even glancing at the documentation. My thoughts are that the documentation is ok as is. I have a

Re: Can we do something to help stop users mistakenly using force_parallel_mode?

2022-06-29 Thread David Rowley
On Thu, 30 Jun 2022 at 00:31, Justin Pryzby wrote: > Since the user in this recent thread is running v13.7, I'm *guessing* that > if that had been backpatched, they wouldn't have made this mistake. I wasn't aware of that change. Thanks for highlighting it. Maybe it's worth seeing if fewer

Can we do something to help stop users mistakenly using force_parallel_mode?

2022-06-28 Thread David Rowley
Over on [1] I noticed that the user had set force_parallel_mode to "on" in the hope that would trick the planner into making their query run more quickly. Of course, that's not what they want since that GUC is only there to inject some parallel nodes into the plan in order to verify the tuple

tuplesort Generation memory contexts don't play nicely with index builds

2022-06-28 Thread David Rowley
Hackers, I noticed while doing some memory context related work that since we now use generation.c memory contexts for tuplesorts (40af10b57) that tuplesort_putindextuplevalues() causes memory "leaks" in the generation context due to index_form_tuple() being called while we're switched into the

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-06-28 Thread David Rowley
patch where you mistakenly cast to an OpExpr instead of ScalarArrayOpExpr when you were fetching the inputcollid) David From 7a1e71a6b622eff72537ae86187b312d01a5e11d Mon Sep 17 00:00:00 2001 From: David Rowley Date: Wed, 29 Jun 2022 10:43:41 +1200 Subject: [PATCH v3] Remove size increase in Exp

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-06-28 Thread David Rowley
On Sat, 18 Jun 2022 at 05:21, Andres Freund wrote: > > On 2022-06-17 14:14:54 +1200, David Rowley wrote: > > So, there appears to be no performance regression due to the extra > > indirection. There's maybe even some gains due to the smaller step > > size. > > "

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-06-16 Thread David Rowley
On Fri, 17 Jun 2022 at 15:33, Peter Geoghegan wrote: > Have you tried this with the insert benchmark [1]? I was mostly focusing on the performance of the hashed saop feature after having removed the additional fields that pushed ExprEvalStep over 64 bytes in 14. I agree it would be good to do

Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-06-16 Thread David Rowley
On Fri, 17 Jun 2022 at 11:31, Andres Freund wrote: > hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the > limit is 40 bytes. > commit 50e17ad281b8d1c1b410c9833955bc80fbad4078 > Author: David Rowley > Date: 2021-04-08 23:51:22 +1200 > > Spee

Re: Allow foreign keys to reference a superset of unique columns

2022-06-09 Thread David Rowley
On Fri, 10 Jun 2022 at 16:14, Peter Eisentraut wrote: > > On 10.06.22 05:47, David Rowley wrote: > >> I think this should be referring to constraint name, not an index name. > > Can you explain why you think that? > > If you wanted to specify this feature in the SQL st

Re: Allow foreign keys to reference a superset of unique columns

2022-06-09 Thread David Rowley
On Fri, 10 Jun 2022 at 15:08, Peter Eisentraut wrote: > > On 07.06.22 20:59, Kaiting Chen wrote: > > 2. The FOREIGN KEY constraint syntax gains a [ USING INDEX index_name ] > > clause > > optionally following the referenced column list. > > > > The index specified by this clause is used

Re: Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list

2022-06-07 Thread David Rowley
On Wed, 8 Jun 2022 at 08:31, David Rowley wrote: > So it does appear that the location index is being chosen, at least > with the data that I inserted. Those gist indexes are costing quite a > bit cheaper than the cheapest btree index. This seems just to be because the gist indexes ar

Re: Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list

2022-06-07 Thread David Rowley
On Wed, 8 Jun 2022 at 07:55, Tom Lane wrote: > > David Rowley writes: > > On Tue, 7 Jun 2022 at 19:58, Jean Landercy - BEEODIVERSITY > > wrote: > >> Here is the detail of the table (I have anonymized it on SO, this is its > >> real name): > >&

Re: Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list

2022-06-07 Thread David Rowley
On Tue, 7 Jun 2022 at 19:58, Jean Landercy - BEEODIVERSITY wrote: > Here is the detail of the table (I have anonymized it on SO, this is its real > name): > "logistic_site_location_54ae0166_id" gist (location) I imagine this is due to the planner choosing an index-only scan on the above

Re: Reducing Memory Consumption (aset and generation)

2022-06-06 Thread David Rowley
On Mon, 6 Jun 2022 at 07:28, Ranier Vilela wrote: > 4) 004-generation-reduces-memory-consumption-BUG.patch > Same to the (2), but with BUG. > It only takes a few tweaks to completely remove the field block. > This fails with make check. > I couldn't figure out why it doesn't work with 16 bits

Re: pgcon unconference / impact of block size on performance

2022-06-06 Thread David Rowley
On Sun, 5 Jun 2022 at 11:23, Tomas Vondra wrote: > At on of the pgcon unconference sessions a couple days ago, I presented > a bunch of benchmark results comparing performance with different > data/WAL block size. Most of the OLTP results showed significant gains > (up to 50%) with smaller (4k)

Re: Sudden database error with COUNT(*) making Query Planner crashes: variable not found in subplan target list

2022-06-06 Thread David Rowley
On Mon, 6 Jun 2022 at 21:34, Jean Landercy - BEEODIVERSITY wrote: > SELECT COUNT(*) FROM items; > -- ERROR: variable not found in subplan target list > -- SQL state: XX000 Can you share some more details about what "items" is. psql's "\d items" output would be useful. From what you've reported

Re: PG15 beta1 sort performance regression due to Generation context change

2022-06-02 Thread David Rowley
On Wed, 1 Jun 2022 at 03:09, Tom Lane wrote: > Right now my vote would be to leave things as they stand for v15 --- > the performance loss that started this thread occurs in a narrow > enough set of circumstances that I don't feel too much angst about > it being the price of winning in most other

Re: PG15 beta1 sort performance regression due to Generation context change

2022-06-02 Thread David Rowley
On Thu, 2 Jun 2022 at 20:20, John Naylor wrote: > If anything, changing work_mem is an > easy to understand (although sometimes not practical) workaround. I had a quick look at that for the problem case and we're very close in terms of work_mem size to better performance. A work_mem of just

Re: PostgreSQL Limits: maximum number of columns in SELECT result

2022-05-31 Thread David Rowley
On Wed, 1 Jun 2022 at 12:42, Tom Lane wrote: > > David Rowley writes: > > I've adjusted the patch to use the wording proposed by Alvaro. See attached. > > Should we also change the adjacent item to "columns in a table", > for consistency of wording?

Re: PostgreSQL Limits: maximum number of columns in SELECT result

2022-05-31 Thread David Rowley
On Wed, 1 Jun 2022 at 07:08, Dave Cramer wrote: > > On Tue, 31 May 2022 at 14:51, Tom Lane wrote: >> >> Alvaro Herrera writes: >> > I think it's reasonable to have two adjacent rows in the table for these >> > two closely related things, but rather than "columns per tuple" I would >> > label

Re: PG15 beta1 sort performance regression due to Generation context change

2022-05-30 Thread David Rowley
On Tue, 31 May 2022 at 09:48, Peter Geoghegan wrote: > Shouldn't this be using the geometric mean rather than the arithmetic > mean? That's pretty standard practice when summarizing a set of > benchmark results that are expressed as ratios to some baseline. Maybe just comparing the SUM of the

Re: PG15 beta1 sort performance regression due to Generation context change

2022-05-25 Thread David Rowley
On Wed, 25 May 2022 at 15:09, David Rowley wrote: > I didn't test the performance of an aset.c context. I imagine it's > likely to be less overhead due to aset.c being generally slower from > having to jump through a few more hoops during palloc/pfree. I've attached the results f

Re: First draft of the PG 15 release notes

2022-05-24 Thread David Rowley
On Wed, 25 May 2022 at 15:01, Amit Langote wrote: > +Previously, a partitioned table with DEFAULT partition or a LIST > partition containing multiple values could not be used for ordered > partition scans. Now it can be used at least in the cases where such > partitions are pruned. I think this

Re: PG15 beta1 sort performance regression due to Generation context change

2022-05-24 Thread David Rowley
On Wed, 25 May 2022 at 04:02, Tom Lane wrote: > Here's a draft patch for the other way of doing it. I'd first tried > to make the side-effects completely local to generation.c, but that > fails in view of code like > > MemoryContextAlloc(GetMemoryChunkContext(x), ...) > > Thus we pretty

Re: PG15 beta1 sort performance regression due to Generation context change

2022-05-24 Thread David Rowley
On Tue, 24 May 2022 at 09:36, Tom Lane wrote: > > David Rowley writes: > > Handy wavy idea: It's probably too complex for now, and it also might > > be too much overhead, but having GenerationPointerGetChunk() do a > > binary search on a sorted-by-memory-address array of

Re: partition wise aggregate wrong rows cost

2022-05-23 Thread David Rowley
On Tue, 24 May 2022 at 15:38, bucoo wrote: > > Normal aggregate and partition wise aggregate have a big difference rows cost: > explain (verbose) > select count(1) from t1 group by id; > HashAggregate (cost=106.20..108.20 rows=200 width=12) --here rows is 200 > set

<    5   6   7   8   9   10   11   12   13   14   >