Re: Fixes a typo in tablecmd

2025-12-14 Thread David Rowley
On Mon, 15 Dec 2025 at 12:26, Chao Li wrote: > ``` > /* > * If the table at either end of the constraint is partitioned, we need to > * recurse and handle every unvalidate constraint that is a child of this > * constraint. > */ > ``` > So, I believe “unvalidate” is a typo, a “d” should be appende

Re: Fix incorrect comments in tuplesort.c

2025-12-09 Thread David Rowley
On Mon, 8 Dec 2025 at 15:08, cca5507 wrote: > > Using "However" here indicates some exception to what's just been > > said, but there is no longer an exception. To write about what the > > 1024 is for, we might need to reverse engineer what that's for. I > > assume it's something like "Clamp at 10

Re: Fix repetition in hash index documentation

2025-12-08 Thread David Rowley
On Tue, 9 Dec 2025 at 08:41, David Geier wrote: > Attached patch fixes a repetition in the user-facing hash index > documentation and the README. Thanks. I can take care of that. I'll backpatch to v15 due to the sgml docs. The README issue exists in v14, but likely isn't doing enough harm to warr

Re: Introduce Index Aggregate - new GROUP BY strategy

2025-12-08 Thread David Rowley
On Tue, 9 Dec 2025 at 04:37, Sergey Soloviev wrote: > I would like to introduce new GROUP BY strategy, called Index Aggregate. > In a nutshell, we build B+tree index where GROUP BY attributes are index > keys and if memory limit reached we will build index for each batch and > spill it to the dis

Re: Fix incorrect comments in tuplesort.c

2025-12-07 Thread David Rowley
On Mon, 8 Dec 2025 at 12:10, David G. Johnston wrote: > > On Sun, Dec 7, 2025 at 3:09 PM David Rowley wrote: >> >> The comment is effectively >> explaining that we don't want to make the array big enough so that a >> malloc will always be required. > >

Re: Fix incorrect comments in tuplesort.c

2025-12-07 Thread David Rowley
On Sun, 7 Dec 2025 at 21:34, cca5507 wrote: > I find that the initial size of memtuples array must be more than > ALLOCSET_SEPARATE_THRESHOLD > is not for lower overhead of allocation, but for a bug fixed in > 8ea3e7a75c0d22c41c57f59c8b367059b97d0b66. > > Attach a new patch. I find the current

Re: Support tid range scan in parallel?

2025-12-07 Thread David Rowley
On Sat, 6 Dec 2025 at 09:12, Cary Huang wrote: > > within each worker process. > > > > > > + > > + > > +In a parallel tid range scan, the range of > > blocks > > +will be subdivided into smaller ranges which are shared among the > > +cooper

Re: [Patch] Build the heap more efficient in tuplesort.c

2025-11-30 Thread David Rowley
On Sun, 30 Nov 2025 at 22:36, cca5507 wrote: > Now we build the heap by using tuplesort_heap_insert(), which has a sift-up > every call. > > To make it more efficient, I want to add tuplesort_heap_insert_unordered() > and tuplesort_heap_build() > just like binaryheap_add_unordered() and binaryhe

Re: Support tid range scan in parallel?

2025-11-27 Thread David Rowley
On Fri, 28 Nov 2025 at 02:31, David Rowley wrote: > The patch basically adds: > > if (!keep_startblock) > scan->rs_startblock = InvalidBlockNumber; I've pushed that patch. David

Re: Support tid range scan in parallel?

2025-11-27 Thread David Rowley
On Thu, 27 Nov 2025 at 18:48, David Rowley wrote: > This seems to have caused issues on skink [1] under Valgrind. The > problem seems to be that initscan() does not always initialise > rs_startblock. I'm now trying to figure out if there's some reason for > that, or if tha

Re: Support tid range scan in parallel?

2025-11-26 Thread David Rowley
On Thu, 27 Nov 2025 at 14:07, David Rowley wrote: > I went over this again today and only made a few whitespace > adjustments in the tests. I've now pushed the resulting patch. This seems to have caused issues on skink [1] under Valgrind. The problem seems to be that initscan() does

Re: Partial hash index is not used for implied qual.

2025-11-26 Thread David Rowley
On Thu, 27 Nov 2025 at 07:26, Tom Lane wrote: > I checked the costing calculations and it's basically that > genericcostestimate doesn't understand about hash buckets. > For the partial index, it correctly estimates that we'll visit > all 10 of the tuples in the index, so it supposes that that > m

Re: Support tid range scan in parallel?

2025-11-26 Thread David Rowley
On Tue, 18 Nov 2025 at 14:51, David Rowley wrote: > I've attached v12, which adds a mention in the docs about Parallel TID > Range scans being supported. It also does very minor adjustments to > the comments. Again, I've kept Cary's v10 and the changes I've made >

Re: Some optimizations for COALESCE expressions during constant folding

2025-11-26 Thread David Rowley
On Thu, 27 Nov 2025 at 00:55, Richard Guo wrote: > > On Tue, Nov 25, 2025 at 10:16 PM David Rowley wrote: > > uhh, of course it is. That's what I did in [1] for Consts. Doing it > > this way means we'll not need to modify the constant folding code (or > > which

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-11-26 Thread David Rowley
On Wed, 26 Nov 2025 at 12:37, David Rowley wrote: > I'm not seeing any reason now not to go ahead with this now. Does > anyone else want to take a look at it before I start wielding the > sword of commitment on it? And pushed. Many thanks to Corey and Matheus for having a look at this. David

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-11-25 Thread David Rowley
On Wed, 5 Nov 2025 at 13:16, David Rowley wrote: > I've attached a version with the NOT_USED part removed (and a bunch of > #includes I forgot to remove). The only other change was a minor > revision to some comments. This patch needed to be rebased due to the changes made in b14

Re: Some optimizations for COALESCE expressions during constant folding

2025-11-25 Thread David Rowley
On Wed, 26 Nov 2025 at 02:10, David Rowley wrote: > > On Tue, 25 Nov 2025 at 23:51, Richard Guo wrote: > > (I'm wondering whether it'd be better to consolidate the > > non-null check for Const, Var, and CoalesceExpr into one helper > > function to simplify

Re: Some optimizations for COALESCE expressions during constant folding

2025-11-25 Thread David Rowley
On Tue, 25 Nov 2025 at 23:51, Richard Guo wrote: > (I'm wondering whether it'd be better to consolidate the > non-null check for Const, Var, and CoalesceExpr into one helper > function to simplify the code in eval_const_expressions.) uhh, of course it is. That's what I did in [1] for Consts and e

Re: Partial hash index is not used for implied qual.

2025-11-25 Thread David Rowley
On Tue, 25 Nov 2025 at 15:01, Tom Lane wrote: > Actually, after thinking a bit longer, it'd be better to do something > like the attached so that we don't keep redundant quals unless they'd > *all* be excluded. I think your 1st patch was more along the correct lines. With the 2nd one, I think you

Re: Partial hash index is not used for implied qual.

2025-11-24 Thread David Rowley
On Mon, 24 Nov 2025 at 20:33, Sergei Glukhov wrote: > Partial hash index is not used if qual is an implied qual > since this qual is not added to indrestrictinfo and we cannot > get the keys needed to make hash index scan possible. > Suggested fix is to add implied qual for the indexes > which req

Re: Adjust comments for `IndexOptInfo` to accurately reflect indexcollations's length

2025-11-23 Thread David Rowley
On Mon, 24 Nov 2025 at 14:57, Junwang Zhao wrote: > > On Mon, Nov 24, 2025 at 6:52 AM David Rowley wrote: > > > > On Sun, 23 Nov 2025 at 01:44, Junwang Zhao wrote: > > Thanks. I agree with the patch. We should probably mention > > canreturn[] too. I can take ca

Re: Adjust comments for `IndexOptInfo` to accurately reflect indexcollations's length

2025-11-23 Thread David Rowley
On Sun, 23 Nov 2025 at 01:44, Junwang Zhao wrote: > Length of indexcollations[] in `IndexOptInfo` is nkeycolumns, but > struct comments fail to express this, it says: > > indexkeys[], indexcollations[] each have ncolumns entries. > > So we should polish this description. trivial patch attached. T

Re: another autovacuum scheduling thread

2025-11-23 Thread David Rowley
On Sun, 23 Nov 2025 at 07:35, Robert Haas wrote: > > On Sat, Nov 22, 2025 at 12:28 PM Sami Imseih wrote: > > What I have not been able to prove from my tests is that the processing > > order of tables by autovacuum will actually make things any better or any > > worse. My tests have been short 30

Re: 10% drop in code line count in PG 17

2025-11-20 Thread David Rowley
On Fri, 21 Nov 2025 at 10:26, Bruce Momjian wrote: > > On Fri, Nov 21, 2025 at 10:16:56AM +1300, David Rowley wrote: > > I think you need to keep the "top of the Git tree" comment as git > > ls-files is context-based. > > Uh, the current file has this comment: O

Re: another autovacuum scheduling thread

2025-11-20 Thread David Rowley
On Fri, 21 Nov 2025 at 10:16, Robert Haas wrote: > > On Thu, Nov 20, 2025 at 3:58 PM David Rowley wrote: > > before we need to make a decision. My vote is to use as much of that > > time as possible rather than using it to allow people to dream up > > hypothetical prob

Re: 10% drop in code line count in PG 17

2025-11-20 Thread David Rowley
On Fri, 21 Nov 2025 at 09:27, Bruce Momjian wrote: > # This script is used to compute the total number of "C" lines in the release > -# This should be run from the top of the Git tree after a 'make distclean' > -find . -name '*.[chyl]' | xargs cat| wc -l > +# This should be run from the top of the

Re: another autovacuum scheduling thread

2025-11-20 Thread David Rowley
On Fri, 21 Nov 2025 at 07:36, Robert Haas wrote: > If you want to take more manual control, you can use > the reloption, a choice that you can layer on top of the default > strategy or any of the alternate strategies just proposed. Of course, > making this all too complicated is a recipe for failu

Re: 10% drop in code line count in PG 17

2025-11-19 Thread David Rowley
On Thu, 20 Nov 2025 at 10:58, Bruce Momjian wrote: > I think you are right. Attached is the difference between the output > for 16 & 17. Let me do some more research and run all the versions > again and report back, thanks. Maybe you'd be better with git ls-files if you only want just what's in

Re: GUC thread-safety approaches

2025-11-18 Thread David Rowley
On Tue, 18 Nov 2025 at 21:50, Peter Eisentraut wrote: > where get_config_val_*() would be a thin wrapper around hash_search() > (a bit like the existing GetConfigOption() and find_option(), but > without all the error checking). > > Would that be too expensive? Why couldn't in-core GUCs be fields

Re: Support tid range scan in parallel?

2025-11-17 Thread David Rowley
On Fri, 7 Nov 2025 at 18:31, David Rowley wrote: > I still need to do a bit more testing on this, but in the meantime > thought I'd share what I've done with it so that other people can look > in parallel. I've been looking at the v11 patch again. I did some testing and

Re: Some efforts to get rid of "long" in our codebase

2025-11-16 Thread David Rowley
On Mon, 10 Nov 2025 at 13:53, David Rowley wrote: > v2-0001 wraps the format string as suggested by Heikki, v3-0001 uses > unsigned long long as suggested by Peter. > > v2-0002 is updated to use size_t instead of Size, per Heikki > > Any further opinions or votes on v2-0001 v

Re: obsolete autovacuum comment

2025-11-11 Thread David Rowley
On Wed, 12 Nov 2025 at 09:35, Nathan Bossart wrote: > > On Mon, Nov 10, 2025 at 02:03:27PM -0600, Nathan Bossart wrote: > > Any objections to removing this comment? > > If none materialize, I will proceed with removal in a couple of days. I've never been able to make sense of that comment. I trie

Re: another autovacuum scheduling thread

2025-11-11 Thread David Rowley
On Wed, 12 Nov 2025 at 09:25, Sami Imseih wrote: > The thing I’m hoping to address is something I’ve seen many times in practice. > Autovacuum workers can get stuck on specific large or slow tables, and when > that happens, users often end up running manual vacuums on those tables > just to keep t

Re: another autovacuum scheduling thread

2025-11-11 Thread David Rowley
On Wed, 12 Nov 2025 at 09:13, Nathan Bossart wrote: > > On Wed, Nov 12, 2025 at 09:03:54AM +1300, David Rowley wrote: > > /* when enough time has passed, refresh the list to ensure the > > scores aren't too out-of-date */ > > if (time is >

Re: another autovacuum scheduling thread

2025-11-11 Thread David Rowley
On Wed, 12 Nov 2025 at 05:36, Nathan Bossart wrote: > From skimming the latest discussion, I gather we might want to consider > re-sorting the list periodically. Is the idea that we'll re-sort the > remaining tables in the list, or that we'll basically restart > do_autovacuum()? If it's the latt

Re: another autovacuum scheduling thread

2025-11-10 Thread David Rowley
On Sat, 8 Nov 2025 at 08:23, Sami Imseih wrote: > > I'm confused at why we'd have set up our autovacuum trigger points as > > they are today because we think those are good times to do a > > vacuum/analyze, but then prioritise on something completely different. > > Surely if we think 20% dead tupl

Re: Some efforts to get rid of "long" in our codebase

2025-11-09 Thread David Rowley
r be more explicit about the width of the type by using uint64, but on the other hand, I don't know of any realistic modern hardware/compiler combination where unsigned long long isn't 64-bit. Certainly, the format string is easier to read with %llu. v2-0001 wraps the format string as sugges

Re: Support tid range scan in parallel?

2025-11-06 Thread David Rowley
On Sat, 30 Aug 2025 at 05:32, Cary Huang wrote: > > > The workers are ending their scan early because > > heap_getnextslot_tidrange() returns false on the first call from the > > parallel worker. > Yes, the previous v9 patch missed setting node->trss_mintid and > node->trss_maxtid, causing the

Re: another autovacuum scheduling thread

2025-11-06 Thread David Rowley
On Fri, 7 Nov 2025 at 11:21, Sami Imseih wrote: > Also, I am thinking about another sorting strategy based on average > autovacuum/autoanalyze time per table. The idea is to sort ascending by > the greater of the two averages, so workers process quicker tables first > instead of all workers potent

Re: Refactor StringInfo usage in subscriptioncmds.c

2025-11-06 Thread David Rowley
On Fri, 7 Nov 2025 at 01:47, Mats Kindahl wrote: > > I agree with Amit that there doesn't seem to be a need to free > > pubnames.data. We're already leaking publist, for instance. This is > > okay since we only call these functions during DDL, which in general is > > not sensitive to leaks. > >

Re: Some efforts to get rid of "long" in our codebase

2025-11-06 Thread David Rowley
On Fri, 7 Nov 2025 at 07:33, Peter Eisentraut wrote: > > On 06.11.25 12:46, David Rowley wrote: > > 0002: MemSet / MemSetAligned macros. It's probably about time someone > > made these disappear, but that's likely for another thread with more > > research th

Re: Refactor StringInfo usage in subscriptioncmds.c

2025-11-06 Thread David Rowley
On Fri, 7 Nov 2025 at 00:38, Mats Kindahl wrote: > As discussed in [1] the functions check_publications_origin_tables() and > check_publications_origin_sequences() are building error messages using > dynamically allocated StringInfo instances only to avoid duplicating a > call of ereport(). Looks

Some efforts to get rid of "long" in our codebase

2025-11-06 Thread David Rowley
I was playing around with the following imperfect regex to try and get an idea of how many "long" usages we have. git grep -E "^\s*(static)?\s*(unsigned)?\s*long\b" -- '*.c' '*.h' | wc -l REL_13_STABLE -> 486 REL_14_STABLE -> 482 REL_15_STABLE -> 476 REL_16_STABLE -> 485 REL_17_STABLE -> 449 REL_

Re: Use stack-allocated StringInfoData

2025-11-06 Thread David Rowley
On Fri, 7 Nov 2025 at 00:24, Álvaro Herrera wrote: > Yeah, I came across this a few weeks ago as well. I was thinking maybe > we can expand the ereport() macro instead of duplicating things. It > looks a bit ugly and I don't think we do it anywhere else ... I mean > something like > > errstart(

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-05 Thread David Rowley
On Wed, 5 Nov 2025 at 22:44, David Rowley wrote: > > On Wed, 5 Nov 2025 at 18:26, David Rowley wrote: > > I'm considering restricting that code Path to where the > > path->parent->reloptkind != RELOPT_UPPER_REL. > > I couldn't think of anything better,

Re: Use stack-allocated StringInfoData

2025-11-05 Thread David Rowley
On Thu, 6 Nov 2025 at 00:01, David Rowley wrote: > I can push this tomorrow morning UTC+13. I'll leave it til then in > case anyone else has any comments. I found a few more cases that could get the same treatment and also included the check_publications() one mentioned by Chao. I

Re: Use stack-allocated StringInfoData

2025-11-05 Thread David Rowley
On Tue, 4 Nov 2025 at 21:25, Mats Kindahl wrote: > Here is an updated patch that I checked manually for unnecessary > whitespace changes. Thanks for updating the patch. > There is one such case affected by this patch, and there the buffer > pointer is copied to another variable and then returned

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-05 Thread David Rowley
On Wed, 5 Nov 2025 at 18:26, David Rowley wrote: > I'm considering restricting that code Path to where the > path->parent->reloptkind != RELOPT_UPPER_REL. I couldn't think of anything better, so here's a patch to that effect. David v1-0001-Fix-UNION-planner-est

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-04 Thread David Rowley
On Wed, 5 Nov 2025 at 17:00, Alexander Lakhin wrote: > But while playing around, I've just discovered another interesting error: > > SET enable_seqscan = 'off'; > SELECT * FROM t EXCEPT SELECT * FROM t > UNION SELECT * FROM pt; > > leads to: > ERROR: XX000: no relation entry for relid 0 > LOCATIO

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-11-04 Thread David Rowley
On Wed, 5 Nov 2025 at 08:51, Matheus Alcantara wrote: > > On Mon Nov 3, 2025 at 7:47 PM -03, David Rowley wrote: > > Are you sure you've not got something else in your branch? It applies > > ok here, and the CFbot isn't complaining either. CFBot's is based on

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-04 Thread David Rowley
On Tue, 4 Nov 2025 at 23:00, David Rowley wrote: > > On Tue, 4 Nov 2025 at 22:54, David Rowley wrote: > > The reason we end up with the same result_rel is that we're not > > passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP, > > relids) due to having re

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-04 Thread David Rowley
On Tue, 4 Nov 2025 at 22:54, David Rowley wrote: > The reason we end up with the same result_rel is that we're not > passing all the relids in fetch_upper_rel(root, UPPERREL_SETOP, > relids) due to having removed dummy rels. I guess the fix might be > something like record the

Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs

2025-11-04 Thread David Rowley
On Tue, 4 Nov 2025 at 21:00, Alexander Lakhin wrote: > Please look at a new anomaly, introduced with 03d40e4b5: > CREATE TABLE t(i integer); > CREATE TABLE pt(i integer) PARTITION BY LIST(i); > > SET enable_seqscan = 'off'; > SELECT * FROM t UNION SELECT * FROM t > UNION ALL > SELECT * FROM pt; >

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-11-03 Thread David Rowley
On Sat, 1 Nov 2025 at 14:19, Corey Huinker wrote: >> >> which might be a more realistic thing if the query without the WHERE >> clause was part of a VIEW. However, we don't currently have any >> infrastructure to detect when a column *is* NULL. There's only the >> opposite with expr_is_nonnullable

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-11-03 Thread David Rowley
On Tue, 4 Nov 2025 at 09:38, Matheus Alcantara wrote: > I looked the code and it seems to be in a good shape, but I tried to > apply the v2 on top of e7ccb247b38 in master to run some tests and a > rebase is necessary. Are you sure you've not got something else in your branch? It applies ok here,

Re: Use stack-allocated StringInfoData

2025-11-03 Thread David Rowley
On Tue, 4 Nov 2025 at 09:20, Mats Kindahl wrote: > This is using Coccinelle to transform the text based on a semantic patch > (which is included in the patch). Unfortunately it mess with the whitespace a > bit so it is necessary to run pg_intent after. I'm surprised that there are > whitespace

Re: Use stack-allocated StringInfoData

2025-11-03 Thread David Rowley
On Mon, 3 Nov 2025 at 20:27, Mats Kindahl wrote: > We can use > >StringInfoData info; >initStringInfo(&info); >... >appendStringInfo(&info, ...); >... >return info.data; > > It was corrected in an earlier commit, but that seems to have been removed s

Re: Should HashSetOp go away

2025-11-02 Thread David Rowley
On Mon, 3 Nov 2025 at 05:26, Tom Lane wrote: > v3 attached. I reviewed the changes. Looks good to me. David

Re: Should HashSetOp go away

2025-11-02 Thread David Rowley
On Sat, 1 Nov 2025 at 08:22, Tom Lane wrote: > I wrote: > > Here's a pair of patches to try to do better. The first one > > is concerned with getting more realistic size estimates for > > TupleHashTables in the planner. The second is some mop-up > > that's been pending for a long time in the sam

Re: another autovacuum scheduling thread

2025-10-31 Thread David Rowley
On Sat, 1 Nov 2025 at 14:50, David Rowley wrote: > If we logged the score, we could do the "unpatched" test with the > patched code, just with commenting out the > list_sort(tables_to_process, TableToProcessComparator); It'd then be > interesting to zero the log_auto

Re: another autovacuum scheduling thread

2025-10-31 Thread David Rowley
On Sat, 1 Nov 2025 at 09:12, Nathan Bossart wrote: > > On Thu, Oct 30, 2025 at 07:38:15PM -0500, Sami Imseih wrote: > > The results above show what I expected: the batch tables receive higher > > priority, as seen from the averages of autovacuum and autoanalyze runs. > > This behavior is expected,

Re: Potential bug introduced in PG17 with query parallelization - plan flip

2025-10-30 Thread David Rowley
On Fri, 31 Oct 2025 at 10:40, Jon Jenkins wrote: > -> Index Scan using index_on_merge_requests_for_latest_diffs > on public.merge_requests (cost=0.57..1011.36 rows=773 width=4) > (actual time=0.015..10.901 rows=9271 loops=1) >Index Cond: (merge_requests.target_project_id

Re: another autovacuum scheduling thread

2025-10-30 Thread David Rowley
On Thu, 30 Oct 2025 at 19:48, wenhui qiu wrote: > I think there might be some misunderstanding — I’m only suggesting > changing > effective_xid_failsafe_age = Max(vacuum_failsafe_age, > autovacuum_freeze_max_age * 1.05); > to > effective_xid_failsafe_age = (v

Re: tuple radix sort

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 16:46, Chao Li wrote: > > On Oct 30, 2025, at 11:40, John Naylor wrote: > > Are you by chance running with asserts on? It's happened before, so I > > have to make sure. That makes a big difference here because I disabled > > diversion thresholds in assert builds so that reg

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 16:40, Corey Huinker wrote: > I'm in favor of the COUNT(NULL) optimization as well, as one of my favorite > programming tropes is "There is nothing faster than nothing". I think it would be much more interesting to do that if it could be detected in cases like: SELECT COU

Re: another autovacuum scheduling thread

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 15:58, wenhui qiu wrote: > In fact, with the introduction of the vacuum_max_eager_freeze_failure_rate > feature, if a table’s age still exceeds more than 1.x times the > autovacuum_freeze_max_age, it suggests that the vacuum freeze process is not > functioning properly. O

Re: Fix bogus use of "long" in aset.c

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 13:24, Tom Lane wrote: > > The reason I left it an unsigned type was that the check is doing: if > > (chsize + ALLOC_CHUNKHDRSZ != blk_used), so we'd still catch this with > > an unsigned type, even if it wrapped due to going negative due to a > > bogus freeptr. Changing to

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 13:22, Tom Lane wrote: > > David Rowley writes: > > Is it worth mentioning there that we leave it up to the destruction of > > the hash table itself to when the ExecutorState context is reset? > > (Apparently execGrouping.c does

Re: Fix bogus use of "long" in aset.c

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 12:55, Tom Lane wrote: > > David Rowley writes: > > I did also consider [u]intptr_t, but thought Size was better as that's > > what chsize is. > > Seems like it's important that the value be signed, so maybe ssize_t? > Or ptrdiff_t?

Fix bogus use of "long" in aset.c

2025-10-29 Thread David Rowley
While reviewing another patch, I was playing around with a test case to trigger a large memory allocation. I was doing this on Windows when I got a nasty looking WARNING in a MEMORY_CONTEXT_CHECKING build: create table t (a int); insert into t values(1); alter table t alter column a set (n_distinc

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-29 Thread David Rowley
On Thu, 30 Oct 2025 at 08:51, Tom Lane wrote: > > I wrote: > > David Rowley writes: > >> I can't help wonder if we can improve the memory context parameter > >> names in BuildTupleHashTable(). Every time I see "tablecxt" I have to > >> remi

Re: another autovacuum scheduling thread

2025-10-27 Thread David Rowley
On Tue, 28 Oct 2025 at 11:35, Sami Imseih wrote: > We discuss the threshold calculations in the documentation, and users > can write scripts to monitor which tables are eligible. However, there > is nothing that indicates which table autovacuum will work on next (I > have been asked that question

Re: another autovacuum scheduling thread

2025-10-27 Thread David Rowley
The patch is starting to look good. Here's a review of v5: 1. I think the following code at the bottom of relation_needs_vacanalyze() can be deleted. You've added the check to ensure *doanalyze never gets set to true for pg_statistic. /* ANALYZE refuses to work with pg_statistic */ if (relid == S

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-26 Thread David Rowley
On Mon, 27 Oct 2025 at 16:55, Tom Lane wrote: > Hmm, I wasn't really expecting any direct time saving; the point > was about cutting memory consumption. So Chao Li's nearby results > are in line with mine. It's for the same reason that Hash Join starts to run more slowly once the hash table is l

Re: Enhance statistics reset functions to return reset timestamp

2025-10-26 Thread David Rowley
On Sat, 9 Aug 2025 at 10:38, Andres Freund wrote: > On 2025-08-08 13:18:39 +0900, Shinya Kato wrote: > > I would like to propose a series of patches that enhance the behavior > > of various statistics reset functions by making them return the reset > > timestamp. This change improves usability and

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-26 Thread David Rowley
On Mon, 27 Oct 2025 at 10:46, David Rowley wrote: > > On Mon, 27 Oct 2025 at 09:55, Tom Lane wrote: > > > > [ starting a new thread to keep this separate from the estimation > > issue ] > > > > I looked at the callers of BuildTupleHashTable, and realized that

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-26 Thread David Rowley
On Mon, 27 Oct 2025 at 11:11, Tom Lane wrote: > Related to this, while I was chasing Jeff's complaint I realized that > the none-too-small simplehash table for this is getting made in the > query's ExecutorState. That's pretty awful from the standpoint of > being able to blame memory consumption

Re: Should HashSetOp go away

2025-10-26 Thread David Rowley
On Mon, 27 Oct 2025 at 07:00, Tom Lane wrote: > > Jeff Janes writes: > > I was thinking of ways to improve the memory usage (or at least its > > estimation) but decided maybe it would be better if HashSetOp went away > > entirely. As far as I can tell HashSetOp has nothing to recommend it other

Re: Use BumpContext contexts for TupleHashTables' tablecxt

2025-10-26 Thread David Rowley
On Mon, 27 Oct 2025 at 09:55, Tom Lane wrote: > > [ starting a new thread to keep this separate from the estimation > issue ] > > I looked at the callers of BuildTupleHashTable, and realized that > (a) every one of them can use a BumpContext for the "tablecxt", > and (b) Jeff Davis already noticed

Re: another autovacuum scheduling thread

2025-10-25 Thread David Rowley
On Sat, 25 Oct 2025 at 04:08, Nathan Bossart wrote: > Here is an updated patch based on the latest discussion. Thanks. I've just had a look at it. A few comments and questions. 1) The subtraction here looks back to front: + xid_age = TransactionIdIsNormal(relfrozenxid) ? relfrozenxid - recentXi

Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

2025-10-24 Thread David Rowley
Since e2debb643, we've had the ability to determine if a column is NULLable as early as during constant folding. This seems like a good time to consider converting COUNT(not_null_col) into COUNT(*), which is faster and may result in far fewer columns being deformed from the tuple. To make this wo

Re: another autovacuum scheduling thread

2025-10-24 Thread David Rowley
On Sat, 25 Oct 2025 at 10:14, Peter Geoghegan wrote: > > On Wed, Oct 22, 2025 at 3:35 PM David Rowley wrote: > > If we had the varying sleep time as I mentioned above, the > > failsafe code could even be removed as the > > "autovacuum_vacuum_cost_delay / " calc

Re: another autovacuum scheduling thread

2025-10-23 Thread David Rowley
On Fri, 24 Oct 2025 at 09:48, Sami Imseih wrote: > Yes, in my last reply, I did indicate that the sort will likely not be > the operation that will tip the performance over, but the > catalog scan itself that I have seen not scale well as the number of > relations grow ( in cases of thousands or h

Re: another autovacuum scheduling thread

2025-10-23 Thread David Rowley
On Fri, 24 Oct 2025 at 08:33, Sami Imseih wrote: > Yeah, you’re correct, the list already exists; sorry I missed that. My > main concern is > the additional overhead of the sort operation, especially if we have > many eligible > tables and an aggressive autovacuum_naptime. It is true that there a

Re: pgsql: Use CompactAttribute more often, when possible

2025-10-23 Thread David Rowley
On Fri, 24 Oct 2025 at 01:21, Robert Haas wrote: > David, I've noticed several times lately you've documented in commit > messages why you thought going further with some change would be > risky, and I really like that. I think sometimes we (as committers) > are sometimes reluctant to document cas

Re: Issue with query_is_distinct_for() and grouping sets

2025-10-22 Thread David Rowley
On Thu, 23 Oct 2025 at 16:43, Richard Guo wrote: > How about using the following wording in the commit message? > > " > The previous logic in query_is_distinct_for() was incomplete because > the check was insufficiently thorough and could return false when it > should have returned true. > " "it

Re: Issue with query_is_distinct_for() and grouping sets

2025-10-22 Thread David Rowley
On Thu, 23 Oct 2025 at 15:59, David Rowley wrote: > > No backpatch as this could result in plan changes. > > If this is broken then it'll need to be backpatched as if that > function returns true when it should return false, then you could have > LEFT JOINs being removed

Re: Issue with query_is_distinct_for() and grouping sets

2025-10-22 Thread David Rowley
On Thu, 23 Oct 2025 at 15:45, Richard Guo wrote: > > On Wed, Oct 22, 2025 at 6:25 PM Richard Guo wrote: > > Attached is a patch along the lines of option #2. The LCOV report > > indicates that there is currently no test coverage for the "else if > > (query->groupingSets)" branch in query_is_dist

Re: another autovacuum scheduling thread

2025-10-22 Thread David Rowley
On Thu, 23 Oct 2025 at 07:58, Nathan Bossart wrote: > > That's a good point. I wonder if we should try to make the wraparound > > score independent of the *_freeze_max_age parameters (once the table age > > surpasses said parameters). Else, different settings will greatly impact > > how aggressi

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

2025-10-21 Thread David Rowley
On Mon, 20 Oct 2025 at 16:59, Tatsuo Ishii wrote: > > > A very trivial commit: > > > > ``` > > + else > > + > > + /* > > + * For other cases we have no idea what position of row > > callers would > > + * fetch next time. Also for relpos < 0 case (we go >

Re: Use CompactAttribute more often, when possible

2025-10-21 Thread David Rowley
On Mon, 20 Oct 2025 at 21:43, David Rowley wrote: > > On Mon, 20 Oct 2025 at 21:15, Michael Paquier wrote: > > > > On Mon, Oct 20, 2025 at 05:46:33PM +1300, David Rowley wrote: > > @@ -5146,10 +5146,6 @@ ReorderBufferToastReplace(ReorderBuffer *rb, &

Re: another autovacuum scheduling thread

2025-10-21 Thread David Rowley
On Wed, 22 Oct 2025 at 03:38, Nathan Bossart wrote: > I did something similar to this in v3, although I used the *_freeze_max_age > parameters as the point to start scaling aggressively, and I simply raised > the score to the power of 10. > > I've yet to do any real testing with this stuff. I've

Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

2025-10-21 Thread David Rowley
On Tue, 21 Oct 2025 at 18:53, Michael Paquier wrote: > FWIW, I am on the same line as you. Your suggestions are better than > what the proposed patch does, as far as I've looked. Pushed with the agreed comment change and type change to pageno. David

Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

2025-10-20 Thread David Rowley
On Tue, 21 Oct 2025 at 15:55, Michael Paquier wrote: > Using signed or unsigned is not going to matter much at the end. We > would be far from the count even if the number is signed. I'd leave it as uint64. There's no reason to mixup the signedness between these two variables. > +* Sinc

Re: Use CompactAttribute more often, when possible

2025-10-20 Thread David Rowley
On Mon, 20 Oct 2025 at 22:33, Chao Li wrote: > I have removed those ones that you don’t want in v2 diff. Looks like it's just the rowtypes.c ones that you have that I didn't list. get_sql_insert() and FreeTupleDesc() were in my reject list. The extra ones you've found seem to match a similar pat

Re: Use CompactAttribute more often, when possible

2025-10-20 Thread David Rowley
On Mon, 20 Oct 2025 at 21:34, Chao Li wrote: > I search for “TupleDescAttr”, and found more occurrences to replace. See the > attached diff file. Thanks for looking. There was a bunch that I didn't do and tried to convey that in the commit message. The diff you sent seems to contain a mix of m

Re: Use CompactAttribute more often, when possible

2025-10-20 Thread David Rowley
On Mon, 20 Oct 2025 at 21:15, Michael Paquier wrote: > > On Mon, Oct 20, 2025 at 05:46:33PM +1300, David Rowley wrote: > > I don't think the attached is very interesting to look at, but l'll > > leave it here for a bit just in case anyone wants to look. Otherwise, >

Use CompactAttribute more often, when possible

2025-10-19 Thread David Rowley
5983a4cff added CompactAttribute to TupleDesc to allow commonly accessed fields from the FormData_pg_attribute array to be accessed more quickly by having to load fewer cachelines from memory. That commit also went around and changed many locations to use CompactAttribute, but not all locations. T

Re: BRIN: Prevent the heapblk overflow during index summarization on very large tables resulting in an infinite loop

2025-10-19 Thread David Rowley
On Sun, 19 Oct 2025 at 19:03, sunil s wrote: > Previously, heapBlk was defined as an unsigned 32-bit integer. When > incremented > by pagesPerRange on very large tables, it could wrap around, causing the > condition > heapBlk < nblocks to remain true indefinitely — resulting in an infinite loop.

Re: bug, ALTER TABLE call ATPostAlterTypeCleanup twice for the same relation

2025-10-18 Thread David Rowley
On Sat, 27 Sept 2025 at 21:54, jian he wrote: > if (pass == AT_PASS_ALTER_TYPE || pass == AT_PASS_SET_EXPRESSION) > - ATPostAlterTypeCleanup(wqueue, tab, lockmode); > + { > + if (!list_member_oid(relids, tab->relid)) > + { > + ATPostAlterTypeCleanup(wqueue, tab, lockmode); > + relids = lappend_oi

  1   2   3   4   5   6   7   8   9   10   >