Re: index prefetching

2025-08-28 Thread Thomas Munro
On Fri, Aug 29, 2025 at 11:52 AM Tomas Vondra wrote: > True. But one worker did show up in top, using a fair amount of CPU, so > why wouldn't the others (if they process the same stream)? It deliberately concentrates wakeups into the lowest numbered workers that are marked idle in a bitmap. * hi

Re: index prefetching

2025-08-28 Thread Thomas Munro
On Fri, Aug 29, 2025 at 7:52 AM Andres Freund wrote: > On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote: > > From the 2x regression (compared to master) it might seem like that, but > > even with the increased distance it's still slower than master (by 25%). So > > maybe the "error" is to use AIO

Re: Non-reproducible AIO failure

2025-08-27 Thread Thomas Munro
On Thu, Aug 28, 2025 at 11:08 AM Andres Freund wrote: > On 2025-08-26 16:59:54 +0300, Konstantin Knizhnik wrote: > > Still it is not quite clear to me how bitfields can cause this issue. > > Same. Here's what I speculated after reading the generated asm[1]: "Could it be that the store buffer was

Re: Non-reproducible AIO failure

2025-08-25 Thread Thomas Munro
On Tue, Aug 26, 2025 at 12:37 PM Andres Freund wrote: > I'm a bit confused by this focus on bitfields - both Alexander and Konstantin > stated they could reproduce the issue without the bitfields. Konstantin's message all seem to say it *did* fix it? But I do apologise for working through the sa

Re: Non-reproducible AIO failure

2025-08-25 Thread Thomas Munro
On Tue, Aug 26, 2025 at 12:45 PM Andres Freund wrote: > On 2025-08-25 10:43:21 +1200, Thomas Munro wrote: > > On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik > > wrote: > > > In theory even replacing bitfield with in should not > > > avoid race condition,

Re: index prefetching

2025-08-25 Thread Thomas Munro
On Tue, Aug 26, 2025 at 2:18 AM Tomas Vondra wrote: > Of course, this can happen even with other hit ratios, there's nothing > special about 50%. Right, that's what this patch was attacking directly, basically only giving up when misses are so sparse we can't do anything about it for an ordered s

Re: Calling PGReserveSemaphores() from CreateOrAttachShmemStructs

2025-08-25 Thread Thomas Munro
On Mon, Aug 25, 2025 at 9:10 PM Ashutosh Bapat wrote: > Is this change correct? Was there any reason to leave it like that in > e25626677f8076eb3ce94586136c5464ee154381? Or was it just something > that didn't fit in that commit? We/I just missed that opportunity when ripping that stuff out. It s

Re: Non-reproducible AIO failure

2025-08-24 Thread Thomas Munro
On Mon, Aug 25, 2025 at 2:41 PM Thomas Munro wrote: > On Mon, Aug 25, 2025 at 1:52 PM Thomas Munro wrote: > > > struct { PgAioHandleState v:8; } state; > > > > This preserves type safety and compiles to strb two properties we > > want, but it seems to waste

Re: Non-reproducible AIO failure

2025-08-24 Thread Thomas Munro
On Mon, Aug 25, 2025 at 1:52 PM Thomas Munro wrote: > > struct { PgAioHandleState v:8; } state; > > This preserves type safety and compiles to strb two properties we > want, but it seems to waste space (look at the offsets for the > stores): > > a.out[0x105f8

Re: Non-reproducible AIO failure

2025-08-24 Thread Thomas Munro
On Mon, Aug 25, 2025 at 11:42 AM Nico Williams wrote: > I think the issue is that if the compiler decides to coalesce what we > think of as distinct (but neighboring) bitfields, then when you update > one of the bitfields you could be updating the other with stale data > from an earlier read where

Re: Non-reproducible AIO failure

2025-08-24 Thread Thomas Munro
On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik wrote: > In theory even replacing bitfield with in should not > avoid race condition, because they are still shared the same cache line. I'm no expert in this stuff, but that's not my understanding of how it works. Plain stores to normal memory

Re: Non-reproducible AIO failure

2025-08-24 Thread Thomas Munro
On Sun, Aug 24, 2025 at 5:32 AM Konstantin Knizhnik wrote: > On 20/08/2025 9:00 PM, Alexander Lakhin wrote: > > for i in {1..10}; do np=$((20 + $RANDOM % 10)); echo "iteration $i: > > $np"; time parallel -j40 --linebuffer --tag /tmp/repro-AIO-Assert.sh > > {} ::: `seq $np` || break; sleep $(($RAN

Re: VM corruption on standby

2025-08-21 Thread Thomas Munro
On Fri, Aug 22, 2025 at 10:27 AM Alexander Korotkov wrote: > And let's retry it for v19. +1 I'm hoping we can fix PM death handling soon, and then I assume this can go straight back in without modification. CVs are an essential low level synchronisation component that really should work in lots

Re: Redesigning postmaster death handling

2025-08-21 Thread Thomas Munro
On Thu, Aug 21, 2025 at 5:45 PM Tom Lane wrote: > One other thought here: do we *really* want such a critical-and-hard- > to-test aspect of our behavior to be handled completely differently > on different platforms? I'd lean to ignoring the Linux/FreeBSD > facilities, because otherwise we're basi

Re: Redesigning postmaster death handling

2025-08-20 Thread Thomas Munro
On Thu, Aug 21, 2025 at 5:28 PM Tom Lane wrote: > Hmm. Up to now, we have not had an assumption that postmaster > children are aware of every other postmaster child. In particular, > not all postmaster children have PGPROC entries. How much does > this matter? What happens if the shared PGPROC

Re: VM corruption on standby

2025-08-20 Thread Thomas Munro
On Wed, Aug 20, 2025 at 3:47 PM Tom Lane wrote: > Having said that, we should in any case have a better story on > what WaitEventSetWait should do after detecting postmaster death. > So I'm all for trying to avoid the proc_exit path if we can > design a better answer. Yeah. I've posted a concept

Redesigning postmaster death handling

2025-08-20 Thread Thomas Munro
Hi, Here's an experimental patch to fix our shutdown strategy on postmaster death, as discussed in a nearby report[1]. Maybe it's possible to switch to _exit() without also switching to preemptive handling, but it seems fragile and painful for no gain. Following that line of thinking, we might a

Re: VM corruption on standby

2025-08-19 Thread Thomas Munro
On Wed, Aug 20, 2025 at 11:59 AM Thomas Munro wrote: > they can't all be > blocked in sig_wait() unless there is already a deadlock. s/sig_wait()/sem_wait()/

Re: VM corruption on standby

2025-08-19 Thread Thomas Munro
On Wed, Aug 20, 2025 at 7:50 AM Tom Lane wrote: > I'm inclined to think that we do want to prohibit WaitEventSetWait > inside a critical section --- it just seems like a bad idea all > around, even without considering this specific failure mode. FWIW aio/README.md describes a case where we'd need

Re: VM corruption on standby

2025-08-19 Thread Thomas Munro
On Wed, Aug 20, 2025 at 2:57 AM Andres Freund wrote: > On 2025-08-20 02:54:09 +1200, Thomas Munro wrote: > > > On linux - the primary OS with OOM killer troubles - I'm pretty sure'll > > > lwlock > > > waiters would get killed due to the postmaster death s

Re: VM corruption on standby

2025-08-19 Thread Thomas Munro
On Wed, Aug 20, 2025 at 1:56 AM Andres Freund wrote: > On 2025-08-19 02:13:43 -0400, Tom Lane wrote: > > > Then wouldn't backends blocked in LWLockAcquire(x) hang forever, after > > > someone who holds x calls _exit()? > > > > If someone who holds x is killed by (say) the OOM killer, how do > > we

Re: VM corruption on standby

2025-08-18 Thread Thomas Munro
On Tue, Aug 19, 2025 at 4:52 AM Tom Lane wrote: > But I'm of the opinion that proc_exit > is the wrong thing to use after seeing postmaster death, critical > section or no. We should assume that system integrity is already > compromised, and get out as fast as we can with as few side-effects > as

Re: Memory leak of SMgrRelation object on standby

2025-08-17 Thread Thomas Munro
On Sat, Aug 16, 2025 at 12:50 AM Jingtang Zhang wrote: > Back to v17, commit 21d9c3ee gave SMgrRelation a well-defined lifetime, and > smgrclose nolonger removes SMgrRelation object from the hashtable, leaving > the work to smgrdestroyall. But I find a place that relies on the removing > behavior

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-08-17 Thread Thomas Munro
On Fri, Aug 8, 2025 at 1:38 AM Magnus Hagander wrote: > On Tue, Aug 5, 2025 at 3:08 PM Thomas Munro wrote: >> We discussed that a bit earlier in the thread. Some problems about >> layering violations and general weirdness, I recall trying it even. >> On the flip side, i

Re: [PATCH] Let's get rid of the freelist and the buffer_strategy_lock

2025-08-16 Thread Thomas Munro
On Sun, Aug 17, 2025 at 4:34 PM Thomas Munro wrote: > Or if you don't like those odds, maybe it'd be OK to keep % but use it > rarely and without the CAS that can fail. ... or if we wanted to try harder to avoid %, could we relegate it to the unlikely CLOCK-went-all-the-way-aro

Re: [PATCH] Let's get rid of the freelist and the buffer_strategy_lock

2025-08-16 Thread Thomas Munro
On Sat, Aug 16, 2025 at 3:37 PM Thomas Munro wrote: > while (hand >= NBuffers) > { > /* Base value advanced by backend that overshoots by one tick. */ > if (hand == NBuffers) > pg_atomic_fetch_add_u64(&StrategyControl->ticks_base,

Re: [PATCH] Let's get rid of the freelist and the buffer_strategy_lock

2025-08-15 Thread Thomas Munro
On Wed, Aug 13, 2025 at 9:42 AM Greg Burd wrote: > Amazing, thank you. I'll try to replicate your tests tomorrow to see if > my optimized division and modulo functions do in fact help or not. I > realize that both you and Anders are (rightly) concerned that the > performance impact of IDIV on so

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-15 Thread Thomas Munro
On Sat, Aug 16, 2025 at 2:25 PM Thomas Munro wrote: > Supposing posix_sema.c checked that the maximum number of backends > didn't exceed SEM_VALUE_MAX and refused to start up if so (I suppose > today if you later exceed it in sem_post() you'll get either FATAL: > EOVERFLOW

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-15 Thread Thomas Munro
On Sat, Aug 16, 2025 at 5:58 AM Burd, Greg wrote: > > 1. They use CAS in sem_post() because they want to report EOVERFLOW if > > you exceed SEM_VALUE_MAX, but POSIX doesn't seem to require that, so I > > just used fetch-and-add. Is that bad? I noticed their alternative > > older version (remove

Re: Potential deadlock in pgaio_io_wait()

2025-08-14 Thread Thomas Munro
I discussed this off-list with Andres who provided the following review: * +1 for the analysis * +1 for the solution * his benchmark machine shows no regression under heavy IO submission workload * needs better comments I had expected that patch to be rejected as too slow. I was thinking tha

Re: index prefetching

2025-08-14 Thread Thomas Munro
On Fri, Aug 15, 2025 at 1:47 PM Thomas Munro wrote: > (rather than introducing a secondary reference > counting scheme in the WAL that I think you might be describing?), and s/WAL/read stream/

Re: index prefetching

2025-08-14 Thread Thomas Munro
On Fri, Aug 15, 2025 at 11:21 AM Tomas Vondra wrote: > I don't recall all the details, but IIRC my impression was it'd be best > to do this "caching" entirely in the read_stream.c (so the next_block > callbacks would probably not need to worry about lastBlock at all), > enabled when creating the s

Re: index prefetching

2025-08-13 Thread Thomas Munro
On Thu, Aug 14, 2025 at 9:19 AM Tomas Vondra wrote: > I did investigate this, and I don't think there's anything broken in > read_stream. It happens because ReadStream has a concept of "ungetting" > a block, which can happen after hitting some I/O limits. > > In that case we "remember" the last bl

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-13 Thread Thomas Munro
On Wed, Aug 13, 2025 at 7:38 PM Thomas Munro wrote: > Here's a new attempt at that. It picks futexes automatically, unless > you export MACOSX_DEPLOYMENT_TARGET=14.3 or lower, and then it picks > sysv as before. I compared my hand-rolled atomics logic against FreeBSD's user

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-13 Thread Thomas Munro
On Wed, Aug 13, 2025 at 12:45 PM Thomas Munro wrote: > Maybe I should have another go at that! Here's a new attempt at that. It picks futexes automatically, unless you export MACOSX_DEPLOYMENT_TARGET=14.3 or lower, and then it picks sysv as before. v1-0001-Use-futex-based-semap

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-12 Thread Thomas Munro
On Wed, Aug 13, 2025 at 11:29 AM Thomas Munro wrote: > FWIW in early prototype multithreading patches you can just use > sem_init() on all these systems since you don't need pshared=1. Oops, I misremembered that. It works on NetBSD and OpenBSD, but not macOS :-(. Hmm, but it looks

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-12 Thread Thomas Munro
On Wed, Aug 13, 2025 at 8:19 AM Tom Lane wrote: > In the meantime, though, we'll have to deal with the existing > behavior for years to come. So I'll go ahead with that patch. > I like having a loop limit there anyway --- I was never exactly > convinced that retrying indefinitely was a good idea.

Re: `pg_ctl init` crashes when run concurrently; semget(2) suspected

2025-08-12 Thread Thomas Munro
On Mon, Aug 11, 2025 at 10:56 AM Tom Lane wrote: > A bit more research later: OpenBSD behaves like NetBSD, while > FreeBSD behaves like Linux. So that's pretty inconclusive about > what the aboriginal behavior was. I also found that OpenIndiana > behaves like Linux. Out of curiosity, FreeBSD ch

Re: index prefetching

2025-08-12 Thread Thomas Munro
On Tue, Aug 12, 2025 at 11:22 PM Nazir Bilal Yavuz wrote: > Unfortunately this doesn't work. We need to handle backwards I/O > combining in the StartReadBuffersImpl() function too as buffer indexes > won't have correct blocknums. Also, I think buffer forwarding of split > backwards I/O should be h

Re: index prefetching

2025-08-11 Thread Thomas Munro
On Tue, Aug 12, 2025 at 11:42 AM Peter Geoghegan wrote: > On Mon, Aug 11, 2025 at 5:07 PM Tomas Vondra wrote: > > I can do some tests with forward vs. backwards scans. Of course, the > > trouble with finding these weird cases is that they may be fairly rare. > > So hitting them is a matter or luc

Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events

2025-08-08 Thread Thomas Munro
On Fri, Aug 8, 2025 at 8:04 AM Jacob Champion wrote: > On Thu, Aug 7, 2025 at 11:11 AM Jacob Champion > wrote: > > Thank you so much for the reviews! > > Here is v4, with the feedback from both of you. 0001-0004 are planned > for backport; 0005 is slated for master only. Thanks again for the > re

Re: index prefetching

2025-08-06 Thread Thomas Munro
On Thu, Aug 7, 2025 at 8:41 AM Peter Geoghegan wrote: > On Tue, Aug 5, 2025 at 7:31 PM Thomas Munro wrote: > > There must be a similar opportunity for parallel index scans. It has > > that "seize the scan" concept where parallel workers do one-at-a-time > > locked

Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events

2025-08-06 Thread Thomas Munro
On Thu, Aug 7, 2025 at 1:45 PM Thomas Munro wrote: > I like the C TAP test. PostgreSQL > needs more of this. I should add, I didn't look closely at that part since you said it's not in scope for back-patching. I'd like to, though, later. I wonder if you would be interest

Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events

2025-08-06 Thread Thomas Munro
On Thu, Aug 7, 2025 at 11:55 AM Jacob Champion wrote: > On Wed, Aug 6, 2025 at 9:13 AM Jacob Champion > wrote: > > Maybe "drain" would no longer be the > > verb to use there. > > I keep describing this as "combing" the queue when I talk about it in > person, so v3-0001 renames this new operation

Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events

2025-08-06 Thread Thomas Munro
On Tue, Aug 5, 2025 at 3:24 AM Jacob Champion wrote: > On Mon, Aug 4, 2025 at 7:53 AM Thomas Munro wrote: > > [FYI, I'm looking into this and planning to post a review in 1-2 days...] 0001: So, the problem is that poll(kqueue_fd) reports POLLIN if any events are queued, but l

Re: index prefetching

2025-08-05 Thread Thomas Munro
On Wed, Aug 6, 2025 at 9:35 AM Peter Geoghegan wrote: > On Tue, Aug 5, 2025 at 4:56 PM Tomas Vondra wrote: > > True, the complex patch could prefetch the leaf pages. There must be a similar opportunity for parallel index scans. It has that "seize the scan" concept where parallel workers do one-

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-08-05 Thread Thomas Munro
On Tue, Jul 29, 2025 at 6:52 PM Magnus Hagander wrote: > Not just to throw a wrench in there, but... Should this perhaps be a > tablespace option? ISTM having different filesystems for them is a good > reason to use tablespaces in the first place, and then being able to pick > different options

Kernel AIO on FreeBSD, macOS and a couple of other Unixen

2025-08-05 Thread Thomas Munro
Hi, Here is a proof-of-concept patch for io_method=posix_aio. It works pretty well on FreeBSD, making good use of a couple of extensions. It's working better than all my previous attempts on macOS, but I guess it's more of a curiosity for developers, and scan performance is also affected by the

Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events

2025-08-04 Thread Thomas Munro
On Tue, Jul 29, 2025 at 8:52 AM Jacob Champion wrote: > On Thu, Jun 26, 2025 at 4:33 PM Jacob Champion > wrote: > > My plan, if this code seems reasonable, is to backport 0001-0003, but > > keep the larger 0004 on HEAD only until it has proven to be stable. > > It's a big new suite and I want to

Re: Potential deadlock in pgaio_io_wait()

2025-08-03 Thread Thomas Munro
On Mon, Aug 4, 2025 at 5:54 PM Thomas Munro wrote: > I doubt it's very easy to reproduce with simple queries, but I assume > if you had a SQL function that acquires a central LWLock and you ran > concurrent queries SELECT COUNT(*) FROM t WHERE locking_function(x) Hmm, that's

Re: Automatically sizing the IO worker pool

2025-08-03 Thread Thomas Munro
On Wed, Jul 30, 2025 at 10:15 PM Dmitry Dolgov <9erthali...@gmail.com> wrote: > Thanks. I was experimenting with this approach, and realized there isn't > much metrics exposed about workers and the IO queue so far. Since the Hmm. You can almost infer the depth from the pg_aios view. All IOs in u

Re: Remove INT64_HEX_FORMAT and UINT64_HEX_FORMAT

2025-08-03 Thread Thomas Munro
On Sun, Aug 3, 2025 at 6:25 AM Nathan Bossart wrote: > On Sat, Aug 02, 2025 at 11:09:16AM +0200, Peter Eisentraut wrote: > > These were introduced (commit efdc7d74753) at the same time as we were > > moving to using the standard inttypes.h format macros (commit a0ed19e0a9e). > > It doesn't seem us

Re: Optimize LISTEN/NOTIFY

2025-07-22 Thread Thomas Munro
On Wed, Jul 23, 2025 at 1:39 PM Joel Jacobson wrote: > In their patch, in asyn.c's SignalBackends(), they do > SendInterrupt(INTERRUPT_ASYNC_NOTIFY, procno) instead of > SendProcSignal(pid, PROCSIG_NOTIFY_INTERRUPT, procnos[i]). They don't > seem to check if the backend is already signalled or not

Re: index prefetching

2025-07-22 Thread Thomas Munro
On Wed, Jul 23, 2025 at 1:55 AM Tomas Vondra wrote: > On 7/21/25 14:39, Thomas Munro wrote: > > Here also are some alternative experimental patches for preserving > > accumulated look-ahead distance better in cases like that. Needs more > > exploration... thoughts/ideas

Re: [PATCH] Optimize ProcSignal to avoid redundant SIGUSR1 signals

2025-07-22 Thread Thomas Munro
On Wed, Jul 23, 2025 at 8:08 AM Joel Jacobson wrote: > Previously, ProcSignal used an array of volatile sig_atomic_t flags, one > per signal reason. A sender would set a flag and then unconditionally > send a SIGUSR1 to the target process. This could result in a storm of > redundant signals if mul

Re: index prefetching

2025-07-21 Thread Thomas Munro
On Sun, Jul 20, 2025 at 1:07 AM Thomas Munro wrote: > On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra wrote: > > Thanks for the link. It seems I came up with an almost the same patch, > > with three minor differences: > > > > 1) There's another p

Re: index prefetching

2025-07-20 Thread Thomas Munro
On Sun, Jul 20, 2025 at 1:07 AM Thomas Munro wrote: > On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra wrote: > > The thing that however concerns me is that what I observed was not the > > distance getting reset to 1, and then ramping up. Which should happen > > pretty q

Re: index prefetching

2025-07-19 Thread Thomas Munro
On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra wrote: > Thanks for the link. It seems I came up with an almost the same patch, > with three minor differences: > > 1) There's another place that sets "distance = 0" in > read_stream_next_buffer, so maybe this should preserve the distance too? > > 2) I

Re: index prefetching

2025-07-18 Thread Thomas Munro
On Sat, Jul 19, 2025 at 6:31 AM Tomas Vondra wrote: > Perhaps the ReadStream should do something like this? Of course, the > simple patch resets the stream very often, likely mcuh more often than > anything else in the code. But wouldn't it be beneficial for streams > reset because of a rescan? Po

Re: meson's in-tree libpq header search order vs -Dextra_include_dirs

2025-07-13 Thread Thomas Munro
ion tests: -pg_regress_inc = include_directories('.') +pg_regress_inc = [libpq_inc, include_directories('.')] [1] https://mesonbuild.com/Reference-manual_elementary_dict.html From 0ba525115e6c6b7b9cadd73b3db9e26c77b8c1a2 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 14 J

Re: Automatically sizing the IO worker pool

2025-07-11 Thread Thomas Munro
tting to it when I'll get back), and can > imagine practical considerations significantly impacting any potential > solution. Here's a rebase. From fa7aac1bc9c0a47fbdbd9459424f08fa61b71ce2 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 11 Apr 2025 21:17:26 +1200 Subject: [

Re: I/O worker and ConfigReload

2025-07-11 Thread Thomas Munro
On Sun, May 25, 2025 at 2:34 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > I see thanks. Indeed, there isn't much difference between what I had in > mind and the relevant bits in 0004, so probably it's the way to go. Done.

Re: Remaining dependency on setlocale()

2025-07-10 Thread Thomas Munro
On Fri, Jul 11, 2025 at 6:22 AM Jeff Davis wrote: > I don't have a great windows development environment, and it appears CI > and the buildfarm don't offer great coverage either. Can I ask for a > volunteer to do the windows side of this work? Me neither but I'm willing to help with that, and hav

Re: Remaining dependency on setlocale()

2025-07-10 Thread Thomas Munro
On Fri, Jul 11, 2025 at 6:33 AM Jeff Davis wrote: > On Thu, 2025-07-10 at 12:01 +1200, Thomas Munro wrote: > > I tried to make a portable PG_C_LOCALE mechanism like that, but it > > was > > reverted for reasons needing more investigation... see > > 8e993bff5326b

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-07-10 Thread Thomas Munro
On Fri, Jul 11, 2025 at 5:39 AM Dimitrios Apostolou wrote: > > I applied the patch on PostgreSQL v17 and am testing it now. I chose > > ftruncate method and I see ftruncate in action using strace while doing > > pg_restore of a big database. Nothing unexpected has happened so far. I also > > verif

Re: Replace remaining getpwuid() calls with getpwuid_r()?

2025-07-09 Thread Thomas Munro
On Thu, Jul 10, 2025 at 5:10 PM Steve Lau wrote: > Hi hackers, when reading the source code, I noticed that Postgres is still > using getpwuid(), which is not thread-safe since it returns a pointer to the > static memory that can be overwritten by concurrent calls. Then I searched > "getpwuid"

Re: Windows question: when is LC_MESSAGES defined?

2025-07-09 Thread Thomas Munro
On Thu, Jul 10, 2025 at 3:45 PM Thomas Munro wrote: > [03:28:24.318] Program msgfmt found: NO Correction, that ^ is the reason it's not reached on the MSVC task, while -Dnls=disabled is the reason for the MinGW task. But the conclusion is the same: MinGW is the easiest way to test this

Re: Windows question: when is LC_MESSAGES defined?

2025-07-09 Thread Thomas Munro
On Thu, Jul 10, 2025 at 5:32 AM Jeff Davis wrote: > I was trying to exercise the function IsoLocaleName(), which is > surrounded by: > >#if defined(WIN32) && defined(LC_MESSAGES) > > but, at least in CI, that combination never seems to be true, which > surprised me. What platforms exercise thi

Re: Remaining dependency on setlocale()

2025-07-09 Thread Thomas Munro
On Tue, Jul 8, 2025 at 1:14 PM Jeff Davis wrote: > v4-0008 uses LC_C_LOCALE, and I'm not sure if that's portable, but if > the buildfarm complains then I'll fix it or revert it. (Catching up with this thread...) LC_C_LOCALE is definitely not portable: I've only seen it on macOS and NetBSD. It w

Re: Remaining dependency on setlocale()

2025-07-09 Thread Thomas Munro
On Thu, Jul 10, 2025 at 10:52 AM Jeff Davis wrote: > The first problem -- how to affect the encoding of strings returned by > strerror() on windows -- may be solvable as well. It looks like > LC_MESSAGES is not supported at all on windows, so the only thing to be > concerned about is the encoding,

Re: IO worker crash in test_aio/002_io_workers

2025-07-08 Thread Thomas Munro
On Wed, Jul 9, 2025 at 8:45 AM Andres Freund wrote: > /* Got one. Clear idle flag. */ > io_worker_control->idle_worker_mask &= ~(UINT64_C(1) > << MyIoWorkerId); > > /* See if we can wake up some peers. */ >

Re: Non-reproducible AIO failure

2025-06-18 Thread Thomas Munro
On Thu, Jun 19, 2025 at 4:08 AM Andres Freund wrote: > Konstantin, Alexander, can you share what commit you're testing and what > precise changes have been applied to the source? I've now tested this on a > significant number of apple machines for many many days without being able to > reproduce

Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward

2025-06-13 Thread Thomas Munro
On Wed, Jun 11, 2025 at 9:48 AM Nathan Bossart wrote: > So, fseeko() starts winning around 4096 bytes. On macOS, the differences > aren't quite as dramatic, but 4096 bytes is the break-even point there, > too. I imagine there's a buffer around that size somewhere... BTW you can call setvbuf(f,

Re: Cleaning up historical portability baggage

2025-06-10 Thread Thomas Munro
On Tue, Jun 10, 2025 at 10:59 PM Michael Banck wrote: > I don't have an opinion here, I think it would be ok to just define it > to 16 if it is undefined and if the Hurd people want something better at > some point, they should submit patches. Cool. I will go ahead and do that, as you proposed,

Re: Cleaning up historical portability baggage

2025-06-09 Thread Thomas Munro
On Tue, Jun 10, 2025 at 2:25 AM Andres Freund wrote: > On 2025-06-09 15:25:22 +0200, Michael Banck wrote: > > On Thu, Aug 11, 2022 at 10:02:29PM +1200, Thomas Munro wrote: > > > Remove configure probe for sys/uio.h. > > > > Removing the configure probe is fi

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-06-09 Thread Thomas Munro
905fd90732 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 31 May 2025 22:50:22 +1200 Subject: [PATCH] Add file_extend_method setting. BTRFS's compression is reported to be disabled by posix_fallocate(), so offer a way to turn it off by setting it to either write or ftruncate instead. M

Re: Non-reproducible AIO failure

2025-06-08 Thread Thomas Munro
On Sat, Jun 7, 2025 at 6:47 AM Andres Freund wrote: > On 2025-06-06 14:03:12 +0300, Konstantin Knizhnik wrote: > > There is really essential difference in code generated by clang 15 (working) > > and 16 (not working). > > There also are code gen differences between upstream clang 17 and apple's >

Re: Update Windows CI Task Names: Server 2022 + VS 2022 Upgrade

2025-06-05 Thread Thomas Munro
On Thu, Jun 5, 2025 at 8:48 PM Nazir Bilal Yavuz wrote: > > Hmm, for the purposes of [0], I think it might be better to keep the > > image at VS 2019 for now. Unless there are specific reasons why VS 2022 > > would be of use now? > > Thomas was thinking of trying some new APIs which are available

Re: Custom Glibc collation version strings under LOCPATH

2025-06-04 Thread Thomas Munro
On Thu, Jun 5, 2025 at 3:44 AM Joe Conway wrote: > On 6/4/25 09:52, Joe Conway wrote: > > On 6/4/25 00:03, Thomas Munro wrote: > >> I'm interested in hearing about other concrete > >> examples of the locale-recompilation technique failing to be perfect, > >

Re: Non-reproducible AIO failure

2025-06-04 Thread Thomas Munro
On Thu, Jun 5, 2025 at 8:02 AM Andres Freund wrote: > On 2025-06-03 08:00:01 +0300, Alexander Lakhin wrote: > > 2025-06-03 00:19:09.282 EDT [25175:1] LOG: !!!pgaio_io_before_start| ioh: > > 0x104c3e1a0, ioh->op: 1, ioh->state: 1, ioh->result: 0, ioh->num_callbacks: > > 2, ioh->generation: 21694 >

Re: Custom Glibc collation version strings under LOCPATH

2025-06-04 Thread Thomas Munro
:00:00 2001 From: Thomas Munro Date: Wed, 4 Jun 2025 12:19:53 +1200 Subject: [PATCH v2] Load optional collation version from glibc LOCPATH. One technique for dealing with glibc locale definition changes across Linux distribution upgrades or migrations is to compile the locale definitions from

Custom Glibc collation version strings under LOCPATH

2025-06-03 Thread Thomas Munro
y. [1] https://www.mail-archive.com/austin-group-l@opengroup.org/msg12849.html From ab504665cc51814bbe0d8757d35e331fd9b6a41a Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 4 Jun 2025 12:19:53 +1200 Subject: [PATCH] Load optional collation version from glibc LOCPATH. One technique for dealing

Re: C11 / VS 2019

2025-06-03 Thread Thomas Munro
On Wed, Jun 4, 2025 at 2:02 AM Tom Lane wrote: > Yura Sokolov writes: > > Will it mean we can implement atomics in term of C11 atomics? > > Any such change would have to be supported by extensive performance > testing to verify that there's not a regression on any supported > platform. Yeah, it'

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-06-02 Thread Thomas Munro
On Mon, Jun 2, 2025 at 10:14 PM Dimitrios Apostolou wrote: > On Sun, 1 Jun 2025, Thomas Munro wrote: > > Or for a completely different approach: I wonder if ftruncate() would > > be more efficient on COW systems anyway. The minimum thing we need is > > for the file syste

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-05-31 Thread Thomas Munro
Or for a completely different approach: I wonder if ftruncate() would be more efficient on COW systems anyway. The minimum thing we need is for the file system to remember the new size, 'cause, erm, we don't. All the rest is probably a waste of cycles, since they reserve real space (or fail to) la

Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-05-31 Thread Thomas Munro
least allows experimentation. From 8607189eb19302c509eed78a7a2db55b9a2d70b3 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 31 May 2025 22:50:22 +1200 Subject: [PATCH 1/2] Add io_min_fallocate setting. BTRFS's compression is reported to be disabled by posix_fallocate(), so offer a way to turn it off. The previous c

Re: Non-reproducible AIO failure

2025-05-27 Thread Thomas Munro
On Mon, May 26, 2025 at 12:05 PM Tom Lane wrote: > Thomas Munro writes: > > Could you guys please share your exact repro steps? > > I've just been running 027_stream_regress.pl over and over. > It's not a recommendable answer though because the failure > pro

Re: Automatically sizing the IO worker pool

2025-05-26 Thread Thomas Munro
BTW I would like to push 0001 and 0002 to master/18. They are are not behaviour changes, they just fix up a bunch of inconsistent (0001) and misleading (0002) variable naming and comments to reflect reality (in AIO v1 the postmaster used to assign those I/O worker numbers, now the postmaster has i

Re: Automatically sizing the IO worker pool

2025-05-26 Thread Thomas Munro
On Sun, May 25, 2025 at 7:20 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > > On Sun, Apr 13, 2025 at 04:59:54AM GMT, Thomas Munro wrote: > > It's hard to know how to set io_workers=3. If it's too small, > > io_method=worker's small submission queue over

Re: Non-reproducible AIO failure

2025-05-25 Thread Thomas Munro
On Sun, May 25, 2025 at 3:22 PM Tom Lane wrote: > Thomas Munro writes: > > Can you get a core and print *ioh in the debugger? > > So far, I've failed to get anything useful out of core files > from this failure. The trace goes back no further than > > (lldb) bt

Re: Non-reproducible AIO failure

2025-05-24 Thread Thomas Munro
On Sun, May 25, 2025 at 9:00 AM Alexander Lakhin wrote: > Hello Thomas, > 24.05.2025 14:42, Thomas Munro wrote: > > On Sat, May 24, 2025 at 3:17 PM Tom Lane wrote: > >> So it seems that "very low-probability issue in our Mac AIO code" is > >> the most p

Re: I/O worker and ConfigReload

2025-05-24 Thread Thomas Munro
On Sun, May 25, 2025 at 1:56 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > I've been rebasing the patch for online resizing of shared memory, and > noticed something strange about IoWorkerMain: although it sets the > handler SignalHandlerForConfigReload, it doesn't look like it acts upon > Conf

Re: Non-reproducible AIO failure

2025-05-24 Thread Thomas Munro
On Sat, May 24, 2025 at 3:17 PM Tom Lane wrote: > So it seems that "very low-probability issue in our Mac AIO code" is > the most probable description. There isn't any macOS-specific AIO code so my first guess would be that it might be due to aarch64 weak memory reordering (though Andres speculat

Re: Improve hash join's handling of tuples with null join keys

2025-05-11 Thread Thomas Munro
On Tue, May 6, 2025 at 12:12 PM Tomas Vondra wrote: > On 5/6/25 01:11, Tom Lane wrote: > > The attached patch is a response to the discussion at [1], where > > it emerged that lots of rows with null join keys can send a hash > > join into too-many-batches hell, if they are on the outer side > > of

Re: disabled SSL log_like tests

2025-05-06 Thread Thomas Munro
On Wed, May 7, 2025 at 4:34 PM Tom Lane wrote: > Thanks, I'll look into reporting it tomorrow. In the meantime, > I couldn't help noticing that the backtraces went through > lib/libssl/tls13_legacy.c, which doesn't give a warm feeling > about how supported they think our usage is (and perhaps als

Re: disabled SSL log_like tests

2025-05-06 Thread Thomas Munro
On Wed, May 7, 2025 at 1:18 PM Tom Lane wrote: > So it seems like this might be a simple oversight in > ssl_sigalg_pkey_ok(), which doesn't make any such correction: > > if (sigalg->key_type != EVP_PKEY_id(pkey)) > return 0; Nice detective work. > Anyone know anything abo

Re: disabled SSL log_like tests

2025-05-05 Thread Thomas Munro
If you run the not-yet-enabled-by-default OpenBSD CI task on master, ssl/001_ssltests fails in "intermediate client certificate is untrusted", recently uncommented by commit e0f373ee. I think it might be telling us that LibreSSL's x509_store_ctx_get_current_cert() is giving us the client certifica

Re: Adding pg_dump flag for parallel export to pipes

2025-04-25 Thread Thomas Munro
On Tue, Apr 8, 2025 at 7:48 AM Hannu Krosing wrote: > Just to bring this out separately : Does anybody have any idea why pipe > commands close inside tests ? > > Re: 003-pg_dump_basic_tests has a few basic validation tests for > correctmflag combinations. We need to write more automated tests in

Re: Allow io_combine_limit up to 1MB

2025-04-25 Thread Thomas Munro
On Sat, Apr 26, 2025 at 5:43 AM Tom Lane wrote: > Andres Freund writes: > > It's kinda sad to not have any test that tests a larger > > io_combine_limit/io_max_combine_limit - as evidenced by this bug that'd be > > good. However, not all platforms have PG_IOV_MAX > 16, so it seems like it'd > > b

Re: AIX support

2025-04-24 Thread Thomas Munro
On Mon, Apr 7, 2025 at 10:04 PM Heikki Linnakangas wrote: > I'm surprised how big the difference is, because I actually expected the > compiler to detect the memory-zeroing loop and replace it with some > fancy vector instructions (does powerpc have any?). It certainly does, and we've played with

  1   2   3   4   5   6   7   8   9   10   >