Hi,
On 2025-09-03 15:33:30 -0400, Peter Geoghegan wrote:
> On Wed, Sep 3, 2025 at 2:47 PM Andres Freund wrote:
> > I still don't think I fully understand why the impact of this is so large.
> > The
> > branch misses appear to be the only thing differentiating t
e.g. network attached storage), one might need to increase
io_max_concurrency to actually be able to saturate storage. An ~8% increase
in size isn't nothing when the baseline isn't small.
Greetings,
Andres Freund
Hi,
I spent a fair bit more time analyzing this issue.
On 2025-08-28 21:10:48 -0400, Andres Freund wrote:
> On 2025-08-28 19:57:17 -0400, Peter Geoghegan wrote:
> > On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra wrote:
> > I'm not sure that Thomas'/your patch to amel
ent, it's just a bit of
lock contention in an extreme workload...
Greetings,
Andres Freund
e doing > 30GB/s of repeated reads from the page cache
is a particularly useful thing to optimize. I see a lot of unrelated
contention, e.g. on the BufferMappingLock - unsurprising, it's a really
extreme workload...
If I instead just increase s_b, I get 2x the throughput...
Greetings,
Andres Freund
erState *record)
> Relationreln = CreateFakeRelcacheEntry(rlocator);
>
> visibilitymap_pin(reln, blkno, &vmbuffer);
> - old_vmbits = visibilitymap_set_vmbyte(reln, blkno, vmbuffer,
> vmflags);
> + old_vmbits = visibilitymap_set(reln, blkno, vmbuffer, vmflags);
> /* Only set VM page LSN if we modified the page */
> if (old_vmbits != vmflags)
> PageSetLSN(BufferGetPage(vmbuffer), lsn);
> @@ -279,143 +275,6 @@ heap_xlog_prune_freeze(XLogReaderState *record)
> UnlockReleaseBuffer(vmbuffer);
> }
Why are we manually pinning the vm buffer here? Shouldn't the xlog machinery
have done so, as you noticed in one of the early on patches?
Greetings,
Andres Freund
Hi,
On 2025-09-02 14:39:44 -0300, Ranier Vilela wrote:
> In (src/pl/plperl/plperl.c) if *PERL_SYS_INIT3* is defined and
> *MYMALLOC* is not, is possible use of variable
> *perl_sys_init_done* uninitialized.
Static variables are zero initialized by definition, no?
Greetings,
Andres Freund
pensive the more detail you add to stats.
EXPLAIN ANALYZE spends a large chunk of the time doing diffing of buffer
access stats, for example. We need to work towards doing less of that stuff,
not more.
Greetings,
Andres Freund
using the flag
and callback, as otherwise it's too easy to change the callsites to a
different callback, without removing the flag.
Greetings,
Andres Freund
Hi,
On 2025-08-29 15:23:48 -0700, Jeff Davis wrote:
> On Fri, 2025-08-29 at 12:32 -0400, Andres Freund wrote:
> > I don't really see an advantage of sync in those cases either.
>
> It seems a bit early to say that it's just there for debugging. But
> it's just
not use
> the global source root, which allows postgres to be built as a meson
> subproject.
That makes sense. I however can't apply it just now, the PG 18 code is frozen
until mid of the next week due to the release of 18rc1, and I think this
should be backpatched.
Greetings,
Andres Freund
use mentioning
specific technologies tends to get more out of date than more general things
like storage that has high throughput - I don't think we'll go back to SATA
devices with ~600MB/s of hard bus limited throughput...
Greetings,
Andres Freund
count infrastructure in bufmgr.c gets a bit slower once
more buffers are pinned
2) signalling overhead to the worker - I think we are resetting the latch too
eagerly, leading to unnecessarily many signals being sent to the IO worker.
3) same issue with the resowner tracking
But there's some additional difference in performance I don't yet
understand...
Greetings,
Andres Freund
gt; tried with 3 and 12 workers, and there's virtually no difference between
> those. IIRC when watching "top", I've never seen more than 1 or maybe 2
> workers active (using CPU).
That doesn't say much - if the they are doing IO, they're not on CPU...
Greetings,
Andres Freund
Hi,
On 2025-08-28 19:08:40 +0200, Tomas Vondra wrote:
> On 8/28/25 18:16, Andres Freund wrote:
> >> So I think the IPC overhead with "worker" can be quite significant,
> >> especially for cases with distance=1. I don't think it's a major issue
> >&g
s end can't have a higher distance?
Obviously you can construct cases with a low distance by having indexes point
to a lot of tiny tuples pointing to perfectly correlated pages, but in that
case IO can't be a significant factor.
Greetings,
Andres Freund
Hi,
On 2025-08-26 17:06:11 +0200, Tomas Vondra wrote:
> On 8/26/25 01:48, Andres Freund wrote:
> > Hi,
> >
> > On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote:
> >> Thanks. Based on the testing so far, the patch seems to be a substantial
> >> improvemen
On 2025-08-27 19:08:20 -0400, Andres Freund wrote:
> I'll push the patch to remove the bitfields after adjusting the commit message
> somewhat.
And done.
Hi,
On 2025-08-26 16:59:54 +0300, Konstantin Knizhnik wrote:
> On 26/08/2025 3:37 AM, Andres Freund wrote:
> > Hi,
> >
> > I'm a bit confused by this focus on bitfields - both Alexander and
> > Konstantin
> > stated they could reproduce the issue without th
27;re evaluating
scan keys with the buffer lock held - with basically arbitrary expressions
being evaluated. That's an easy path to undetected deadlocks. You'd have to
redesign the relevant mechanism to filter outside of the lock...
Greetings,
Andres Freund
Hi,
On 2025-08-27 12:14:41 -0700, Noah Misch wrote:
> On Wed, Aug 27, 2025 at 12:18:27PM -0400, Andres Freund wrote:
> > One way to do that would be to maintain a back-pointer from the BufferDesc
> > to
> > the BufferLookupEnt, since the latter *already* contains the Buffe
Hi,
On 2025-08-26 17:14:49 -0700, Noah Misch wrote:
> On Fri, Aug 22, 2025 at 03:44:48PM -0400, Andres Freund wrote:
> > == Problem 2 - AIO writes vs exclusive locks ==
> >
> > Separate from the hint bit issue, there is a second issue that I didn't
> > ha
Hi,
On 2025-08-26 16:21:36 -0400, Robert Haas wrote:
> On Fri, Aug 22, 2025 at 3:45 PM Andres Freund wrote:
> > My conclusion from the above is that we ought to:
> >
> > A) Make Buffer Locks something separate from lwlocks
> > B) Merge BufferDesc.state and the con
retending to be the real server. The settings on the real
server don't take effect in that case.
Greetings,
Andres Freund
Hi,
On 2025-08-26 15:21:34 +1200, Thomas Munro wrote:
> On Tue, Aug 26, 2025 at 12:45 PM Andres Freund wrote:
> > On 2025-08-25 10:43:21 +1200, Thomas Munro wrote:
> > > On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik
> > > wrote:
> > > > In theory
e mentioned ([1]) and that I've in mind to look at.
I mean that we increment the counters less frequently.
pgstat_count_heap_getnext() is called for every tuple on a page, which is
obviously much more frequent than once per page like for IO.
Greetings,
Andres Freund
cacheline). The reason that that is somewhat
OK from a coherency perspective is that this is done only for pure writes, not
read-modify-write operations. As the write overwrites the prior contents of
the memory, it is "ok" to do the write without waiting for cacheline ownership
ahead of time.
Greetings,
Andres Freund
at one must never have adjacent distinct (for some value of
> "distinct") bitfields for anything that requires atomics.
I think the barriers in place should prevent that.
Greetings,
Andres Freund
>From 3839e3a0a0e6717dcf6ee3d0547b3268c4a3fa3a Mon Sep 17 00:00:00 2001
From: Andre
was done wrongly for the per-backend IO stats too. I've
seen the increased overhead in profiles - and IO related counters aren't
incremented remotely as often as the scan related counters are.
Greetings,
Andres Freund
inherent io_method=worker overhead?
I think what you might be observing might be the inherent IPC / latency
overhead of the worker based approach. This is particularly pronounced if the
workers are idle (and the CPU they get scheduled on is clocked down). The
latency impact of that is small, but if you never actually get to do much
readahead it can be visible.
Greetings,
Andres Freund
Hi,
On 2025-08-24 09:08:16 -0700, Noah Misch wrote:
> On Sun, Aug 24, 2025 at 11:50:01AM -0400, Tom Lane wrote:
> > Andres Freund writes:
> > > I wonder if it's worth adding support to CI to perform the cross-version
> > > upgrade test. It'd be pretty
Hi,
On 2025-08-23 12:11:51 -0400, Tom Lane wrote:
> Andres Freund writes:
> > FWIW, I find the autoconf/make test run experience completely unusable. It
> > literally is made me embark on getting away from it. I don't understand how
> > people stand it.
>
> Inter
that bet.)
I wonder if it's worth adding support to CI to perform the cross-version
upgrade test. It'd be pretty easy to install all pgdg apt postgres packages to
the debian image, which then could be used as the source version...
Greetings,
Andres Freund
://postgr.es/m/20240610181212.auytluwmbfl7lb5n%40awork3.anarazel.de
I don't know what the right solution is, but it's really not good that
something as rarely used as gss encryption causes crashes and performance
issues for everyone.
Greetings,
Andres Freund
Hi,
On 2025-08-23 10:17:59 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2025-08-23 11:57:37 +0500, Andrey Borodin wrote:
> >> What is the downside of the approach where meson uses t/*.pl wildcard?
>
> > In meson you can't do wildcards at "co
is needed. You could do it by
running prove to run the tap tests, like make does, but that would
considerably slow down the tests, as prove has either no parallelism or
independent parallelism from the make's (or ninjas).
Greetings,
Andres Freund
Hi,
On 2025-08-18 11:38:02 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2025-08-18 08:57:13 +0900, Michael Paquier wrote:
> >> The following command fails, because btree_gist is not installed in
> >> the context of the isolation tests:
> >> make -C s
lock and require all buffer
modifications to at least hold share exclusive lock
6) Wait for AIO when acquiring an exclusive content lock
(some of these will likely have parts of their own, but that's details)
Sane?
DOES ANYBODY HAVE A BETTER NAME THAN SHARE-EXCLUSIVE???!?
Greeti
Hi,
On 2025-08-21 20:14:14 +0200, Antonin Houska wrote:
> ok, installing a copy of the same executable with a different name seems more
> reliable. At least that's how the postmaster->postgres link used to be
> handled, if I read Makefile correctly. Thanks.
I have not followed this thread, but I
2] ../src/bin/scripts/vacuumdb.c(197): warning C4034: sizeof
> > returns 0
> >
> > The real problem here seems to be the empty long_options_repack array.
> > I removed it and started a new run to see what happens. Running now:
> > https://cirrus-ci.com/build/4961902171783168
>
> The symlink issue occurred at "Windows - Server 2019, MinGW64 - Meson", where
> the code compiled well. The compilation failure mentioned above comes from
> "Windows - Server 2019, VS 2019 - Meson & ninja". I think it's still possible
> that the symlink issue will occur there once the compilation is fixed.
FWIW, I don't think it's particularly wise to rely on symlinks on windows -
IIRC they will often not be enabled outside of development environments.
Greetings,
Andres Freund
L)
3) With some ways of doing AIO the IO is offloaded to other processes, and
thus waiting for the IO to complete always requires waiting for another
process
How could we avoid the need to wait for another process in criticial sections
given these points?
Greetings,
Andres Freund
Hi,
On 2025-08-19 16:34:56 -0500, Nathan Bossart wrote:
> From 68b81e3bf70d5da0a0e2d0a0087218df7fde1101 Mon Sep 17 00:00:00 2001
> From: Nathan Bossart
> Date: Tue, 19 Aug 2025 16:27:33 -0500
> Subject: [PATCH v1 1/1] Fix comment for MAX_SIMUL_LWLOCKS.
>
> This comment mentions that pg_buffercac
Hi,
On 2025-08-19 13:31:35 -0500, Nathan Bossart wrote:
> On Tue, Aug 19, 2025 at 02:06:50PM -0400, Andres Freund wrote:
> > Possibly stupid question - is it really worth having a dynamic structure
> > here?
> > The number of tranches is strictly bound, it seems like it
ld be pretty easy
> to switch to something like a "dslist" in the future.
Possibly stupid question - is it really worth having a dynamic structure here?
The number of tranches is strictly bound, it seems like it'd be simpler to
have an array of tranch nmes in shared memory.
Greetings,
Andres Freund
Hi,
On 2025-08-20 03:19:38 +1200, Thomas Munro wrote:
> On Wed, Aug 20, 2025 at 2:57 AM Andres Freund wrote:
> > On 2025-08-20 02:54:09 +1200, Thomas Munro wrote:
> > > > On linux - the primary OS with OOM killer troubles - I'm pretty sure'll
> > > >
trying to stay up longer just makes everything more fragile. Waiting for the
logger is *exactly* what we should *not* do - what if the logger also crashed?
There's no postmaster around to start it.
Greetings,
Andres Freund
e adaptive spinning to lwlocks -
which is also what we need to make it more feasible to replace some of the
remaining spinlocks...
Greetings,
Andres Freund
It's so slow that it has measurable impact for single threaded readonly
pgbench. Which does friggin btree lookups.
Greetings,
Andres Freund
page's LSN using the
> returned XLOG location. For instance,
>
> XLogBeginInsert();
> XLogRegisterBuffer(...)
> XLogRegisterData(...)
> recptr = XLogInsert(rmgr_id, info);
>
> PageSetLSN(dp, recptr);
>
> 6. END_CRIT_SECTION()
>
> 7. Unlock and unpin the buffer(s).
Greetings,
Andres Freund
may not be possible to reach any real deadlock with existing AIO
> users, but that situation could change. There's also no reason the
> waiter shouldn't begin to wait via the IO method as soon as possible
> even without a deadlock.
>
> Picked up by testing a proposed IO method that has ->wait_one(), like
> io_method=io_uring, and code review.
LGTM.
Greetings,
Andres Freund
.
> and there's the matter of the test correctness.
What are you trying to say here?
Greetings,
Andres Freund
r test
earlier, then dependency tests, and then other compiler tests. Doing anything
the order we do it in autoconf is an anti-argument, because the ordering we
use in autoconf is completely unintelligible.
Greetings,
Andres Freund
Hi,
On 2025-08-15 12:57:52 -0500, Nathan Bossart wrote:
> On Fri, Aug 15, 2025 at 01:39:52PM -0400, Andres Freund wrote:
> > On 2025-08-14 11:29:08 +0200, Álvaro Herrera wrote:
> >> However, changing that spinlock to an lwlock doesn't look easy, because of
> >> th
Hi,
On 2025-08-15 15:42:10 -0400, Peter Geoghegan wrote:
> On Fri, Aug 15, 2025 at 3:38 PM Andres Freund wrote:
> > I see absolutely no effect of the patch with shared_buffers=1GB and a
> > read-only scale 200 pgbench at 40 clients. What data sizes, shared buffers
> >
Hi,
On 2025-08-15 15:31:47 -0400, Peter Geoghegan wrote:
> On Fri, Aug 15, 2025 at 3:28 PM Andres Freund wrote:
> > >I'm not worried about it. Andres' "not waiting for already-in-progress
> > >IO" patch was clearly just a prototype. Just thought it wa
Hi,
On August 15, 2025 3:25:50 PM EDT, Peter Geoghegan wrote:
>On Thu, Aug 14, 2025 at 10:12 PM Peter Geoghegan wrote:
>> As far as I know, we only have the following unambiguous performance
>> regressions (that clearly need to be fixed):
>>
>> 1. This issue.
>>
>> 2. There's about a 3% loss of
ere. With spinlocks we can just reinit the spinlock each time, but
> that doesn't work with lwlocks. We have no easy way to associate then
> disassociate each entry from a specific lwlock.
I'm not following? The lwlock can just be inside the struct, just like the
spinlock is? "Association" is just LWLockInitialize() and deassociation is not
needed.
Greetings,
Andres Freund
seeing "I/O Timings" > 0 even if we do perfect readahead.
Most of the cost is in the kernel, primarily looking up block locations and
setting up the actual I/O.
Greetings,
Andres Freund
buffer that we already started IO for.
Greetings,
Andres Freund
Hi,
On 2025-08-14 19:36:49 -0400, Andres Freund wrote:
> On 2025-08-14 17:55:53 -0400, Peter Geoghegan wrote:
> > On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan wrote:
> > > > We can optimize that by deferring the StartBufferIO() if we're
> > > >
nk it has quite as large an effect for that as it
has here, because the different scans basically desynchronize whenever it
happens due to the synchronous waits slowing down the waiting backend a lot),
limiting the impact somewhat.
Greetings,
Andres Freund
certainly explains the performance difference...
We can optimize that by deferring the StartBufferIO() if we're encountering a
buffer that is undergoing IO, at the cost of some complexity. I'm not sure
real-world queries will often encounter the pattern of the same block being
read in by a read stream multiple times in close proximity sufficiently often
to make that worth it.
Greetings,
Andres Freund
Hi,
On 2025-08-14 15:15:02 -0400, Peter Geoghegan wrote:
> On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote:
> > I think this is just an indicator of being IO bound.
>
> Then why does the exact same pair of runs show "I/O Timings: shared
> read=194.629" for th
Hi,
On 2025-08-14 15:30:16 -0400, Peter Geoghegan wrote:
> On Thu Aug 14, 2025 at 3:15 PM EDT, Peter Geoghegan wrote:
> > On Thu, Aug 14, 2025 at 2:53 PM Andres Freund wrote:
> >> I think this is just an indicator of being IO bound.
> >
> > Then why does the exa
r io_combine_limit, the OS also
can do combining.
I'd see what changes if you temporarily reduce
/sys/block/nvme6n1/queue/max_sectors_kb to a smaller size.
Could you show iostat for both cases?
Greetings,
Andres Freund
--rw read:-8k --buffered 0 2>&1|grep READ
READ: bw=70.6MiB/s (74.0MB/s), 70.6MiB/s-70.6MiB/s (74.0MB/s-74.0MB/s),
io=1024MiB (1074MB), run=14513-14513msec
So on this WD Red SN700 there's a rather substantial performance difference.
On a Samsung 970 PRO I don't see much of a difference. Nor on a ADATA
SX8200PNP.
Greetings,
Andres Freund
Hi,
On 2025-08-14 00:23:49 +0200, Tomas Vondra wrote:
> On 8/13/25 23:37, Andres Freund wrote:
> > On 2025-08-13 23:07:07 +0200, Tomas Vondra wrote:
> >> On 8/13/25 16:44, Andres Freund wrote:
> >>> On 2025-08-13 14:15:37 +0200, Tomas Vondra wrote:
> >
Hi,
On 2025-08-13 23:07:07 +0200, Tomas Vondra wrote:
> On 8/13/25 16:44, Andres Freund wrote:
> > On 2025-08-13 14:15:37 +0200, Tomas Vondra wrote:
> >> In fact, I believe this is about io_method. I initially didn't see the
> >> difference you described, and then
Hi,
On 2025-08-13 10:24:07 -0400, Andres Freund wrote:
> > 2- Using '/DEBUG:FULL' instead of '/DEBUG:FASTLINK' in the Windows CI
> > task but this causes more memory to be used. It seems that the error
> > appears only when the '/DEBUG:FASTLINK' i
Hi,
On 2024-10-30 12:45:27 -0400, Andres Freund wrote:
> On 2024-10-30 13:29:01 +0200, Heikki Linnakangas wrote:
> > On 30/10/2024 04:21, Andres Freund wrote:
> > > Attached is a, unfortunately long, series of patches implementing what I
> > > described upthr
7;t add the attribute to the declaration, just to the
function. The caller doesn't need to know that it's unused, it's purely a
question of the specific implementation that the attribute is unused.
Greetings,
Andres Freund
the CI
task when it was using windows containers, as we'd run out of memory
occasionally. But since we aren't using those anymore, I think the best way to
make CI work again is to simply stop using /DEBUG:FASTLINK.
Separately I think we should report this as a bug to meson. Could you perhaps
create a minimal reproducer of the issue and report it?
Greetings,
Andres Freund
we could make some of this into tests somehow. It's pretty easy
to break this kind of thing and not notice, as everything just continues to
work, just a tad slower.
Greetings,
Andres Freund
if (persistence == RELPERSISTENCE_TEMP)
pgBufferUsage.local_blks_hit += 1;
else
pgBufferUsage.shared_blks_hit += 1;
...
Greetings,
Andres Freund
ficient memory level prefetching is.
OS level readahead is visible in some form in iostat - you get bigger reads or
multiple in-flight IOs.
Greetings,
Andres Freund
0.00 0.000.00
0.000.000.000.69 63.80
Note the different read sizes...
> I did look into pg_aios, but there's only 8kB requests in both cases. I
> didn't have time to look closer yet.
That's what we'd expect, right? There's nothing on master that'd perform read
combining for index scans...
Greetings,
Andres Freund
e failure, or at least the cfbot is showing a red
> column at the moment.
See
https://postgr.es/m/CAN55FZ1RuBhJmPWs3Oi%3D9UoezDfrtO-VaU67db5%2B0_uy19uF%2BA%40mail.gmail.com
Greetings,
Andres Freund
Hi,
On 2025-08-11 16:30:30 -0700, Jacob Champion wrote:
> On Mon, Aug 11, 2025 at 3:52 PM Andres Freund wrote:
> > And the warning is right. Not sure why a new compiler is needed, IIRC this
> > warning is present in other cases with older compilers too.
>
> Probably
>
reate
> PGPROC partitions only for those)? I suppose that requires literally
> walking all the nodes.
I didn't think of numa_node_of_cpu().
As long as numa_node_of_cpu() returns *something* I think it may be good
enough. Nobody uses an RPi for high-throughput postgres workloads with a lot
of memory. Slightly sub-optimal mappings should really not matter.
I'm kinda wondering if we should deal with such fake numa systems by detecting
them and disabling our numa support.
Greetings,
Andres Freund
────┴────┘
(3 rows)
Greetings,
Andres Freund
iler is needed, IIRC this
warning is present in other cases with older compilers too.
The most obvious fix is to slap on a PG_USED_FOR_ASSERTS_ONLY. However, we so
far don't seem to have used it for function parameters... But I don't see a
problem with starting to do so.
Greetings,
Andres Freund
Hi,
On 2025-07-11 11:22:36 +0900, Amit Langote wrote:
> On Fri, Jul 11, 2025 at 5:55 AM Andres Freund wrote:
> > On 2025-07-10 17:28:50 +0900, Amit Langote wrote:
> > > On Thu, Jul 10, 2025 at 8:34 AM Andres Freund wrote:
> > > > The performance gain unsurprisingl
Hi,
On 2025-08-11 14:40:40 +0300, Nazir Bilal Yavuz wrote:
> Thank you for working on this!
Thanks for the review - pushed.
Greetings,
Andres Freund
yc too, if "L3 LLC as
NUMA" is enabled.
> I'm not sure what to do about this (or how getcpu() or libnuma handle this).
I don't immediately see any libnuma functions that would care?
I also am somewhat curious about what getcpu() returns for the current node...
Greetings,
Andres Freund
l is somewhat expensive.
Greetings,
Andres Freund
>From d845c0d56a0357730a7ec398cd77c6a1ada392fa Mon Sep 17 00:00:00 2001
From: Andres Freund
Date: Fri, 8 Aug 2025 19:49:23 -0400
Subject: [PATCH v2] meson: add and use stamp files for generated headers
Without using stamp files, meson lists the g
ot;strange" combinations of parameters, looking for
> weird behaviors like that.
I'm just catching up: Isn't it a bit early to focus this much on testing? ISMT
that the patchsets for both approaches currently have some known architectural
issues and that addressing them seems likely to change their performance
characteristics.
Greetings,
Andres Freund
It's possible to do this by globing for files at configure time, but that
wouldn't detect adding new headers (which would need to trigger a
re-configure). Whether that's an issue worth caring about I'm a bit on the
fence about.
Greetings,
Andres Freund
istake to introduce support for granular resets, we
shouldn't bury ourselves deeper. If anything we should rip out everything
other than 1) a global reset b) a per-database reset.
Leaving that aside, I just don't see a convincing use case for returning the
timestamp here.
Greetings,
Andres Freund
Hi,
On 2025-08-08 18:28:09 -0400, Andres Freund wrote:
> > From 6574ac9267fe9938f59ed67c8f0282716d8c28f3 Mon Sep 17 00:00:00 2001
> > From: Thomas Munro
> > Date: Sun, 3 Aug 2025 00:15:01 +1200
> > Subject: [PATCH v1 3/4] aio: Support I/O methods without true vectore
_completion_queue() to give up
> + * early since this backend can process its own queue promptly and
> efficiently.
> + */
> +static void
> +pgaio_posix_aio_ipc_acquire_own_completion_lock(PgAioPosixAioContext
> *context)
> +{
> + Assert(context == pgaio_my_posix_aio_context);
> + Assert(!LWLockHeldByMe(&context->completion_lock));
> +
> + if (!LWLockConditionalAcquire(&context->completion_lock, LW_EXCLUSIVE))
> + {
> + ProcNumber procno;
> +
> + procno = pg_atomic_exchange_u32(&context->ipc_procno,
> MyProcNumber);
> + if (procno != INVALID_PROC_NUMBER)
> + SetLatch(&GetPGProcByNumber(procno)->procLatch);
> +
> + LWLockAcquire(&context->completion_lock, LW_EXCLUSIVE);
> + pg_atomic_write_u32(&context->ipc_procno, INVALID_PROC_NUMBER);
> + }
> +}
Is the "command pgaio_posix_aio_ipc_drain_completion_queue() to give up" path
frequent enough to be worth the complexity? I somewhat doubt so?
Greetings,
Andres Freund
README should do the trick, I'll go
> investigate that.
FWIW, you can trigger manual tasks in the cirrus-ci web-interface.
Greetings,
Andres Freund
A large portion of the cases I've seen where toast ID assignments were a
problem were when the global OID counter wrapped around due to activity on
*other* tables (and/or temporary table creation). If you instead had a
per-toast-table sequence for assigning chunk IDs, that problem would largely
vanish.
With 64bit toast IDs we shouldn't need to search the index for a
non-conflicting toast IDs, there can't be wraparounds (we'd hit wraparound of
LSNs well before that and that's not practically reachable).
Greetings,
Andres Freund
On 2025-07-28 08:18:01 +0900, Michael Paquier wrote:
> I have used that and applied it down to v18, closing the open item.
Thanks!
LTRUE(entry->key))
> + else if (!LTG_ISALLTRUE(entry->key.value))
This should be DatumGet*(), no?
> diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c
> index 996ce174454..5d57563ecb7 100644
> --- a/contrib/sepgsql/label.c
> +++ b/contrib/sepgsql/label.c
> @@ -330,7 +330,7 @@ sepgsql_fmgr_hook(FmgrHookEventType event,
> stack = palloc(sizeof(*stack));
> stack->old_label = NULL;
> stack->new_label =
> sepgsql_avc_trusted_proc(flinfo->fn_oid);
> - stack->next_private = 0;
> + stack->next_private.value = 0;
>
> MemoryContextSwitchTo(oldcxt);
Probably should use DummyDatum.
Greetings,
Andres Freund
Hi,
On 2025-08-05 19:20:20 +0200, Peter Eisentraut wrote:
> On 31.07.25 19:17, Tom Lane wrote:
> > Also I see a "// XXX" in pg_get_aios, which I guess is a note
> > to confirm the data type to use for ioh_id?
>
> Yes, the stuff returned from pgaio_io_get_id() should be int, but some code
> uses u
of days before
> getting down to it.
I don't really get what the point of designing that mechanism is before we
have a usecase. If we need it, we can expand it at that time.
Greetings,
Andres Freund
cific changes, so I
guess "... all good" covers it...
Greetings,
Andres Freund
or
VERBOSE and once without. That's not exactly a free lunch...
Greetings,
Andres Freund
Hi,
On 2025-07-18 13:24:32 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2025-07-17 20:09:57 -0400, Tom Lane wrote:
> >> I made it just as a proof-of-concept that this can work. It compiled
> >> cleanly and passed check-world for me on a 32-bit FreeBSD im
te and IndexScanInstrumentation seems to be
pre-destined for that information. But it seems a a bit too much memory to
just keep a BufferUsage around even when analyze isn't used.
Greetings,
Andres Freund
PS: Another thing that I think we ought to track is the number of fetches from
the table
1 - 100 of 2193 matches
Mail list logo