Re: Use streaming read API in ANALYZE

2024-04-07 Thread Thomas Munro
is not obvious that this code matches Knuth's Algorithm S ..." and realised I'm not sure I have time to develop a good opinion about this today. So I'll leave the 0002 change out for now, as it's a tidy-up that can easily be applied in the next cycle. From c3b8df8e4720d8b0dfb4c892c0aa3ddae

Re: Use streaming read API in ANALYZE

2024-04-07 Thread Thomas Munro
On Mon, Apr 8, 2024 at 10:26 AM Melanie Plageman wrote: > On Sun, Apr 07, 2024 at 03:00:00PM -0700, Andres Freund wrote: > > > src/backend/commands/analyze.c | 89 ++ > > > 1 file changed, 26 insertions(+), 63 deletions(-) > > > > That's a very nice demonstration

pgsql: Use streaming I/O in ANALYZE.

2024-04-07 Thread Thomas Munro
Plageman Reviewed-by: Andres Freund Reviewed-by: Jakub Wartak Reviewed-by: Heikki Linnakangas Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/flat/CAN55FZ0UhXqk9v3y-zW_fp4-WCp43V8y0A72xPmLkOM%2B6M%2BmJg%40mail.gmail.com Branch -- master Details --- https://git.postgresql.org/pg

Re: Streaming read-ready sequential scan code

2024-04-07 Thread Thomas Munro
On Sun, Apr 7, 2024 at 1:34 PM Melanie Plageman wrote: > Attached v13 0001 is your fix and 0002 is a new version of the > sequential scan streaming read user. Off-list Andres mentioned that I > really ought to separate the parallel and serial sequential scan users > into two different callbacks.

pgsql: Use streaming I/O in sequential scans.

2024-04-07 Thread Thomas Munro
Use streaming I/O in sequential scans. Instead of calling ReadBuffer() for each block, heap sequential scans and TID range scans now use the streaming API introduced in b5a9b18cd0. Author: Melanie Plageman Reviewed-by: Andres Freund Reviewed-by: Thomas Munro Discussion: https://postgr.es/m

pgsql: Fix if/while thinko in read_stream.c edge case.

2024-04-06 Thread Thomas Munro
Fix if/while thinko in read_stream.c edge case. When we determine that a wanted block can't be combined with the current pending read, it's time to start that read to get it out of the way. An "if" in that code path should have been a "while", because it might take more than one go in case of

Re: Streaming read-ready sequential scan code

2024-04-06 Thread Thomas Munro
fix for that, longer explanation in commit message. From 43cef2d58141ba048e9349b0027afff148be5553 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sun, 7 Apr 2024 12:36:44 +1200 Subject: [PATCH] Fix bug in read_stream.c. When we determine that a wanted block can't be combined with the current pending read

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2024-04-06 Thread Thomas Munro
2001 From: Thomas Munro Date: Sun, 7 Apr 2024 09:13:17 +1200 Subject: [PATCH v6] Add pg_buffercache_invalidate() function for testing. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When testing buffer pool logic, it is useful to be able to evict arbitr

pgsql: Allow BufferAccessStrategy to limit pin count.

2024-04-06 Thread Thomas Munro
Allow BufferAccessStrategy to limit pin count. While pinning extra buffers to look ahead, users of strategies are in danger of using too many buffers. For some strategies, that means "escaping" from the ring, and in others it means forcing dirty data to disk very frequently with associated WAL

pgsql: Increase default vacuum_buffer_usage_limit to 2MB.

2024-04-06 Thread Thomas Munro
Increase default vacuum_buffer_usage_limit to 2MB. The BAS_VACUUM ring size has been 256kB since commit d526575f introduced the mechanism 17 years ago. Commit 1cbbee03 recently made it configurable but retained the traditional default. The correct default size has been debated for years, but

pgsql: Improve read_stream.c's fast path.

2024-04-05 Thread Thomas Munro
Improve read_stream.c's fast path. The "fast path" for well cached scans that don't do any I/O was accidentally coded in a way that could only be triggered by pg_prewarm's usage pattern, which starts out with a higher distance because of the flags it passes in. We want it to work for streaming

Re: LogwrtResult contended spinlock

2024-04-05 Thread Thomas Munro
On Sat, Apr 6, 2024 at 6:55 AM Alvaro Herrera wrote: > Pushed 0001. Could that be related to the 3 failures on parula that look like this? TRAP: failed Assert("node->next == 0 && node->prev == 0"), File: "../../../../src/include/storage/proclist.h", Line: 63, PID: 29119 2024-04-05 16:16:26.812

Re: Streaming read-ready sequential scan code

2024-04-05 Thread Thomas Munro
On Sat, Apr 6, 2024 at 6:55 AM Melanie Plageman wrote: > On Fri, Apr 5, 2024 at 12:15 AM Thomas Munro wrote: > > The interesting column is hot. The 200ms->211ms regression is due to > > the extra bookkeeping in the slow path. The rejiggered fastpath code > > fixes it fo

Re: broken JIT support on Fedora 40

2024-04-05 Thread Thomas Munro
On Sun, Mar 31, 2024 at 12:49 PM Thomas Munro wrote: > https://github.com/llvm/llvm-project/pull/87093 Oh, with those clues, I think I might see... It is a bit strange that we copy attributes from AttributeTemplate(), a function that returns Datum, to our void deform function. It works (I m

Re: Streaming read-ready sequential scan code

2024-04-04 Thread Thomas Munro
, going to find my Mac... [1] https://www.postgresql.org/message-id/flat/CAApHDvpTRx7hqFZGiZJ%3Dd9JN4h1tzJ2%3Dxt7bM-9XRmpVj63psQ%40mail.gmail.com From 74b8cde45a8babcec7b52b06bdb8ea046a0a966f Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 5 Apr 2024 13:32:14 +1300 Subject: [PATCH v10 1/4]

Re: Built-in CTYPE provider

2024-04-04 Thread Thomas Munro
Hi, +command_ok( + [ + 'initdb', '--no-sync', + '--locale-provider=builtin', '-E UTF-8', + '--builtin-locale=C.UTF-8', "$tempdir/data8" + ], + 'locale provider builtin with -E UTF-8 --builtin-locale=C.UTF-8'); This Sun animal recently

Re: Streaming read-ready sequential scan code

2024-04-04 Thread Thomas Munro
On Fri, Apr 5, 2024 at 4:20 AM Melanie Plageman wrote: > So, sequential scan does not have per-buffer data. I did some logging > and the reason most fully-in-SB sequential scans don't use the fast > path is because read_stream->pending_read_nblocks is always 0. Hnghghghgh... right, sorry I

Re: Streaming read-ready sequential scan code

2024-04-04 Thread Thomas Munro
On Thu, Apr 4, 2024 at 8:02 PM David Rowley wrote: > 3a4a3537a > latency average = 34.497 ms > latency average = 34.538 ms > > 3a4a3537a + read_stream_for_seqscans.patch > latency average = 40.923 ms > latency average = 41.415 ms > > i.e. no meaningful change from the refactor, but a regression

Re: Streaming read-ready sequential scan code

2024-04-04 Thread Thomas Munro
On Thu, Apr 4, 2024 at 10:31 PM Thomas Munro wrote: > Alright what about this? Forgot to git add a change, new version. From 6dea2983abf8d608c34e02351d70694de99f25f2 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 4 Apr 2024 20:31:26 +1300 Subject: [PATCH v2 1/2] Al

Re: Streaming read-ready sequential scan code

2024-04-04 Thread Thomas Munro
t about this? From 6dea2983abf8d608c34e02351d70694de99f25f2 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 4 Apr 2024 20:31:26 +1300 Subject: [PATCH 1/2] Allow BufferAccessStrategy to limit pin count. When pinning extra buffers to look ahead, users of a strategy are in danger of pinning a lot of the b

WIP: Vectored writeback

2024-04-04 Thread Thomas Munro
d them over relevant recent commits, so I could leave them in working state in case anyone is interested in this file I/O-level stuff... From c6d227678c586387a49c30c4f9a61f62c9b04b1c Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Wed, 13 Mar 2024 17:02:42 +1300 Subject: [PATCH v5 1/3] Prov

Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

2024-04-03 Thread Thomas Munro
On Fri, Mar 8, 2024 at 6:20 AM Maxim Orlov wrote: > Quite an interesting patch, in my opinion. I've decided to work on it a bit, > did some refactoring (sorry) and add > basic tests. Also, I try to take into account as much as possible notes on > the patch, mentioned by Cédric Villemain.

Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-03 Thread Thomas Munro
On Thu, Apr 4, 2024 at 11:51 AM Peter Eisentraut wrote: > On 30.03.24 22:27, Thomas Munro wrote: > > Hmm, OK so it doesn't have 3 available in parallel from base repos. > > But it's also about to reach end of "full support" in 2 months[1], so > > if we ap

Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-04-03 Thread Thomas Munro
On Sat, Mar 23, 2024 at 3:01 AM Nick Renders wrote: > We now have a second machine with this issue: it is an Intel Mac mini running > macOS Sonoma (14.4) and PostgreSQL 16.2. > This one only has a single Data directory, so there are no multiple instances > running. BTW if you're running

Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-04-03 Thread Thomas Munro
On Thu, Apr 4, 2024 at 3:11 AM Nick Renders wrote: > In the macOS Finder, when you show the Info (command+i) for an external drive > (or any partition that is not the boot drive), there is a checkbox "Ignore > ownership on this volume" in the Permissions section. I think it is by > default

Re: Streaming read-ready sequential scan code

2024-04-03 Thread Thomas Munro
On Thu, Apr 4, 2024 at 6:03 AM Melanie Plageman wrote: > On Tue, Apr 2, 2024 at 1:10 PM Heikki Linnakangas wrote: > > On 01/04/2024 22:58, Melanie Plageman wrote: > > > Attached v7 has version 14 of the streaming read API as well as a few > > > small tweaks to comments and code. > > > > I saw

Re: Streaming I/O, vectored I/O (WIP)

2024-04-02 Thread Thomas Munro
On Tue, Apr 2, 2024 at 9:39 PM Thomas Munro wrote: > So this is the version I'm going to commit shortly, barring objections. And done, after fixing a small snafu with smgr-only reads coming from CreateAndCopyRelationData() (BM_PERMANENT would be incorrectly/unnecessarily set for unlogged tab

Re: Building with musl in CI and the build farm

2024-04-02 Thread Thomas Munro
On Wed, Mar 27, 2024 at 11:27 AM Wolfgang Walther wrote: > The animal runs in a docker container via GitHub Actions in [2]. Great idea :-)

Re: pgsql: Implement pg_wal_replay_wait() stored procedure

2024-04-02 Thread Thomas Munro
On Wed, Apr 3, 2024 at 9:42 AM Alexander Korotkov wrote: > On Tue, Apr 2, 2024 at 10:58 PM Alexander Korotkov > wrote: > > Implement pg_wal_replay_wait() stored procedure > > I'm trying to figure out if this failure could be related to this commit... >

pgsql: Provide vectored variant of ReadBuffer().

2024-04-02 Thread Thomas Munro
advice and leaving WaitReadBuffers() to do the work synchronously. Author: Thomas Munro Author: Andres Freund (some optimization tweaks) Reviewed-by: Melanie Plageman Reviewed-by: Heikki Linnakangas Reviewed-by: Nazir Bilal Yavuz Reviewed-by: Dilip Kumar Reviewed-by: Andres Freund Tested

pgsql: Provide API for streaming relation data.

2024-04-02 Thread Thomas Munro
patterns involving predictable access to a single fork of a single relation. Several patches using this API are proposed separately. This stream concept is loosely based on ideas from Andres Freund on how we should pave the way for later work on asynchronous I/O. Author: Thomas Munro Author

pgsql: Use streaming I/O in pg_prewarm.

2024-04-02 Thread Thomas Munro
Use streaming I/O in pg_prewarm. Instead of calling ReadBuffer() repeatedly, use the new streaming interface. This commit provides a very simple example of such a transformation. Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6ut5tum2g...@mail.gmail.com Branch --

Re: Streaming I/O, vectored I/O (WIP)

2024-04-02 Thread Thomas Munro
a054f Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 2 Apr 2024 14:40:40 +1300 Subject: [PATCH v15 1/4] Provide vectored variant of ReadBuffer(). Break ReadBuffer() up into two steps: StartReadBuffers() and WaitReadBuffers(). This has two main advantages: 1. Multiple consecutive block

Re: pg_combinebackup --copy-file-range

2024-04-01 Thread Thomas Munro
On Tue, Apr 2, 2024 at 8:43 AM Tomas Vondra wrote: > And I think he's right, and my tests confirm this. I did a trivial patch > to align the blocks to 8K boundary, by forcing the header to be a > multiple of 8K (I think 4K alignment would be enough). See the 0001 > patch that does this. > > And

Re: pg_combinebackup --copy-file-range

2024-03-30 Thread Thomas Munro
On Sun, Mar 31, 2024 at 5:33 PM Tomas Vondra wrote: > I'm on 2.2.2 (on Linux). But there's something wrong, because the > pg_combinebackup that took ~150s on xfs/btrfs, takes ~900s on ZFS. > > I'm not sure it's a ZFS config issue, though, because it's not CPU or > I/O bound, and I see this on

Re: pg_combinebackup --copy-file-range

2024-03-30 Thread Thomas Munro
+wb = copy_file_range(s->fd, [i], wfd, NULL, BLCKSZ, 0); Can you collect adjacent blocks in one multi-block call? And then I think the contract is that you need to loop if it returns short.

Re: pg_combinebackup --copy-file-range

2024-03-30 Thread Thomas Munro
On Sun, Mar 31, 2024 at 1:37 PM Tomas Vondra wrote: > So I decided to take a stab at Thomas' idea, i.e. reading the data to > ... > I'll see how this works on EXT4/ZFS next ... Wow, very cool! A couple of very quick thoughts/notes: ZFS: the open source version only gained per-file block

Re: broken JIT support on Fedora 40

2024-03-30 Thread Thomas Munro
On Sun, Mar 31, 2024 at 5:59 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > Yeah, sorry, I'm a bit baffled about this situation myself. Yesterday > I've opened a one-line PR fix that should address the issue, maybe this > would help. In the meantime I've attached what did work for me as a >

Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-03-30 Thread Thomas Munro
On Sun, Mar 31, 2024 at 9:59 AM Tom Lane wrote: > Thomas Munro writes: > > I was reminded of this thread by ambient security paranoia. As it > > stands, we require 1.0.2 (but we very much hope that package > > maintainers and others in control of builds don't decide to u

Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-03-30 Thread Thomas Munro
On Thu, Sep 7, 2023 at 11:44 PM Daniel Gustafsson wrote: > > On 7 Sep 2023, at 13:30, Thomas Munro wrote: > > I don't like the idea that our *next* release's library version > > horizon is controlled by Red Hat's "ELS" phase. > > Agreed. If we instead

Re: broken JIT support on Fedora 40

2024-03-29 Thread Thomas Munro
On Fri, Mar 22, 2024 at 7:15 AM Dmitry Dolgov <9erthali...@gmail.com> wrote: > > For verification, I've modified the deform.outblock to call LLVMBuildRet > > instead of LLVMBuildRetVoid and this seems to help -- inline and deform > > stages are still performed as before, but nothing crashes. But

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 12:34 PM Tomas Vondra wrote: > Hmmm. I admit I didn't think about the "always prefetch" flag too much, > but I did imagine it'd only affect some places (e.g. BHS, but not for > sequential scans). If it could be done by lowering the combine limit, > that could work too - in

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 12:40 PM Tomas Vondra wrote: > Sorry, I meant the prefetch (readahead) built into ZFS. I may be wrong > but I don't think the regular RA (in linux kernel) works for ZFS, right? Right, it separate page cache ("ARC") and prefetch settings:

Re: LLVM 18

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 7:07 AM Christoph Berg wrote: > Ubuntu in their infinite wisdom have switched to LLVM 18 as default > for their upcoming 24.04 "noble" LTS release while Debian is still > defaulting to 16. I'm now seeing LLVM crashes on the 4 architectures > we support on noble. > > Should

Re: Security lessons from liblzma

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 11:37 AM Bruce Momjian wrote: > You might have seen reports today about a very complex exploit added to > recent versions of liblzma. Fortunately, it was only enabled two months > ago and has not been pushed to most stable operating systems like Debian > and Ubuntu. The

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 10:39 AM Thomas Munro wrote: > On Sat, Mar 30, 2024 at 4:53 AM Tomas Vondra > wrote: > > ... Maybe there should be some flag to force > > issuing fadvise even for sequential patterns, perhaps at the tablespace > > level? ... > > Yeah, I'v

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 4:53 AM Tomas Vondra wrote: > Two observations: > > * The combine limit seems to have negligible impact. There's no visible > difference between combine_limit=8kB and 128kB. > > * Parallel queries seem to work about the same as master (especially for > optimal cases, but

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
On Sat, Mar 30, 2024 at 12:17 AM Thomas Munro wrote: > eic unpatched patched > 041729572 > 1 30846 10376 > 2 184355562 > 4 189803503 > 8 189802680 > 16 189763233 ... but the patched version gets down to a low number

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-29 Thread Thomas Munro
I spent a bit of time today testing Melanie's v11, except with read_stream.c v13, on Linux, ext4, and 3000 IOPS cloud storage. I think I now know roughly what's going on. Here are some numbers, using your random table from above and a simple SELECT * FROM t WHERE a < 100 OR a = 123456. I'll

Re: Streaming I/O, vectored I/O (WIP)

2024-03-29 Thread Thomas Munro
On Fri, Mar 29, 2024 at 9:45 AM Heikki Linnakangas wrote: > master (213c959a29):8.0 s > streaming-api v13: 9.5 s Hmm, that's not great, and I think I know one factor that has confounded my investigation and the conflicting reports I have received from a couple of people:

Re: could not open file "global/pg_filenode.map": Operation not permitted

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 4:47 AM Nick Renders wrote: > Looking at the 2 machines that are having this issue (and the others that > don't), I think it is somehow related to the following setup: > - macOS Sonoma (14.4 and 14.4.1) > - data directory on an external drive > > That external drive (a

Re: AIX support

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 4:00 PM Thomas Munro wrote: > On Fri, Mar 29, 2024 at 3:48 PM Noah Misch wrote: > > The thread Alvaro and Tom cited contains an analysis. It's a compiler bug. > > You can get past the compiler bug by upgrading your compiler; both ibm-clang > > 17

Re: AIX support

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 3:48 PM Noah Misch wrote: > On Thu, Mar 28, 2024 at 11:09:43AM +, Sriram RK wrote: > > We are setting up the build environment and trying to build the source and > > also trying to analyze the assert from the Aix point of view. > > The thread Alvaro and Tom cited

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 10:43 AM Tomas Vondra wrote: > I think there's some sort of bug, triggering this assert in heapam > > Assert(BufferGetBlockNumber(hscan->rs_cbuf) == tbmres->blockno); Thanks for the repro. I can't seem to reproduce it (still trying) but I assume this is with Melanie's

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 7:01 AM Tomas Vondra wrote: > On 3/28/24 06:20, Thomas Munro wrote: > > With the unexplained but apparently somewhat systematic regression > > patterns on certain tests and settings, I wonder if they might be due > > to read_stream.c trying to form

Re: Vectored I/O in bulk_write.c

2024-03-28 Thread Thomas Munro
Then I would make the trivial change to respect the new io_combine_limit GUC that I'm gearing up to commit in another thread. As attached. From 7993cede8939cad9172867ccc690a44ea25d1ad6 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 29 Mar 2024 00:22:53 +1300 Subject: [PATCH] fixup

Re: Streaming I/O, vectored I/O (WIP)

2024-03-28 Thread Thomas Munro
On Fri, Mar 29, 2024 at 12:06 AM Thomas Munro wrote: > Small bug fix: the condition in the final test at the end of > read_stream_look_ahead() wasn't quite right. In general when looking > ahead, we don't need to start a read just because the pending read > would bring us up to stre

Re: Streaming I/O, vectored I/O (WIP)

2024-03-28 Thread Thomas Munro
Small bug fix: the condition in the final test at the end of read_stream_look_ahead() wasn't quite right. In general when looking ahead, we don't need to start a read just because the pending read would bring us up to stream->distance if submitted now (we'd prefer to build it all the way up to

Re: [HACKERS] make async slave to wait for lsn to be replayed

2024-03-28 Thread Thomas Munro
> v12 Hi all, I didn't review the patch but one thing jumped out: I don't think it's OK to hold a spinlock while (1) looping over an array of backends and (2) making system calls (SetLatch()).

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-27 Thread Thomas Munro
With the unexplained but apparently somewhat systematic regression patterns on certain tests and settings, I wonder if they might be due to read_stream.c trying to form larger reads, making it a bit lazier. It tries to see what the next block will be before issuing the fadvise. I think that means

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
to adjust that as we learn more about more interesting users of _reset(). From 6b66a6412c90c8f696a8b5890596ba1ab7477191 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 26 Feb 2024 23:48:31 +1300 Subject: [PATCH v11 1/4] Provide vectored variant of ReadBuffer(). Break ReadBuffer() up into two

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 2:02 PM Thomas Munro wrote: > ... In practice on a non-toy system, that's always going to be > io_combine_limit. ... And to be more explicit about that: you're right that we initialise max_pinned_buffers such that it's usually at least io_combine_limit, but then

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Mon, Mar 25, 2024 at 2:02 AM Thomas Munro wrote: > On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > > > /* > > >* Skip the initial ramp-up phase if the caller says we're going to > > > be > > >* reading the whole

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 10:52 AM Thomas Munro wrote: > I think 1 is good, as a rescan is even more likely to find the pages > in cache, and if that turns out to be wrong it'll very soon adjust. Hmm, no I take that back, it probably won't be due to the strategy/ring... I see your poi

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Thu, Mar 28, 2024 at 9:43 AM Melanie Plageman wrote: > For sequential scan, I added a little reset function to the streaming > read API (read_stream_reset()) that just releases all the buffers. > Previously, it set finished to true before releasing the buffers (to > indicate it was done) and

Re: Streaming I/O, vectored I/O (WIP)

2024-03-27 Thread Thomas Munro
On Wed, Mar 27, 2024 at 1:40 AM Heikki Linnakangas wrote: > Is int16 enough though? It seems so, because: > > max_pinned_buffers = Max(max_ios * 4, buffer_io_size); > > and max_ios is constrained by the GUC's maximum MAX_IO_CONCURRENCY, and > buffer_io_size is constrained by

Re: Large block sizes support in Linux

2024-03-25 Thread Thomas Munro
On Tue, Mar 26, 2024 at 3:34 AM Pankaj Raghav wrote: > One question: Does ZFS do something like FUA request to force the device > to clear the cache before it can update the node to point to the new page? > > If it doesn't do it, there is no guarantee from device to update the data > atomically

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
tions _begin(), _next(), _end() to be next to each other after the static helper functions. Working on perf regression/tuning reports today, more soon... From edd3d078cf8d4b0c2f08df82295825f7107ec62b Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 26 Feb 2024 23:48:31 +1300 Subject: [

Re: Streaming I/O, vectored I/O (WIP)

2024-03-24 Thread Thomas Munro
On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas wrote: > On 12/03/2024 15:02, Thomas Munro wrote: > > src/backend/storage/aio/streaming_read.c > > src/include/storage/streaming_read.h > > Standard file header comments missing. Fixed. > It would be nice to h

Re: Large block sizes support in Linux

2024-03-22 Thread Thomas Munro
On Fri, Mar 22, 2024 at 10:56 PM Pankaj Raghav (Samsung) wrote: > My team and I have been working on adding Large block size(LBS) > support to XFS in Linux[1]. Once this feature lands upstream, we will be > able to create XFS with FS block size > page size of the system on Linux. > We also gave a

Re: Potential stack overflow in incremental base backup

2024-03-22 Thread Thomas Munro
On Fri, Mar 8, 2024 at 6:53 AM Robert Haas wrote: > But I think that's really only necessary if we're actually going to > get rid of the idea of segmented relations altogether, which I don't > think is happening at least for v17, and maybe not ever. Yeah, I consider the feedback on ext4's size

Re: Cannot find a working 64-bit integer type on Illumos

2024-03-22 Thread Thomas Munro
On Sat, Mar 23, 2024 at 6:26 AM Tom Lane wrote: > conftest.c:139:5: error: no previous prototype for 'does_int64_work' > [-Werror=missing-prototypes] > 139 | int does_int64_work() > | ^~~ > cc1: all warnings being treated as errors > configure:17003: $? = 1 > configure:

Re: pg_upgrade --copy-file-range

2024-03-22 Thread Thomas Munro
Hmm, this discussion seems to assume that we only use copy_file_range() to copy/clone whole segment files, right? That's great and may even get most of the available benefit given typical databases with many segments of old data that never changes, but... I think copy_write_range() allows us to

Re: Vectored I/O in bulk_write.c

2024-03-19 Thread Thomas Munro
On Sun, Mar 17, 2024 at 8:10 AM Andres Freund wrote: > I don't think zeroextend on the one hand and and on the other hand a normal > write or extend are really the same operation. In the former case the content > is hard-coded in the latter it's caller provided. Sure, we can deal with that > by

Re: Built-in CTYPE provider

2024-03-18 Thread Thomas Munro
On Tue, Mar 19, 2024 at 11:55 AM Tom Lane wrote: > Jeff Davis writes: > > On Mon, 2024-03-18 at 18:04 -0400, Tom Lane wrote: > >> This is causing all CI jobs to fail the "compiler warnings" check. > > > I did run CI before checkin, and it passed: > > https://cirrus-ci.com/build/5382423490330624

Re: Confine vacuum skip logic to lazy_scan_skip

2024-03-17 Thread Thomas Munro
On Tue, Mar 12, 2024 at 10:03 AM Melanie Plageman wrote: > I've rebased the attached v10 over top of the changes to > lazy_scan_heap() Heikki just committed and over the v6 streaming read > patch set. I started testing them and see that you are right, we no > longer pin too many buffers. However,

Re: Streaming I/O, vectored I/O (WIP)

2024-03-15 Thread Thomas Munro
I am planning to push the bufmgr.c patch soon. At that point the new API won't have any direct callers yet, but the traditional ReadBuffer() family of functions will internally reach StartReadBuffers(nblocks=1) followed by WaitReadBuffers(), ZeroBuffer() or nothing as appropriate. Any more

Re: Vectored I/O in bulk_write.c

2024-03-15 Thread Thomas Munro
I canvassed Andres off-list since smgrzeroextend() is his invention, and he wondered if it was a good idea to blur the distinction between the different zero-extension strategies like that. Good question. My take is that it's fine: mdzeroextend() already uses fallocate() only for nblocks > 8,

Re: Weird test mixup

2024-03-15 Thread Thomas Munro
On Sat, Mar 16, 2024 at 7:27 AM Tom Lane wrote: > Are there limits on the runtime of CI or cfbot jobs? Maybe > somebody should go check those systems. Those get killed at a higher level after 60 minutes (configurable but we didn't change it AFAIK): https://cirrus-ci.org/faq/#instance-timed-out

Re: broken JIT support on Fedora 40

2024-03-14 Thread Thomas Munro
For me it seems that the LLVMRunPasses() call, new in commit 76200e5ee469e4a9db5f9514b9d0c6a31b496bff Author: Thomas Munro Date: Wed Oct 18 22:15:54 2023 +1300 jit: Changes for LLVM 17. is reaching code that segfaults inside libLLVM, specifically in llvm::InlineFunction(llvm::CallBase

Re: broken JIT support on Fedora 40

2024-03-14 Thread Thomas Munro
be a > combination we have covered in the buildfarm. Yeah, 18.1 (note they switched to 1-based minor numbers, there was no 18.0) just came out a week or so ago. Despite testing their 18 branch just before their "RC1" tag, as recently as commit d282e88e50521a457fa1b36e55f43bac02a3167f

Re: Weird test mixup

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 11:19 AM Tom Lane wrote: > Heikki Linnakangas writes: > > Somehow the 'gin-leave-leaf-split-incomplete' injection point was active > > in the 'intarray' test. That makes no sense. That injection point is > > only used by the test in src/test/modules/gin/. Perhaps that ran

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 3:18 AM Tomas Vondra wrote: > So, IIUC this means (1) the patched code is more aggressive wrt > prefetching (because we prefetch more data overall, because master would > prefetch N pages and patched prefetches N ranges, each of which may be > multiple pages. And (2) it's

Re: Recent 027_streaming_regress.pl hangs

2024-03-14 Thread Thomas Munro
On Fri, Mar 15, 2024 at 7:00 AM Alexander Lakhin wrote: > Could it be that the timeout (360 sec?) is just not enough for the test > under the current (changed due to switch to meson) conditions? Hmm, well it looks like he switched over to meson around 42 days ago 2024-02-01, looking at

Re: Recent 027_streaming_regress.pl hangs

2024-03-13 Thread Thomas Munro
On Thu, Mar 14, 2024 at 3:27 PM Michael Paquier wrote: > Hmm. Perhaps 8af25652489? That looks like the closest thing in the > list that could have played with the way WAL is generated, hence > potentially impacting the records that are replayed. Yeah, I was wondering if its checkpoint delaying

Re: Recent 027_streaming_regress.pl hangs

2024-03-13 Thread Thomas Munro
On Wed, Mar 13, 2024 at 10:53 AM Thomas Munro wrote: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2024-02-23%2015%3A44%3A35 Assuming it is due to a commit in master, and given the failure frequency, I think it is very likely to be a change from this 3 day window of comm

ERROR: error triggered for injection point gin-leave-leaf-split-incomplete

2024-03-13 Thread Thomas Munro
Hi, I noticed 3 regression test failures like $SUBJECT in cfbot runs for unrelated patches that probably shouldn't affect GIN, so I guess this is probably a problem in master. All three happened on FreeBSD, but I doubt that's relevant, it's just that the FreeBSD CI task was randomly selected to

Re: Volatile write caches on macOS and Windows, redux

2024-03-13 Thread Thomas Munro
Short sales pitch for these patches: * the default settings eat data on Macs and Windows * nobody understands what wal_sync_method=fsync_writethrough means anyway * it's a weird kludge that it affects not only WAL, let's clean that up

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
like this. From 61b351b60d22060e5fc082645cdfc19188ac4841 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH v5 1/3] Use smgrwritev() for both overwriting and extending. Since mdwrite() and mdextend() were basically the same and both need vectored variants, merge the

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-03-13 Thread Thomas Munro
On Sun, Mar 3, 2024 at 11:41 AM Tomas Vondra wrote: > On 3/2/24 23:28, Melanie Plageman wrote: > > On Sat, Mar 2, 2024 at 10:05 AM Tomas Vondra > > wrote: > >> With the current "master" code, eic=1 means we'll issue a prefetch for B > >> and then read+process A. And then issue prefetch for C and

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
of the smgr implementation, it's part of the "contract" for the API. From 0a57274e29369e61712941e379c24f7db1dec068 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 9 Mar 2024 16:04:21 +1300 Subject: [PATCH v4 1/3] Merge smgrzeroextend() and smgrextend() with smgrwritev(). Sin

Re: Vectored I/O in bulk_write.c

2024-03-13 Thread Thomas Munro
On Wed, Mar 13, 2024 at 9:57 PM Heikki Linnakangas wrote: > Let's bite the bullet and merge the smgrwrite and smgrextend functions > at the smgr level too. I propose the following signature: > > #define SWF_SKIP_FSYNC 0x01 > #define SWF_EXTEND 0x02 > #define SWF_ZERO

Re: CI speed improvements for FreeBSD

2024-03-12 Thread Thomas Munro
On Wed, Mar 13, 2024 at 4:50 AM Maxim Orlov wrote: > I looked at the changes and I liked them. Here are my thoughts: Thanks for looking! Pushed.

pgsql: ci: Use a RAM disk and more CPUs on FreeBSD.

2024-03-12 Thread Thomas Munro
ci: Use a RAM disk and more CPUs on FreeBSD. Run the tests in a RAM disk. It's still a UFS file system and is backed by 20GB of disk, but this avoids a lot of I/O. Even though we disable fsync, our tests do a lot of directory manipulations, some of which force file system meta-data to disk and

Recent 027_streaming_regress.pl hangs

2024-03-12 Thread Thomas Munro
Hi, Several animals are timing out while waiting for catchup, sporadically. I don't know why. The oldest example I have found so far by clicking around is: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2024-02-23%2015%3A44%3A35 So perhaps something was committed ~3 weeks ago

Re: Vectored I/O in bulk_write.c

2024-03-12 Thread Thomas Munro
One more observation while I'm thinking about bulk_write.c... hmm, it writes the data out and asks the checkpointer to fsync it, but doesn't call smgrwriteback(). I assume that means that on Linux the physical writeback sometimes won't happen until the checkpointer eventually calls fsync()

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:40 PM Thomas Munro wrote: > possible. So in the current patch you say "hey please read these 16 > blocks" and it returns saying "only read 1", you call again with 15 Oops, typo worth correcting: s/15/16/. Point being that the caller is inter

Re: Streaming I/O, vectored I/O (WIP)

2024-03-12 Thread Thomas Munro
On Tue, Mar 12, 2024 at 7:15 PM Dilip Kumar wrote: > I am planning to review this patch set, so started going through 0001, > I have a question related to how we are issuing smgrprefetch in > StartReadBuffers() Thanks! > + /* > + * In theory we should only do this if PrepareReadBuffers() had to

Re: [PROPOSAL] Skip test citext_utf8 on Windows

2024-03-11 Thread Thomas Munro
On Tue, Mar 12, 2024 at 2:56 PM Andrew Dunstan wrote: > On 2024-03-11 Mo 04:21, Oleg Tselebrovskiy wrote: > > Greetings, everyone! > > > > While running "installchecks" on databases with UTF-8 encoding the test > > citext_utf8 fails because of Turkish dotted I like this: > > > > SELECT

Re: Confine vacuum skip logic to lazy_scan_skip

2024-03-10 Thread Thomas Munro
On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman wrote: > On Wed, Mar 6, 2024 at 6:47 PM Melanie Plageman > wrote: > > Performance results: > > > > The TL;DR of my performance results is that streaming read vacuum is > > faster. However there is an issue with the interaction of the streaming > >

<    1   2   3   4   5   6   7   8   9   10   >