On Sun, Apr 7, 2024 at 1:34 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > Attached v13 0001 is your fix and 0002 is a new version of the > sequential scan streaming read user. Off-list Andres mentioned that I > really ought to separate the parallel and serial sequential scan users > into two different callbacks. I've done that in the attached. It > actually makes the code used by the callbacks nicer and more readable > anyway (putting aside performance). I was able to measure a small > performance difference as well.
Thanks. I changed a couple of very trivial things before pushing. + BlockNumber (*cb) (ReadStream *pgsr, void *private_data, + void *per_buffer_data); This type has a friendly name: ReadStreamBlockNumberCB. + scan->rs_read_stream = read_stream_begin_relation(READ_STREAM_SEQUENTIAL, I've been on the fence about that flag for sequential scan... Some days I want to consider changing to READ_STREAM_DEFAULT and relying on our "anti-heuristics" to suppress advice, which would work out the same in most cases but might occasionally win big. It might also hurt, I dunno, so I suspect we'd have to make it better first, which my patch in [1] was a first swing at, but I haven't researched that enough. So, kept this way! - * Read the next block of the scan relation into a buffer and pin that buffer - * before saving it in the scan descriptor. + * Read the next block of the scan relation from the read stream and pin that + * buffer before saving it in the scan descriptor. Changed to: * Read the next block of the scan relation from the read stream and save it * in the scan descriptor. It is already pinned. +static BlockNumber +heap_scan_stream_read_next_parallel(ReadStream *pgsr, void *private_data, + void *per_buffer_data) Changed argument names to match the function pointer type definition, "stream" and "callback_private_data". BTW looking at the branching in read-stream user patches that have an initialisation step like yours, I wonder if it might every make sense to be able to change the callback on the fly from inside the callback, so that you finish up with a branchless one doing most of the work. I have no idea if it's worth it... [1] https://www.postgresql.org/message-id/CA%2BhUKGLLFvou5rx5FDhm-Pc9r4STQTFFmrx6SUV%2Bvk8fwMbreA%40mail.gmail.com