On 19/11/2022 13:00, Peter Eisentraut wrote:
On 18.10.21 14:15, Heikki Linnakangas wrote:
On 05/10/2021 20:24, John Naylor wrote:
I've had a chance to review and test out the v5 patches.

Thanks! I fixed the stray reference to PostgreSQL 14 that Zhihong
mentioned, and pushed.

AFAICT, this thread updated the API of LogicalTapeSetCreate() in PG15,
but did not adequately update the function header comment.  The comment
still mentions the "shared" argument, which has been removed.  There is
a new "preallocate" argument that is not mentioned at all.  Also, it's
not easy to match the actual callers to the call variants described in
the comment.  Could someone who remembers this work perhaps look this
over and update the comment?

Is the attached more readable?

I'm not 100% sure of the "preallocate" argument. If I understand correctly, you should pass true if you are writing multiple tapes at the same time, and false otherwise. HashAgg passed true, tuplesort passes false. However, it's not clear to me why we couldn't just always do the preallocation. It seems pretty harmless even if it's not helpful. Or do it when there are multiple writer tapes, and not otherwise. The parameter was added in commit 0758964963 so presumably there was a reason, but at a quick glance at the thread that led to that commit, I couldn't see what it was.

- Heikki
From 1c5d4e28a21f6052d22667e79b5bd443aec3a5b5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Mon, 21 Nov 2022 01:55:05 +0200
Subject: [PATCH 1/1] Fix and clarify function comment on LogicalTapeSetCreate.

Commit c4649cce39 removed the "shared" and "ntapes" arguments, but the
comment still talked about "shared". It also talked about "a shared
file handle", which was technically correct because it even before
commit c4649cce39, the "shared file handle" referred to the "fileset"
argument, not "shared". But it was very confusing. Improve the
comment.

Also add a comment on what the "preallocate" argument does.

Discussion: https://www.postgresql.org/message-id/af989685-91d5-aad4-8f60-1d066b5ec...@enterprisedb.com
---
 src/backend/utils/sort/logtape.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index c384f98e13..b518a4b9c5 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -544,14 +544,20 @@ ltsInitReadBuffer(LogicalTape *lt)
  * The tape set is initially empty. Use LogicalTapeCreate() to create
  * tapes in it.
  *
- * Serial callers pass NULL argument for shared, and -1 for worker.  Parallel
- * worker callers pass a shared file handle and their own worker number.
+ * In a single-process sort, pass NULL argument for fileset, and -1 for
+ * worker.
  *
- * Leader callers pass a shared file handle and -1 for worker. After creating
- * the tape set, use LogicalTapeImport() to import the worker tapes into it.
+ * In a parallel sort, parallel workers pass the shared fileset handle and
+ * their own worker number.  After the workers have finished, create the
+ * tape set in the leader, passing the shared fileset handle and -1 for
+ * worker, and use LogicalTapeImport() to import the worker tapes into it.
  *
  * Currently, the leader will only import worker tapes into the set, it does
  * not create tapes of its own, although in principle that should work.
+ *
+ * If preallocate is true, blocks for each individual tape are allocated in
+ * batches.  This avoids fragmentation when writing multiple tapes at the
+ * same time.
  */
 LogicalTapeSet *
 LogicalTapeSetCreate(bool preallocate, SharedFileSet *fileset, int worker)
-- 
2.30.2

Reply via email to