On Tue, Nov 7, 2017 at 1:01 PM, Andres Freund <and...@anarazel.de> wrote: > +/* > + * Build the name for a given segment of a given BufFile. > + */ > +static void > +MakeSharedSegmentName(char *name, const char *buffile_name, int segment) > +{ > + snprintf(name, MAXPGPATH, "%s.%d", buffile_name, segment); > +} > > Not a fan of this name - you're not "making" a filename here (as in > allocating or such). I think I'd just remove the Make prefix.
+1 Can we document the theory behind file naming here, if that isn't in the latest version? This is a routine private to parallel hash join (or shared tuplestore), not Buffile. Maybe Buffile should have some opinion on this, though. Just as a matter of style. > +/* > + * Delete a BufFile that was created by BufFileCreateShared in the given > + * SharedFileSet using the given name. > + * > + * It is not necessary to delete files explicitly with this function. It is > + * provided only as a way to delete files proactively, rather than waiting > for > + * the SharedFileSet to be cleaned up. > + * > + * Only one backend should attempt to delete a given name, and should know > + * that it exists and has been exported or closed. > + */ This part is new to me. We now want one backend to delete a given filename. What changed? Please provide a Message-Id reference if that will help me to understand. For now, I'm going to guess that this development had something to do with the need to deal with virtual FDs that do a close() on an FD to keep under backend limits. Do I have that right? > + /* > + * We don't set FD_DELETE_AT_CLOSE for files opened this way, but we > still > + * want to make sure they get closed at end of xact. > + */ > + ResourceOwnerEnlargeFiles(CurrentResourceOwner); > + ResourceOwnerRememberFile(CurrentResourceOwner, file); > + VfdCache[file].resowner = CurrentResourceOwner; > > So maybe I'm being pedantic here, but wouldn't the right order be to do > ResourceOwnerEnlargeFiles() *before* creating the file? It's a memory > allocating operation, so it can fail, which'd leak the file. I remember going to pains to get this right with my own unifiable BufFile concept. I'm going to wait for an my question about file deletion + resowners before commenting further, though. > + if (vfdP->fdstate & FD_TEMP_FILE_LIMIT) > + { > + /* Subtract its size from current usage (do first in case of > error) */ > + temporary_files_size -= vfdP->fileSize; > + vfdP->fileSize = 0; > + } > > So, is it right to do so unconditionally and without regard for errors? > If the file isn't deleted, it shouldn't be subtracted from fileSize. I > guess you're managing that through the flag, but that's not entirely > obvious. I think that the problem here is that the accounting is expected to always work. It's not like there is a resowner style error path in which temporary_files_size gets reset to 0. > Is that entirely unproblematic? Are there any DSM callbacks that rely on > locks still being held? Please split this part into a separate commit > with such analysis. I was always confused about the proper use of DSM callbacks myself. They seemed generally underdocumented. > + /* Create the output buffer. */ > + if (accessor->write_chunk != NULL) > + pfree(accessor->write_chunk); > + accessor->write_chunk = (SharedTuplestoreChunk *) > + palloc0(accessor->write_pages * BLCKSZ); > > Are we guaranteed to be in a long-lived memory context here? I imagine that Thomas looked to tuplestore_begin_heap() + interXact as a kind of precedent here. See comments above that function. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers