Parallell hashjoin sometimes ignores temp_tablespaces

2020-06-29 Thread Magnus Hagander
If a database (a) has a default tablespace set,

Reproduction:

CREATE TABLESPACE t LOCATION '/tmp/t';
CREATE DATABASE dumb TABLESPACE t;
\c dumb
SET temp_tablespaces=t;

At this point if you run a query with a parallel hash join in it, the
tempfiles go in base/pgsql_tmp instead of the temporary tablespace. For
example:

create table foo(bar int);
insert into foo select * from generate_series(1,100);
set parallel_tuple_cost =0;
set parallel_setup_cost =0;
set log_temp_files=0;
set client_min_messages ='log';
explain analyze select foo.bar,count(*) from foo inner join foo foo2 on
foo.bar=foo2.bar group by foo.bar;

Will trigger some temp files in the 't' tablespace and some in the
'pg_default' one.

I think the fix is the attached one (tested on version 11 which is what
$customer is using).  To me it looks like this may have been a copy/paste
error all the way back in 98e8b480532 which added default_tablespace back
in 2004. (And is in itself entirely unrelated to parallel hashjoin, but
that's where it got exposed at least in my case)

Thoughts?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 946777f48b..3105efe040 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -1354,7 +1354,7 @@ PrepareTempTablespaces(void)
 		 */
 		if (curoid == MyDatabaseTableSpace)
 		{
-			tblSpcs[numSpcs++] = InvalidOid;
+			tblSpcs[numSpcs++] = curoid;
 			continue;
 		}
 


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-06-29 Thread Daniel Gustafsson
> On 29 Jun 2020, at 17:02, Magnus Hagander  wrote:

> I think the fix is the attached one (tested on version 11 which is what 
> $customer is using).  To me it looks like this may have been a copy/paste 
> error all the way back in 98e8b480532 which added default_tablespace back in 
> 2004. (And is in itself entirely unrelated to parallel hashjoin, but that's 
> where it got exposed at least in my case)

Running through the repro and patch on HEAD I confirm that the attached fixes
the issue. +1 for the patch and a backpatch of it.

It would be nice to have a test covering test_tablespaces, but it seems a tad
cumbersome to create a stable one.

cheers ./daniel



Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Magnus Hagander
On Mon, Jun 29, 2020 at 10:26 PM Daniel Gustafsson  wrote:

> > On 29 Jun 2020, at 17:02, Magnus Hagander  wrote:
>
> > I think the fix is the attached one (tested on version 11 which is what
> $customer is using).  To me it looks like this may have been a copy/paste
> error all the way back in 98e8b480532 which added default_tablespace back
> in 2004. (And is in itself entirely unrelated to parallel hashjoin, but
> that's where it got exposed at least in my case)
>
> Running through the repro and patch on HEAD I confirm that the attached
> fixes
> the issue. +1 for the patch and a backpatch of it.
>
> It would be nice to have a test covering test_tablespaces, but it seems a
> tad
> cumbersome to create a stable one.
>

Thanks. pushed!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Tom Lane
Magnus Hagander  writes:
> Thanks. pushed!

Sorry for not having paid more attention earlier, but this patch is
quite broken.  If it weren't misguided it'd still be wrong, because
this isn't the only spot in PrepareTempTablespaces that inserts
InvalidOid into the output list.

But, in fact, it's intentional that we represent the DB's default
tablespace by InvalidOid in that list.  Some callers of
GetNextTempTableSpace need that to be the case, either for
permissions-checking reasons or because they're going to store the
result into a temp table's pg_class.reltablespace, where that
representation is *required*.

I see that this is perhaps underdocumented, since while
GetNextTempTableSpace's comment mentions the behavior, there's
no comment about it with the data structure proper.

It looks to me like the actual bug here is that whoever added
GetTempTablespaces() and made sharedfileset.c depend on it
did not get the memo about what to do with InvalidOid.
It's possible that we could safely make GetTempTablespaces()
do the substitution, but that would be making fd.c assume more
about the usage of GetTempTablespaces() than I think it ought to.
I feel like we oughta fix sharedfileset.c, instead.

regards, tom lane




Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Magnus Hagander
On Fri, Jul 3, 2020 at 4:16 PM Tom Lane  wrote:

> Magnus Hagander  writes:
> > Thanks. pushed!
>
> Sorry for not having paid more attention earlier, but this patch is
> quite broken.  If it weren't misguided it'd still be wrong, because
> this isn't the only spot in PrepareTempTablespaces that inserts
> InvalidOid into the output list.
>
> But, in fact, it's intentional that we represent the DB's default
> tablespace by InvalidOid in that list.  Some callers of
> GetNextTempTableSpace need that to be the case, either for
> permissions-checking reasons or because they're going to store the
> result into a temp table's pg_class.reltablespace, where that
> representation is *required*.
>

Hmm. I guess I must've been careless in checking other callers.


I see that this is perhaps underdocumented, since while
> GetNextTempTableSpace's comment mentions the behavior, there's
> no comment about it with the data structure proper.
>

Yeah, it could definitely do with that. It was too many steps of
indirections away to me to pick that up.


It looks to me like the actual bug here is that whoever added
> GetTempTablespaces() and made sharedfileset.c depend on it
> did not get the memo about what to do with InvalidOid.
> It's possible that we could safely make GetTempTablespaces()
> do the substitution, but that would be making fd.c assume more
> about the usage of GetTempTablespaces() than I think it ought to.
> I feel like we oughta fix sharedfileset.c, instead.
>

This seems to be in dc6c4c9dc2a -- adding Thomas.

A quick look -- to do things right, we will need to know the database
default tablespace in this case right? Which I guess isn't there because
the shared fileset isn't tied to a database. But perhaps it's as easy as
something like the attached, just overwriting the oid?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 946777f48b..6d44d0480d 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -1152,6 +1152,10 @@ GetDefaultTablespace(char relpersistence)
 typedef struct
 {
 	int			numSpcs;
+	/*
+	 * InvalidOid stored in the tblSpcs member means use the database default
+	 * tablespace.
+	 */
 	Oid			tblSpcs[FLEXIBLE_ARRAY_MEMBER];
 } temp_tablespaces_extra;
 
@@ -1226,6 +1230,7 @@ check_temp_tablespaces(char **newval, void **extra, GucSource source)
 			 */
 			if (curoid == MyDatabaseTableSpace)
 			{
+/* The default tablespace is indicated by the value InvalidOid */
 tblSpcs[numSpcs++] = InvalidOid;
 continue;
 			}
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index d41b39a177..9c412959d8 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -64,6 +64,20 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
 		fileset->tablespaces[0] = DEFAULTTABLESPACE_OID;
 		fileset->ntablespaces = 1;
 	}
+	else
+	{
+		int i;
+
+		/*
+		 * Tablespace set to InvalidOid means use the default tablespace for
+		 * the database.
+		 */
+		for (i = 0; i < fileset->ntablespaces; i++)
+		{
+			if (fileset->tablespaces[i] == InvalidOid)
+fileset->tablespaces[i] = MyDatabaseTableSpace;
+		}
+	}
 
 	/* Register our cleanup callback. */
 	on_dsm_detach(seg, SharedFileSetOnDetach, PointerGetDatum(fileset));


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Tom Lane
Magnus Hagander  writes:
> A quick look -- to do things right, we will need to know the database
> default tablespace in this case right? Which I guess isn't there because
> the shared fileset isn't tied to a database. But perhaps it's as easy as
> something like the attached, just overwriting the oid?

Yeah, we just have to pick an appropriate place for making the
substitution.  I have no objection to doing it in SharedFileSetInit, as
long as we're sure it will only be consulted for placing temp files and
not relations.

The lack of documentation seems to be my fault, so I'm willing to pick
this up unless somebody else wants it.

regards, tom lane




Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Magnus Hagander
On Fri, Jul 3, 2020 at 6:12 PM Tom Lane  wrote:

> Magnus Hagander  writes:
> > A quick look -- to do things right, we will need to know the database
> > default tablespace in this case right? Which I guess isn't there because
> > the shared fileset isn't tied to a database. But perhaps it's as easy as
> > something like the attached, just overwriting the oid?
>
> Yeah, we just have to pick an appropriate place for making the
> substitution.  I have no objection to doing it in SharedFileSetInit, as
> long as we're sure it will only be consulted for placing temp files and
> not relations.
>

It doesn't *now*, and I'm pretty sure it can't be in the future the way it
is now (a parallel worker can't be creating relations). But it is probably
a good idea to add a comment indicating this as well...


>
> The lack of documentation seems to be my fault, so I'm willing to pick
> this up unless somebody else wants it.
>

If the comments I included in that patch are enough, I can just commit
those along with it. Otherwise, please do :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Tom Lane
Magnus Hagander  writes:
> On Fri, Jul 3, 2020 at 6:12 PM Tom Lane  wrote:
>> The lack of documentation seems to be my fault, so I'm willing to pick
>> this up unless somebody else wants it.

> If the comments I included in that patch are enough, I can just commit
> those along with it. Otherwise, please do :)

Being once burned, I had something more like the attached in mind.

BTW, looking at this, I'm kind of wondering about the other code path
in SharedFileSetInit:

if (fileset->ntablespaces == 0)
{
fileset->tablespaces[0] = DEFAULTTABLESPACE_OID;
fileset->ntablespaces = 1;
}

Shouldn't that be inserting MyDatabaseTableSpace?  I see no other places
anywhere that are forcing temp stuff into pg_default like this.

regards, tom lane

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index f887ee9857..2c3b9050b2 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -1183,6 +1183,7 @@ GetDefaultTablespace(char relpersistence, bool partitioned)
 
 typedef struct
 {
+	/* Array of OIDs to be passed to SetTempTablespaces() */
 	int			numSpcs;
 	Oid			tblSpcs[FLEXIBLE_ARRAY_MEMBER];
 } temp_tablespaces_extra;
@@ -1232,6 +1233,7 @@ check_temp_tablespaces(char **newval, void **extra, GucSource source)
 			/* Allow an empty string (signifying database default) */
 			if (curname[0] == '\0')
 			{
+/* InvalidOid signifies database's default tablespace */
 tblSpcs[numSpcs++] = InvalidOid;
 continue;
 			}
@@ -1258,6 +1260,7 @@ check_temp_tablespaces(char **newval, void **extra, GucSource source)
 			 */
 			if (curoid == MyDatabaseTableSpace)
 			{
+/* InvalidOid signifies database's default tablespace */
 tblSpcs[numSpcs++] = InvalidOid;
 continue;
 			}
@@ -1368,6 +1371,7 @@ PrepareTempTablespaces(void)
 		/* Allow an empty string (signifying database default) */
 		if (curname[0] == '\0')
 		{
+			/* InvalidOid signifies database's default tablespace */
 			tblSpcs[numSpcs++] = InvalidOid;
 			continue;
 		}
@@ -1386,7 +1390,8 @@ PrepareTempTablespaces(void)
 		 */
 		if (curoid == MyDatabaseTableSpace)
 		{
-			tblSpcs[numSpcs++] = curoid;
+			/* InvalidOid signifies database's default tablespace */
+			tblSpcs[numSpcs++] = InvalidOid;
 			continue;
 		}
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 7dc6dd2f15..5f6420efb2 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -264,8 +264,10 @@ static int	numExternalFDs = 0;
 static long tempFileCounter = 0;
 
 /*
- * Array of OIDs of temp tablespaces.  When numTempTableSpaces is -1,
- * this has not been set in the current transaction.
+ * Array of OIDs of temp tablespaces.  (Some entries may be InvalidOid,
+ * indicating that the current database's default tablespace should be used.)
+ * When numTempTableSpaces is -1, this has not been set in the current
+ * transaction.
  */
 static Oid *tempTableSpaces = NULL;
 static int	numTempTableSpaces = -1;
@@ -2779,6 +2781,9 @@ closeAllVfds(void)
  * unless this function is called again before then.  It is caller's
  * responsibility that the passed-in array has adequate lifespan (typically
  * it'd be allocated in TopTransactionContext).
+ *
+ * Some entries of the array may be InvalidOid, indicating that the current
+ * database's default tablespace should be used.
  */
 void
 SetTempTablespaces(Oid *tableSpaces, int numSpaces)
@@ -2818,7 +2823,10 @@ TempTablespacesAreSet(void)
  * GetTempTablespaces
  *
  * Populate an array with the OIDs of the tablespaces that should be used for
- * temporary files.  Return the number that were copied into the output array.
+ * temporary files.  (Some entries may be InvalidOid, indicating that the
+ * current database's default tablespace should be used.)  At most numSpaces
+ * entries will be filled.
+ * Returns the number of OIDs that were copied into the output array.
  */
 int
 GetTempTablespaces(Oid *tableSpaces, int numSpaces)
diff --git a/src/backend/storage/file/sharedfileset.c b/src/backend/storage/file/sharedfileset.c
index f7206c9175..0f65210476 100644
--- a/src/backend/storage/file/sharedfileset.c
+++ b/src/backend/storage/file/sharedfileset.c
@@ -66,6 +66,21 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
 		fileset->tablespaces[0] = DEFAULTTABLESPACE_OID;
 		fileset->ntablespaces = 1;
 	}
+	else
+	{
+		int			i;
+
+		/*
+		 * An entry of InvalidOid means use the default tablespace for the
+		 * current database.  Replace that now, to be sure that all users of
+		 * the SharedFileSet agree on what to do.
+		 */
+		for (i = 0; i < fileset->ntablespaces; i++)
+		{
+			if (fileset->tablespaces[i] == InvalidOid)
+fileset->tablespaces[i] = MyDatabaseTableSpace;
+		}
+	}
 
 	/* Register our cleanup callback. */
 	on_dsm_detach(seg, SharedFileSetOnDetach, PointerGetDatum(fileset));


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Magnus Hagander
On Fri, Jul 3, 2020 at 7:06 PM Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Fri, Jul 3, 2020 at 6:12 PM Tom Lane  wrote:
> >> The lack of documentation seems to be my fault, so I'm willing to pick
> >> this up unless somebody else wants it.
>
> > If the comments I included in that patch are enough, I can just commit
> > those along with it. Otherwise, please do :)
>
> Being once burned, I had something more like the attached in mind.
>

That's a bit more elaborate and yes, I agree, better.


BTW, looking at this, I'm kind of wondering about the other code path
> in SharedFileSetInit:
>
> if (fileset->ntablespaces == 0)
> {
> fileset->tablespaces[0] = DEFAULTTABLESPACE_OID;
> fileset->ntablespaces = 1;
> }
>
> Shouldn't that be inserting MyDatabaseTableSpace?  I see no other places
> anywhere that are forcing temp stuff into pg_default like this.
>

Yeah, looking at it again, I think it should. I can't see any reason why it
should enforce pg_default.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-03 Thread Tom Lane
Magnus Hagander  writes:
> On Fri, Jul 3, 2020 at 7:06 PM Tom Lane  wrote:
>> Shouldn't that be inserting MyDatabaseTableSpace?  I see no other places
>> anywhere that are forcing temp stuff into pg_default like this.

> Yeah, looking at it again, I think it should. I can't see any reason why it
> should enforce pg_default.

OK, pushed with that additional correction.

regards, tom lane




Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-05 Thread Magnus Hagander
On Fri, Jul 3, 2020 at 11:02 PM Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Fri, Jul 3, 2020 at 7:06 PM Tom Lane  wrote:
> >> Shouldn't that be inserting MyDatabaseTableSpace?  I see no other places
> >> anywhere that are forcing temp stuff into pg_default like this.
>
> > Yeah, looking at it again, I think it should. I can't see any reason why
> it
> > should enforce pg_default.
>
> OK, pushed with that additional correction.
>

LGTM, thanks!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/