On Sat, Apr 02, 2022 at 07:21:11PM +0200, Alvaro Herrera wrote: > Small things here.
> 1. in VACUUM FULL we only process partitions that are owned by the > invoking user. We don't have this test in the new code. I'm not sure > why do we do that there; is it worth doing the same here? That dates to a556549d7 (see also cbe24a6dd8 for an earlier commit in CLUSTER itself). The reason was to avoid blocking if an unprivileged user runs VACUUM FULL which would try to lock things (including shared catalogs) before checking if they have permission to vacuum them. That commit also initially checks the owner of the partitioned table, and then re-checking owner of partitions later on. A similar issue exists here. But 1) catalog tables are not partitioned, and, 2) ownership of a partitioned table is checked immediately. So the problem can only occur if a user who owns a partitioned table but doesn't own all its partitions tries to cluster it, and it blocks behind another session. Fixing this is probably a good idea, but seems improbable that it would avoid a DOS. > 2. We should silently skip a partition that's a foreign table, I > suppose. I think it's not needed, since the loop over index children doesn't see a child index on the foreign table. ? > 3. We do mark the index on the partitions as indisclustered AFAICS (we > claim that the partitioned table's index is not marked, which is > accurate). So users doing unadorned CLUSTER afterwards will get the > partitions clustered too, once they cluster the partitioned table. If > they don't want this, they would have to ALTER TABLE to remove the > marking. How likely is that this will be a problem? Maybe documenting > this point is enough. It seems at least as likely that someone would *want* the partitions to be marked clustered as that someone would want them to be unchanged. The cluster mark accurately reflects having been clustered. It seems unlikely that a user would want something else to be clustered later by "cluster;". Since clustering on a partitioned table wasn't supported before, nothing weird will happen to someone who upgrades to v15 unless they elect to use the new feature. As this seems to be POLA, it doesn't even need to be documented. ?
>From 19c02209dfbcbf73494e1e6d3ca0db50f64dc5fd Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Fri, 24 Sep 2021 14:18:52 -0500 Subject: [PATCH v13 1/2] Recheck arg is not needed since 2011 It was added at: a4dde3bff36dac1ac0b699becad6f103d861a874 And not used since: 7e2f906201c8bb95f7fb17e56b8740c38bda5441 --- src/backend/commands/cluster.c | 6 +++--- src/backend/commands/tablecmds.c | 2 +- src/include/commands/cluster.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 322d6bb2f18..0f0a6e9f018 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -232,7 +232,7 @@ cluster(ParseState *pstate, ClusterStmt *stmt, bool isTopLevel) if (rel != NULL) { Assert(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE); - check_index_is_clusterable(rel, indexOid, true, AccessShareLock); + check_index_is_clusterable(rel, indexOid, AccessShareLock); rtcs = get_tables_to_cluster_partitioned(cluster_context, indexOid); /* close relation, releasing lock on parent table */ @@ -434,7 +434,7 @@ cluster_rel(Oid tableOid, Oid indexOid, ClusterParams *params) /* Check heap and index are valid to cluster on */ if (OidIsValid(indexOid)) - check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock); + check_index_is_clusterable(OldHeap, indexOid, AccessExclusiveLock); /* * Quietly ignore the request if this is a materialized view which has not @@ -480,7 +480,7 @@ cluster_rel(Oid tableOid, Oid indexOid, ClusterParams *params) * protection here. */ void -check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck, LOCKMODE lockmode) +check_index_is_clusterable(Relation OldHeap, Oid indexOid, LOCKMODE lockmode) { Relation OldIndex; diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 90edd0bb97d..1d7db41d172 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -14055,7 +14055,7 @@ ATExecClusterOn(Relation rel, const char *indexName, LOCKMODE lockmode) indexName, RelationGetRelationName(rel)))); /* Check index is valid to cluster on */ - check_index_is_clusterable(rel, indexOid, false, lockmode); + check_index_is_clusterable(rel, indexOid, lockmode); /* And do the work */ mark_index_clustered(rel, indexOid, false); diff --git a/src/include/commands/cluster.h b/src/include/commands/cluster.h index 3c279f6210a..df8e73af409 100644 --- a/src/include/commands/cluster.h +++ b/src/include/commands/cluster.h @@ -34,7 +34,7 @@ typedef struct ClusterParams extern void cluster(ParseState *pstate, ClusterStmt *stmt, bool isTopLevel); extern void cluster_rel(Oid tableOid, Oid indexOid, ClusterParams *params); extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid, - bool recheck, LOCKMODE lockmode); + LOCKMODE lockmode); extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal); extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, Oid NewAccessMethod, -- 2.17.1
>From 4fc03463d51cc55de676813a7a8ac7321ecc7be7 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Wed, 6 Apr 2022 07:23:39 -0500 Subject: [PATCH v13 2/2] cluster: early ownership check of partitions Similar to a556549d7, check the ownership of partitions before queueing them to be clustered. Otherwise, we may later wait on an exclusive lock on a partition only for it to fails ownership check anyway, causing other queries to wait behind cluster. --- src/backend/commands/cluster.c | 19 +++++++++++++++---- src/test/regress/expected/cluster.out | 21 +++++++++++++++++++++ src/test/regress/sql/cluster.sql | 15 +++++++++++++++ 3 files changed, 51 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 0f0a6e9f018..671ea3f45a2 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -1663,9 +1663,6 @@ get_tables_to_cluster_partitioned(MemoryContext cluster_context, Oid indexOid) /* Do not lock the children until they're processed */ inhoids = find_all_inheritors(indexOid, NoLock, NULL); - /* Use a permanent memory context for the result list */ - old_context = MemoryContextSwitchTo(cluster_context); - foreach(lc, inhoids) { Oid indexrelid = lfirst_oid(lc); @@ -1676,12 +1673,26 @@ get_tables_to_cluster_partitioned(MemoryContext cluster_context, Oid indexOid) if (get_rel_relkind(indexrelid) != RELKIND_INDEX) continue; + /* + * Silently skip partitions which the user has no access to. They'll + * be skipped later anyway, but this avoids blocking before the + * permission check if another session is reading the table. + */ + if (!pg_class_ownercheck(relid, GetUserId()) && + (!pg_database_ownercheck(MyDatabaseId, GetUserId()) || + IsSharedRelation(relid))) + continue; + + /* Use a permanent memory context for the result list */ + old_context = MemoryContextSwitchTo(cluster_context); + rtc = (RelToCluster *) palloc(sizeof(RelToCluster)); rtc->tableOid = relid; rtc->indexOid = indexrelid; rtcs = lappend(rtcs, rtc); + + MemoryContextSwitchTo(old_context); } - MemoryContextSwitchTo(old_context); return rtcs; } diff --git a/src/test/regress/expected/cluster.out b/src/test/regress/expected/cluster.out index 953818c74e1..08e10d89dcf 100644 --- a/src/test/regress/expected/cluster.out +++ b/src/test/regress/expected/cluster.out @@ -493,6 +493,27 @@ ERROR: cannot mark index clustered in partitioned table ALTER TABLE clstrpart CLUSTER ON clstrpart_idx; ERROR: cannot mark index clustered in partitioned table DROP TABLE clstrpart; +-- Ownership of partitions is checked +CREATE TABLE ptnowner(i int unique) PARTITION BY LIST (i); +CREATE INDEX ptnowner_i_idx ON ptnowner(i); +CREATE TABLE ptnowner1 PARTITION OF ptnowner FOR VALUES IN (1); +CREATE ROLE ptnowner; +CREATE TABLE ptnowner2 PARTITION OF ptnowner FOR VALUES IN (2); +ALTER TABLE ptnowner1 OWNER TO ptnowner; +ALTER TABLE ptnowner OWNER TO ptnowner; +SET SESSION AUTHORIZATION ptnowner; +CREATE TEMP TABLE ptnowner_oldnodes AS SELECT oid, relname, relfilenode FROM pg_partition_tree('ptnowner') AS tree JOIN pg_class AS c ON c.oid=tree.relid; +CLUSTER ptnowner USING ptnowner_i_idx; +SELECT a.relname, a.relfilenode=b.relfilenode FROM pg_class a JOIN ptnowner_oldnodes b USING (oid) ORDER BY a.relname COLLATE "C"; + relname | ?column? +-----------+---------- + ptnowner | t + ptnowner1 | f + ptnowner2 | t +(3 rows) + +RESET SESSION AUTHORIZATION; +DROP TABLE ptnowner; -- Test CLUSTER with external tuplesorting create table clstr_4 as select * from tenk1; create index cluster_sort on clstr_4 (hundred, thousand, tenthous); diff --git a/src/test/regress/sql/cluster.sql b/src/test/regress/sql/cluster.sql index 5601684ee3f..631eec1b739 100644 --- a/src/test/regress/sql/cluster.sql +++ b/src/test/regress/sql/cluster.sql @@ -229,6 +229,21 @@ ALTER TABLE clstrpart SET WITHOUT CLUSTER; ALTER TABLE clstrpart CLUSTER ON clstrpart_idx; DROP TABLE clstrpart; +-- Ownership of partitions is checked +CREATE TABLE ptnowner(i int unique) PARTITION BY LIST (i); +CREATE INDEX ptnowner_i_idx ON ptnowner(i); +CREATE TABLE ptnowner1 PARTITION OF ptnowner FOR VALUES IN (1); +CREATE ROLE ptnowner; +CREATE TABLE ptnowner2 PARTITION OF ptnowner FOR VALUES IN (2); +ALTER TABLE ptnowner1 OWNER TO ptnowner; +ALTER TABLE ptnowner OWNER TO ptnowner; +SET SESSION AUTHORIZATION ptnowner; +CREATE TEMP TABLE ptnowner_oldnodes AS SELECT oid, relname, relfilenode FROM pg_partition_tree('ptnowner') AS tree JOIN pg_class AS c ON c.oid=tree.relid; +CLUSTER ptnowner USING ptnowner_i_idx; +SELECT a.relname, a.relfilenode=b.relfilenode FROM pg_class a JOIN ptnowner_oldnodes b USING (oid) ORDER BY a.relname COLLATE "C"; +RESET SESSION AUTHORIZATION; +DROP TABLE ptnowner; + -- Test CLUSTER with external tuplesorting create table clstr_4 as select * from tenk1; -- 2.17.1