Re: A reloption for partitioned tables - parallel_workers
hi Amit, Thanks so much for doing this. I had created https://commitfest.postgresql.org/32/2987/ and it looks like it now shows your patch as the one to use. Let me know if I should do anything else. Best, Seamus On Fri, Feb 19, 2021, at 2:30 AM, Amit Langote wrote: > On Tue, Feb 16, 2021 at 3:05 PM Amit Langote wrote: > > On Tue, Feb 16, 2021 at 1:35 AM Seamus Abshere wrote: > > > Here we go, my first patch... solves > > > https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520e...@www.fastmail.com > > > > Thanks for sending the patch here. > > > > It seems you haven't made enough changes for reloptions code to > > recognize parallel_workers as valid for partitioned tables, because > > even with the patch applied, I get this: > > > > create table rp (a int) partition by range (a); > > create table rp1 partition of rp for values from (minvalue) to (0); > > create table rp2 partition of rp for values from (0) to (maxvalue); > > alter table rp set (parallel_workers = 1); > > ERROR: unrecognized parameter "parallel_workers" > > > > You need this: > > > > diff --git a/src/backend/access/common/reloptions.c > > b/src/backend/access/common/reloptions.c > > index 029a73325e..9eb8a0c10d 100644 > > --- a/src/backend/access/common/reloptions.c > > +++ b/src/backend/access/common/reloptions.c > > @@ -377,7 +377,7 @@ static relopt_int intRelOpts[] = > > { > > "parallel_workers", > > "Number of parallel processes that can be used per > > executor node for this relation.", > > - RELOPT_KIND_HEAP, > > + RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED, > > ShareUpdateExclusiveLock > > }, > > -1, 0, 1024 > > > > which tells reloptions parsing code that parallel_workers is > > acceptable for both heap and partitioned relations. > > > > Other comments on the patch: > > > > * This function calling style, where the first argument is not placed > > on the same line as the function itself, is not very common in > > Postgres: > > > > + /* First see if there is a root-level setting for parallel_workers > > */ > > + parallel_workers = compute_parallel_worker( > > + rel, > > + -1, > > + -1, > > + max_parallel_workers_per_gather > > + > > > > This makes the new code look very different from the rest of the > > codebase. Better to stick to existing styles. > > > > 2. It might be a good idea to use this opportunity to add a function, > > say compute_append_parallel_workers(), for the code that does what the > > function name says. Then the patch will simply add the new > > compute_parallel_worker() call at the top of that function. > > > > 3. I think we should consider the Append parent relation's > > parallel_workers ONLY if it is a partitioned relation, because it > > doesn't make a lot of sense for other types of parent relations. So > > the new code should look like this: > > > > if (IS_PARTITIONED_REL(rel)) > > parallel_workers = compute_parallel_worker(rel, -1, -1, > > max_parallel_workers_per_gather); > > > > 4. Maybe it also doesn't make sense to consider the parent relation's > > parallel_workers if Parallel Append is disabled > > (enable_parallel_append = off). That's because with a simple > > (non-parallel) Append running under Gather, all launched parallel > > workers process the same partition before moving to the next one. > > OTOH, one's intention of setting parallel_workers on the parent > > partitioned table would most likely be to distribute workers across > > partitions, which is only possible with parallel Append > > (enable_parallel_append = on). So, the new code should look like > > this: > > > > if (IS_PARTITIONED_REL(rel) && enable_parallel_append) > > parallel_workers = compute_parallel_worker(rel, -1, -1, > > max_parallel_workers_per_gather); > > Here is an updated version of the Seamus' patch that takes into > account these and other comments received on this thread so far. > Maybe warrants adding some tests too but I haven't. > > Seamus, please register this patch in the next commit-fest: > https://commitfest.postgresql.org/32/ > > If you haven't already, you will need to create a community account to > use that site. > > -- > Amit Langote > EDB: http://www.enterprisedb.com > > Attachments: > * v2-0001-Allow-setting-parallel_workers-on-partitioned-tab.patch
Re: A reloption for partitioned tables - parallel_workers
hi, Here we go, my first patch... solves https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520e...@www.fastmail.com Best, Seamus On Mon, Feb 15, 2021, at 3:53 AM, Amit Langote wrote: > Hi Seamus, > > On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere wrote: > > It turns out parallel_workers may be a useful reloption for certain uses of > > partitioned tables, at least if they're made up of fancy column store > > partitions (see > > https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com). > > > > Would somebody tell me what I'm doing wrong? I would love to submit a patch > > but I'm stuck: > > > > diff --git a/src/backend/optimizer/path/allpaths.c > > b/src/backend/optimizer/path/allpaths.c > > index cd3fdd259c..f1ade035ac 100644 > > --- a/src/backend/optimizer/path/allpaths.c > > +++ b/src/backend/optimizer/path/allpaths.c > > @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double > > heap_pages, double index_pages, > > * If the user has set the parallel_workers reloption, use that; > > otherwise > > * select a default number of workers. > > */ > > + // I want to affect this > > if (rel->rel_parallel_workers != -1) > > parallel_workers = rel->rel_parallel_workers; > > else > > > > so I do this > > > > diff --git a/src/backend/access/common/reloptions.c > > b/src/backend/access/common/reloptions.c > > index c687d3ee9e..597b209bfb 100644 > > --- a/src/backend/access/common/reloptions.c > > +++ b/src/backend/access/common/reloptions.c > > @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, > > Datum options, bool validate) > > bytea * > > partitioned_table_reloptions(Datum reloptions, bool validate) > > { > > - /* > > -* There are no options for partitioned tables yet, but this is > > able to do > > -* some validation. > > -*/ > > + static const relopt_parse_elt tab[] = { > > + {"parallel_workers", RELOPT_TYPE_INT, > > + offsetof(StdRdOptions, parallel_workers)}, > > + }; > > + > > return (bytea *) build_reloptions(reloptions, validate, > > > > RELOPT_KIND_PARTITIONED, > > - > > 0, NULL, 0); > > + > > sizeof(StdRdOptions), > > + > > tab, lengthof(tab)); > > } > > > > That "works": > > > > postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = > > 33); > > ALTER TABLE > > postgres=# select relname, relkind, reloptions from pg_class where relname > > = 'test_3pd_cstore_partitioned'; > >relname | relkind | reloptions > > -+-+--- > > test_3pd_cstore_partitioned | p | {parallel_workers=33} > > (1 row) > > > > But it seems to be ignored: > > > > diff --git a/src/backend/optimizer/path/allpaths.c > > b/src/backend/optimizer/path/allpaths.c > > index cd3fdd259c..c68835ce38 100644 > > --- a/src/backend/optimizer/path/allpaths.c > > +++ b/src/backend/optimizer/path/allpaths.c > > @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double > > heap_pages, double index_pages, > > * If the user has set the parallel_workers reloption, use that; > > otherwise > > * select a default number of workers. > > */ > > + // I want to affect this, but this assertion always passes > > + Assert(rel->rel_parallel_workers == -1) > > if (rel->rel_parallel_workers != -1) > > parallel_workers = rel->rel_parallel_workers; > > else > > You may see by inspecting the callers of compute_parallel_worker() > that it never gets called on a partitioned table, only its leaf > partitions. Maybe you could try calling compute_parallel_worker() > somewhere in add_paths_to_append_rel(), which has this code to figure > out parallel_workers to use for a parallel Append path for a given > partitioned table: > > /* Find the highest number of workers requested for any subpath. */ > foreach(lc, partial_subpaths) >
A reloption for partitioned tables - parallel_workers
hi, It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're made up of fancy column store partitions (see https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com). Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck: diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index cd3fdd259c..f1ade035ac 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, * If the user has set the parallel_workers reloption, use that; otherwise * select a default number of workers. */ + // I want to affect this if (rel->rel_parallel_workers != -1) parallel_workers = rel->rel_parallel_workers; else so I do this diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index c687d3ee9e..597b209bfb 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate) bytea * partitioned_table_reloptions(Datum reloptions, bool validate) { - /* -* There are no options for partitioned tables yet, but this is able to do -* some validation. -*/ + static const relopt_parse_elt tab[] = { + {"parallel_workers", RELOPT_TYPE_INT, + offsetof(StdRdOptions, parallel_workers)}, + }; + return (bytea *) build_reloptions(reloptions, validate, RELOPT_KIND_PARTITIONED, - 0, NULL, 0); + sizeof(StdRdOptions), + tab, lengthof(tab)); } That "works": postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33); ALTER TABLE postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned'; relname | relkind | reloptions -+-+--- test_3pd_cstore_partitioned | p | {parallel_workers=33} (1 row) But it seems to be ignored: diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index cd3fdd259c..c68835ce38 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, * If the user has set the parallel_workers reloption, use that; otherwise * select a default number of workers. */ + // I want to affect this, but this assertion always passes + Assert(rel->rel_parallel_workers == -1) if (rel->rel_parallel_workers != -1) parallel_workers = rel->rel_parallel_workers; else Thanks and please forgive my code pasting etiquette as this is my first post to pgsql-hackers and I'm not quite sure what the right format is. Thank you, Seamus