Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 24, 2020 at 1:07 PM Bharath Rupireddy wrote: > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > You could change intoclause_len = strlen(intoclausestr) to > > strlen(intoclausestr) + 1 and use intoclause_len in the remaining > > places. We can avoid the +1 in the other places. > > + /* Estimate space for into clause for CTAS. */ > > + if (IS_CTAS(intoclause) && OidIsValid(objectid)) > > + { > > + intoclausestr = nodeToString(intoclause); > > + intoclause_len = strlen(intoclausestr); > > + shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + > > 1); > > + shm_toc_estimate_keys(&pcxt->estimator, 1); > > + } > > Done. > > > Can we use node->nworkers_launched == 0 in place of > > node->need_to_scan_locally, that way the setting and resetting of > > node->need_to_scan_locally can be removed. Unless need_to_scan_locally > > is needed in any of the functions that gets called. > > + /* Enable leader to insert in case no parallel workers were > > launched. */ > > + if (node->nworkers_launched == 0) > > + node->need_to_scan_locally = true; > > + > > + /* > > +* By now, for parallel workers (if launched any), would have > > started their > > +* work i.e. insertion to target table. In case the leader is > > chosen to > > +* participate for parallel inserts in CTAS, then finish its > > share before > > +* going to wait for the parallel workers to finish. > > +*/ > > + if (node->need_to_scan_locally) > > + { > > need_to_scan_locally is being set in ExecGather() even if > nworkers_launched > 0 it can still be true, so I think we can not > remove need_to_scan_locally in ExecParallelInsertInCTAS. > > Attaching v15 patch set for further review. Note that the change is > only in 0001 patch, other patches remain unchanged from v14. > +-- parallel inserts must occur +select explain_pictas( +'create table parallel_write as select length(stringu1) from tenk1;'); +select count(*) from parallel_write; +drop table parallel_write; We can change comment "parallel inserts must occur" like "parallel insert must be selected for CTAS on normal table" +-- parallel inserts must occur +select explain_pictas( +'create unlogged table parallel_write as select length(stringu1) from tenk1;'); +select count(*) from parallel_write; +drop table parallel_write; We can change comment "parallel inserts must occur" like "parallel insert must be selected for CTAS on unlogged table" Similar comment need to be handled in other places also. +create function explain_pictas(text) returns setof text +language plpgsql as +$$ +declare +ln text; +begin +for ln in +execute format('explain (analyze, costs off, summary off, timing off) %s', +$1) +loop +ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N'); +ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N'); +ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N'); +return next ln; +end loop; +end; +$$; The above function is same as function present in partition_prune.sql: create function explain_parallel_append(text) returns setof text language plpgsql as $$ declare ln text; begin for ln in execute format('explain (analyze, costs off, summary off, timing off) %s', $1) loop ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N'); ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N'); ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N'); return next ln; end loop; end; $$; If possible try to make a common function for both and use. + if (intoclausestr && OidIsValid(objectid)) + fpes->objectid = objectid; + else + fpes->objectid = InvalidOid; Here OidIsValid(objectid) check is not required intoclausestr will be set only if OidIsValid. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 24, 2020 at 1:07 PM Bharath Rupireddy wrote: > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > You could change intoclause_len = strlen(intoclausestr) to > > strlen(intoclausestr) + 1 and use intoclause_len in the remaining > > places. We can avoid the +1 in the other places. > > + /* Estimate space for into clause for CTAS. */ > > + if (IS_CTAS(intoclause) && OidIsValid(objectid)) > > + { > > + intoclausestr = nodeToString(intoclause); > > + intoclause_len = strlen(intoclausestr); > > + shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + > > 1); > > + shm_toc_estimate_keys(&pcxt->estimator, 1); > > + } > > Done. > > > Can we use node->nworkers_launched == 0 in place of > > node->need_to_scan_locally, that way the setting and resetting of > > node->need_to_scan_locally can be removed. Unless need_to_scan_locally > > is needed in any of the functions that gets called. > > + /* Enable leader to insert in case no parallel workers were > > launched. */ > > + if (node->nworkers_launched == 0) > > + node->need_to_scan_locally = true; > > + > > + /* > > +* By now, for parallel workers (if launched any), would have > > started their > > +* work i.e. insertion to target table. In case the leader is > > chosen to > > +* participate for parallel inserts in CTAS, then finish its > > share before > > +* going to wait for the parallel workers to finish. > > +*/ > > + if (node->need_to_scan_locally) > > + { > > need_to_scan_locally is being set in ExecGather() even if > nworkers_launched > 0 it can still be true, so I think we can not > remove need_to_scan_locally in ExecParallelInsertInCTAS. > > Attaching v15 patch set for further review. Note that the change is > only in 0001 patch, other patches remain unchanged from v14. I have reviewed part of v15-0001 patch, I have a few comments, I will continue to review this. 1. @@ -763,18 +763,34 @@ GetCurrentCommandId(bool used) /* this is global to a transaction, not subtransaction-local */ if (used) { -/* - * Forbid setting currentCommandIdUsed in a parallel worker, because - * we have no provision for communicating this back to the leader. We - * could relax this restriction when currentCommandIdUsed was already - * true at the start of the parallel operation. - */ -Assert(!IsParallelWorker()); + /* + * This is a temporary hack for all common parallel insert cases i.e. + * insert into, ctas, copy from. To be changed later. In a parallel + * worker, set currentCommandIdUsed to true only if it was not set to + * true at the start of the parallel operation (by way of + * SetCurrentCommandIdUsedForWorker()). We have to do this because + * GetCurrentCommandId(true) may be called from anywhere, especially + * for parallel inserts, within parallel worker. + */ +Assert(!(IsParallelWorker() && !currentCommandIdUsed)); Why is this temporary hack? and what is the plan for removing this hack? 2. +/* + * ChooseParallelInsertsInCTAS --- determine whether or not parallel + * insertion is possible, if yes set the parallel insert state i.e. push down + * the dest receiver to the Gather nodes. + */ +void ChooseParallelInsertsInCTAS(IntoClause *into, QueryDesc *queryDesc) +{ +if (!IS_CTAS(into)) +return; When will this hit? The functtion name suggest that it is from CTAS but now you have a check that if it is not for CTAS then return, can you add the comment that when do you expect this case? Also the function name should start in a new line i.e void ChooseParallelInsertsInCTAS(IntoClause *into, QueryDesc *queryDesc) 3. +/* + * ChooseParallelInsertsInCTAS --- determine whether or not parallel + * insertion is possible, if yes set the parallel insert state i.e. push down + * the dest receiver to the Gather nodes. + */ Push down to the Gather nodes? I think the right statement will be push down below the Gather node. 4. intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo) { DR_intorel *myState = (DR_intorel *) self; +if (myState->is_parallel_worker) +{ +/* In the worker */ +SetCurrentCommandIdUsedForWorker(); +myState->output_cid = GetCurrentCommandId(false); +} +else { non-parallel worker code } } I think instead of moving all the code related to non-parallel worker in the else we can do better. This will avoid unnecessary code movement. 4. intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo) { DR_intorel *myState = (DR_intorel *) self; -- Comment ->in parallel worker we don't need to crease dest recv blah blah +if (myState->is_parallel_worker) { --parallel worker handling-- return;
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Dec 25, 2020 at 10:04 AM Amit Kapila wrote: > > On Fri, Dec 25, 2020 at 9:54 AM Bharath Rupireddy > wrote: > > > > On Fri, Dec 25, 2020 at 7:12 AM vignesh C wrote: > > > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila > > > wrote: > > > > > > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > > > > > > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > > > > wrote: > > > > > > > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > > > > Attaching v14 patch set that has above changes. Please consider this > > > > > > for further review. > > > > > > > > > > > > > > > > Few comments: > > > > > In the below case, should create be above Gather? > > > > > postgres=# explain create table t7 as select * from t6; > > > > > QUERY PLAN > > > > > --- > > > > > Gather (cost=0.00..9.17 rows=0 width=4) > > > > >Workers Planned: 2 > > > > > -> Create t7 > > > > >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > > (4 rows) > > > > > > > > > > Can we change it to something like: > > > > > --- > > > > > Create t7 > > > > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > > > > Workers Planned: 2 > > > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > > (4 rows) > > > > > > > > > > > > > I think it is better to have it in a way as in the current patch > > > > because that reflects that we are performing insert/create below > > > > Gather which is the purpose of this patch. I think this is similar to > > > > what the Parallel Insert patch [1] has for a similar plan. > > > > > > > > > > > > [1] - https://commitfest.postgresql.org/31/2844/ > > > > > > > > > > Also another thing that I felt was that actually the Gather nodes will > > > actually do the insert operation, the Create table will be done earlier > > > itself. Should we change Create table to Insert table something like > > > below: > > > QUERY PLAN > > > --- > > > Gather (cost=0.00..9.17 rows=0 width=4) > > >Workers Planned: 2 > > > -> Insert table2 (instead of Create table2) > > >-> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) > > > > IMO, showing Insert under Gather makes sense if the query is INSERT > > INTO SELECT as it's in the other patch [1]. Since here it is a CTAS > > query, so having Create under Gather looks fine to me. This way we can > > also distinguish the EXPLAINs of parallel inserts in INSERT INTO > > SELECT and CTAS. > > > > Right, IIRC, we have done the way it is in the patch for convenience > and to move forward with it and come back to it later once all other > parts of the patch are good. > > > And also, some might wonder that Create under Gather means that each > > parallel worker is creating the table, it's actually not the creation > > of the table that's parallelized but it's insertion. If required, we > > can clarify it in CTAS docs with a sample EXPLAIN. I have not yet > > added docs related to allowing parallel inserts in CTAS. Shall I add a > > para saying when parallel inserts can be picked and how the sample > > EXPLAIN looks? Thoughts? > > > > Yeah, I don't see any problem with it, and maybe we can move Explain > related code to a separate patch. The reason is we don't display DDL > part without parallelism and this might need a separate discussion. > This makes sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Dec 25, 2020 at 9:54 AM Bharath Rupireddy wrote: > > On Fri, Dec 25, 2020 at 7:12 AM vignesh C wrote: > > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila > > wrote: > > > > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > > > > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > > > wrote: > > > > > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > > > Attaching v14 patch set that has above changes. Please consider this > > > > > for further review. > > > > > > > > > > > > > Few comments: > > > > In the below case, should create be above Gather? > > > > postgres=# explain create table t7 as select * from t6; > > > > QUERY PLAN > > > > --- > > > > Gather (cost=0.00..9.17 rows=0 width=4) > > > >Workers Planned: 2 > > > > -> Create t7 > > > >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > (4 rows) > > > > > > > > Can we change it to something like: > > > > --- > > > > Create t7 > > > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > > > Workers Planned: 2 > > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > (4 rows) > > > > > > > > > > I think it is better to have it in a way as in the current patch > > > because that reflects that we are performing insert/create below > > > Gather which is the purpose of this patch. I think this is similar to > > > what the Parallel Insert patch [1] has for a similar plan. > > > > > > > > > [1] - https://commitfest.postgresql.org/31/2844/ > > > > > > > Also another thing that I felt was that actually the Gather nodes will > > actually do the insert operation, the Create table will be done earlier > > itself. Should we change Create table to Insert table something like below: > > QUERY PLAN > > --- > > Gather (cost=0.00..9.17 rows=0 width=4) > >Workers Planned: 2 > > -> Insert table2 (instead of Create table2) > >-> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) > > IMO, showing Insert under Gather makes sense if the query is INSERT > INTO SELECT as it's in the other patch [1]. Since here it is a CTAS > query, so having Create under Gather looks fine to me. This way we can > also distinguish the EXPLAINs of parallel inserts in INSERT INTO > SELECT and CTAS. > Right, IIRC, we have done the way it is in the patch for convenience and to move forward with it and come back to it later once all other parts of the patch are good. > And also, some might wonder that Create under Gather means that each > parallel worker is creating the table, it's actually not the creation > of the table that's parallelized but it's insertion. If required, we > can clarify it in CTAS docs with a sample EXPLAIN. I have not yet > added docs related to allowing parallel inserts in CTAS. Shall I add a > para saying when parallel inserts can be picked and how the sample > EXPLAIN looks? Thoughts? > Yeah, I don't see any problem with it, and maybe we can move Explain related code to a separate patch. The reason is we don't display DDL part without parallelism and this might need a separate discussion. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Dec 25, 2020 at 9:54 AM Bharath Rupireddy wrote: > > On Fri, Dec 25, 2020 at 7:12 AM vignesh C wrote: > > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila > > wrote: > > > > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > > > > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > > > wrote: > > > > > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > > > Attaching v14 patch set that has above changes. Please consider this > > > > > for further review. > > > > > > > > > > > > > Few comments: > > > > In the below case, should create be above Gather? > > > > postgres=# explain create table t7 as select * from t6; > > > > QUERY PLAN > > > > --- > > > > Gather (cost=0.00..9.17 rows=0 width=4) > > > >Workers Planned: 2 > > > > -> Create t7 > > > >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > (4 rows) > > > > > > > > Can we change it to something like: > > > > --- > > > > Create t7 > > > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > > > Workers Planned: 2 > > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > > (4 rows) > > > > > > > > > > I think it is better to have it in a way as in the current patch > > > because that reflects that we are performing insert/create below > > > Gather which is the purpose of this patch. I think this is similar to > > > what the Parallel Insert patch [1] has for a similar plan. > > > > > > > > > [1] - https://commitfest.postgresql.org/31/2844/ > > > > > > > Also another thing that I felt was that actually the Gather nodes will > > actually do the insert operation, the Create table will be done earlier > > itself. Should we change Create table to Insert table something like below: > > QUERY PLAN > > --- > > Gather (cost=0.00..9.17 rows=0 width=4) > >Workers Planned: 2 > > -> Insert table2 (instead of Create table2) > >-> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) > > IMO, showing Insert under Gather makes sense if the query is INSERT > INTO SELECT as it's in the other patch [1]. Since here it is a CTAS > query, so having Create under Gather looks fine to me. This way we can > also distinguish the EXPLAINs of parallel inserts in INSERT INTO > SELECT and CTAS. I don't think that is a problem because now also if we EXPLAIN CTAS it will appear like we are executing the select query because that is what we are planning for only the select part. So now if we are including the INSERT in the planning and pushing the insert under the gather then it will make more sense to show INSERT instead of showing CREATE. Let's see what others think. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Dec 25, 2020 at 7:12 AM vignesh C wrote: > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila wrote: > > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > > wrote: > > > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > > Attaching v14 patch set that has above changes. Please consider this > > > > for further review. > > > > > > > > > > Few comments: > > > In the below case, should create be above Gather? > > > postgres=# explain create table t7 as select * from t6; > > > QUERY PLAN > > > --- > > > Gather (cost=0.00..9.17 rows=0 width=4) > > >Workers Planned: 2 > > > -> Create t7 > > >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > (4 rows) > > > > > > Can we change it to something like: > > > --- > > > Create t7 > > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > > Workers Planned: 2 > > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > > (4 rows) > > > > > > > I think it is better to have it in a way as in the current patch > > because that reflects that we are performing insert/create below > > Gather which is the purpose of this patch. I think this is similar to > > what the Parallel Insert patch [1] has for a similar plan. > > > > > > [1] - https://commitfest.postgresql.org/31/2844/ > > > > Also another thing that I felt was that actually the Gather nodes will > actually do the insert operation, the Create table will be done earlier > itself. Should we change Create table to Insert table something like below: > QUERY PLAN > --- > Gather (cost=0.00..9.17 rows=0 width=4) >Workers Planned: 2 > -> Insert table2 (instead of Create table2) >-> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) IMO, showing Insert under Gather makes sense if the query is INSERT INTO SELECT as it's in the other patch [1]. Since here it is a CTAS query, so having Create under Gather looks fine to me. This way we can also distinguish the EXPLAINs of parallel inserts in INSERT INTO SELECT and CTAS. And also, some might wonder that Create under Gather means that each parallel worker is creating the table, it's actually not the creation of the table that's parallelized but it's insertion. If required, we can clarify it in CTAS docs with a sample EXPLAIN. I have not yet added docs related to allowing parallel inserts in CTAS. Shall I add a para saying when parallel inserts can be picked and how the sample EXPLAIN looks? Thoughts? [1] - https://commitfest.postgresql.org/31/2844/ With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila wrote: > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > > wrote: > > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > > Attaching v14 patch set that has above changes. Please consider this > > > for further review. > > > > > > > Few comments: > > In the below case, should create be above Gather? > > postgres=# explain create table t7 as select * from t6; > > QUERY PLAN > > --- > > Gather (cost=0.00..9.17 rows=0 width=4) > >Workers Planned: 2 > > -> Create t7 > >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > (4 rows) > > > > Can we change it to something like: > > --- > > Create t7 > > -> Gather (cost=0.00..9.17 rows=0 width=4) > > Workers Planned: 2 > > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > > (4 rows) > > > > I think it is better to have it in a way as in the current patch > because that reflects that we are performing insert/create below > Gather which is the purpose of this patch. I think this is similar to > what the Parallel Insert patch [1] has for a similar plan. > > > [1] - https://commitfest.postgresql.org/31/2844/ > Also another thing that I felt was that actually the Gather nodes will actually do the insert operation, the Create table will be done earlier itself. Should we change Create table to Insert table something like below: QUERY PLAN --- Gather (cost=0.00..9.17 rows=0 width=4) Workers Planned: 2 -> *Insert table2 **(instead of Create table2)* -> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4) Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 24, 2020 at 10:25 AM vignesh C wrote: > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy > wrote: > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > > Attaching v14 patch set that has above changes. Please consider this > > for further review. > > > > Few comments: > In the below case, should create be above Gather? > postgres=# explain create table t7 as select * from t6; > QUERY PLAN > --- > Gather (cost=0.00..9.17 rows=0 width=4) >Workers Planned: 2 > -> Create t7 >-> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > (4 rows) > > Can we change it to something like: > --- > Create t7 > -> Gather (cost=0.00..9.17 rows=0 width=4) > Workers Planned: 2 > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) > (4 rows) > I think it is better to have it in a way as in the current patch because that reflects that we are performing insert/create below Gather which is the purpose of this patch. I think this is similar to what the Parallel Insert patch [1] has for a similar plan. [1] - https://commitfest.postgresql.org/31/2844/ -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy wrote: > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy > Attaching v14 patch set that has above changes. Please consider this > for further review. > Few comments: In the below case, should create be above Gather? postgres=# explain create table t7 as select * from t6; QUERY PLAN --- Gather (cost=0.00..9.17 rows=0 width=4) Workers Planned: 2 -> Create t7 -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) (4 rows) Can we change it to something like: --- Create t7 -> Gather (cost=0.00..9.17 rows=0 width=4) Workers Planned: 2 -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4) (4 rows) You could change intoclause_len = strlen(intoclausestr) to strlen(intoclausestr) + 1 and use intoclause_len in the remaining places. We can avoid the +1 in the other places. + /* Estimate space for into clause for CTAS. */ + if (IS_CTAS(intoclause) && OidIsValid(objectid)) + { + intoclausestr = nodeToString(intoclause); + intoclause_len = strlen(intoclausestr); + shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + 1); + shm_toc_estimate_keys(&pcxt->estimator, 1); + } Can we use node->nworkers_launched == 0 in place of node->need_to_scan_locally, that way the setting and resetting of node->need_to_scan_locally can be removed. Unless need_to_scan_locally is needed in any of the functions that gets called. + /* Enable leader to insert in case no parallel workers were launched. */ + if (node->nworkers_launched == 0) + node->need_to_scan_locally = true; + + /* +* By now, for parallel workers (if launched any), would have started their +* work i.e. insertion to target table. In case the leader is chosen to +* participate for parallel inserts in CTAS, then finish its share before +* going to wait for the parallel workers to finish. +*/ + if (node->need_to_scan_locally) + { Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 21, 2020 at 8:16 AM Hou, Zhijie wrote: > The cfbost seems complains about the testcase: > > Command exited with code 1 > perl dumpregr.pl > === $path ===\ndiff -w -U3 > C:/projects/postgresql/src/test/regress/expected/write_parallel.out > C:/projects/postgresql/src/test/regress/results/write_parallel.out > --- C:/projects/postgresql/src/test/regress/expected/write_parallel.out > 2020-12-21 01:41:17.745091500 + > +++ C:/projects/postgresql/src/test/regress/results/write_parallel.out > 2020-12-21 01:47:20.375514800 + > @@ -1204,7 +1204,7 @@ > -> Gather (actual rows=2 loops=1) > Workers Planned: 3 > Workers Launched: 3 > - -> Parallel Seq Scan on temp2 (actual rows=0 loops=4) > + -> Parallel Seq Scan on temp2 (actual rows=1 loops=4) > Filter: (col2 < 3) > Rows Removed by Filter: 1 > (14 rows) > @@ -1233,7 +1233,7 @@ > -> Gather (actual rows=2 loops=1) > Workers Planned: 3 > Workers Launched: 3 > - -> Parallel Seq Scan on temp2 (actual rows=0 loops=4) > + -> Parallel Seq Scan on temp2 (actual rows=1 loops=4) > Filter: (col2 < 3) > Rows Removed by Filter: 1 > (14 rows) Thanks! Looks like the explain analyze test case outputs can be unstable because we may not get the requested number of workers always. The comment before explain_parallel_append function in partition_prune.sql explains it well. Solution is to have a function similar to explain_parallel_append, say explain_parallel_inserts in write_parallel.sql and use that for all explain analyze cases. This will make the results consistent. Thoughts? If okay, I will update the test cases and post new patches. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
Hi The cfbost seems complains about the testcase: Command exited with code 1 perl dumpregr.pl === $path ===\ndiff -w -U3 C:/projects/postgresql/src/test/regress/expected/write_parallel.out C:/projects/postgresql/src/test/regress/results/write_parallel.out --- C:/projects/postgresql/src/test/regress/expected/write_parallel.out 2020-12-21 01:41:17.745091500 + +++ C:/projects/postgresql/src/test/regress/results/write_parallel.out 2020-12-21 01:47:20.375514800 + @@ -1204,7 +1204,7 @@ -> Gather (actual rows=2 loops=1) Workers Planned: 3 Workers Launched: 3 - -> Parallel Seq Scan on temp2 (actual rows=0 loops=4) + -> Parallel Seq Scan on temp2 (actual rows=1 loops=4) Filter: (col2 < 3) Rows Removed by Filter: 1 (14 rows) @@ -1233,7 +1233,7 @@ -> Gather (actual rows=2 loops=1) Workers Planned: 3 Workers Launched: 3 - -> Parallel Seq Scan on temp2 (actual rows=0 loops=4) + -> Parallel Seq Scan on temp2 (actual rows=1 loops=4) Filter: (col2 < 3) Rows Removed by Filter: 1 (14 rows) Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Wed, Dec 16, 2020 at 12:06 PM Bharath Rupireddy wrote: > > On Tue, Dec 15, 2020 at 5:53 PM Bharath Rupireddy > wrote: > > I'm merging them to the original patch set and adding the test cases > > to cover these cases. I will post the updated patch set soon. > > Attaching v12 patch set. > > 0001 - parallel inserts without tuple cost enforcement. > 0002 - enforce planner for parallel tuple cost > 0003 - test cases > > Please review it further. > I think it will be clean to implement the parallel CTAS when a top-level node is the gather node. Basically, the idea is that whenever we get the gather on the top which doesn't have any projection then we can push down the dest receiver directly to the worker. I agree that append is an exception that doesn't do any extra processing other than appending the results, So IMHO it would be better that in the first part we parallelize the plan where gather node on top. I see that we have already worked on the patch where the append node is on top so I would suggest that we can keep that part in a separate patch. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 15, 2020 at 5:48 PM Hou, Zhijie wrote: > > A little explanation about how to push down the ctas info in append. > > > > 1. about how to ignore tuple cost in this case. > > IMO, it create gather path under append like the following: > > query_planner > > -make_one_rel > > --set_base_rel_sizes > > ---set_rel_size > > set_append_rel_size (*) > > -set_rel_size > > --set_subquery_pathlist > > ---subquery_planner > > grouping_planner > > -apply_scanjoin_target_to_paths > > --generate_useful_gather_paths > > > > set_append_rel_size seems the right place where we can check and set a flag > > to ignore tuple cost later. > > We can set the flag for two cases when there is no parent path will be > > created(such as : limit,sort,distinct...): > > i) query_level is 1 > > ii) query_level > 1 and we have set the flag in the parent_root. > > > > The case ii) is to check append under append: > > Append > >->Append > >->Gather > >->Other plan > > > > 2.about how to push ctas info down. > > > > We traversing the whole plans tree, and we only care Append and Gather type. > > Gather: It set the ctas dest info and returned true at once if the > > gathernode > > does not have projection. > > Append: It will recursively traversing the subplan of Appendnode and will > > reture true if one of the subplan can be parallel. > > > > +PushDownCTASParallelInsertState(DestReceiver *dest, PlanState *ps) { > > + bool parallel = false; > > + > > + if(ps == NULL) > > + return parallel; > > + > > + if(IsA(ps, AppendState)) > > + { > > + AppendState *aps = (AppendState *) ps; > > + for(int i = 0; i < aps->as_nplans; i++) > > + { > > + parallel |= > > PushDownCTASParallelInsertState(dest, aps->appendplans[i]); > > + } > > + } > > + else if(IsA(ps, GatherState) && !ps->ps_ProjInfo) > > + { > > + GatherState *gstate = (GatherState *) ps; > > + parallel = true; > > + > > + ((DR_intorel *) dest)->is_parallel = true; > > + gstate->dest = dest; > > + ps->plan->plan_rows = 0; > > + } > > + > > + return parallel; > > +} > > So sorry for my miss, my last patch has some mistakes. > Attatch the new one. Thanks for the append patches. Basically your changes look good to me. I'm merging them to the original patch set and adding the test cases to cover these cases. I will post the updated patch set soon. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
> From: Hou, Zhijie [mailto:houzj.f...@cn.fujitsu.com] > Sent: Tuesday, December 15, 2020 7:30 PM > To: Bharath Rupireddy > Cc: Amit Kapila ; Luc Vlaming ; > PostgreSQL-development ; Zhihong Yu > ; Dilip Kumar > Subject: RE: Parallel Inserts in CREATE TABLE AS > > > Thanks for the append use case. > > > > Here's my analysis on pushing parallel inserts down even in case the > > top node is Append. > > > > For union cases which need to remove duplicate tuples, we can't push > > the inserts or CTAS dest receiver down. If I'm not wrong, Append node > > is not doing duplicate removal(??), I saw that it's the HashAggregate > > node (which is the top node that removes the duplicate tuples). And > > also for except/except all/intersect/intersect all cases we receive > > HashSetOp nodes on top of Append. So for both cases, our check for > > Gather or Append at the top node is enough to detect this to not allow > parallel inserts. > > > > For union all: > > case 1: We can push the CTAS dest receiver to each Gather node Append > > ->Gather > > ->Parallel Seq Scan > > ->Gather > > ->Parallel Seq Scan > > ->Gather > > ->Parallel Seq Scan > > > > case 2: We can still push the CTAS dest receiver to each Gather node. > > Non-Gather nodes will do inserts as they do now i.e. by sending tuples > > to Append and from there to CTAS dest receiver. > > Append > > ->Gather > > ->Parallel Seq Scan > > ->Seq Scan / Join / any other non-Gather node > > ->Gather > > ->Parallel Seq Scan > > ->Seq Scan / Join / any other non-Gather node > > > > case 3: We can push the CTAS dest receiver to Gather Gather > > ->Parallel Append > > ->Parallel Seq Scan > > ->Parallel Seq Scan > > > > case 4: We can push the CTAS dest receiver to Gather Gather > > ->Parallel Append > > ->Parallel Seq Scan > > ->Parallel Seq Scan > > ->Seq Scan / Join / any other non-Gather node > > > > Please let me know if I'm missing any other possible use case. > > > > Thoughts? > > > Yes, The analysis looks right to me. > > > > As suggested by Amit earlier, I kept the 0001 patch(so far) such that > > it doesn't have the code to influence the planner to consider parallel > > tuple cost as 0. It works on the plan whatever gets generated and > > decides to allow parallel inserts or not. And in the 0002 patch, I > > added the code for influencing the planner to consider parallel tuple > > cost as 0. Maybe we can have a 0003 patch for tests alone. > > > > Once we are okay with the above analysis and use cases, we can > > incorporate the Append changes to respective patches. > > > > Hope that's okay. > > A little explanation about how to push down the ctas info in append. > > 1. about how to ignore tuple cost in this case. > IMO, it create gather path under append like the following: > query_planner > -make_one_rel > --set_base_rel_sizes > ---set_rel_size > set_append_rel_size (*) > -set_rel_size > --set_subquery_pathlist > ---subquery_planner > grouping_planner > -apply_scanjoin_target_to_paths > --generate_useful_gather_paths > > set_append_rel_size seems the right place where we can check and set a flag > to ignore tuple cost later. > We can set the flag for two cases when there is no parent path will be > created(such as : limit,sort,distinct...): > i) query_level is 1 > ii) query_level > 1 and we have set the flag in the parent_root. > > The case ii) is to check append under append: > Append >->Append >->Gather >->Other plan > > 2.about how to push ctas info down. > > We traversing the whole plans tree, and we only care Append and Gather type. > Gather: It set the ctas dest info and returned true at once if the gathernode > does not have projection. > Append: It will recursively traversing the subplan of Appendnode and will > reture true if one of the subplan can be parallel. > > +PushDownCTASParallelInsertState(DestReceiver *dest, PlanState *ps) { > + bool parallel = false; > + > + if(ps == NULL) > + return parallel; > + > + if(IsA(ps, AppendState)) > + { > + AppendState *aps = (AppendState *) ps; > + for(int i = 0; i < aps->as_nplans; i++) > + { > +
RE: Parallel Inserts in CREATE TABLE AS
> Thanks for the append use case. > > Here's my analysis on pushing parallel inserts down even in case the top > node is Append. > > For union cases which need to remove duplicate tuples, we can't push the > inserts or CTAS dest receiver down. If I'm not wrong, Append node is not > doing duplicate removal(??), I saw that it's the HashAggregate node (which > is the top node that removes the duplicate tuples). And also for > except/except all/intersect/intersect all cases we receive HashSetOp nodes > on top of Append. So for both cases, our check for Gather or Append at the > top node is enough to detect this to not allow parallel inserts. > > For union all: > case 1: We can push the CTAS dest receiver to each Gather node Append > ->Gather > ->Parallel Seq Scan > ->Gather > ->Parallel Seq Scan > ->Gather > ->Parallel Seq Scan > > case 2: We can still push the CTAS dest receiver to each Gather node. > Non-Gather nodes will do inserts as they do now i.e. by sending tuples to > Append and from there to CTAS dest receiver. > Append > ->Gather > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > ->Gather > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > > case 3: We can push the CTAS dest receiver to Gather Gather > ->Parallel Append > ->Parallel Seq Scan > ->Parallel Seq Scan > > case 4: We can push the CTAS dest receiver to Gather Gather > ->Parallel Append > ->Parallel Seq Scan > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > > Please let me know if I'm missing any other possible use case. > > Thoughts? Yes, The analysis looks right to me. > As suggested by Amit earlier, I kept the 0001 patch(so far) such that it > doesn't have the code to influence the planner to consider parallel tuple > cost as 0. It works on the plan whatever gets generated and decides to allow > parallel inserts or not. And in the 0002 patch, I added the code for > influencing the planner to consider parallel tuple cost as 0. Maybe we can > have a 0003 patch for tests alone. > > Once we are okay with the above analysis and use cases, we can incorporate > the Append changes to respective patches. > > Hope that's okay. A little explanation about how to push down the ctas info in append. 1. about how to ignore tuple cost in this case. IMO, it create gather path under append like the following: query_planner -make_one_rel --set_base_rel_sizes ---set_rel_size set_append_rel_size (*) -set_rel_size --set_subquery_pathlist ---subquery_planner grouping_planner -apply_scanjoin_target_to_paths --generate_useful_gather_paths set_append_rel_size seems the right place where we can check and set a flag to ignore tuple cost later. We can set the flag for two cases when there is no parent path will be created(such as : limit,sort,distinct...): i) query_level is 1 ii) query_level > 1 and we have set the flag in the parent_root. The case ii) is to check append under append: Append ->Append ->Gather ->Other plan 2.about how to push ctas info down. We traversing the whole plans tree, and we only care Append and Gather type. Gather: It set the ctas dest info and returned true at once if the gathernode does not have projection. Append: It will recursively traversing the subplan of Appendnode and will reture true if one of the subplan can be parallel. +PushDownCTASParallelInsertState(DestReceiver *dest, PlanState *ps) +{ + bool parallel = false; + + if(ps == NULL) + return parallel; + + if(IsA(ps, AppendState)) + { + AppendState *aps = (AppendState *) ps; + for(int i = 0; i < aps->as_nplans; i++) + { + parallel |= PushDownCTASParallelInsertState(dest, aps->appendplans[i]); + } + } + else if(IsA(ps, GatherState) && !ps->ps_ProjInfo) + { + GatherState *gstate = (GatherState *) ps; + parallel = true; + + ((DR_intorel *) dest)->is_parallel = true; + gstate->dest = dest; + ps->plan->plan_rows = 0; + } + + return parallel; +} Best regards, houzj 0001-support-pctas-in-append-parallel-inserts.patch Description: 0001-support-pctas-in-append-parallel-inserts.patch 0002-support-pctas-in-append-tuple-cost-adjustment.patch Description: 0002-support-pctas-in-append-tuple-cost-adjustment.patch
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 15, 2020 at 2:06 PM Bharath Rupireddy wrote: > > On Mon, Dec 14, 2020 at 6:08 PM Hou, Zhijie wrote: > > Currently with the patch, we can allow parallel CTAS when topnode is Gather. > > When top-node is Append and Gather is the sub-node of Append, I think we > > can still enable > > Parallel CTAS by pushing Parallel CTAS down to the sub-node Gather, such as: > > > > Append > > -->Gather > > ->Create table > > ->Seqscan > > -->Gather > > ->create table > > ->Seqscan > > > > And the use case seems common to me, such as: > > select * from A where xxx union all select * from B where xxx; > > Thanks for the append use case. > > Here's my analysis on pushing parallel inserts down even in case the > top node is Append. > > For union cases which need to remove duplicate tuples, we can't push > the inserts or CTAS dest receiver down. If I'm not wrong, Append node > is not doing duplicate removal(??), I saw that it's the HashAggregate > node (which is the top node that removes the duplicate tuples). And > also for except/except all/intersect/intersect all cases we receive > HashSetOp nodes on top of Append. So for both cases, our check for > Gather or Append at the top node is enough to detect this to not allow > parallel inserts. > > For union all: > case 1: We can push the CTAS dest receiver to each Gather node > Append > ->Gather > ->Parallel Seq Scan > ->Gather > ->Parallel Seq Scan > ->Gather > ->Parallel Seq Scan > > case 2: We can still push the CTAS dest receiver to each Gather node. > Non-Gather nodes will do inserts as they do now i.e. by sending tuples > to Append and from there to CTAS dest receiver. > Append > ->Gather > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > ->Gather > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > > case 3: We can push the CTAS dest receiver to Gather > Gather > ->Parallel Append > ->Parallel Seq Scan > ->Parallel Seq Scan > > case 4: We can push the CTAS dest receiver to Gather > Gather > ->Parallel Append > ->Parallel Seq Scan > ->Parallel Seq Scan > ->Seq Scan / Join / any other non-Gather node > > Please let me know if I'm missing any other possible use case. > > Thoughts? Your analysis looks right to me. > > I attach a WIP patch which just show the possibility of this feature. > > The patch is based on the latest v11-patch. > > > > What do you think? > > As suggested by Amit earlier, I kept the 0001 patch(so far) such that > it doesn't have the code to influence the planner to consider parallel > tuple cost as 0. It works on the plan whatever gets generated and > decides to allow parallel inserts or not. And in the 0002 patch, I > added the code for influencing the planner to consider parallel tuple > cost as 0. Maybe we can have a 0003 patch for tests alone. Yeah, that makes sense and it will be easy for the review. > Once we are okay with the above analysis and use cases, we can > incorporate the Append changes to respective patches. > > Hope that's okay. Make sense to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 14, 2020 at 6:08 PM Hou, Zhijie wrote: > Currently with the patch, we can allow parallel CTAS when topnode is Gather. > When top-node is Append and Gather is the sub-node of Append, I think we can > still enable > Parallel CTAS by pushing Parallel CTAS down to the sub-node Gather, such as: > > Append > -->Gather > ->Create table > ->Seqscan > -->Gather > ->create table > ->Seqscan > > And the use case seems common to me, such as: > select * from A where xxx union all select * from B where xxx; Thanks for the append use case. Here's my analysis on pushing parallel inserts down even in case the top node is Append. For union cases which need to remove duplicate tuples, we can't push the inserts or CTAS dest receiver down. If I'm not wrong, Append node is not doing duplicate removal(??), I saw that it's the HashAggregate node (which is the top node that removes the duplicate tuples). And also for except/except all/intersect/intersect all cases we receive HashSetOp nodes on top of Append. So for both cases, our check for Gather or Append at the top node is enough to detect this to not allow parallel inserts. For union all: case 1: We can push the CTAS dest receiver to each Gather node Append ->Gather ->Parallel Seq Scan ->Gather ->Parallel Seq Scan ->Gather ->Parallel Seq Scan case 2: We can still push the CTAS dest receiver to each Gather node. Non-Gather nodes will do inserts as they do now i.e. by sending tuples to Append and from there to CTAS dest receiver. Append ->Gather ->Parallel Seq Scan ->Seq Scan / Join / any other non-Gather node ->Gather ->Parallel Seq Scan ->Seq Scan / Join / any other non-Gather node case 3: We can push the CTAS dest receiver to Gather Gather ->Parallel Append ->Parallel Seq Scan ->Parallel Seq Scan case 4: We can push the CTAS dest receiver to Gather Gather ->Parallel Append ->Parallel Seq Scan ->Parallel Seq Scan ->Seq Scan / Join / any other non-Gather node Please let me know if I'm missing any other possible use case. Thoughts? > I attach a WIP patch which just show the possibility of this feature. > The patch is based on the latest v11-patch. > > What do you think? As suggested by Amit earlier, I kept the 0001 patch(so far) such that it doesn't have the code to influence the planner to consider parallel tuple cost as 0. It works on the plan whatever gets generated and decides to allow parallel inserts or not. And in the 0002 patch, I added the code for influencing the planner to consider parallel tuple cost as 0. Maybe we can have a 0003 patch for tests alone. Once we are okay with the above analysis and use cases, we can incorporate the Append changes to respective patches. Hope that's okay. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
For set_append_rel_size(), it seems this is the difference between query_level != 1 and query_level == 1: + (root->parent_root->parse->CTASParallelInsInfo & CTAS_PARALLEL_INS_IGN_TUP_COST_APPEND) && Maybe extract the common conditions into its own expression / variable so that the code is easier to read. Cheers On Mon, Dec 14, 2020 at 4:50 AM Hou, Zhijie wrote: > Hi > > > Attaching v11 patch set. Please review it further. > > Currently with the patch, we can allow parallel CTAS when topnode is > Gather. > When top-node is Append and Gather is the sub-node of Append, I think we > can still enable > Parallel CTAS by pushing Parallel CTAS down to the sub-node Gather, such > as: > > Append > -->Gather > ->Create table > ->Seqscan > -->Gather > ->create table > ->Seqscan > > And the use case seems common to me, such as: > select * from A where xxx union all select * from B where xxx; > > I attatch a WIP patch which just show the possibility of this feature. > The patch is based on the latest v11-patch. > > What do you think? > > Best regards, > houzj > > > > > >
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 14, 2020 at 4:06 PM Bharath Rupireddy wrote: > > On Thu, Dec 10, 2020 at 7:20 PM Bharath Rupireddy > wrote: > > On Thu, Dec 10, 2020 at 5:19 PM Dilip Kumar wrote: > > > > > > + allow = ps && IsA(ps, GatherState) && > > > > > > !ps->ps_ProjInfo && > > > > > > + plannedstmt->parallelModeNeeded && > > > > > > + plannedstmt->planTree && > > > > > > + IsA(plannedstmt->planTree, Gather) > > > > > > && > > > > > > + plannedstmt->planTree->lefttree && > > > > > > + > > > > > > plannedstmt->planTree->lefttree->parallel_aware && > > > > > > + > > > > > > plannedstmt->planTree->lefttree->parallel_safe; > > > > > > > > > > > > I noticed it check both IsA(ps, GatherState) and > > > > > > IsA(plannedstmt->planTree, Gather). > > > > > > Does it mean it is possible that IsA(ps, GatherState) is true but > > > > > > IsA(plannedstmt->planTree, Gather) is false ? > > > > > > > > > > > > I did some test but did not find a case like that. > > > > > > > > > > > This seems like an extra check. Apart from that if we combine 0001 > > > > > and 0002 there should be an additional protection so that it should > > > > > not happen that in cost_gather we have ignored the parallel tuple cost > > > > > and now we are rejecting the parallel insert. Probably we should add > > > > > an assert. > > > > > > > > Yeah it's an extra check. I don't think we need that extra check > > > > IsA(plannedstmt->planTree, Gather). GatherState check is enough. I > > > > verified it as follows: the gatherstate will be allocated and > > > > initialized with the plan tree in ExecInitGather which are the ones we > > > > are checking here. So, there is no chance that the plan state is > > > > GatherState and the plan tree will not be Gather. I will remove > > > > IsA(plannedstmt->planTree, Gather) check in the next version of the > > > > patch set. > > > > > > > > Breakpoint 4, ExecInitGather (node=0x5647f98ae994 > > > > , estate=0x1ca8, eflags=730035099) at > > > > nodeGather.c:61 > > > > (gdb) p gatherstate > > > > $10 = (GatherState *) 0x5647fac83850 > > > > (gdb) p gatherstate->ps.plan > > > > $11 = (Plan *) 0x5647fac918a0 > > > > > > > > Breakpoint 1, IsParallelInsertInCTASAllowed (into=0x5647fac97580, > > > > queryDesc=0x5647fac835e0) at createas.c:663 > > > > 663 { > > > > (gdb) p ps > > > > $13 = (PlanState *) 0x5647fac83850 > > > > (gdb) p ps->plan > > > > $14 = (Plan *) 0x5647fac918a0 > > > > > > > Hope you did not miss the second part of my comment > > > " > > > > Apart from that if we combine 0001 > > > > and 0002 there should be additional protection so that it should > > > > not happen that in cost_gather we have ignored the parallel tuple cost > > > > and now we are rejecting the parallel insert. Probably we should add > > > > an assert. > > > " > > > > IIUC, we need to set a flag in cost_gather(in 0002 patch) whenever we > > ignore the parallel tuple cost and while checking to allow or disallow > > parallel inserts in IsParallelInsertInCTASAllowed(), we need to add an > > assert something like Assert(cost_ignored_in_cost_gather && allow) > > before return allow; > > > > This assertion fails 1) either if we have not ignored the cost but > > allowing parallel inserts 2) or we ignored the cost but not allowing > > parallel inserts. > > > > 1) seems to be fine, we can go ahead and perform parallel inserts. 2) > > is the concern that the planner would have wrongly chosen the parallel > > plan, but in this case also isn't it better to go ahead with the > > parallel plan instead of failing the query? > > > > +/* > > + * We allow parallel inserts by the workers only if the Gather > > node has > > + * no projections to perform and if the upper node is Gather. In > > case, > > + * the Gather node has projections, which is possible if there are > > any > > + * subplans in the query, the workers can not do those > > projections. And > > + * when the upper node is GatherMerge, then the leader has to > > perform > > + * the final phase i.e. merge the results by workers. > > + */ > > +allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && > > +plannedstmt->parallelModeNeeded && > > +plannedstmt->planTree && > > +plannedstmt->planTree->lefttree && > > +plannedstmt->planTree->lefttree->parallel_aware && > > +plannedstmt->planTree->lefttree->parallel_safe; > > + > > +return allow; > > +} > > I added the assertion into the 0002 patch so that it fails when the > planner ignores parallel tuple cost and may choose parallel plan but > later we don't allow parallel inserts. make check and make check-world > passeses without any assertion failures. > > Attaching v11 patch set
RE: Parallel Inserts in CREATE TABLE AS
Hi > Attaching v11 patch set. Please review it further. Currently with the patch, we can allow parallel CTAS when topnode is Gather. When top-node is Append and Gather is the sub-node of Append, I think we can still enable Parallel CTAS by pushing Parallel CTAS down to the sub-node Gather, such as: Append -->Gather ->Create table ->Seqscan -->Gather ->create table ->Seqscan And the use case seems common to me, such as: select * from A where xxx union all select * from B where xxx; I attatch a WIP patch which just show the possibility of this feature. The patch is based on the latest v11-patch. What do you think? Best regards, houzj 0001-patch-for-pctas-in-append.patch Description: 0001-patch-for-pctas-in-append.patch
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 7:20 PM Bharath Rupireddy wrote: > On Thu, Dec 10, 2020 at 5:19 PM Dilip Kumar wrote: > > > > > + allow = ps && IsA(ps, GatherState) && > > > > > !ps->ps_ProjInfo && > > > > > + plannedstmt->parallelModeNeeded && > > > > > + plannedstmt->planTree && > > > > > + IsA(plannedstmt->planTree, Gather) && > > > > > + plannedstmt->planTree->lefttree && > > > > > + > > > > > plannedstmt->planTree->lefttree->parallel_aware && > > > > > + > > > > > plannedstmt->planTree->lefttree->parallel_safe; > > > > > > > > > > I noticed it check both IsA(ps, GatherState) and > > > > > IsA(plannedstmt->planTree, Gather). > > > > > Does it mean it is possible that IsA(ps, GatherState) is true but > > > > > IsA(plannedstmt->planTree, Gather) is false ? > > > > > > > > > > I did some test but did not find a case like that. > > > > > > > > > This seems like an extra check. Apart from that if we combine 0001 > > > > and 0002 there should be an additional protection so that it should > > > > not happen that in cost_gather we have ignored the parallel tuple cost > > > > and now we are rejecting the parallel insert. Probably we should add > > > > an assert. > > > > > > Yeah it's an extra check. I don't think we need that extra check > > > IsA(plannedstmt->planTree, Gather). GatherState check is enough. I > > > verified it as follows: the gatherstate will be allocated and initialized > > > with the plan tree in ExecInitGather which are the ones we are checking > > > here. So, there is no chance that the plan state is GatherState and the > > > plan tree will not be Gather. I will remove IsA(plannedstmt->planTree, > > > Gather) check in the next version of the patch set. > > > > > > Breakpoint 4, ExecInitGather (node=0x5647f98ae994 > > > , estate=0x1ca8, eflags=730035099) at > > > nodeGather.c:61 > > > (gdb) p gatherstate > > > $10 = (GatherState *) 0x5647fac83850 > > > (gdb) p gatherstate->ps.plan > > > $11 = (Plan *) 0x5647fac918a0 > > > > > > Breakpoint 1, IsParallelInsertInCTASAllowed (into=0x5647fac97580, > > > queryDesc=0x5647fac835e0) at createas.c:663 > > > 663 { > > > (gdb) p ps > > > $13 = (PlanState *) 0x5647fac83850 > > > (gdb) p ps->plan > > > $14 = (Plan *) 0x5647fac918a0 > > > > > Hope you did not miss the second part of my comment > > " > > > Apart from that if we combine 0001 > > > and 0002 there should be additional protection so that it should > > > not happen that in cost_gather we have ignored the parallel tuple cost > > > and now we are rejecting the parallel insert. Probably we should add > > > an assert. > > " > > IIUC, we need to set a flag in cost_gather(in 0002 patch) whenever we > ignore the parallel tuple cost and while checking to allow or disallow > parallel inserts in IsParallelInsertInCTASAllowed(), we need to add an > assert something like Assert(cost_ignored_in_cost_gather && allow) > before return allow; > > This assertion fails 1) either if we have not ignored the cost but > allowing parallel inserts 2) or we ignored the cost but not allowing > parallel inserts. > > 1) seems to be fine, we can go ahead and perform parallel inserts. 2) > is the concern that the planner would have wrongly chosen the parallel > plan, but in this case also isn't it better to go ahead with the > parallel plan instead of failing the query? > > +/* > + * We allow parallel inserts by the workers only if the Gather node > has > + * no projections to perform and if the upper node is Gather. In > case, > + * the Gather node has projections, which is possible if there are > any > + * subplans in the query, the workers can not do those projections. > And > + * when the upper node is GatherMerge, then the leader has to perform > + * the final phase i.e. merge the results by workers. > + */ > +allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && > +plannedstmt->parallelModeNeeded && > +plannedstmt->planTree && > +plannedstmt->planTree->lefttree && > +plannedstmt->planTree->lefttree->parallel_aware && > +plannedstmt->planTree->lefttree->parallel_safe; > + > +return allow; > +} I added the assertion into the 0002 patch so that it fails when the planner ignores parallel tuple cost and may choose parallel plan but later we don't allow parallel inserts. make check and make check-world passeses without any assertion failures. Attaching v11 patch set. Please review it further. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v11-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch Description: Binary data v11-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch Description: Binary data
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 5:19 PM Dilip Kumar wrote: > > > > + allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo > > > > && > > > > + plannedstmt->parallelModeNeeded && > > > > + plannedstmt->planTree && > > > > + IsA(plannedstmt->planTree, Gather) && > > > > + plannedstmt->planTree->lefttree && > > > > + > > > > plannedstmt->planTree->lefttree->parallel_aware && > > > > + > > > > plannedstmt->planTree->lefttree->parallel_safe; > > > > > > > > I noticed it check both IsA(ps, GatherState) and > > > > IsA(plannedstmt->planTree, Gather). > > > > Does it mean it is possible that IsA(ps, GatherState) is true but > > > > IsA(plannedstmt->planTree, Gather) is false ? > > > > > > > > I did some test but did not find a case like that. > > > > > > > This seems like an extra check. Apart from that if we combine 0001 > > > and 0002 there should be an additional protection so that it should > > > not happen that in cost_gather we have ignored the parallel tuple cost > > > and now we are rejecting the parallel insert. Probably we should add > > > an assert. > > > > Yeah it's an extra check. I don't think we need that extra check > > IsA(plannedstmt->planTree, Gather). GatherState check is enough. I verified > > it as follows: the gatherstate will be allocated and initialized with the > > plan tree in ExecInitGather which are the ones we are checking here. So, > > there is no chance that the plan state is GatherState and the plan tree > > will not be Gather. I will remove IsA(plannedstmt->planTree, Gather) check > > in the next version of the patch set. > > > > Breakpoint 4, ExecInitGather (node=0x5647f98ae994 , > > estate=0x1ca8, eflags=730035099) at nodeGather.c:61 > > (gdb) p gatherstate > > $10 = (GatherState *) 0x5647fac83850 > > (gdb) p gatherstate->ps.plan > > $11 = (Plan *) 0x5647fac918a0 > > > > Breakpoint 1, IsParallelInsertInCTASAllowed (into=0x5647fac97580, > > queryDesc=0x5647fac835e0) at createas.c:663 > > 663 { > > (gdb) p ps > > $13 = (PlanState *) 0x5647fac83850 > > (gdb) p ps->plan > > $14 = (Plan *) 0x5647fac918a0 > > > Hope you did not miss the second part of my comment > " > > Apart from that if we combine 0001 > > and 0002 there should be additional protection so that it should > > not happen that in cost_gather we have ignored the parallel tuple cost > > and now we are rejecting the parallel insert. Probably we should add > > an assert. > " IIUC, we need to set a flag in cost_gather(in 0002 patch) whenever we ignore the parallel tuple cost and while checking to allow or disallow parallel inserts in IsParallelInsertInCTASAllowed(), we need to add an assert something like Assert(cost_ignored_in_cost_gather && allow) before return allow; This assertion fails 1) either if we have not ignored the cost but allowing parallel inserts 2) or we ignored the cost but not allowing parallel inserts. 1) seems to be fine, we can go ahead and perform parallel inserts. 2) is the concern that the planner would have wrongly chosen the parallel plan, but in this case also isn't it better to go ahead with the parallel plan instead of failing the query? +/* + * We allow parallel inserts by the workers only if the Gather node has + * no projections to perform and if the upper node is Gather. In case, + * the Gather node has projections, which is possible if there are any + * subplans in the query, the workers can not do those projections. And + * when the upper node is GatherMerge, then the leader has to perform + * the final phase i.e. merge the results by workers. + */ +allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && +plannedstmt->parallelModeNeeded && +plannedstmt->planTree && +plannedstmt->planTree->lefttree && +plannedstmt->planTree->lefttree->parallel_aware && +plannedstmt->planTree->lefttree->parallel_safe; + +return allow; +} With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 5:00 PM Bharath Rupireddy wrote: > > On Thu, Dec 10, 2020 at 4:49 PM Dilip Kumar wrote: > > > + allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && > > > + plannedstmt->parallelModeNeeded && > > > + plannedstmt->planTree && > > > + IsA(plannedstmt->planTree, Gather) && > > > + plannedstmt->planTree->lefttree && > > > + > > > plannedstmt->planTree->lefttree->parallel_aware && > > > + > > > plannedstmt->planTree->lefttree->parallel_safe; > > > > > > I noticed it check both IsA(ps, GatherState) and > > > IsA(plannedstmt->planTree, Gather). > > > Does it mean it is possible that IsA(ps, GatherState) is true but > > > IsA(plannedstmt->planTree, Gather) is false ? > > > > > > I did some test but did not find a case like that. > > > > > > > This seems like an extra check. Apart from that if we combine 0001 > > and 0002 there should be an additional protection so that it should > > not happen that in cost_gather we have ignored the parallel tuple cost > > and now we are rejecting the parallel insert. Probably we should add > > an assert. > > Yeah it's an extra check. I don't think we need that extra check > IsA(plannedstmt->planTree, Gather). GatherState check is enough. I verified > it as follows: the gatherstate will be allocated and initialized with the > plan tree in ExecInitGather which are the ones we are checking here. So, > there is no chance that the plan state is GatherState and the plan tree will > not be Gather. I will remove IsA(plannedstmt->planTree, Gather) check in the > next version of the patch set. > > Breakpoint 4, ExecInitGather (node=0x5647f98ae994 , > estate=0x1ca8, eflags=730035099) at nodeGather.c:61 > (gdb) p gatherstate > $10 = (GatherState *) 0x5647fac83850 > (gdb) p gatherstate->ps.plan > $11 = (Plan *) 0x5647fac918a0 > > Breakpoint 1, IsParallelInsertInCTASAllowed (into=0x5647fac97580, > queryDesc=0x5647fac835e0) at createas.c:663 > 663 { > (gdb) p ps > $13 = (PlanState *) 0x5647fac83850 > (gdb) p ps->plan > $14 = (Plan *) 0x5647fac918a0 > Hope you did not miss the second part of my comment " > Apart from that if we combine 0001 > and 0002 there should be additional protection so that it should > not happen that in cost_gather we have ignored the parallel tuple cost > and now we are rejecting the parallel insert. Probably we should add > an assert. " -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 4:49 PM Dilip Kumar wrote: > > + allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && > > + plannedstmt->parallelModeNeeded && > > + plannedstmt->planTree && > > + IsA(plannedstmt->planTree, Gather) && > > + plannedstmt->planTree->lefttree && > > + plannedstmt->planTree->lefttree->parallel_aware && > > + plannedstmt->planTree->lefttree->parallel_safe; > > > > I noticed it check both IsA(ps, GatherState) and IsA(plannedstmt->planTree, Gather). > > Does it mean it is possible that IsA(ps, GatherState) is true but IsA(plannedstmt->planTree, Gather) is false ? > > > > I did some test but did not find a case like that. > > > > This seems like an extra check. Apart from that if we combine 0001 > and 0002 there should be an additional protection so that it should > not happen that in cost_gather we have ignored the parallel tuple cost > and now we are rejecting the parallel insert. Probably we should add > an assert. Yeah it's an extra check. I don't think we need that extra check IsA(plannedstmt->planTree, Gather). GatherState check is enough. I verified it as follows: the gatherstate will be allocated and initialized with the plan tree in ExecInitGather which are the ones we are checking here. So, there is no chance that the plan state is GatherState and the plan tree will not be Gather. I will remove IsA(plannedstmt->planTree, Gather) check in the next version of the patch set. Breakpoint 4, ExecInitGather (node=0x5647f98ae994 , estate=0x1ca8, eflags=730035099) at nodeGather.c:61 (gdb) p gatherstate $10 = *(GatherState *) 0x5647fac83850* (gdb) p gatherstate->ps.plan $11 = *(Plan *) 0x5647fac918a0* Breakpoint 1, IsParallelInsertInCTASAllowed (into=0x5647fac97580, queryDesc=0x5647fac835e0) at createas.c:663 663 { (gdb) p ps $13 = *(PlanState *) 0x5647fac83850* (gdb) p ps->plan $14 =* (Plan *) 0x5647fac918a0* With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 3:59 PM Hou, Zhijie wrote: > > Hi > > + allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && > + plannedstmt->parallelModeNeeded && > + plannedstmt->planTree && > + IsA(plannedstmt->planTree, Gather) && > + plannedstmt->planTree->lefttree && > + > plannedstmt->planTree->lefttree->parallel_aware && > + > plannedstmt->planTree->lefttree->parallel_safe; > > I noticed it check both IsA(ps, GatherState) and IsA(plannedstmt->planTree, > Gather). > Does it mean it is possible that IsA(ps, GatherState) is true but > IsA(plannedstmt->planTree, Gather) is false ? > > I did some test but did not find a case like that. > This seems like an extra check. Apart from that if we combine 0001 and 0002 there should be an additional protection so that it should not happen that in cost_gather we have ignored the parallel tuple cost and now we are rejecting the parallel insert. Probably we should add an assert. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
Hi + allow = ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && + plannedstmt->parallelModeNeeded && + plannedstmt->planTree && + IsA(plannedstmt->planTree, Gather) && + plannedstmt->planTree->lefttree && + plannedstmt->planTree->lefttree->parallel_aware && + plannedstmt->planTree->lefttree->parallel_safe; I noticed it check both IsA(ps, GatherState) and IsA(plannedstmt->planTree, Gather). Does it mean it is possible that IsA(ps, GatherState) is true but IsA(plannedstmt->planTree, Gather) is false ? I did some test but did not find a case like that. Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Dec 10, 2020 at 7:48 AM Zhihong Yu wrote: > + if (!OidIsValid(col->collOid) && > + type_is_collatable(col->typeName->typeOid)) > + ereport(ERROR, > ... > + attrList = lappend(attrList, col); > > Should attrList be freed when ereport is called ? > I think that's not necessary since we are going to throw an error anyways. And also that this is not a new code added as part of this feature, it is an existing code adjusted for parallel inserts. On looking further in the code base there are many places where we don't free up the lists before throwing errors. errmsg("column privileges are only valid for relations"))); errmsg("check constraint \"%s\" already exists", errmsg("name or argument lists may not contain nulls"))); elog(ERROR, "no tlist entry for key %d", keyresno); > + query->CTASParallelInsInfo &= CTAS_PARALLEL_INS_UNDEF; > > Since CTAS_PARALLEL_INS_UNDEF is 0, isn't the above equivalent to assigning > the value of 0 ? > Yeah both are equivalent. For now I will keep it that way, I will change it in the next version of the patch. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
Hi, + if (!OidIsValid(col->collOid) && + type_is_collatable(col->typeName->typeOid)) + ereport(ERROR, ... + attrList = lappend(attrList, col); Should attrList be freed when ereport is called ? + query->CTASParallelInsInfo &= CTAS_PARALLEL_INS_UNDEF; Since CTAS_PARALLEL_INS_UNDEF is 0, isn't the above equivalent to assigning the value of 0 ? Cheers On Wed, Dec 9, 2020 at 5:43 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > On Wed, Dec 9, 2020 at 10:16 AM Dilip Kumar wrote: > > > > On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie > wrote: > > > > > > > > I'm not quite sure how to address this. Can we not allow the > planner > > > > > to consider that the select is for CTAS and check only after the > > > > > planning is done for the Gather node and other checks? > > > > > > > > > > > > > IIUC, you are saying that we should not influence the cost of gather > node > > > > even when the insertion would be done by workers? I think that > should be > > > > our fallback option anyway but that might miss some paths to be > considered > > > > parallel where the cost becomes more due to parallel_tuple_cost (aka > tuple > > > > transfer cost). I think the idea is we can avoid the tuple transfer > cost > > > > only when Gather is the top node because only at that time we can > push > > > > insertion down, right? How about if we have some way to detect the > same > > > > before calling generate_useful_gather_paths()? I think when we are > calling > > > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > > > query_level is 1, it is for CTAS, and it doesn't have a chance to > create > > > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we > can > > > > probably assume that the Gather will be top_node. I am not sure > about this > > > > but I think it is worth exploring. > > > > > > > > > > I took a look at the parallel insert patch and have the same idea. > > > https://commitfest.postgresql.org/31/2844/ > > > > > > * Consider generating Gather or Gather Merge paths. We must > only do this > > > * if the relation is parallel safe, and we don't do it for > child rels to > > > * avoid creating multiple Gather nodes within the same plan. > We must do > > > * this after all paths have been generated and before > set_cheapest, since > > > * one of the generated paths may turn out to be the cheapest > one. > > > */ > > > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > > > generate_useful_gather_paths(root, rel, false); > > > > > > IMO Gatherpath created here seems the right one which can possible > ignore parallel cost if in CTAS. > > > But We need check the following parse option which will create path to > be the parent of Gatherpath here. > > > > > > if (root->parse->rowMarks) > > > if (limit_needed(root->parse)) > > > if (root->parse->sortClause) > > > if (root->parse->distinctClause) > > > if (root->parse->hasWindowFuncs) > > > if (root->parse->groupClause || root->parse->groupingSets || > root->parse->hasAggs || root->root->hasHavingQual) > > > > > > > Yeah, and as I pointed earlier, along with this we also need to > > consider that the RelOptInfo must be the final target(top level rel). > > > > Attaching v10 patch set that includes the change suggested above for > ignoring parallel tuple cost and also few more test cases. I split the > patch as per Amit's suggestion. v10-0001 contains parallel inserts > code without planner tuple cost changes and test cases. v10-0002 has > required changes for ignoring planner tuple cost calculations. > > Please review it further. > > After the review and addressing all the comments, I plan to make some > code common so that it can be used for Parallel Inserts in REFRESH > MATERIALIZED VIEW. Thoughts? > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com >
Re: Parallel Inserts in CREATE TABLE AS
On Wed, Dec 9, 2020 at 10:16 AM Dilip Kumar wrote: > > On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie wrote: > > > > > > I'm not quite sure how to address this. Can we not allow the planner > > > > to consider that the select is for CTAS and check only after the > > > > planning is done for the Gather node and other checks? > > > > > > > > > > IIUC, you are saying that we should not influence the cost of gather node > > > even when the insertion would be done by workers? I think that should be > > > our fallback option anyway but that might miss some paths to be considered > > > parallel where the cost becomes more due to parallel_tuple_cost (aka tuple > > > transfer cost). I think the idea is we can avoid the tuple transfer cost > > > only when Gather is the top node because only at that time we can push > > > insertion down, right? How about if we have some way to detect the same > > > before calling generate_useful_gather_paths()? I think when we are calling > > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > > query_level is 1, it is for CTAS, and it doesn't have a chance to create > > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can > > > probably assume that the Gather will be top_node. I am not sure about this > > > but I think it is worth exploring. > > > > > > > I took a look at the parallel insert patch and have the same idea. > > https://commitfest.postgresql.org/31/2844/ > > > > * Consider generating Gather or Gather Merge paths. We must only > > do this > > * if the relation is parallel safe, and we don't do it for child > > rels to > > * avoid creating multiple Gather nodes within the same plan. We > > must do > > * this after all paths have been generated and before > > set_cheapest, since > > * one of the generated paths may turn out to be the cheapest one. > > */ > > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > > generate_useful_gather_paths(root, rel, false); > > > > IMO Gatherpath created here seems the right one which can possible ignore > > parallel cost if in CTAS. > > But We need check the following parse option which will create path to be > > the parent of Gatherpath here. > > > > if (root->parse->rowMarks) > > if (limit_needed(root->parse)) > > if (root->parse->sortClause) > > if (root->parse->distinctClause) > > if (root->parse->hasWindowFuncs) > > if (root->parse->groupClause || root->parse->groupingSets || > > root->parse->hasAggs || root->root->hasHavingQual) > > > > Yeah, and as I pointed earlier, along with this we also need to > consider that the RelOptInfo must be the final target(top level rel). > Attaching v10 patch set that includes the change suggested above for ignoring parallel tuple cost and also few more test cases. I split the patch as per Amit's suggestion. v10-0001 contains parallel inserts code without planner tuple cost changes and test cases. v10-0002 has required changes for ignoring planner tuple cost calculations. Please review it further. After the review and addressing all the comments, I plan to make some code common so that it can be used for Parallel Inserts in REFRESH MATERIALIZED VIEW. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 2939b2c51bff3548ea15c9054f9e2ab6619b Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Thu, 10 Dec 2020 05:38:35 +0530 Subject: [PATCH v10] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest receiver. Leader inserts its share of tuples if instructed to do, and so are workers. Each worker writes atomically its number of inserted tuples into a shared memory variable, the leader combines this with its own number of inserted tuples and shares to the client. --- src/backend/access/heap/heapam.c | 11 - src/backend/access/transam/xact.c| 30 +- src/backend/commands/createas.c | 332 src/backend/commands/explain.c | 32 ++ src/backend/executor/execParallel.c | 70 ++- src/backend/executor/nodeGather.c| 113 - src/backend/executor/nodeGatherMerge.c | 4 +- src/include/access/xact.h| 1 + src/include/commands/createas.h | 28 ++ src/include/executor/execParallel.h | 6 +- src/inc
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie wrote: > > > > I'm not quite sure how to address this. Can we not allow the planner > > > to consider that the select is for CTAS and check only after the > > > planning is done for the Gather node and other checks? > > > > > > > IIUC, you are saying that we should not influence the cost of gather node > > even when the insertion would be done by workers? I think that should be > > our fallback option anyway but that might miss some paths to be considered > > parallel where the cost becomes more due to parallel_tuple_cost (aka tuple > > transfer cost). I think the idea is we can avoid the tuple transfer cost > > only when Gather is the top node because only at that time we can push > > insertion down, right? How about if we have some way to detect the same > > before calling generate_useful_gather_paths()? I think when we are calling > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > query_level is 1, it is for CTAS, and it doesn't have a chance to create > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can > > probably assume that the Gather will be top_node. I am not sure about this > > but I think it is worth exploring. > > > > I took a look at the parallel insert patch and have the same idea. > https://commitfest.postgresql.org/31/2844/ > > * Consider generating Gather or Gather Merge paths. We must only do > this > * if the relation is parallel safe, and we don't do it for child > rels to > * avoid creating multiple Gather nodes within the same plan. We must > do > * this after all paths have been generated and before set_cheapest, > since > * one of the generated paths may turn out to be the cheapest one. > */ > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > generate_useful_gather_paths(root, rel, false); > > IMO Gatherpath created here seems the right one which can possible ignore > parallel cost if in CTAS. > But We need check the following parse option which will create path to be the > parent of Gatherpath here. > > if (root->parse->rowMarks) > if (limit_needed(root->parse)) > if (root->parse->sortClause) > if (root->parse->distinctClause) > if (root->parse->hasWindowFuncs) > if (root->parse->groupClause || root->parse->groupingSets || > root->parse->hasAggs || root->root->hasHavingQual) > Yeah, and as I pointed earlier, along with this we also need to consider that the RelOptInfo must be the final target(top level rel). -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 8, 2020 at 6:36 PM Bharath Rupireddy wrote: > > On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie wrote: > > > > > > I'm not quite sure how to address this. Can we not allow the planner > > > > to consider that the select is for CTAS and check only after the > > > > planning is done for the Gather node and other checks? > > > > > > > > > > IIUC, you are saying that we should not influence the cost of gather node > > > even when the insertion would be done by workers? I think that should be > > > our fallback option anyway but that might miss some paths to be considered > > > parallel where the cost becomes more due to parallel_tuple_cost (aka tuple > > > transfer cost). I think the idea is we can avoid the tuple transfer cost > > > only when Gather is the top node because only at that time we can push > > > insertion down, right? How about if we have some way to detect the same > > > before calling generate_useful_gather_paths()? I think when we are calling > > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > > query_level is 1, it is for CTAS, and it doesn't have a chance to create > > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can > > > probably assume that the Gather will be top_node. I am not sure about this > > > but I think it is worth exploring. > > > > > > > I took a look at the parallel insert patch and have the same idea. > > https://commitfest.postgresql.org/31/2844/ > > > > * Consider generating Gather or Gather Merge paths. We must only > > do this > > * if the relation is parallel safe, and we don't do it for child > > rels to > > * avoid creating multiple Gather nodes within the same plan. We > > must do > > * this after all paths have been generated and before > > set_cheapest, since > > * one of the generated paths may turn out to be the cheapest one. > > */ > > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > > generate_useful_gather_paths(root, rel, false); > > > > IMO Gatherpath created here seems the right one which can possible ignore > > parallel cost if in CTAS. > > But We need check the following parse option which will create path to be > > the parent of Gatherpath here. > > > > if (root->parse->rowMarks) > > if (limit_needed(root->parse)) > > if (root->parse->sortClause) > > if (root->parse->distinctClause) > > if (root->parse->hasWindowFuncs) > > if (root->parse->groupClause || root->parse->groupingSets || > > root->parse->hasAggs || root->root->hasHavingQual) > > > > Thanks Amit and Hou. I will look into these areas and get back soon. > It might be better to split the patch for this such that in the base patch, we won't consider anything special for gather costing w.r.t CTAS and in the next patch, we consider all the checks discussed above. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie wrote: > > > > I'm not quite sure how to address this. Can we not allow the planner > > > to consider that the select is for CTAS and check only after the > > > planning is done for the Gather node and other checks? > > > > > > > IIUC, you are saying that we should not influence the cost of gather node > > even when the insertion would be done by workers? I think that should be > > our fallback option anyway but that might miss some paths to be considered > > parallel where the cost becomes more due to parallel_tuple_cost (aka tuple > > transfer cost). I think the idea is we can avoid the tuple transfer cost > > only when Gather is the top node because only at that time we can push > > insertion down, right? How about if we have some way to detect the same > > before calling generate_useful_gather_paths()? I think when we are calling > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > query_level is 1, it is for CTAS, and it doesn't have a chance to create > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can > > probably assume that the Gather will be top_node. I am not sure about this > > but I think it is worth exploring. > > > > I took a look at the parallel insert patch and have the same idea. > https://commitfest.postgresql.org/31/2844/ > > * Consider generating Gather or Gather Merge paths. We must only do > this > * if the relation is parallel safe, and we don't do it for child > rels to > * avoid creating multiple Gather nodes within the same plan. We must > do > * this after all paths have been generated and before set_cheapest, > since > * one of the generated paths may turn out to be the cheapest one. > */ > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > generate_useful_gather_paths(root, rel, false); > > IMO Gatherpath created here seems the right one which can possible ignore > parallel cost if in CTAS. > But We need check the following parse option which will create path to be the > parent of Gatherpath here. > > if (root->parse->rowMarks) > if (limit_needed(root->parse)) > if (root->parse->sortClause) > if (root->parse->distinctClause) > if (root->parse->hasWindowFuncs) > if (root->parse->groupClause || root->parse->groupingSets || > root->parse->hasAggs || root->root->hasHavingQual) > Thanks Amit and Hou. I will look into these areas and get back soon. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
> > I'm not quite sure how to address this. Can we not allow the planner > > to consider that the select is for CTAS and check only after the > > planning is done for the Gather node and other checks? > > > > IIUC, you are saying that we should not influence the cost of gather node > even when the insertion would be done by workers? I think that should be > our fallback option anyway but that might miss some paths to be considered > parallel where the cost becomes more due to parallel_tuple_cost (aka tuple > transfer cost). I think the idea is we can avoid the tuple transfer cost > only when Gather is the top node because only at that time we can push > insertion down, right? How about if we have some way to detect the same > before calling generate_useful_gather_paths()? I think when we are calling > apply_scanjoin_target_to_paths() in grouping_planner(), if the > query_level is 1, it is for CTAS, and it doesn't have a chance to create > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can > probably assume that the Gather will be top_node. I am not sure about this > but I think it is worth exploring. > I took a look at the parallel insert patch and have the same idea. https://commitfest.postgresql.org/31/2844/ * Consider generating Gather or Gather Merge paths. We must only do this * if the relation is parallel safe, and we don't do it for child rels to * avoid creating multiple Gather nodes within the same plan. We must do * this after all paths have been generated and before set_cheapest, since * one of the generated paths may turn out to be the cheapest one. */ if (rel->consider_parallel && !IS_OTHER_REL(rel)) generate_useful_gather_paths(root, rel, false); IMO Gatherpath created here seems the right one which can possible ignore parallel cost if in CTAS. But We need check the following parse option which will create path to be the parent of Gatherpath here. if (root->parse->rowMarks) if (limit_needed(root->parse)) if (root->parse->sortClause) if (root->parse->distinctClause) if (root->parse->hasWindowFuncs) if (root->parse->groupClause || root->parse->groupingSets || root->parse->hasAggs || root->root->hasHavingQual) Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 7:04 PM Bharath Rupireddy wrote: > > I'm not quite sure how to address this. Can we not allow the planner > to consider that the select is for CTAS and check only after the > planning is done for the Gather node and other checks? > IIUC, you are saying that we should not influence the cost of gather node even when the insertion would be done by workers? I think that should be our fallback option anyway but that might miss some paths to be considered parallel where the cost becomes more due to parallel_tuple_cost (aka tuple transfer cost). I think the idea is we can avoid the tuple transfer cost only when Gather is the top node because only at that time we can push insertion down, right? How about if we have some way to detect the same before calling generate_useful_gather_paths()? I think when we are calling apply_scanjoin_target_to_paths() in grouping_planner(), if the query_level is 1, it is for CTAS, and it doesn't have a chance to create UPPER_REL (doesn't have grouping, order, limit, etc clause) then we can probably assume that the Gather will be top_node. I am not sure about this but I think it is worth exploring. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 7:04 PM Bharath Rupireddy wrote: > > On Mon, Dec 7, 2020 at 5:25 PM Amit Kapila wrote: > > > > On Mon, Dec 7, 2020 at 4:20 PM Bharath Rupireddy > > wrote: > > > > > > On Mon, Dec 7, 2020 at 4:04 PM Amit Kapila > > > wrote: > > > > > > > > What is the need of checking query_level when 'isForCTAS' is set only > > > > when Gather is a top-node? > > > > > > > > > > isForCTAS is getting set before pg_plan_query() which is being used in > > > cost_gather(). We will not have a Gather node by then and hence will > > > not pass queryDesc to IsParallelInsertInCTASAllowed(into, NULL) while > > > setting isForCTAS to true. > > > > > > > IsParallelInsertInCTASAllowed() seems to be returning false if > > queryDesc is NULL, so won't isForCTAS be always set to false? I think > > I am missing something here. > > > > My bad. I utterly missed this, sorry for the confusion. > > My intention to have IsParallelInsertInCTASAllowed() is for two > purposes. 1. when called before planning without queryDesc, it should > return true if IS_CTAS(into) is true and is not a temporary table. 2. > when called after planning with a non-null queryDesc, along with 1) > checks, it should also perform the Gather State checks and return > accordingly. > > I have corrected it in v9 patch. Please have a look. > > > > > > > The isForCTAS will be true because [create table as], the > > > > query_level is always 1 because there is no subquery. > > > > So even if gather is not the top node, parallel cost will still be > > > > ignored. > > > > > > > > Is that works as expected ? > > > > > > > > > > I don't think that is expected and is not the case without this patch. > > > The cost shouldn't be changed for existing cases where the write is > > > not pushed to workers. > > > > > > > Thanks for pointing that out. Yes it should not change for the cases > > where parallel inserts will not be picked later. > > > > Any better suggestions on how to make the planner consider that the > > CTAS might choose parallel inserts later at the same time avoiding the > > above issue in case it doesn't? > > > > I'm not quite sure how to address this. Can we not allow the planner > to consider that the select is for CTAS and check only after the > planning is done for the Gather node and other checks? This is simple > to do, but we might miss some parallel plans for the SELECTs because > the planner would have already considered the tuple transfer cost from > workers to Gather wrongly because of which that parallel plan would > have become costlier compared to non parallel plans. IMO, we can do > this since it also keeps the existing behaviour of the planner i.e. > when the planner is planning for SELECTs it doesn't know that it is > doing it for CTAS. Thoughts? > I have done some initial review and I have a few comments. @@ -328,6 +316,15 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt, query = linitial_node(Query, rewritten); Assert(query->commandType == CMD_SELECT); + /* +* Flag to let the planner know that the SELECT query is for CTAS. This +* is used to calculate the tuple transfer cost from workers to gather +* node(in case parallelism kicks in for the SELECT part of the CTAS), +* to zero as each worker will insert its share of tuples in parallel. +*/ + if (IsParallelInsertInCTASAllowed(into, NULL)) + query->isForCTAS = true; + /* plan the query */ plan = pg_plan_query(query, pstate->p_sourcetext, CURSOR_OPT_PARALLEL_OK, params); @@ -350,6 +347,15 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt, /* call ExecutorStart to prepare the plan for execution */ ExecutorStart(queryDesc, GetIntoRelEFlags(into)); + /* +* If SELECT part of the CTAS is parallelizable, then make each +* parallel worker insert the tuples that are resulted in its execution +* into the target table. We need plan state to be initialized by the +* executor to decide whether to allow parallel inserts or not. + */ + if (IsParallelInsertInCTASAllowed(into, queryDesc)) + SetCTASParallelInsertState(queryDesc); Once we have called IsParallelInsertInCTASAllowed and set the query->isForCTAS flag then why we are calling this again? —— --- +*/ + if (!(root->parse->isForCTAS && + root->query_level == 1)) + run_cost += parallel_tuple_cost * path->path.rows; >From this check, it appeared that the lower level gather will also get influenced by this, consider this -> NLJ -> Gather -> Parallel Seq Scan -> Index Scan This condition is only checking that it should be a top-level query and it should be under CTAS then this will impact all the gather nodes as shown in the above example. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 5:25 PM Amit Kapila wrote: > > On Mon, Dec 7, 2020 at 4:20 PM Bharath Rupireddy > wrote: > > > > On Mon, Dec 7, 2020 at 4:04 PM Amit Kapila wrote: > > > > > > What is the need of checking query_level when 'isForCTAS' is set only > > > when Gather is a top-node? > > > > > > > isForCTAS is getting set before pg_plan_query() which is being used in > > cost_gather(). We will not have a Gather node by then and hence will > > not pass queryDesc to IsParallelInsertInCTASAllowed(into, NULL) while > > setting isForCTAS to true. > > > > IsParallelInsertInCTASAllowed() seems to be returning false if > queryDesc is NULL, so won't isForCTAS be always set to false? I think > I am missing something here. > My bad. I utterly missed this, sorry for the confusion. My intention to have IsParallelInsertInCTASAllowed() is for two purposes. 1. when called before planning without queryDesc, it should return true if IS_CTAS(into) is true and is not a temporary table. 2. when called after planning with a non-null queryDesc, along with 1) checks, it should also perform the Gather State checks and return accordingly. I have corrected it in v9 patch. Please have a look. > > > > The isForCTAS will be true because [create table as], the > > > query_level is always 1 because there is no subquery. > > > So even if gather is not the top node, parallel cost will still be > > > ignored. > > > > > > Is that works as expected ? > > > > > > > I don't think that is expected and is not the case without this patch. > > The cost shouldn't be changed for existing cases where the write is > > not pushed to workers. > > > > Thanks for pointing that out. Yes it should not change for the cases > where parallel inserts will not be picked later. > > Any better suggestions on how to make the planner consider that the > CTAS might choose parallel inserts later at the same time avoiding the > above issue in case it doesn't? > I'm not quite sure how to address this. Can we not allow the planner to consider that the select is for CTAS and check only after the planning is done for the Gather node and other checks? This is simple to do, but we might miss some parallel plans for the SELECTs because the planner would have already considered the tuple transfer cost from workers to Gather wrongly because of which that parallel plan would have become costlier compared to non parallel plans. IMO, we can do this since it also keeps the existing behaviour of the planner i.e. when the planner is planning for SELECTs it doesn't know that it is doing it for CTAS. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v9-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch Description: Binary data
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 4:20 PM Bharath Rupireddy wrote: > > On Mon, Dec 7, 2020 at 4:04 PM Amit Kapila wrote: > > > > What is the need of checking query_level when 'isForCTAS' is set only > > when Gather is a top-node? > > > > isForCTAS is getting set before pg_plan_query() which is being used in > cost_gather(). We will not have a Gather node by then and hence will > not pass queryDesc to IsParallelInsertInCTASAllowed(into, NULL) while > setting isForCTAS to true. > IsParallelInsertInCTASAllowed() seems to be returning false if queryDesc is NULL, so won't isForCTAS be always set to false? I think I am missing something here. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 4:04 PM Amit Kapila wrote: > > What is the need of checking query_level when 'isForCTAS' is set only > when Gather is a top-node? > isForCTAS is getting set before pg_plan_query() which is being used in cost_gather(). We will not have a Gather node by then and hence will not pass queryDesc to IsParallelInsertInCTASAllowed(into, NULL) while setting isForCTAS to true. Intention to check query_level == 1 in cost_gather is to consider for only top level query not for other sub queries. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 3:44 PM Bharath Rupireddy wrote: > > On Mon, Dec 7, 2020 at 2:55 PM Amit Kapila wrote: > > > > On Mon, Dec 7, 2020 at 11:32 AM Hou, Zhijie > > wrote: > > > > > > + if (!(root->parse->isForCTAS && > > > + root->query_level == 1)) > > > + run_cost += parallel_tuple_cost * path->path.rows; > > > > > > I noticed that the parallel_tuple_cost will still be ignored, > > > When Gather is not the top node. > > > > > > Example: > > > Create table test(i int); > > > insert into test values(generate_series(1,1000,1)); > > > explain create table ntest3 as select * from test where i < 200 > > > limit 1; > > > > > > QUERY PLAN > > > --- > > > Limit (cost=1000.00..97331.33 rows=1000 width=4) > > >-> Gather (cost=1000.00..97331.33 rows=1000 width=4) > > > Workers Planned: 2 > > > -> Parallel Seq Scan on test (cost=0.00..96331.33 rows=417 > > > width=4) > > >Filter: (i < 200) > > > > > > > > > The isForCTAS will be true because [create table as], the > > > query_level is always 1 because there is no subquery. > > > So even if gather is not the top node, parallel cost will still be > > > ignored. > > > > > > Is that works as expected ? > > > > > > > I don't think that is expected and is not the case without this patch. > > The cost shouldn't be changed for existing cases where the write is > > not pushed to workers. > > > > Thanks for pointing that out. Yes it should not change for the cases > where parallel inserts will not be picked later. > > Any better suggestions on how to make the planner consider that the > CTAS might choose parallel inserts later at the same time avoiding the > above issue in case it doesn't? > What is the need of checking query_level when 'isForCTAS' is set only when Gather is a top-node? -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 2:55 PM Amit Kapila wrote: > > On Mon, Dec 7, 2020 at 11:32 AM Hou, Zhijie wrote: > > > > Hi > > > > + /* > > +* Flag to let the planner know that the SELECT query is for CTAS. > > This is > > +* used to calculate the tuple transfer cost from workers to gather > > node(in > > +* case parallelism kicks in for the SELECT part of the CTAS), to > > zero as > > +* each worker will insert its share of tuples in parallel. > > +*/ > > + if (IsParallelInsertInCTASAllowed(into, NULL)) > > + query->isForCTAS = true; > > > > > > + /* > > +* We do not compute the parallel_tuple_cost for CTAS because the > > number of > > +* tuples that are transferred from workers to the gather node is > > zero as > > +* each worker, in parallel, inserts the tuples that are resulted > > from its > > +* chunk of plan execution. This change may make the parallel plan > > cheap > > +* among all other plans, and influence the planner to consider this > > +* parallel plan. > > +*/ > > + if (!(root->parse->isForCTAS && > > + root->query_level == 1)) > > + run_cost += parallel_tuple_cost * path->path.rows; > > > > I noticed that the parallel_tuple_cost will still be ignored, > > When Gather is not the top node. > > > > Example: > > Create table test(i int); > > insert into test values(generate_series(1,1000,1)); > > explain create table ntest3 as select * from test where i < 200 > > limit 1; > > > > QUERY PLAN > > --- > > Limit (cost=1000.00..97331.33 rows=1000 width=4) > >-> Gather (cost=1000.00..97331.33 rows=1000 width=4) > > Workers Planned: 2 > > -> Parallel Seq Scan on test (cost=0.00..96331.33 rows=417 > > width=4) > >Filter: (i < 200) > > > > > > The isForCTAS will be true because [create table as], the > > query_level is always 1 because there is no subquery. > > So even if gather is not the top node, parallel cost will still be ignored. > > > > Is that works as expected ? > > > > I don't think that is expected and is not the case without this patch. > The cost shouldn't be changed for existing cases where the write is > not pushed to workers. > Thanks for pointing that out. Yes it should not change for the cases where parallel inserts will not be picked later. Any better suggestions on how to make the planner consider that the CTAS might choose parallel inserts later at the same time avoiding the above issue in case it doesn't? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Dec 7, 2020 at 11:32 AM Hou, Zhijie wrote: > > Hi > > + /* > +* Flag to let the planner know that the SELECT query is for CTAS. > This is > +* used to calculate the tuple transfer cost from workers to gather > node(in > +* case parallelism kicks in for the SELECT part of the CTAS), to > zero as > +* each worker will insert its share of tuples in parallel. > +*/ > + if (IsParallelInsertInCTASAllowed(into, NULL)) > + query->isForCTAS = true; > > > + /* > +* We do not compute the parallel_tuple_cost for CTAS because the > number of > +* tuples that are transferred from workers to the gather node is > zero as > +* each worker, in parallel, inserts the tuples that are resulted > from its > +* chunk of plan execution. This change may make the parallel plan > cheap > +* among all other plans, and influence the planner to consider this > +* parallel plan. > +*/ > + if (!(root->parse->isForCTAS && > + root->query_level == 1)) > + run_cost += parallel_tuple_cost * path->path.rows; > > I noticed that the parallel_tuple_cost will still be ignored, > When Gather is not the top node. > > Example: > Create table test(i int); > insert into test values(generate_series(1,1000,1)); > explain create table ntest3 as select * from test where i < 200 limit > 1; > > QUERY PLAN > --- > Limit (cost=1000.00..97331.33 rows=1000 width=4) >-> Gather (cost=1000.00..97331.33 rows=1000 width=4) > Workers Planned: 2 > -> Parallel Seq Scan on test (cost=0.00..96331.33 rows=417 width=4) >Filter: (i < 200) > > > The isForCTAS will be true because [create table as], the > query_level is always 1 because there is no subquery. > So even if gather is not the top node, parallel cost will still be ignored. > > Is that works as expected ? > I don't think that is expected and is not the case without this patch. The cost shouldn't be changed for existing cases where the write is not pushed to workers. -- With Regards, Amit Kapila.
RE: Parallel Inserts in CREATE TABLE AS
Hi + /* +* Flag to let the planner know that the SELECT query is for CTAS. This is +* used to calculate the tuple transfer cost from workers to gather node(in +* case parallelism kicks in for the SELECT part of the CTAS), to zero as +* each worker will insert its share of tuples in parallel. +*/ + if (IsParallelInsertInCTASAllowed(into, NULL)) + query->isForCTAS = true; + /* +* We do not compute the parallel_tuple_cost for CTAS because the number of +* tuples that are transferred from workers to the gather node is zero as +* each worker, in parallel, inserts the tuples that are resulted from its +* chunk of plan execution. This change may make the parallel plan cheap +* among all other plans, and influence the planner to consider this +* parallel plan. +*/ + if (!(root->parse->isForCTAS && + root->query_level == 1)) + run_cost += parallel_tuple_cost * path->path.rows; I noticed that the parallel_tuple_cost will still be ignored, When Gather is not the top node. Example: Create table test(i int); insert into test values(generate_series(1,1000,1)); explain create table ntest3 as select * from test where i < 200 limit 1; QUERY PLAN --- Limit (cost=1000.00..97331.33 rows=1000 width=4) -> Gather (cost=1000.00..97331.33 rows=1000 width=4) Workers Planned: 2 -> Parallel Seq Scan on test (cost=0.00..96331.33 rows=417 width=4) Filter: (i < 200) The isForCTAS will be true because [create table as], the query_level is always 1 because there is no subquery. So even if gather is not the top node, parallel cost will still be ignored. Is that works as expected ? Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
Thanks for the comments. On Mon, Dec 7, 2020 at 8:56 AM Zhihong Yu wrote: > > > + (void) SetCurrentCommandIdUsedForWorker(); > + myState->output_cid = GetCurrentCommandId(false); > > SetCurrentCommandIdUsedForWorker already has void as return type. The > '(void)' is not needed. > Removed. > > +* rd_createSubid is marked invalid, otherwise, the table is > +* not allowed to extend by the workers. > > nit: to extend by the workers -> to be extended by the workers > Changed. > > For IsParallelInsertInCTASAllowed, logic is inside 'if (IS_CTAS(into))' block. > You can return false when (!IS_CTAS(into)) - this would save some indentation > for the body. > Done. > > + if (rel && rel->relpersistence != RELPERSISTENCE_TEMP) > + allowed = true; > > Similarly, when the above condition doesn't hold, you can return false > directly - reducing the next if condition to 'if (queryDesc)'. > Done. > > The composite condition is negated. Maybe you can write without negation: > Done. > > +* Write out the number of tuples this worker has inserted. Leader will > use > +* it to inform to the end client. > > 'inform to the end client' -> 'inform the end client' (without to) > Changed. Attaching v8 patch. Consider this for further review. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v8-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch Description: Binary data
Re: Parallel Inserts in CREATE TABLE AS
Hi, Bharath : + (void) SetCurrentCommandIdUsedForWorker(); + myState->output_cid = GetCurrentCommandId(false); SetCurrentCommandIdUsedForWorker already has void as return type. The '(void)' is not needed. +* rd_createSubid is marked invalid, otherwise, the table is +* not allowed to extend by the workers. nit: to extend by the workers -> to be extended by the workers For IsParallelInsertInCTASAllowed, logic is inside 'if (IS_CTAS(into))' block. You can return false when (!IS_CTAS(into)) - this would save some indentation for the body. + if (rel && rel->relpersistence != RELPERSISTENCE_TEMP) + allowed = true; Similarly, when the above condition doesn't hold, you can return false directly - reducing the next if condition to 'if (queryDesc)'. + if (!(ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && + plannedstmt->parallelModeNeeded && + plannedstmt->planTree && + IsA(plannedstmt->planTree, Gather) && + plannedstmt->planTree->lefttree && + plannedstmt->planTree->lefttree->parallel_aware && + plannedstmt->planTree->lefttree->parallel_safe)) The composite condition is negated. Maybe you can write without negation: + return (ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && + plannedstmt->parallelModeNeeded && + plannedstmt->planTree && + IsA(plannedstmt->planTree, Gather) && + plannedstmt->planTree->lefttree && + plannedstmt->planTree->lefttree->parallel_aware && + plannedstmt->planTree->lefttree->parallel_safe) +* Write out the number of tuples this worker has inserted. Leader will use +* it to inform to the end client. 'inform to the end client' -> 'inform the end client' (without to) Cheers On Sun, Dec 6, 2020 at 4:37 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > Thanks Amit for the review comments. > > On Sat, Dec 5, 2020 at 4:27 PM Amit Kapila > wrote: > > > > > If I'm not wrong, I think currently we have no exec nodes for DDLs. > > > I'm not sure whether we would like to introduce one for this. > > > > Yeah, I am also not in favor of having an executor node for CTAS but > > OTOH, I also don't like the way you have jammed the relevant > > information in generic PlanState. How about keeping it in GatherState > > and initializing it in ExecCreateTableAs() after the executor start. > > You are already doing special treatment for the Gather node in > > ExecCreateTableAs (via IsParallelInsertInCTASAllowed) so we can as > > well initialize the required information in GatherState in > > ExecCreateTableAs. I think that might help in reducing the special > > treatment for intoclause at different places. > > > > Done. Added required info to GatherState node. While this reduced the > changes at many other places, but had to pass the into clause and > object id to ExecInitParallelPlan() as we do not send GatherState node > to it. Hope that's okay. > > > > > Few other assorted comments: > > = > > 1. > > This looks a bit odd. The function name > > 'IsParallelInsertInCTASAllowed' suggests that it just checks whether > > parallelism is allowed but it is internally changing the plan_rows. It > > might be better to do this separately if the parallelism is allowed. > > > > Changed. > > > > > 2. > > static void ExecShutdownGatherWorkers(GatherState *node); > > - > > +static void ExecParallelInsertInCTAS(GatherState *node); > > > > Spurious line removal. > > > > Corrected. > > > > > 3. > > The comment and code appear a bit misleading as the function seems to > > shutdown the workers rather than waiting for them to finish. How about > > using something like below: > > > > /* > > * Next, accumulate buffer and WAL usage. (This must wait for the workers > > * to finish, or we might get incomplete data.) > > */ > > if (nworkers > 0) > > { > > int i; > > > > /* Wait for all vacuum workers to finish */ > > WaitForParallelWorkersToFinish(lps->pcxt); > > > > for (i = 0; i < lps->pcxt->nworkers_launched; i++) > > InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]); > > } > > > > This is how it works for parallel vacuum. > > > > Done. > > > > > 4. > > The above comment doesn't seem to convey what it intends to convey. > > How about changing it slightly as: "We don't compute the > > parallel_tuple_cost for CTAS because the number of tuples that are > > transferred from workers to the gather node is zero as each worker > > parallelly inserts the tuples that are resulted from its chunk of plan > > execution. This change may make the parallel plan cheap among all > > other plans, and influence the planner to consider this parallel > > plan." > > > > Changed. > > > > > Then, we can also have an Assert for path->path.rows to zero for the > CTAS case. > > > > We can not have Assert(path->path.rows == 0), because we are
Re: Parallel Inserts in CREATE TABLE AS
Thanks Amit for the review comments. On Sat, Dec 5, 2020 at 4:27 PM Amit Kapila wrote: > > > If I'm not wrong, I think currently we have no exec nodes for DDLs. > > I'm not sure whether we would like to introduce one for this. > > Yeah, I am also not in favor of having an executor node for CTAS but > OTOH, I also don't like the way you have jammed the relevant > information in generic PlanState. How about keeping it in GatherState > and initializing it in ExecCreateTableAs() after the executor start. > You are already doing special treatment for the Gather node in > ExecCreateTableAs (via IsParallelInsertInCTASAllowed) so we can as > well initialize the required information in GatherState in > ExecCreateTableAs. I think that might help in reducing the special > treatment for intoclause at different places. > Done. Added required info to GatherState node. While this reduced the changes at many other places, but had to pass the into clause and object id to ExecInitParallelPlan() as we do not send GatherState node to it. Hope that's okay. > > Few other assorted comments: > = > 1. > This looks a bit odd. The function name > 'IsParallelInsertInCTASAllowed' suggests that it just checks whether > parallelism is allowed but it is internally changing the plan_rows. It > might be better to do this separately if the parallelism is allowed. > Changed. > > 2. > static void ExecShutdownGatherWorkers(GatherState *node); > - > +static void ExecParallelInsertInCTAS(GatherState *node); > > Spurious line removal. > Corrected. > > 3. > The comment and code appear a bit misleading as the function seems to > shutdown the workers rather than waiting for them to finish. How about > using something like below: > > /* > * Next, accumulate buffer and WAL usage. (This must wait for the workers > * to finish, or we might get incomplete data.) > */ > if (nworkers > 0) > { > int i; > > /* Wait for all vacuum workers to finish */ > WaitForParallelWorkersToFinish(lps->pcxt); > > for (i = 0; i < lps->pcxt->nworkers_launched; i++) > InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]); > } > > This is how it works for parallel vacuum. > Done. > > 4. > The above comment doesn't seem to convey what it intends to convey. > How about changing it slightly as: "We don't compute the > parallel_tuple_cost for CTAS because the number of tuples that are > transferred from workers to the gather node is zero as each worker > parallelly inserts the tuples that are resulted from its chunk of plan > execution. This change may make the parallel plan cheap among all > other plans, and influence the planner to consider this parallel > plan." > Changed. > > Then, we can also have an Assert for path->path.rows to zero for the CTAS > case. > We can not have Assert(path->path.rows == 0), because we are not changing this parameter upstream in or before the planning phase. We are just skipping to take it into account for CTAS. We may have to do extra checks over different places in case we have to make planner path->path.rows to 0 for CTAS. IMHO, that's not necessary. We can just skip taking this value in cost_gather. Thoughts? > > 5. > + /* Prallel inserts in CTAS related info is specified below. */ > + IntoClause *intoclause; > + Oid objectid; > + DestReceiver *dest; > } PlanState; > > Typo. /Prallel/Parallel > Corrected. > > 6. > Currently, it seems the plan look like: > Gather (actual time=970.524..972.913 rows=0 loops=1) >-> Create t1_test > Workers Planned: 2 > Workers Launched: 2 > -> Parallel Seq Scan on t1 (actual time=0.028..86.623 rows=33 > loops=3) > > I would prefer it to be: > Gather (actual time=970.524..972.913 rows=0 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Create t1_test > -> Parallel Seq Scan on t1 (actual time=0.028..86.623 rows=33 > loops=3) > > This way it looks like the writing part is done below the Gather node > and also it will match the Parallel Insert patch of Greg. > Done. Attaching v7 patch. Please review it further. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 7eff47d971c0ac5722ddc7de5a4c7e7550bb71aa Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sun, 6 Dec 2020 08:21:42 +0530 Subject: [PATCH v7] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest r
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Nov 30, 2020 at 10:43 AM Bharath Rupireddy wrote: > > On Fri, Nov 27, 2020 at 1:07 PM Luc Vlaming wrote: > > > > Disclaimer: I have by no means throughly reviewed all the involved parts > > and am probably missing quite a bit of context so if I understood parts > > wrong or they have been discussed before then I'm sorry. Most notably > > the whole situation about the command-id is still elusive for me and I > > can really not judge yet anything related to that. > > > > IMHO The patch makes that we now have the gather do most of the CTAS > > work, which seems unwanted. For the non-ctas insert/update case it seems > > that a modifytable node exists to actually do the work. What I'm > > wondering is if it is maybe not better to introduce a CreateTable node > > as well? > > This would have several merits: > > - the rowcount of that node would be 0 for the parallel case, and > > non-zero for the serial case. Then the gather ndoe and the Query struct > > don't have to know about CTAS for the most part, removing e.g. the case > > distinctions in cost_gather. > > - the inserted rows can now be accounted in this new node instead of the > > parallel executor state, and this node can also do its own DSM > > intializations > > - the generation of a partial variants of the CreateTable node can now > > be done in the optimizer instead of the ExecCreateTableAs which IMHO is > > a more logical place to make these kind of decisions. which then also > > makes it potentially play nicer with costs and the like. > > - the explain code can now be in its own place instead of part of the > > gather node > > - IIUC it would allow the removal of the code to only launch parallel > > workers if its not CTAS, which IMHO would be quite a big benefit. > > > > Thoughts? > > > > If I'm not wrong, I think currently we have no exec nodes for DDLs. > I'm not sure whether we would like to introduce one for this. > Yeah, I am also not in favor of having an executor node for CTAS but OTOH, I also don't like the way you have jammed the relevant information in generic PlanState. How about keeping it in GatherState and initializing it in ExecCreateTableAs() after the executor start. You are already doing special treatment for the Gather node in ExecCreateTableAs (via IsParallelInsertInCTASAllowed) so we can as well initialize the required information in GatherState in ExecCreateTableAs. I think that might help in reducing the special treatment for intoclause at different places. Few other assorted comments: = 1. +/* + * IsParallelInsertInCTASAllowed --- determine whether or not parallel + * insertion is possible. + */ +bool IsParallelInsertInCTASAllowed(IntoClause *into, QueryDesc *queryDesc) +{ .. .. if (ps && IsA(ps, GatherState) && !ps->ps_ProjInfo && + plannedstmt->parallelModeNeeded && + plannedstmt->planTree && + IsA(plannedstmt->planTree, Gather) && + plannedstmt->planTree->lefttree && + plannedstmt->planTree->lefttree->parallel_aware && + plannedstmt->planTree->lefttree->parallel_safe) + { + /* + * Since there are no rows that are transferred from workers to + * Gather node, so we set it to 0 to be visible in explain + * plans. Note that we would have already accounted this for + * cost calculations in cost_gather(). + */ + plannedstmt->planTree->plan_rows = 0; This looks a bit odd. The function name 'IsParallelInsertInCTASAllowed' suggests that it just checks whether parallelism is allowed but it is internally changing the plan_rows. It might be better to do this separately if the parallelism is allowed. 2. static void ExecShutdownGatherWorkers(GatherState *node); - +static void ExecParallelInsertInCTAS(GatherState *node); Spurious line removal. 3. /* Wait for the parallel workers to finish. */ + if (node->nworkers_launched > 0) + { + ExecShutdownGatherWorkers(node); + + /* + * Add up the total tuples inserted by all workers, to the tuples + * inserted by the leader(if any). This will be shared to client. + */ + node->ps.state->es_processed += pg_atomic_read_u64(node->pei->processed); + } The comment and code appear a bit misleading as the function seems to shutdown the workers rather than waiting for them to finish. How about using something like below: /* * Next, accumulate buffer and WAL usage. (This must wait for the workers * to finish, or we might get incomplete data.) */ if (nworkers > 0) { int i; /* Wait for all vacuum workers to finish */ WaitForParallelWorkersToFinish(lps->pcxt); for (i = 0; i < lps->pcxt->nworkers_launched; i++) InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]); } This is how it works for parallel vacuum. 4. + + /* + * Make the number of tuples that are transferred from workers to gather + * node zero as each worker parallelly insert the tuples that are resulted + * from its chunk of plan execution. This change may make the parallel + * plan cheap among all other plans, and influence the planner to consider + * this parallel plan. + */ + if (!(roo
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Nov 27, 2020 at 1:07 PM Luc Vlaming wrote: > > Disclaimer: I have by no means throughly reviewed all the involved parts > and am probably missing quite a bit of context so if I understood parts > wrong or they have been discussed before then I'm sorry. Most notably > the whole situation about the command-id is still elusive for me and I > can really not judge yet anything related to that. > > IMHO The patch makes that we now have the gather do most of the CTAS > work, which seems unwanted. For the non-ctas insert/update case it seems > that a modifytable node exists to actually do the work. What I'm > wondering is if it is maybe not better to introduce a CreateTable node > as well? > This would have several merits: > - the rowcount of that node would be 0 for the parallel case, and > non-zero for the serial case. Then the gather ndoe and the Query struct > don't have to know about CTAS for the most part, removing e.g. the case > distinctions in cost_gather. > - the inserted rows can now be accounted in this new node instead of the > parallel executor state, and this node can also do its own DSM > intializations > - the generation of a partial variants of the CreateTable node can now > be done in the optimizer instead of the ExecCreateTableAs which IMHO is > a more logical place to make these kind of decisions. which then also > makes it potentially play nicer with costs and the like. > - the explain code can now be in its own place instead of part of the > gather node > - IIUC it would allow the removal of the code to only launch parallel > workers if its not CTAS, which IMHO would be quite a big benefit. > > Thoughts? > If I'm not wrong, I think currently we have no exec nodes for DDLs. I'm not sure whether we would like to introduce one for this. And also note that, both CTAS and CREATE MATERIALIZED VIEW(CMV) are handled with the same code, so if we have CreateTable as the new node, then do we also want to have another node or a generic node name? The main design idea of the patch proposed in this thread is that pushing the dest receiver down to the workers if the SELECT part of the CTAS or CMV is parallelizable. And also, for CTAS or CMV we do not do any planning as such, but the planner is just influenced to take into consideration that there are no tuples to transfer from the workers to Gather node which may make the planner choose parallelism for SELECT part. So, the planner work for CTAS or CMV is very minimal. I also have the idea of extending this design (if accepted) to REFRESH MATERIALIZED VIEW after some analysis. I may be wrong above, other hackers may have better opinions. > > Some small things I noticed while going through the patch: > - Typo for the comment about "inintorel_startup" which should be > intorel_startup > Corrected. > > - if (node->nworkers_launched == 0 && !node->need_to_scan_locally) > >can be changed into >if (node->nworkers_launched == 0 >because either way it'll be true. > Yes, !node->need_to_scan_locally is not necessary, we need to set it to true if there are no workers launched. I removed !node->need_to_scan_locally check from the if clause. > On Fri, Nov 27, 2020 at 11:57 AM Hou, Zhijie > wrote: > > > > > Thanks a lot for the use case. Yes with the current patch table will lose > > > data related to the subplan. On analyzing further, I think we can not > > > allow > > > parallel inserts in the cases when the Gather node has some projections > > > to do. Because the workers can not perform that projection. So, having > > > ps_ProjInfo in the Gather node is an indication for us to disable parallel > > > inserts and only the leader can do the insertions after the Gather node > > > does the required projections. > > > > > > Thoughts? > > > > Agreed. > > Thanks! I will add/modify IsParallelInsertInCTASAllowed() to return > false in this case. > Modified. Attaching v6 patch that has the above review comments addressed. Please review it further. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From b5f111c7dcc7efa336aa5da59b6bdbd3689a70e5 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Mon, 30 Nov 2020 09:36:03 +0530 Subject: [PATCH v6] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest receiver. Leader inserts it's share of tuples if instructed to do, and so are workers. Each worker writes
Re: Parallel Inserts in CREATE TABLE AS
On 25-11-2020 03:40, Bharath Rupireddy wrote: On Tue, Nov 24, 2020 at 4:43 PM Hou, Zhijie wrote: I'm very interested in this feature, and I'm looking at the patch, here are some comments. Thanks for the review. How about the following style: if(TupIsNull(outerTupleSlot)) Break; (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest); node->ps.state->es_processed++; Which looks cleaner. Done. The check can be replaced by ISCTAS(into). Done. 'inerst' looks like a typo (insert). Corrected. The code here call strlen(intoclausestr) for two times, After checking the existing code in ExecInitParallelPlan, It used to store the strlen in a variable. So how about the following style: intoclause_len = strlen(intoclausestr); ... /* Store serialized intoclause. */ intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1); memcpy(shmptr, intoclausestr, intoclause_len + 1); shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space); Done. The two check about intoclausestr seems can be combined like: if (intoclausestr != NULL) { ... } else { ... } Done. Attaching v5 patch. Please consider it for further review. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com Disclaimer: I have by no means throughly reviewed all the involved parts and am probably missing quite a bit of context so if I understood parts wrong or they have been discussed before then I'm sorry. Most notably the whole situation about the command-id is still elusive for me and I can really not judge yet anything related to that. IMHO The patch makes that we now have the gather do most of the CTAS work, which seems unwanted. For the non-ctas insert/update case it seems that a modifytable node exists to actually do the work. What I'm wondering is if it is maybe not better to introduce a CreateTable node as well? This would have several merits: - the rowcount of that node would be 0 for the parallel case, and non-zero for the serial case. Then the gather ndoe and the Query struct don't have to know about CTAS for the most part, removing e.g. the case distinctions in cost_gather. - the inserted rows can now be accounted in this new node instead of the parallel executor state, and this node can also do its own DSM intializations - the generation of a partial variants of the CreateTable node can now be done in the optimizer instead of the ExecCreateTableAs which IMHO is a more logical place to make these kind of decisions. which then also makes it potentially play nicer with costs and the like. - the explain code can now be in its own place instead of part of the gather node - IIUC it would allow the removal of the code to only launch parallel workers if its not CTAS, which IMHO would be quite a big benefit. Thoughts? Some small things I noticed while going through the patch: - Typo for the comment about "inintorel_startup" which should be intorel_startup - if (node->nworkers_launched == 0 && !node->need_to_scan_locally) can be changed into if (node->nworkers_launched == 0 because either way it'll be true. Regards, Luc Swarm64
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Nov 27, 2020 at 11:57 AM Hou, Zhijie wrote: > > > Thanks a lot for the use case. Yes with the current patch table will lose > > data related to the subplan. On analyzing further, I think we can not allow > > parallel inserts in the cases when the Gather node has some projections > > to do. Because the workers can not perform that projection. So, having > > ps_ProjInfo in the Gather node is an indication for us to disable parallel > > inserts and only the leader can do the insertions after the Gather node > > does the required projections. > > > > Thoughts? > > > > Agreed. > Thanks! I will add/modify IsParallelInsertInCTASAllowed() to return false in this case. > > 2. > @@ -166,6 +228,16 @@ ExecGather(PlanState *pstate) > { > ParallelContext *pcxt; > > + /* > +* Take the necessary information to be passed to > workers for > +* parallel inserts in CTAS. > +*/ > + if (ISCTAS(node->ps.intoclause)) > + { > + node->ps.lefttree->intoclause = > node->ps.intoclause; > + node->ps.lefttree->objectid = > node->ps.objectid; > + } > + > /* Initialize, or re-initialize, shared state needed > by workers. */ > if (!node->pei) > node->pei = > ExecInitParallelPlan(node->ps.lefttree, > > I found the code pass intoclause and objectid to Gather node's lefttree. > Is it necessary? It seems only Gather node will use the information. > I am passing the required information from the up to here through PlanState structure. Since the Gather node's leftree is also a PlanState structure variable, here I just assigned them to pass that information to ExecInitParallelPlan(). With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
Hi, > > I took a deep look at the projection logic. > > In most cases, you are right that Gather node does not need projection. > > > > In some rare cases, such as Subplan (or initplan I guess). > > The projection will happen in Gather node. > > > > The example: > > > > Create table test(i int); > > Create table test2(a int, b int); > > insert into test values(generate_series(1,1000,1)); > > insert into test2 values(generate_series(1,1000,1), > > generate_series(1,1000,1)); > > > > postgres=# explain(verbose, costs off) select test.i,(select i from > (select * from test2) as tt limit 1) from test where test.i < 2000; > >QUERY PLAN > > > > Gather > >Output: test.i, (SubPlan 1) > >Workers Planned: 2 > >-> Parallel Seq Scan on public.test > > Output: test.i > > Filter: (test.i < 2000) > >SubPlan 1 > > -> Limit > >Output: (test.i) > >-> Seq Scan on public.test2 > > Output: test.i > > > > In this case, projection is necessary, because the subplan will be > > executed in projection. > > > > If skipped, the table created will loss some data. > > > > Thanks a lot for the use case. Yes with the current patch table will lose > data related to the subplan. On analyzing further, I think we can not allow > parallel inserts in the cases when the Gather node has some projections > to do. Because the workers can not perform that projection. So, having > ps_ProjInfo in the Gather node is an indication for us to disable parallel > inserts and only the leader can do the insertions after the Gather node > does the required projections. > > Thoughts? > Agreed. 2. @@ -166,6 +228,16 @@ ExecGather(PlanState *pstate) { ParallelContext *pcxt; + /* +* Take the necessary information to be passed to workers for +* parallel inserts in CTAS. +*/ + if (ISCTAS(node->ps.intoclause)) + { + node->ps.lefttree->intoclause = node->ps.intoclause; + node->ps.lefttree->objectid = node->ps.objectid; + } + /* Initialize, or re-initialize, shared state needed by workers. */ if (!node->pei) node->pei = ExecInitParallelPlan(node->ps.lefttree, I found the code pass intoclause and objectid to Gather node's lefttree. Is it necessary? It seems only Gather node will use the information. Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Nov 26, 2020 at 12:15 PM Hou, Zhijie wrote: > > I took a deep look at the projection logic. > In most cases, you are right that Gather node does not need projection. > > In some rare cases, such as Subplan (or initplan I guess). > The projection will happen in Gather node. > > The example: > > Create table test(i int); > Create table test2(a int, b int); > insert into test values(generate_series(1,1000,1)); > insert into test2 values(generate_series(1,1000,1), > generate_series(1,1000,1)); > > postgres=# explain(verbose, costs off) select test.i,(select i from (select * > from test2) as tt limit 1) from test where test.i < 2000; >QUERY PLAN > > Gather >Output: test.i, (SubPlan 1) >Workers Planned: 2 >-> Parallel Seq Scan on public.test > Output: test.i > Filter: (test.i < 2000) >SubPlan 1 > -> Limit >Output: (test.i) >-> Seq Scan on public.test2 > Output: test.i > > In this case, projection is necessary, > because the subplan will be executed in projection. > > If skipped, the table created will loss some data. > Thanks a lot for the use case. Yes with the current patch table will lose data related to the subplan. On analyzing further, I think we can not allow parallel inserts in the cases when the Gather node has some projections to do. Because the workers can not perform that projection. So, having ps_ProjInfo in the Gather node is an indication for us to disable parallel inserts and only the leader can do the insertions after the Gather node does the required projections. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
Hi , > On Thu, Nov 26, 2020 at 7:47 AM Hou, Zhijie > wrote: > > > > Hi, > > > > I have an issue about the following code: > > > > econtext = node->ps.ps_ExprContext; > > ResetExprContext(econtext); > > > > + if (ISCTAS(node->ps.intoclause)) > > + { > > + ExecParallelInsertInCTAS(node); > > + return NULL; > > + } > > > > /* If no projection is required, we're done. */ > > if (node->ps.ps_ProjInfo == NULL) > > return slot; > > > > /* > > * Form the result tuple using ExecProject(), and return it. > > */ > > econtext->ecxt_outertuple = slot; > > return ExecProject(node->ps.ps_ProjInfo); > > > > It seems the projection will be skipped. > > Is this because projection is not required in this case ? > > (I'm not very familiar with where the projection will be.) > > > > For parallel inserts in CTAS, I don't think we need to project the tuples > being returned from the underlying plan nodes, and also we have nothing > to project from the Gather node further up. The required projection will > happen while the tuples are being returned from the underlying nodes and > the projected tuples are being directly fed to CTAS's dest receiver > intorel_receive(), from there into the created table. We don't need > ExecProject again in ExecParallelInsertInCTAS(). > > For instance, projection will always be done when the tuple is being returned > from an underlying sequential scan node(see ExecScan() --> > ExecProject() and this is true for both leader and workers. In both leader > and workers, we are just calling CTAS's dest receiver intorel_receive(). > > Thoughts? I took a deep look at the projection logic. In most cases, you are right that Gather node does not need projection. In some rare cases, such as Subplan (or initplan I guess). The projection will happen in Gather node. The example: Create table test(i int); Create table test2(a int, b int); insert into test values(generate_series(1,1000,1)); insert into test2 values(generate_series(1,1000,1), generate_series(1,1000,1)); postgres=# explain(verbose, costs off) select test.i,(select i from (select * from test2) as tt limit 1) from test where test.i < 2000; QUERY PLAN Gather Output: test.i, (SubPlan 1) Workers Planned: 2 -> Parallel Seq Scan on public.test Output: test.i Filter: (test.i < 2000) SubPlan 1 -> Limit Output: (test.i) -> Seq Scan on public.test2 Output: test.i In this case, projection is necessary, because the subplan will be executed in projection. If skipped, the table created will loss some data. Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Nov 26, 2020 at 7:47 AM Hou, Zhijie wrote: > > Hi, > > I have an issue about the following code: > > econtext = node->ps.ps_ExprContext; > ResetExprContext(econtext); > > + if (ISCTAS(node->ps.intoclause)) > + { > + ExecParallelInsertInCTAS(node); > + return NULL; > + } > > /* If no projection is required, we're done. */ > if (node->ps.ps_ProjInfo == NULL) > return slot; > > /* > * Form the result tuple using ExecProject(), and return it. > */ > econtext->ecxt_outertuple = slot; > return ExecProject(node->ps.ps_ProjInfo); > > It seems the projection will be skipped. > Is this because projection is not required in this case ? > (I'm not very familiar with where the projection will be.) > For parallel inserts in CTAS, I don't think we need to project the tuples being returned from the underlying plan nodes, and also we have nothing to project from the Gather node further up. The required projection will happen while the tuples are being returned from the underlying nodes and the projected tuples are being directly fed to CTAS's dest receiver intorel_receive(), from there into the created table. We don't need ExecProject again in ExecParallelInsertInCTAS(). For instance, projection will always be done when the tuple is being returned from an underlying sequential scan node(see ExecScan() --> ExecProject() and this is true for both leader and workers. In both leader and workers, we are just calling CTAS's dest receiver intorel_receive(). Thoughts? > > If projection is not required here, shall we add some comments here? > If the above point looks okay, I can add a comment. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
RE: Parallel Inserts in CREATE TABLE AS
Hi, I have an issue about the following code: econtext = node->ps.ps_ExprContext; ResetExprContext(econtext); + if (ISCTAS(node->ps.intoclause)) + { + ExecParallelInsertInCTAS(node); + return NULL; + } /* If no projection is required, we're done. */ if (node->ps.ps_ProjInfo == NULL) return slot; /* * Form the result tuple using ExecProject(), and return it. */ econtext->ecxt_outertuple = slot; return ExecProject(node->ps.ps_ProjInfo); It seems the projection will be skipped. Is this because projection is not required in this case ? (I'm not very familiar with where the projection will be.) If projection is not required here, shall we add some comments here? Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Nov 24, 2020 at 4:43 PM Hou, Zhijie wrote: > > I'm very interested in this feature, > and I'm looking at the patch, here are some comments. > Thanks for the review. > > How about the following style: > > if(TupIsNull(outerTupleSlot)) > Break; > > (void) node->ps.dest->receiveSlot(outerTupleSlot, > node->ps.dest); > node->ps.state->es_processed++; > > Which looks cleaner. > Done. > > The check can be replaced by ISCTAS(into). > Done. > > 'inerst' looks like a typo (insert). > Corrected. > > The code here call strlen(intoclausestr) for two times, > After checking the existing code in ExecInitParallelPlan, > It used to store the strlen in a variable. > > So how about the following style: > > intoclause_len = strlen(intoclausestr); > ... > /* Store serialized intoclause. */ > intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1); > memcpy(shmptr, intoclausestr, intoclause_len + 1); > shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space); > Done. > > The two check about intoclausestr seems can be combined like: > > if (intoclausestr != NULL) > { > ... > } > else > { > ... > } > Done. Attaching v5 patch. Please consider it for further review. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 21fbbb8d4297e6daa7a7ec696a36327f592089bd Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Wed, 25 Nov 2020 04:44:29 +0530 Subject: [PATCH v5] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest receiver. Leader inserts it's share of tuples if instructed to do, and so are workers. Each worker writes atomically it's number of inserted tuples into a shared memory variable, the leader combines this with it's own number of inserted tuples and shares to the client. --- src/backend/access/heap/heapam.c | 11 - src/backend/access/transam/xact.c| 30 +- src/backend/commands/createas.c | 313 --- src/backend/commands/explain.c | 36 +++ src/backend/executor/execMain.c | 18 ++ src/backend/executor/execParallel.c | 68 +++- src/backend/executor/nodeGather.c| 99 +- src/backend/optimizer/path/costsize.c| 12 +- src/include/access/xact.h| 1 + src/include/commands/createas.h | 22 ++ src/include/executor/execParallel.h | 2 + src/include/nodes/execnodes.h| 5 + src/include/nodes/parsenodes.h | 1 + src/test/regress/expected/write_parallel.out | 143 + src/test/regress/sql/write_parallel.sql | 65 15 files changed, 688 insertions(+), 138 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 1b2f70499e..3045c0f046 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2043,17 +2043,6 @@ static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid, CommandId cid, int options) { - /* - * To allow parallel inserts, we need to ensure that they are safe to be - * performed in workers. We have the infrastructure to allow parallel - * inserts in general except for the cases where inserts generate a new - * CommandId (eg. inserts into a table having a foreign key column). - */ - if (IsParallelWorker()) - ereport(ERROR, -(errcode(ERRCODE_INVALID_TRANSACTION_STATE), - errmsg("cannot insert tuples in a parallel worker"))); - tup->t_data->t_infomask &= ~(HEAP_XACT_MASK); tup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK); tup->t_data->t_infomask |= HEAP_XMAX_INVALID; diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 9cd0b7c11b..db6eedd635 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -763,18 +763,34 @@ GetCurrentCommandId(bool used) /* this is global to a transaction, not subtransaction-local */ if (used) { - /* - * Forbid setting currentCommandIdUsed in a parallel worker, because - * we have no provision for communicating this back to the leader. We - * could relax this restriction when currentCommandIdUsed was already - * true at the start of the parallel operation. - */ - Assert(!Is
RE: Parallel Inserts in CREATE TABLE AS
Hi, I'm very interested in this feature, and I'm looking at the patch, here are some comments. 1. + if (!TupIsNull(outerTupleSlot)) + { + (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest); + node->ps.state->es_processed++; + } + + if(TupIsNull(outerTupleSlot)) + break; + } How about the following style: if(TupIsNull(outerTupleSlot)) Break; (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest); node->ps.state->es_processed++; Which looks cleaner. 2. + + if (into != NULL && + IsA(into, IntoClause)) + { The check can be replaced by ISCTAS(into). 3. + /* +* For parallelizing inserts in CTAS i.e. making each +* parallel worker inerst it's tuples, we must send +* information such as intoclause(for each worker 'inerst' looks like a typo (insert). 4. + /* Estimate space for into clause for CTAS. */ + if (ISCTAS(planstate->intoclause)) + { + intoclausestr = nodeToString(planstate->intoclause); + shm_toc_estimate_chunk(&pcxt->estimator, strlen(intoclausestr) + 1); + shm_toc_estimate_keys(&pcxt->estimator, 1); + } ... + if (intoclausestr != NULL) + { + char *shmptr = (char *)shm_toc_allocate(pcxt->toc, + strlen(intoclausestr) + 1); + strcpy(shmptr, intoclausestr); + shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr); + } The code here call strlen(intoclausestr) for two times, After checking the existing code in ExecInitParallelPlan, It used to store the strlen in a variable. So how about the following style: intoclause_len = strlen(intoclausestr); ... /* Store serialized intoclause. */ intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1); memcpy(shmptr, intoclausestr, intoclause_len + 1); shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space); the code in ExecInitParallelPlan 5. + if (intoclausestr != NULL) + { + char *shmptr = (char *)shm_toc_allocate(pcxt->toc, + strlen(intoclausestr) + 1); + strcpy(shmptr, intoclausestr); + shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr); + } + /* Set up the tuple queues that the workers will write into. */ - pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false); + if (intoclausestr == NULL) + pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false); The two check about intoclausestr seems can be combined like: if (intoclausestr != NULL) { ... } else { ... } Best regards, houzj
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Oct 19, 2020 at 10:47 PM Bharath Rupireddy wrote: > > Attaching v3 patch herewith. > > I'm done with all the open points in my list. Please review the v3 patch and > provide comments. > Attaching v4 patch, rebased on the latest master 68b1a4877e. Also, added this feature to commitfest - https://commitfest.postgresql.org/31/2841/ With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 127309ffab82e372b338914ccc900463d8c63157 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Mon, 23 Nov 2020 16:27:45 +0530 Subject: [PATCH v4] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest receiver. Leader inserts it's share of tuples if instructed to do, and so are workers. Each worker writes atomically it's number of inserted tuples into a shared memory variable, the leader combines this with it's own number of inserted tuples and shares to the client. --- src/backend/access/heap/heapam.c | 11 - src/backend/access/transam/xact.c| 30 +- src/backend/commands/createas.c | 314 --- src/backend/commands/explain.c | 36 +++ src/backend/executor/execMain.c | 19 ++ src/backend/executor/execParallel.c | 60 +++- src/backend/executor/nodeGather.c| 101 +- src/backend/optimizer/path/costsize.c| 12 +- src/include/access/xact.h| 1 + src/include/commands/createas.h | 20 ++ src/include/executor/execParallel.h | 1 + src/include/nodes/execnodes.h| 5 + src/include/nodes/parsenodes.h | 1 + src/test/regress/expected/write_parallel.out | 143 + src/test/regress/sql/write_parallel.sql | 65 15 files changed, 682 insertions(+), 137 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 1b2f70499e..3045c0f046 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2043,17 +2043,6 @@ static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid, CommandId cid, int options) { - /* - * To allow parallel inserts, we need to ensure that they are safe to be - * performed in workers. We have the infrastructure to allow parallel - * inserts in general except for the cases where inserts generate a new - * CommandId (eg. inserts into a table having a foreign key column). - */ - if (IsParallelWorker()) - ereport(ERROR, -(errcode(ERRCODE_INVALID_TRANSACTION_STATE), - errmsg("cannot insert tuples in a parallel worker"))); - tup->t_data->t_infomask &= ~(HEAP_XACT_MASK); tup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK); tup->t_data->t_infomask |= HEAP_XMAX_INVALID; diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 03c553e7ea..c85a9906f1 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -763,18 +763,34 @@ GetCurrentCommandId(bool used) /* this is global to a transaction, not subtransaction-local */ if (used) { - /* - * Forbid setting currentCommandIdUsed in a parallel worker, because - * we have no provision for communicating this back to the leader. We - * could relax this restriction when currentCommandIdUsed was already - * true at the start of the parallel operation. - */ - Assert(!IsParallelWorker()); + /* + * This is a temporary hack for all common parallel insert cases i.e. + * insert into, ctas, copy from. To be changed later. In a parallel + * worker, set currentCommandIdUsed to true only if it was not set to + * true at the start of the parallel operation (by way of + * SetCurrentCommandIdUsedForWorker()). We have to do this because + * GetCurrentCommandId(true) may be called from anywhere, especially + * for parallel inserts, within parallel worker. + */ + Assert(!(IsParallelWorker() && !currentCommandIdUsed)); currentCommandIdUsed = true; } return currentCommandId; } +/* + * SetCurrentCommandIdUsedForWorker + * + * For a parallel worker, record that the currentCommandId has been used. + * This must only be called at the start of a parallel operation. +*/ +void +SetCurrentCommandIdUsedForWorker(void) +{ + Assert(IsParallelWorker() && !currentCommandIdUsed && currentCommandId != InvalidCommandId); + current
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Oct 15, 2020 at 3:18 PM Amit Kapila wrote: > > > > > 1. How to represent the parallel insert for CTAS in explain plans? The > > > > explain CTAS shows the plan for only the SELECT part. How about having > > > > some textual info along with the Gather node? I'm not quite sure on > > > > this point, any suggestions are welcome. > > > > > > I am also not sure about this point because we don't display anything > > > for the DDL part in explain. Can you propose by showing some example > > > of what you have in mind? > > > > I thought we could have something like this. > > - > > Gather (cost=1000.00..108738.90 rows=0 width=8) > > Workers Planned: 2 Parallel Insert on t_test1 > > -> Parallel Seq Scan on t_test (cost=0.00..106748.00 rows=4954 width=8) > > Filter: (many < 1) > > - > > maybe something like below: > Gather (cost=1000.00..108738.90 rows=0 width=8) >-> Create t_test1 >-> Parallel Seq Scan on t_test > > I don't know what is the best thing to do here. I think for the > temporary purpose you can keep something like above then once the > patch is matured then we can take a separate opinion for this. > Agreed. Here's a snapshot of explain with the change suggested. postgres=# EXPLAIN (ANALYZE, COSTS OFF) CREATE TABLE t1_test AS SELECT * FROM t1; QUERY PLAN - Gather (actual time=970.524..972.913 rows=0 loops=1) * -> Create t1_test* Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on t1 (actual time=0.028..86.623 rows=33 loops=3) Planning Time: 0.049 ms Execution Time: 973.733 ms > > I think there is no reason why one can't use ORDER BY in the > statements we are talking about here. But, I think we can't enable > parallelism for GatherMerge is because for that node we always need to > fetch the data in the leader backend to perform the final merge phase. > So, I was expecting a small comment saying something on those lines. > Added comments. > > 2. Addition of new test cases. > Added new test cases. > > Analysis on the 2 mismatches in write_parallel.sql regression test. > Done. It needed a small code change in costsize.c. Now, both make check and make check-world passes. Apart from above, a couple of other things I have finished with the v3 patch. 1. Both make check and make check-world with force_parallel_mode = regress passes. 2. I enabled parallel inserts in case of materialized views. Hope that's fine. Attaching v3 patch herewith. I'm done with all the open points in my list. Please review the v3 patch and provide comments. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From b80460c0390317cceecd66ce9780feafd55bd5b2 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Mon, 19 Oct 2020 22:06:41 +0530 Subject: [PATCH v3] Parallel Inserts in CREATE TABLE AS The idea of this patch is to allow the leader and each worker insert the tuples in parallel if the SELECT part of the CTAS is parallelizable. The design: Let the planner know that the SELECT is from CTAS in createas.c so that it can set the number of tuples transferred from the workers to Gather node to 0. With this change, there are chances that the planner may choose the parallel plan. After the planning, check if the upper plan node is Gather in createas.c and mark a parallelism flag in the CTAS dest receiver. Pass the into clause, object id, command id from the leader to workers, so that each worker can create its own CTAS dest receiver. Leader inserts it's share of tuples if instructed to do, and so are workers. Each worker writes atomically it's number of inserted tuples into a shared memory variable, the leader combines this with it's own number of inserted tuples and shares to the client. --- src/backend/access/heap/heapam.c | 11 - src/backend/access/transam/xact.c| 30 +- src/backend/commands/createas.c | 341 --- src/backend/commands/explain.c | 36 ++ src/backend/executor/execMain.c | 19 ++ src/backend/executor/execParallel.c | 60 +++- src/backend/executor/nodeGather.c| 101 +- src/backend/optimizer/path/costsize.c| 12 +- src/include/access/xact.h| 1 + src/include/commands/createas.h | 20 ++ src/include/executor/execParallel.h | 1 + src/include/nodes/execnodes.h| 5 + src/include/nodes/parsenodes.h | 1 + src/test/regress/expected/write_parallel.out | 143 src/test/regress/sql/write_parallel.sql | 65 15 files changed, 694 insertions(+), 152 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/ac
Re: Parallel Inserts in CREATE TABLE AS
On 16.10.20 08:23, Bharath Rupireddy wrote: On Fri, Oct 16, 2020 at 11:33 AM Luc Vlaming wrote: Really looking forward to this ending up in postgres as I think it's a very nice improvement. Whilst reviewing your patch I was wondering: is there a reason you did not introduce a batch insert in the destreceiver for the CTAS? For me this makes a huge difference in ingest speed as otherwise the inserts do not really scale so well as lock contention start to be a big problem. If you like I can make a patch to introduce this on top? Thanks for your interest. You are right, we can get maximum improvement if we have multi inserts in destreceiver for the CTAS on the similar lines to COPY FROM command. I specified this point in my first mail [1]. You may want to take a look at an already existing patch [2] for multi inserts, I think there are some review comments to be addressed in that patch. I would love to see the multi insert patch getting revived. [1] - https://www.postgresql.org/message-id/CALj2ACWFq6Z4_jd9RPByURB8-Y8wccQWzLf%2B0-Jg%2BKYT7ZO-Ug%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAEET0ZG31mD5SWjTYsAt0JTLReOejPvusJorZ3kGZ1%3DN1AC-Fw%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com Sorry had not seen that pointer in your first email. I'll first finish some other patches I'm working on and then I'll try to revive that patch. Thanks for the pointers. Kind regards, Luc Swarm64
Re: Parallel Inserts in CREATE TABLE AS
On Fri, Oct 16, 2020 at 11:33 AM Luc Vlaming wrote: > > Really looking forward to this ending up in postgres as I think it's a > very nice improvement. > > Whilst reviewing your patch I was wondering: is there a reason you did > not introduce a batch insert in the destreceiver for the CTAS? For me > this makes a huge difference in ingest speed as otherwise the inserts do > not really scale so well as lock contention start to be a big problem. > If you like I can make a patch to introduce this on top? > Thanks for your interest. You are right, we can get maximum improvement if we have multi inserts in destreceiver for the CTAS on the similar lines to COPY FROM command. I specified this point in my first mail [1]. You may want to take a look at an already existing patch [2] for multi inserts, I think there are some review comments to be addressed in that patch. I would love to see the multi insert patch getting revived. [1] - https://www.postgresql.org/message-id/CALj2ACWFq6Z4_jd9RPByURB8-Y8wccQWzLf%2B0-Jg%2BKYT7ZO-Ug%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAEET0ZG31mD5SWjTYsAt0JTLReOejPvusJorZ3kGZ1%3DN1AC-Fw%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On 14.10.20 11:16, Bharath Rupireddy wrote: On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila wrote: Yes we do a bunch of catalog changes related to the created new table. We will have both the txn id and command id assigned when catalogue changes are being made. But, right after the table is created in the leader, the command id is incremented (CommandCounterIncrement() is called from create_ctas_internal()) whereas the txn id remains the same. The new command id is marked as GetCurrentCommandId(true); in intorel_startup, then the parallel mode is entered. The txn id and command id are serialized into parallel DSM, they are then available to all parallel workers. This is discussed in [1]. Few changes I have to make in the parallel worker code: set currentCommandIdUsed = true;, may be via a common API SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the extra command id sharing from the leader to workers. I will add a few comments in the upcoming patch related to the above info. Yes, that would be good. Added comments. But how does that work for SELECT INTO? Are you prohibiting that? ... In case of SELECT INTO, a new table gets created and I'm not prohibiting the parallel inserts and I think we don't need to. So, in this case, also do we ensure that table is created before we launch the workers. If so, I think you can explain in comments about it and what you need to do that to ensure the same. For SELECT INTO, the table gets created by the leader in create_ctas_internal(), then ExecInitParallelPlan() gets called which launches the workers and then the leader(if asked to do so) and the workers insert the rows. So, we don't need to do any extra work to ensure the table gets created before the workers start inserting tuples. While skimming through the patch, a small thing I noticed: + /* + * SELECT part of the CTAS is parallelizable, so we can make + * each parallel worker insert the tuples that are resulted + * in it's execution into the target table. + */ + if (!is_matview && + IsA(plan->planTree, Gather)) + ((DR_intorel *) dest)->is_parallel = true; + I am not sure at this stage if this is the best way to make CTAS as parallel but if so, then probably you can expand the comments a bit to say why you consider only Gather node (and that too when it is the top-most node) and why not another parallel node like GatherMerge? If somebody expects to preserve the order of the tuples that are coming from GatherMerge node of the select part in CTAS or SELECT INTO while inserting, now if parallelism is allowed, that may not be the case i.e. the order of insertion of tuples may vary. I'm not quite sure, if someone wants to use order by in the select parts of CTAS or SELECT INTO in a real world use case. Thoughts? Right, for now, I think you can simply remove that check from the code instead of just commenting it. We will see if there is a better check/Assert we can add there. Done. I also worked on some of the open points I listed earlier in my mail. 3. Need to restrict parallel inserts, if CTAS tries to create temp/global tables as the workers will not have access to those tables. Done. Need to analyze whether to allow parallelism if CTAS has prepared statements or with no data. For prepared statements, the parallelism will not be picked and so is parallel insertion. For CTAS with no data option case the select part is not even planned, and so the parallelism will also not be picked. 4. Need to stop unnecessary parallel shared state such as tuple queue being created and shared to workers. Done. I'm listing the things that are still pending. 1. How to represent the parallel insert for CTAS in explain plans? The explain CTAS shows the plan for only the SELECT part. How about having some textual info along with the Gather node? I'm not quite sure on this point, any suggestions are welcome. 2. Addition of new test cases. Testing with more scenarios and different data sets, sizes, tablespaces, select into. Analysis on the 2 mismatches in write_parallel.sql regression test. Attaching v2 patch, thoughts and comments are welcome. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com Hi, Really looking forward to this ending up in postgres as I think it's a very nice improvement. Whilst reviewing your patch I was wondering: is there a reason you did not introduce a batch insert in the destreceiver for the CTAS? For me this makes a huge difference in ingest speed as otherwise the inserts do not really scale so well as lock contention start to be a big problem. If you like I can make a patch to introduce this on top? Kind regards, Luc Swarm64
Re: Parallel Inserts in CREATE TABLE AS
On Thu, Oct 15, 2020 at 9:14 AM Bharath Rupireddy wrote: > > On Wed, Oct 14, 2020 at 6:16 PM Amit Kapila wrote: > > > > > For prepared statements, the parallelism will not be picked and so is > > > parallel insertion. > > > > Hmm, I am not sure what makes you say this statement. The parallelism > > is enabled for prepared statements since commit 57a6a72b6b. > > > > Thanks for letting me know this. I misunderstood the parallelism for prepared > statements. Now, I verified with a proper use case(see below), where I had a > prepared statement, CTAS having EXECUTE, in this case too parallelism is > picked and parallel insertion happened with the patch proposed in this > thread. Do we have any problems if we allow parallel insertion for these > cases? > > PREPARE myselect AS SELECT * FROM t1; > EXPLAIN ANALYZE CREATE TABLE t1_test AS EXECUTE myselect; > > I think the commit 57a6a72b6b has not added any test cases, isn't it good to > add one in prepare.sql or select_parallel.sql? > I am not sure if it is worth as this is not functionality which is too complex or there are many chances of getting it broken. > > > > > 1. How to represent the parallel insert for CTAS in explain plans? The > > > explain CTAS shows the plan for only the SELECT part. How about having > > > some textual info along with the Gather node? I'm not quite sure on > > > this point, any suggestions are welcome. > > > > I am also not sure about this point because we don't display anything > > for the DDL part in explain. Can you propose by showing some example > > of what you have in mind? > > > > I thought we could have something like this. > - > Gather (cost=1000.00..108738.90 rows=0 width=8) > Workers Planned: 2 Parallel Insert on t_test1 > -> Parallel Seq Scan on t_test (cost=0.00..106748.00 rows=4954 > width=8) > Filter: (many < 1) > - > maybe something like below: Gather (cost=1000.00..108738.90 rows=0 width=8) -> Create t_test1 -> Parallel Seq Scan on t_test I don't know what is the best thing to do here. I think for the temporary purpose you can keep something like above then once the patch is matured then we can take a separate opinion for this. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Wed, Oct 14, 2020 at 6:16 PM Amit Kapila wrote: > > > If somebody expects to preserve the order of the tuples that are > > coming from GatherMerge node of the select part in CTAS or SELECT INTO > > while inserting, now if parallelism is allowed, that may not be the > > case i.e. the order of insertion of tuples may vary. I'm not quite > > sure, if someone wants to use order by in the select parts of CTAS or > > SELECT INTO in a real world use case. Thoughts? > > > > I think there is no reason why one can't use ORDER BY in the > statements we are talking about here. But, I think we can't enable > parallelism for GatherMerge is because for that node we always need to > fetch the data in the leader backend to perform the final merge phase. > So, I was expecting a small comment saying something on those lines. > Sure, I will add comments in the upcoming patch. > > > For prepared statements, the parallelism will not be picked and so is > > parallel insertion. > > Hmm, I am not sure what makes you say this statement. The parallelism > is enabled for prepared statements since commit 57a6a72b6b. > Thanks for letting me know this. I misunderstood the parallelism for prepared statements. Now, I verified with a proper use case(see below), where I had a prepared statement, CTAS having EXECUTE, in this case too parallelism is picked and parallel insertion happened with the patch proposed in this thread. Do we have any problems if we allow parallel insertion for these cases? PREPARE myselect AS SELECT * FROM t1; EXPLAIN ANALYZE CREATE TABLE t1_test AS EXECUTE myselect; I think the commit 57a6a72b6b has not added any test cases, isn't it good to add one in prepare.sql or select_parallel.sql? > > > 1. How to represent the parallel insert for CTAS in explain plans? The > > explain CTAS shows the plan for only the SELECT part. How about having > > some textual info along with the Gather node? I'm not quite sure on > > this point, any suggestions are welcome. > > I am also not sure about this point because we don't display anything > for the DDL part in explain. Can you propose by showing some example > of what you have in mind? > I thought we could have something like this. - Gather (cost=1000.00..108738.90 rows=0 width=8) Workers Planned: 2 *Parallel Insert on t_test1* -> Parallel Seq Scan on t_test (cost=0.00..106748.00 rows=4954 width=8) Filter: (many < 1) - With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
On Wed, Oct 14, 2020 at 2:46 PM Bharath Rupireddy wrote: > > On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila wrote: > > > > > > While skimming through the patch, a small thing I noticed: > > + /* > > + * SELECT part of the CTAS is parallelizable, so we can make > > + * each parallel worker insert the tuples that are resulted > > + * in it's execution into the target table. > > + */ > > + if (!is_matview && > > + IsA(plan->planTree, Gather)) > > + ((DR_intorel *) dest)->is_parallel = true; > > + > > > > I am not sure at this stage if this is the best way to make CTAS as > > parallel but if so, then probably you can expand the comments a bit to > > say why you consider only Gather node (and that too when it is the > > top-most node) and why not another parallel node like GatherMerge? > > > > If somebody expects to preserve the order of the tuples that are > coming from GatherMerge node of the select part in CTAS or SELECT INTO > while inserting, now if parallelism is allowed, that may not be the > case i.e. the order of insertion of tuples may vary. I'm not quite > sure, if someone wants to use order by in the select parts of CTAS or > SELECT INTO in a real world use case. Thoughts? > I think there is no reason why one can't use ORDER BY in the statements we are talking about here. But, I think we can't enable parallelism for GatherMerge is because for that node we always need to fetch the data in the leader backend to perform the final merge phase. So, I was expecting a small comment saying something on those lines. > > > > > Need to analyze whether to allow parallelism if CTAS has prepared > > statements or with no data. > > > > For prepared statements, the parallelism will not be picked and so is > parallel insertion. > Hmm, I am not sure what makes you say this statement. The parallelism is enabled for prepared statements since commit 57a6a72b6b. > > I'm listing the things that are still pending. > > 1. How to represent the parallel insert for CTAS in explain plans? The > explain CTAS shows the plan for only the SELECT part. How about having > some textual info along with the Gather node? I'm not quite sure on > this point, any suggestions are welcome. > I am also not sure about this point because we don't display anything for the DDL part in explain. Can you propose by showing some example of what you have in mind? -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila wrote: > > > Yes we do a bunch of catalog changes related to the created new table. > > We will have both the txn id and command id assigned when catalogue > > changes are being made. But, right after the table is created in the > > leader, the command id is incremented (CommandCounterIncrement() is > > called from create_ctas_internal()) whereas the txn id remains the > > same. The new command id is marked as GetCurrentCommandId(true); in > > intorel_startup, then the parallel mode is entered. The txn id and > > command id are serialized into parallel DSM, they are then available > > to all parallel workers. This is discussed in [1]. > > > > Few changes I have to make in the parallel worker code: set > > currentCommandIdUsed = true;, may be via a common API > > SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the > > extra command id sharing from the leader to workers. > > > > I will add a few comments in the upcoming patch related to the above info. > > > > Yes, that would be good. > Added comments. > > > > But how does that work for SELECT INTO? Are you prohibiting > > > that? ... > > > > > > > In case of SELECT INTO, a new table gets created and I'm not > > prohibiting the parallel inserts and I think we don't need to. > > > > So, in this case, also do we ensure that table is created before we > launch the workers. If so, I think you can explain in comments about > it and what you need to do that to ensure the same. > For SELECT INTO, the table gets created by the leader in create_ctas_internal(), then ExecInitParallelPlan() gets called which launches the workers and then the leader(if asked to do so) and the workers insert the rows. So, we don't need to do any extra work to ensure the table gets created before the workers start inserting tuples. > > While skimming through the patch, a small thing I noticed: > + /* > + * SELECT part of the CTAS is parallelizable, so we can make > + * each parallel worker insert the tuples that are resulted > + * in it's execution into the target table. > + */ > + if (!is_matview && > + IsA(plan->planTree, Gather)) > + ((DR_intorel *) dest)->is_parallel = true; > + > > I am not sure at this stage if this is the best way to make CTAS as > parallel but if so, then probably you can expand the comments a bit to > say why you consider only Gather node (and that too when it is the > top-most node) and why not another parallel node like GatherMerge? > If somebody expects to preserve the order of the tuples that are coming from GatherMerge node of the select part in CTAS or SELECT INTO while inserting, now if parallelism is allowed, that may not be the case i.e. the order of insertion of tuples may vary. I'm not quite sure, if someone wants to use order by in the select parts of CTAS or SELECT INTO in a real world use case. Thoughts? > > Right, for now, I think you can simply remove that check from the code > instead of just commenting it. We will see if there is a better > check/Assert we can add there. > Done. I also worked on some of the open points I listed earlier in my mail. > > 3. Need to restrict parallel inserts, if CTAS tries to create temp/global > tables as the workers will not have access to those tables. > Done. > > Need to analyze whether to allow parallelism if CTAS has prepared statements > or with no data. > For prepared statements, the parallelism will not be picked and so is parallel insertion. For CTAS with no data option case the select part is not even planned, and so the parallelism will also not be picked. > > 4. Need to stop unnecessary parallel shared state such as tuple queue being > created and shared to workers. > Done. I'm listing the things that are still pending. 1. How to represent the parallel insert for CTAS in explain plans? The explain CTAS shows the plan for only the SELECT part. How about having some textual info along with the Gather node? I'm not quite sure on this point, any suggestions are welcome. 2. Addition of new test cases. Testing with more scenarios and different data sets, sizes, tablespaces, select into. Analysis on the 2 mismatches in write_parallel.sql regression test. Attaching v2 patch, thoughts and comments are welcome. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v2-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch Description: Binary data
Re: Parallel Inserts in CREATE TABLE AS
On Mon, Sep 28, 2020 at 3:58 PM Bharath Rupireddy wrote: > > Thanks Andres for the comments. > > On Thu, Sep 24, 2020 at 8:11 AM Andres Freund wrote: > > > > > The design: > > > > I think it'd be good if you could explain a bit more why you think this > > safe to do in the way you have done it. > > > > E.g. from a quick scroll through the patch, there's not even a comment > > explaining that the only reason there doesn't need to be code dealing > > with xid assignment because we already did the catalog changes to create > > the table. > > > > Yes we do a bunch of catalog changes related to the created new table. > We will have both the txn id and command id assigned when catalogue > changes are being made. But, right after the table is created in the > leader, the command id is incremented (CommandCounterIncrement() is > called from create_ctas_internal()) whereas the txn id remains the > same. The new command id is marked as GetCurrentCommandId(true); in > intorel_startup, then the parallel mode is entered. The txn id and > command id are serialized into parallel DSM, they are then available > to all parallel workers. This is discussed in [1]. > > Few changes I have to make in the parallel worker code: set > currentCommandIdUsed = true;, may be via a common API > SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the > extra command id sharing from the leader to workers. > > I will add a few comments in the upcoming patch related to the above info. > Yes, that would be good. > > > > But how does that work for SELECT INTO? Are you prohibiting > > that? ... > > > > In case of SELECT INTO, a new table gets created and I'm not > prohibiting the parallel inserts and I think we don't need to. > So, in this case, also do we ensure that table is created before we launch the workers. If so, I think you can explain in comments about it and what you need to do that to ensure the same. While skimming through the patch, a small thing I noticed: + /* + * SELECT part of the CTAS is parallelizable, so we can make + * each parallel worker insert the tuples that are resulted + * in it's execution into the target table. + */ + if (!is_matview && + IsA(plan->planTree, Gather)) + ((DR_intorel *) dest)->is_parallel = true; + I am not sure at this stage if this is the best way to make CTAS as parallel but if so, then probably you can expand the comments a bit to say why you consider only Gather node (and that too when it is the top-most node) and why not another parallel node like GatherMerge? > Thoughts? > > > > > > Below things are still pending. Thoughts are most welcome: > > > 1. How better we can lift the "cannot insert tuples in a parallel worker" > > > from heap_prepare_insert() for only CTAS cases or for that matter parallel > > > copy? How about having a variable in any of the worker global contexts and > > > use that? Of course, we can remove this restriction entirely in case we > > > fully allow parallelism for INSERT INTO SELECT, CTAS, and COPY. > > > > And for the purpose of your question, we could then have a > > table_insert_allow_parallel(TableInsertScan *); > > or an additional arg to table_begin_insert(). > > > > Removing "cannot insert tuples in a parallel worker" restriction from > heap_prepare_insert() is a common problem for parallel inserts in > general, i.e. parallel inserts in CTAS, parallel INSERT INTO > SELECTs[1] and parallel copy[2]. It will be good if a common solution > is agreed. > Right, for now, I think you can simply remove that check from the code instead of just commenting it. We will see if there is a better check/Assert we can add there. -- With Regards, Amit Kapila.
Re: Parallel Inserts in CREATE TABLE AS
Thanks Andres for the comments. On Thu, Sep 24, 2020 at 8:11 AM Andres Freund wrote: > > > The design: > > I think it'd be good if you could explain a bit more why you think this > safe to do in the way you have done it. > > E.g. from a quick scroll through the patch, there's not even a comment > explaining that the only reason there doesn't need to be code dealing > with xid assignment because we already did the catalog changes to create > the table. > Yes we do a bunch of catalog changes related to the created new table. We will have both the txn id and command id assigned when catalogue changes are being made. But, right after the table is created in the leader, the command id is incremented (CommandCounterIncrement() is called from create_ctas_internal()) whereas the txn id remains the same. The new command id is marked as GetCurrentCommandId(true); in intorel_startup, then the parallel mode is entered. The txn id and command id are serialized into parallel DSM, they are then available to all parallel workers. This is discussed in [1]. Few changes I have to make in the parallel worker code: set currentCommandIdUsed = true;, may be via a common API SetCurrentCommandIdUsedForWorker() proposed in [1] and remove the extra command id sharing from the leader to workers. I will add a few comments in the upcoming patch related to the above info. > > But how does that work for SELECT INTO? Are you prohibiting > that? ... > In case of SELECT INTO, a new table gets created and I'm not prohibiting the parallel inserts and I think we don't need to. Thoughts? > > > Below things are still pending. Thoughts are most welcome: > > 1. How better we can lift the "cannot insert tuples in a parallel worker" > > from heap_prepare_insert() for only CTAS cases or for that matter parallel > > copy? How about having a variable in any of the worker global contexts and > > use that? Of course, we can remove this restriction entirely in case we > > fully allow parallelism for INSERT INTO SELECT, CTAS, and COPY. > > And for the purpose of your question, we could then have a > table_insert_allow_parallel(TableInsertScan *); > or an additional arg to table_begin_insert(). > Removing "cannot insert tuples in a parallel worker" restriction from heap_prepare_insert() is a common problem for parallel inserts in general, i.e. parallel inserts in CTAS, parallel INSERT INTO SELECTs[1] and parallel copy[2]. It will be good if a common solution is agreed. > > > 3. Need to restrict parallel inserts, if CTAS tries to create temp/global > > tables as the workers will not have access to those tables. Need to analyze > > whether to allow parallelism if CTAS has prepared statements or with no > > data. > > In which case does CTAS not create a table? AFAICS, the table gets created in all the cases but the insertion of the data gets skipped if the user specifies "with no data" option in which case the select part is not even planned, and so the parallelism will also not be picked. > > You definitely need to > ensure that the table is created before your workers are started, and > there needs to be in a different CommandId. > Yeah, this is already being done. Table gets created in the leader(intorel_startup which gets called from dest->rStartup(dest in standard_ExecutorRun()) before entering the parallel mode. [1] https://www.postgresql.org/message-id/CAJcOf-fn1nhEtaU91NvRuA3EbvbJGACMd4_c%2BUu3XU5VMv37Aw%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAA4eK1%2BkpddvvLxWm4BuG_AhVvYz8mKAEa7osxp_X0d4ZEiV%3Dg%40mail.gmail.com With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Parallel Inserts in CREATE TABLE AS
Hi, On 2020-09-23 17:20:20 +0530, Bharath Rupireddy wrote: > The idea of this patch is to allow the leader and each worker insert the > tuples in parallel if the SELECT part of the CTAS is parallelizable. Cool! > The design: I think it'd be good if you could explain a bit more why you think this safe to do in the way you have done it. E.g. from a quick scroll through the patch, there's not even a comment explaining that the only reason there doesn't need to be code dealing with xid assignment because we already did the catalog changes to create the table. But how does that work for SELECT INTO? Are you prohibiting that? ... > Pass the into clause, object id, command id from the leader to > workers, so that each worker can create its own CTAS dest > receiver. Leader inserts it's share of tuples if instructed to do, and > so are workers. Each worker writes atomically it's number of inserted > tuples into a shared memory variable, the leader combines this with > it's own number of inserted tuples and shares to the client. > > Below things are still pending. Thoughts are most welcome: > 1. How better we can lift the "cannot insert tuples in a parallel worker" > from heap_prepare_insert() for only CTAS cases or for that matter parallel > copy? How about having a variable in any of the worker global contexts and > use that? Of course, we can remove this restriction entirely in case we > fully allow parallelism for INSERT INTO SELECT, CTAS, and COPY. I have mentioned before that I think it'd be good if we changed the insert APIs to have a more 'scan' like structure. I am thinking of something like TableInsertScan* table_begin_insert(Relation); table_tuple_insert(TableInsertScan *is, other, args); table_multi_insert(TableInsertScan *is, other, args); table_end_insert(TableInsertScan *); that'd then replace the BulkInsertStateData logic we have right now. But more importantly it'd allow an AM to optimize operations across multiple inserts, which is important for column stores. And for the purpose of your question, we could then have a table_insert_allow_parallel(TableInsertScan *); or an additional arg to table_begin_insert(). > 3. Need to restrict parallel inserts, if CTAS tries to create temp/global > tables as the workers will not have access to those tables. Need to analyze > whether to allow parallelism if CTAS has prepared statements or with no > data. In which case does CTAS not create a table? You definitely need to ensure that the table is created before your workers are started, and there needs to be in a different CommandId. Greetings, Andres Freund
Re: Parallel Inserts in CREATE TABLE AS
> > [1] - For table_multi_insert() in CTAS, I used an in-progress patch available > at > https://www.postgresql.org/message-id/CAEET0ZG31mD5SWjTYsAt0JTLReOejPvusJorZ3kGZ1%3DN1AC-Fw%40mail.gmail.com > [2] - Table with 2 integer columns, 100million tuples, with leader > participation,with default postgresql.conf file. All readings are of triplet > form - (workers, exec time in sec, improvement). > case 1: no multi inserts - > (0,120,1X),(1,91,1.32X),(2,75,1.6X),(3,67,1.79X),(4,72,1.66X),(5,77,1.56),(6,83,1.44X) > case 2: with multi inserts - > (0,59,1X),(1,32,1.84X),(2,28,2.1X),(3,25,2.36X),(4,23,2.56X),(5,22,2.68X),(6,22,2.68X) > case 3: same table but unlogged with multi inserts - > (0,50,1X),(1,28,1.78X),(2,25,2X),(3,22,2.27X),(4,21,2.38X),(5,21,2.38X),(6,20,2.5X) > I feel this enhancement could give good improvement, +1 for this. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com