On 25-11-2020 03:40, Bharath Rupireddy wrote:
On Tue, Nov 24, 2020 at 4:43 PM Hou, Zhijie <houzj.f...@cn.fujitsu.com> wrote:

I'm very interested in this feature,
and I'm looking at the patch, here are some comments.


Thanks for the review.


How about the following style:

                 if(TupIsNull(outerTupleSlot))
                         Break;

                 (void) node->ps.dest->receiveSlot(outerTupleSlot, 
node->ps.dest);
                 node->ps.state->es_processed++;

Which looks cleaner.


Done.


The check can be replaced by ISCTAS(into).


Done.


'inerst' looks like a typo (insert).


Corrected.


The code here call strlen(intoclausestr) for two times,
After checking the existing code in ExecInitParallelPlan,
It used to store the strlen in a variable.

So how about the following style:

         intoclause_len = strlen(intoclausestr);
         ...
         /* Store serialized intoclause. */
         intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1);
         memcpy(shmptr, intoclausestr, intoclause_len + 1);
         shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space);


Done.


The two check about intoclausestr seems can be combined like:

if (intoclausestr != NULL)
{
...
}
else
{
...
}


Done.

Attaching v5 patch. Please consider it for further review.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Disclaimer: I have by no means throughly reviewed all the involved parts and am probably missing quite a bit of context so if I understood parts wrong or they have been discussed before then I'm sorry. Most notably the whole situation about the command-id is still elusive for me and I can really not judge yet anything related to that.

IMHO The patch makes that we now have the gather do most of the CTAS work, which seems unwanted. For the non-ctas insert/update case it seems that a modifytable node exists to actually do the work. What I'm wondering is if it is maybe not better to introduce a CreateTable node as well?
This would have several merits:
- the rowcount of that node would be 0 for the parallel case, and non-zero for the serial case. Then the gather ndoe and the Query struct don't have to know about CTAS for the most part, removing e.g. the case distinctions in cost_gather. - the inserted rows can now be accounted in this new node instead of the parallel executor state, and this node can also do its own DSM intializations - the generation of a partial variants of the CreateTable node can now be done in the optimizer instead of the ExecCreateTableAs which IMHO is a more logical place to make these kind of decisions. which then also makes it potentially play nicer with costs and the like. - the explain code can now be in its own place instead of part of the gather node - IIUC it would allow the removal of the code to only launch parallel workers if its not CTAS, which IMHO would be quite a big benefit.

Thoughts?

Some small things I noticed while going through the patch:
- Typo for the comment about "inintorel_startup" which should be intorel_startup - if (node->nworkers_launched == 0 && !node->need_to_scan_locally)
  can be changed into
  if (node->nworkers_launched == 0
  because either way it'll be true.

Regards,
Luc
Swarm64


Reply via email to