Hello everyone in this thread!

On 29-11-2017 8:01, Michael Paquier wrote:
Moved to next CF for extra reviews.

Amit, I would like to ask some questions about your patch (and can you please rebase it on the top of the master?):

1)
+ path->total_cost -= (target->cost.per_tuple - oldcost.per_tuple) * path->rows;

Here we undo the changes that we make in this function earlier:

path->total_cost += target->cost.startup - oldcost.startup +
        (target->cost.per_tuple - oldcost.per_tuple) * path->rows;

Perhaps we should not start these "reversible" changes in this case from the very beginning?

2)
                gpath->subpath = (Path *)
                        create_projection_path(root,
                                                                   
gpath->subpath->parent,
                                                                   
gpath->subpath,
                                                                   target);
...
+               if (((ProjectionPath *) gpath->subpath)->dummypp)
+               {
...
+ path->total_cost += (target->cost.per_tuple - oldcost.per_tuple) * gpath->subpath->rows;
+               }
+               else
+               {
...
+ path->total_cost += (cpu_tuple_cost + target->cost.per_tuple) * gpath->subpath->rows;
+               }

As I understand it, here in the if-else block we change the run costs of gpath in the same way as they were changed for its subpath in the function create_projection_path earlier. But for the startup costs we always subtract the cost of the old target:

path->startup_cost += target->cost.startup - oldcost.startup;
path->total_cost += target->cost.startup - oldcost.startup +
        (target->cost.per_tuple - oldcost.per_tuple) * path->rows;

Should we change the startup costs of gpath in this way if ((ProjectionPath *) gpath->subpath)->dummypp is false?

3)
        simple_gather_path = (Path *)
                create_gather_path(root, rel, cheapest_partial_path, 
rel->reltarget,
                                                   NULL, NULL);
+       /* Add projection step if needed */
+       if (target && simple_gather_path->pathtarget != target)
+ simple_gather_path = apply_projection_to_path(root, rel, simple_gather_path, target);
...
+ path = (Path *) create_gather_merge_path(root, rel, subpath, rel->reltarget,
+                                                                                  
              subpath->pathkeys, NULL, NULL);
+               /* Add projection step if needed */
+               if (target && path->pathtarget != target)
+                       path = apply_projection_to_path(root, rel, path, 
target);
...
@@ -2422,16 +2422,27 @@ apply_projection_to_path(PlannerInfo *root,
...
                gpath->subpath = (Path *)
                        create_projection_path(root,
                                                                   
gpath->subpath->parent,
                                                                   
gpath->subpath,
                                                                   target);

The target is changing so we change it for the gather(merge) node and for its subpath. Do not we have to do this work (replace the subpath by calling the function create_projection_path if the target is different) in the functions create_gather(_merge)_path too? I suppose that the target of the subpath affects its costs => the costs of the gather(merge) node in the functions cost_gather(_merge) (=> the costs of the gather(merge) node in the function apply_projection_to_path).

4)
+                * Consider ways to implement parallel paths.  We always skip
+                * generating parallel path for top level scan/join nodes till 
the
+                * pathtarget is computed.  This is to ensure that we can 
account for
+                * the fact that most of the target evaluation work will be 
performed
+                * in workers.
+                */
+               generate_gather_paths(root, current_rel, scanjoin_target);
+
+               /* Set or update cheapest_total_path and related fields */
+               set_cheapest(current_rel);

As I understand it (if correctly, thank the comments of Robert Haas and Tom Lane :), after that we cannot use the function apply_projection_to_path for paths in the current_rel->pathlist without risking that the cheapest path will change. And we have several calls to the function adjust_paths_for_srfs (which uses apply_projection_to_path for paths in the current_rel->pathlist) in grouping_planner after generating the gather paths:

                /* Now fix things up if scan/join target contains SRFs */
                if (parse->hasTargetSRFs)
                        adjust_paths_for_srfs(root, current_rel,
                                                                  
scanjoin_targets,
                                                                  
scanjoin_targets_contain_srfs);
...
                        /* Fix things up if grouping_target contains SRFs */
                        if (parse->hasTargetSRFs)
                                adjust_paths_for_srfs(root, current_rel,
                                                                          
grouping_targets,
                                                                          
grouping_targets_contain_srfs);
...
                        /* Fix things up if sort_input_target contains SRFs */
                        if (parse->hasTargetSRFs)
                                adjust_paths_for_srfs(root, current_rel,
                                                                          
sort_input_targets,
                                                                          
sort_input_targets_contain_srfs);
...
                /* Fix things up if final_target contains SRFs */
                if (parse->hasTargetSRFs)
                        adjust_paths_for_srfs(root, current_rel,
                                                                  final_targets,
                                                                  
final_targets_contain_srfs);

Maybe we should add the appropriate call to the function set_cheapest() after this if parse->hasTargetSRFs is true?

5)
+ * If this is a baserel and not the top-level rel, consider gathering any
+        * partial paths we may have created for it.  (If we tried to gather
+        * inheritance children, we could end up with a very large number of
+ * gather nodes, each trying to grab its own pool of workers, so don't do + * this for otherrels. Instead, we'll consider gathering partial paths
+        * for the parent appendrel.).  We can check for joins by counting the
+ * membership of all_baserels (note that this correctly counts inheritance + * trees as single rels). The gather path for top-level rel is generated
+        * once path target is available.  See grouping_planner.
         */
-       if (rel->reloptkind == RELOPT_BASEREL)
-               generate_gather_paths(root, rel);
+       if (rel->reloptkind == RELOPT_BASEREL &&
+               bms_membership(root->all_baserels) != BMS_SINGLETON)
+               generate_gather_paths(root, rel, NULL);

Do I understand correctly that here there's an assumption about root->query_level (which we don't need to check)?

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to