On Wed, Feb 14, 2018 at 5:37 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > Your concern is valid, but isn't the same problem exists in another > approach as well, because in that also we can call > adjust_paths_for_srfs after generating gather path which means that we > might lose some opportunity to reduce the relative cost of parallel > paths due to tlists having SRFs. Also, a similar problem can happen > in create_order_paths for the cases as described in the example > above.
You're right. I think cleaning all of this up for v11 is too much to consider, but please tell me your opinion of the attached patch set. Here, instead of the ripping the problematic logic out of apply_projection_to_path, what I've done is just removed several of the callers to apply_projection_to_path. I think that the work of removing other callers to that function could be postponed to a future release, but we'd still get some benefit now, and this shows the direction I have in mind. I'm going to explain what the patches do one by one, but out of order, because I backed into the need for the earlier patches as a result of troubleshooting the later ones in the series. Note that the patches need to be applied in order even though I'm explaining them out of order. 0003 introduces a new upper relation to represent the result of applying the scan/join target to the topmost scan/join relation. I'll explain further down why this seems to be needed. Since each path must belong to only one relation, this involves using create_projection_path() for the non-partial pathlist as we already do for the partial pathlist, rather than apply_projection_to_path(). This is probably marginally more expensive but I'm hoping it doesn't matter. (However, I have not tested.) Each non-partial path in the topmost scan/join rel produces a corresponding path in the new upper rel. The same is done for partial paths if the scan/join target is parallel-safe; otherwise we can't. 0004 causes the main part of the planner to skip calling generate_gather_paths() from the topmost scan/join rel. This logic is mostly stolen from your patch. If the scan/join target is NOT parallel-safe, we perform generate_gather_paths() on the topmost scan/join rel. If the scan/join target IS parallel-safe, we do it on the post-projection rel introduced by 0003 instead. This is the patch that actually fixes Jeff's original complaint. 0005 goes a bit further and removes a bunch of logic from create_ordered_paths(). The removed logic tries to satisfy the query's ordering requirement via cheapest_partial_path + Sort + Gather Merge. Instead, it adds an additional path to the new upper rel added in 0003. This again avoids a call to apply_projection_to_path() which could cause projection to be pushed down after costing has already been fixed. Therefore, it gains the same advantages as 0004 for queries that would this sort of plan. Currently, this loses the ability to set limit_tuples for the Sort path; that doesn't look too hard to fix but I haven't done it. If we decide to proceed with this overall approach I'll see about getting it sorted out. Unfortunately, when I initially tried this approach, a number of things broke due to the fact that create_projection_path() was not exactly equivalent to apply_projection_to_path(). This initially surprised me, because create_projection_plan() contains logic to eliminate the Result node that is very similar to the logic in apply_projection_to_path(). If apply_projection_path() sees that the subordinate node is projection-capable, it applies the revised target list to the child; if not, it adds a Result node. create_projection_plan() does the same thing. However, create_projection_plan() can lose the physical tlist optimization for the subordinate node; it forces an exact projection even if the parent node doesn't require this. 0001 fixes this, so that we get the same plans with create_projection_path() that we would with apply_projection_to_path(). I think this patch is a good idea regardless of what we decide to do about the rest of this, because it can potentially avoid losing the physical-tlist optimization in any place where create_projection_path() is used. It turns out that 0001 doesn't manage to preserve the physical-tlist optimization when the projection path is attached to an upper relation. 0002 fixes this. If we decide to go forward with this approach, it may makes sense to merge some of these when actually committing, but I found it useful to separate them for development and testing purposes, and for clarity about what was getting changed at each stage. Please let me know your thoughts. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
0005-Remove-explicit-path-construction-logic-in-create_or.patch
Description: Binary data
0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data
0003-Add-new-upper-rel-to-represent-projecting-toplevel-s.patch
Description: Binary data
0002-Adjust-use_physical_tlist-to-work-on-upper-rels.patch
Description: Binary data
0001-Teach-create_projection_plan-to-omit-projection-wher.patch
Description: Binary data