On Mon, Jun 13, 2016 at 4:46 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> In practice, we don't yet have the ability for
>> parallel-safe paths from subqueries to affect planning at higher query
>> levels, but that's because the pathification stuff landed too late in
>> the cycle for me to fully react to it.
>
> I wonder if that's not just from confusion between subplans and
> subqueries.  I don't believe any of the claims made in the comment
> adjusted in the patch below (other than "Subplans currently aren't passed
> to workers", which is true but irrelevant).  Nested Gather nodes is
> something that you would need, and already have, a defense for anyway.

I think you may be correct.

One problem that I've realized that is related to this is that the way
that the consider_parallel flag is being set for upper rels is almost
totally bogus, which I believe accounts for your complaint at PGCon
that force_parallel_mode was not doing as much as it ought to do.
When I originally wrote a lot of this logic, there were no upper rels,
which led to this code:

        if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)
        {
                glob->parallelModeNeeded = false;
                glob->wholePlanParallelSafe = false;    /* either
false or don't care */
        }
        else
        {
                glob->parallelModeNeeded = true;
                glob->wholePlanParallelSafe =
                        !has_parallel_hazard((Node *) parse, false);
        }

The main way that wholePlanParallelSafe is supposed to be cleared is
in create_plan:

        /* Update parallel safety information if needed. */
        if (!best_path->parallel_safe)
                root->glob->wholePlanParallelSafe = false;

However, at the time that code was written, it was impossible to rely
entirely on the latter mechanism.  Since there were no upper rels and
no paths for upper plan nodes, you could have the case where every
path was parallel-safe but the whole plan was node.  Therefore the
code shown above was needed to scan the whole darned plan for
parallel-unsafe things.  Post-pathification, this whole thing is
pretty bogus: upper rels mostly don't get consider_parallel set at
all, which means plans involving upper rels get marked parallel-unsafe
even if they are not, which means the wholePlanParallelSafe flag gets
cleared in a bunch of cases where it shouldn't.  And on the flip side
I think that the first chunk of code quoted above would be totally
unnecessary if we were actually setting consider_parallel correctly on
the upper rels.

I started working on a patch to fix all this, which I'm attaching here
so you can see what I'm thinking.  I am not sure it's correct, but it
does cause force_parallel_mode to do something interesting in many
more cases.

Anyway, the reason this is related to the issue at hand is that you
might have something like A LEFT JOIN (SELECT x, sum(q) FROM B GROUP
BY 1) ON A.x = B.x.   Right now, I think that the paths for the
aggregated subquery will always be marked as not parallel-safe, but
that's only because the consider_parallel flag on the grouped rel
isn't being set properly.  If it were, you could theoretically do a
parallel seq scan on A and then have each worker evaluate the join for
the rows that pop out of the subquery.  Maybe that's a stupid example,
but the point is that I bet there are cases where failing to mark the
upper rels with consider_parallel where appropriate causes good
parallel plans not to get generated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b1554cb..8232a64 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -107,9 +107,6 @@ static double get_number_of_groups(PlannerInfo *root,
 					 double path_rows,
 					 List *rollup_lists,
 					 List *rollup_groupclauses);
-static void set_grouped_rel_consider_parallel(PlannerInfo *root,
-					 RelOptInfo *grouped_rel,
-					 PathTarget *target);
 static Size estimate_hashagg_tablesize(Path *path, AggClauseCosts *agg_costs,
 					 double dNumGroups);
 static RelOptInfo *create_grouping_paths(PlannerInfo *root,
@@ -272,8 +269,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
 	else
 	{
 		glob->parallelModeNeeded = true;
-		glob->wholePlanParallelSafe =
-			!has_parallel_hazard((Node *) parse, false);
+		glob->wholePlanParallelSafe = true;
 	}
 
 	/* Determine what fraction of the plan is likely to be scanned */
@@ -1878,6 +1874,17 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 	 */
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
 
+	/*
+	 * We can just inherit the current_rel's consider_parallel flag; nothing
+	 * that happens here can switch us from being parallel-safe path to
+	 * being parallel-restricted.  (If there's something parallel-unsafe in
+	 * the query, all of these flags will be false anyway.)
+	 */
+	final_rel->consider_parallel = current_rel->consider_parallel;
+
+	/*
+	 * Generate paths for the final rel.
+	 */
 	foreach(lc, current_rel->pathlist)
 	{
 		Path	   *path = (Path *) lfirst(lc);
@@ -3157,56 +3164,6 @@ get_number_of_groups(PlannerInfo *root,
 }
 
 /*
- * set_grouped_rel_consider_parallel
- *	  Determine if it's safe to generate partial paths for grouping.
- */
-static void
-set_grouped_rel_consider_parallel(PlannerInfo *root, RelOptInfo *grouped_rel,
-								  PathTarget *target)
-{
-	Query	   *parse = root->parse;
-
-	Assert(grouped_rel->reloptkind == RELOPT_UPPER_REL);
-
-	/*
-	 * If there are no aggregates or GROUP BY clause, then no parallel
-	 * aggregation is possible.  At present, it doesn't matter whether
-	 * consider_parallel gets set in this case, because none of the upper rels
-	 * on top of this one try to set the flag or examine it, so we just bail
-	 * out as quickly as possible.  We might need to be more clever here in
-	 * the future.
-	 */
-	if (!parse->hasAggs && parse->groupClause == NIL)
-		return;
-
-	/*
-	 * Similarly, bail out quickly if GROUPING SETS are present; we can't
-	 * support those at present.
-	 */
-	if (parse->groupingSets)
-		return;
-
-	/*
-	 * If parallel-restricted functions are present in the target list or the
-	 * HAVING clause, we cannot safely go parallel.
-	 */
-	if (has_parallel_hazard((Node *) target->exprs, false) ||
-		has_parallel_hazard((Node *) parse->havingQual, false))
-		return;
-
-	/*
-	 * All that's left to check now is to make sure all aggregate functions
-	 * support partial mode. If there's no aggregates then we can skip checking
-	 * that.
-	 */
-	if (!parse->hasAggs)
-		grouped_rel->consider_parallel = true;
-	else if (aggregates_allow_partial((Node *) target->exprs) == PAT_ANY &&
-			 aggregates_allow_partial(root->parse->havingQual) == PAT_ANY)
-		grouped_rel->consider_parallel = true;
-}
-
-/*
  * estimate_hashagg_tablesize
  *	  estimate the number of bytes that a hash aggregate hashtable will
  *	  require based on the agg_costs, path width and dNumGroups.
@@ -3267,6 +3224,7 @@ create_grouping_paths(PlannerInfo *root,
 	double		dNumPartialGroups = 0;
 	bool		can_hash;
 	bool		can_sort;
+	bool		try_parallel_path;
 
 	ListCell   *lc;
 
@@ -3274,6 +3232,16 @@ create_grouping_paths(PlannerInfo *root,
 	grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
 
 	/*
+	 * If the input relation is not parallel-safe, then the grouped relation
+	 * can't be parallel-safe, either.  Otherwise, it's parallel-safe if the
+	 * target list and HAVING quals are parallel-safe.
+	 */
+	if (input_rel->consider_parallel &&
+		!has_parallel_hazard((Node *) target->exprs, false) &&
+		!has_parallel_hazard((Node *) parse->havingQual, false))
+		grouped_rel->consider_parallel = true;
+
+	/*
 	 * Check for degenerate grouping.
 	 */
 	if ((root->hasHavingQual || parse->groupingSets) &&
@@ -3361,14 +3329,6 @@ create_grouping_paths(PlannerInfo *root,
 									  rollup_groupclauses);
 
 	/*
-	 * Partial paths in the input rel could allow us to perform aggregation in
-	 * parallel. set_grouped_rel_consider_parallel() will determine if it's
-	 * going to be safe to do so.
-	 */
-	if (input_rel->partial_pathlist != NIL)
-		set_grouped_rel_consider_parallel(root, grouped_rel, target);
-
-	/*
 	 * Determine whether it's possible to perform sort-based implementations
 	 * of grouping.  (Note that if groupClause is empty, grouping_is_sortable()
 	 * is trivially true, and all the pathkeys_contained_in() tests will
@@ -3399,6 +3359,52 @@ create_grouping_paths(PlannerInfo *root,
 				grouping_is_hashable(parse->groupClause));
 
 	/*
+	 * If grouped_rel->consider_parallel is true, then paths that we generate
+	 * for this grouping relation could be run inside of a worker, but that
+	 * doesn't mean we can actually use the PartialAggregate/FinalizeAggregate
+	 * execution strategy.  Figure that out.
+	 */
+	if (!grouped_rel->consider_parallel)
+	{
+		/* Not even parallel-safe. */
+		try_parallel_path = false;
+	}
+	else if (input_rel->partial_pathlist == NIL)
+	{
+		/* Nothing to use as input for partial aggregate. */
+		try_parallel_path = false;
+	}
+	else if (!parse->hasAggs && parse->groupClause == NIL)
+	{
+		/*
+		 * We don't know how to do parallel aggregation unless we have
+		 * either some aggregates or a grouping clause.
+		 */
+		try_parallel_path = false;
+	}
+	else if (parse->groupingSets)
+	{
+		/* We don't know how to do grouping sets in parallel. */
+		try_parallel_path = false;
+	}
+	else if (!parse->hasAggs)
+	{
+		/* If no aggregates, we need not check partial mode support. */
+		try_parallel_path = true;
+	}
+	else if (aggregates_allow_partial((Node *) target->exprs) != PAT_ANY ||
+			 aggregates_allow_partial(root->parse->havingQual) != PAT_ANY)
+	{
+		/* Insufficient support for partial mode. */
+		try_parallel_path = false;
+	}
+	else
+	{
+		/* Everything looks good. */
+		try_parallel_path = true;
+	}
+
+	/*
 	 * Before generating paths for grouped_rel, we first generate any possible
 	 * partial paths; that way, later code can easily consider both parallel
 	 * and non-parallel approaches to grouping.  Note that the partial paths
@@ -3406,7 +3412,7 @@ create_grouping_paths(PlannerInfo *root,
 	 * Gather node on top is insufficient to create a final path, as would be
 	 * the case for a scan/join rel.
 	 */
-	if (grouped_rel->consider_parallel)
+	if (try_parallel_path)
 	{
 		Path   *cheapest_partial_path = linitial(input_rel->partial_pathlist);
 
@@ -3445,7 +3451,7 @@ create_grouping_paths(PlannerInfo *root,
 
 		if (can_sort)
 		{
-			/* Checked in set_grouped_rel_consider_parallel() */
+			/* This was checked before setting try_parallel_path */
 			Assert(parse->hasAggs || parse->groupClause);
 
 			/*
@@ -3791,6 +3797,15 @@ create_window_paths(PlannerInfo *root,
 	window_rel = fetch_upper_rel(root, UPPERREL_WINDOW, NULL);
 
 	/*
+	 * If the input relation is not parallel-safe, then the window relation
+	 * can't be parallel-safe, either.  Otherwise, it's parallel-safe if the
+	 * output target list is parallel-safe.
+	 */
+	if (input_rel->consider_parallel &&
+		!has_parallel_hazard((Node *) output_target->exprs, false))
+		window_rel->consider_parallel = true;
+
+	/*
 	 * Consider computing window functions starting from the existing
 	 * cheapest-total path (which will likely require a sort) as well as any
 	 * existing paths that satisfy root->window_pathkeys (which won't).
@@ -3945,6 +3960,15 @@ create_distinct_paths(PlannerInfo *root,
 	/* For now, do all work in the (DISTINCT, NULL) upperrel */
 	distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
 
+	/*
+	 * If the input relation is not parallel-safe, then the window relation
+	 * can't be parallel-safe, either.  Otherwise, it is.  (We assume that
+	 * the relevant hashing or ordering operators will always be
+	 * parallel-safe.)
+	 */
+	if (input_rel->consider_parallel)
+		distinct_rel->consider_parallel = true;
+
 	/* Estimate number of distinct rows there will be */
 	if (parse->groupClause || parse->groupingSets || parse->hasAggs ||
 		root->hasHavingQual)
@@ -4130,6 +4154,15 @@ create_ordered_paths(PlannerInfo *root,
 	/* For now, do all work in the (ORDERED, NULL) upperrel */
 	ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
 
+	/*
+	 * If the input relation is not parallel-safe, then the ordered relation
+	 * can't be parallel-safe, either.  Otherwise, it's parallel-safe if the
+	 * target list is parallel-safe.
+	 */
+	if (input_rel->consider_parallel &&
+		!has_parallel_hazard((Node *) target->exprs, false))
+		ordered_rel->consider_parallel = true;
+
 	foreach(lc, input_rel->pathlist)
 	{
 		Path	   *path = (Path *) lfirst(lc);
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 6182b36..4e8a3b3 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2271,6 +2271,12 @@ apply_projection_to_path(PlannerInfo *root,
 								   gpath->subpath,
 								   target);
 	}
+	else if (!IsA(path, GatherPath) &&
+			has_parallel_hazard((Node *) target->exprs, false))
+	{
+		/* Uggh, having to do this always kind of bites. */
+		path->parallel_safe = false;
+	}
 
 	return path;
 }
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to