On Fri, Dec 31, 2021 at 7:56 AM Amit Langote <[email protected]> wrote:
>
> On Tue, Dec 28, 2021 at 22:12 Ashutosh Bapat <[email protected]>
> wrote:
>>
>> On Sat, Dec 25, 2021 at 9:06 AM Amit Langote <[email protected]> wrote:
>> >
>> > Executing generic plans involving partitions is known to become slower
>> > as partition count grows due to a number of bottlenecks, with
>> > AcquireExecutorLocks() showing at the top in profiles.
>> >
>> > Previous attempt at solving that problem was by David Rowley [1],
>> > where he proposed delaying locking of *all* partitions appearing under
>> > an Append/MergeAppend until "initial" pruning is done during the
>> > executor initialization phase. A problem with that approach that he
>> > has described in [2] is that leaving partitions unlocked can lead to
>> > race conditions where the Plan node belonging to a partition can be
>> > invalidated when a concurrent session successfully alters the
>> > partition between AcquireExecutorLocks() saying the plan is okay to
>> > execute and then actually executing it.
>> >
>> > However, using an idea that Robert suggested to me off-list a little
>> > while back, it seems possible to determine the set of partitions that
>> > we can safely skip locking. The idea is to look at the "initial" or
>> > "pre-execution" pruning instructions contained in a given Append or
>> > MergeAppend node when AcquireExecutorLocks() is collecting the
>> > relations to lock and consider relations from only those sub-nodes
>> > that survive performing those instructions. I've attempted
>> > implementing that idea in the attached patch.
>> >
>>
>> In which cases, we will have "pre-execution" pruning instructions that
>> can be used to skip locking partitions? Can you please give a few
>> examples where this approach will be useful?
>
>
> This is mainly to be useful for prepared queries, so something like:
>
> prepare q as select * from partitioned_table where key = $1;
>
> And that too when execute q(…) uses a generic plan. Generic plans are
> problematic because it must contain nodes for all partitions (without any
> plan time pruning), which means CheckCachedPlan() has to spend time
> proportional to the number of partitions to determine that the plan is still
> usable / has not been invalidated; most of that is AcquireExecutorLocks().
>
> Other bottlenecks, not addressed in this patch, pertain to some executor
> startup/shutdown subroutines that process the range table of a PlannedStmt in
> its entirety, whose length is also proportional to the number of partitions
> when the plan is generic.
>
>> The benchmark is showing good results, indeed.
>
Indeed.
Here are few comments for v1 patch:
+ /* Caller error if we get here without contains_init_steps */
+ Assert(pruneinfo->contains_init_steps);
- prunedata = prunestate->partprunedata[i];
- pprune = &prunedata->partrelprunedata[0];
- /* Perform pruning without using PARAM_EXEC Params */
- find_matching_subplans_recurse(prunedata, pprune, true, &result);
+ if (parentrelids)
+ *parentrelids = NULL;
You got two blank lines after Assert.
--
+ /* Set up EState if not in the executor proper. */
+ if (estate == NULL)
+ {
+ estate = CreateExecutorState();
+ estate->es_param_list_info = params;
+ free_estate = true;
}
... [Skip]
+ if (free_estate)
+ {
+ FreeExecutorState(estate);
+ estate = NULL;
}
I think this work should be left to the caller.
--
/*
* Stuff that follows matches exactly what ExecCreatePartitionPruneState()
* does, except we don't need a PartitionPruneState here, so don't call
* that function.
*
* XXX some refactoring might be good.
*/
+1, while doing it would be nice if foreach_current_index() is used
instead of the i & j sequence in the respective foreach() block, IMO.
--
+ while ((i = bms_next_member(validsubplans, i)) >= 0)
+ {
+ Plan *subplan = list_nth(subplans, i);
+
+ context->relations =
+ bms_add_members(context->relations,
+ get_plan_scanrelids(subplan));
+ }
I think instead of get_plan_scanrelids() the
GetLockableRelations_worker() can be used; if so, then no need to add
get_plan_scanrelids() function.
--
/* Nodes containing prunable subnodes. */
+ case T_MergeAppend:
+ {
+ PlannedStmt *plannedstmt = context->plannedstmt;
+ List *rtable = plannedstmt->rtable;
+ ParamListInfo params = context->params;
+ PartitionPruneInfo *pruneinfo;
+ Bitmapset *validsubplans;
+ Bitmapset *parentrelids;
...
if (pruneinfo && pruneinfo->contains_init_steps)
{
int i;
...
return false;
}
}
break;
Most of the declarations need to be moved inside the if-block.
Also, initially, I was a bit concerned regarding this code block
inside GetLockableRelations_worker(), what if (pruneinfo &&
pruneinfo->contains_init_steps) evaluated to false? After debugging I
realized that plan_tree_walker() will do the needful -- a bit of
comment would have helped.
--
+ case T_CustomScan:
+ foreach(lc, ((CustomScan *) plan)->custom_plans)
+ {
+ if (walker((Plan *) lfirst(lc), context))
+ return true;
+ }
+ break;
Why not plan_walk_members() call like other nodes?
Regards,
Amul