Hi Robert,

I've been reviewing the v17-0003 patch and learned a lot along the
way.  The design is well thought out and the implementation looks
solid.  I only have a few minor comments and questions.

While reading the patch, I noticed several places where we infer
planner intent from currently observed plan shapes (e.g., Gather
projection, Agg used for semijoin uniqueness, Append elision, etc.),
which I quote below:

In pgpa_decompose_join():
> /*
>  * Can we see a Result node here, to project above a Gather? So far I've
>  * found no example that behaves that way; rather, the Gather or Gather
>  * Merge is made to project. Hence, don't test is_result_node_with_child()
>  * at this point.
>  */

In pgpa_descend_any_unique():
> else if (IsA(*plan, Agg))
> {
> /*
> * If this is a simple Agg node, then assume it's here to implement
> * semijoin uniqueness. Otherwise, assume it's completing an eager
> * aggregation or partitionwise aggregation operation that began at a
> * higher level of the plan tree.
> *
> * (Note that when we're using an Agg node for uniqueness, there's no
> * need for any case other than AGGSPLIT_SIMPLE, because there's no
> * aggregated column being * computed. However, the fact that
> * AGGSPLIT_SIMPLE is in use doesn't prove that this Agg is here for
> * the semijoin uniqueness. Maybe we should adjust an Agg node to
> * carry a "purpose" field so that code like this can be more certain
> * of its analysis.)
> */
> descend = true;
> sjunique = (((Agg *) *plan)->aggsplit == AGGSPLIT_SIMPLE);
> }

In pgpa_walk_recursively():
>  * Exception: We disregard any single_copy Gather nodes. These are created
>  * by debug_parallel_query, and having them affect the plan advice is
>  * counterproductive, as the result will be to advise the use of a real
>  * Gather node, rather than a single copy one.
>  */
> if (IsA(plan, Gather) && !((Gather *) plan)->single_copy)
> {
> active_query_features =
> lappend(list_copy(active_query_features),
> pgpa_add_feature(walker, PGPAQF_GATHER, plan));
> beneath_any_gather = true;
> }

In pgpa_build_scan():
> /*
>  * If setrefs processing elided an Append or MergeAppend node that had
>  * only one surviving child, it might be a partitionwise operation,
>  * but then this is either a setop over subqueries, or a partitionwise
>  * operation (which might be a scan or a join in reality, but here we
>  * don't care about the distinction and consider it simply a scan).
>  *
>  * A setop over subqueries, or a trivial SubQueryScan that was elided,
>  * is an "ordinary" scan i.e. one for which we need to generate advice
>  * because the planner has not made any meaningful choice.
>  */
> if ((nodetype == T_Append || nodetype == T_MergeAppend) &&
> unique_nonjoin_rtekind(relids,
>   walker->pstmt->rtable) == RTE_RELATION)
> strategy = PGPA_SCAN_PARTITIONWISE;
> else
> strategy = PGPA_SCAN_ORDINARY;

All of the interpretations above look reasonable, but they made me
wonder how modules like this detect planner-behavior drift over time
if we add new plan node types, add or change plan shapes, or even a
new debug GUC.

I realize this isn't specific to this patch, but I'm curious whether
something like monitoring a possible set of parent-child plan node
patterns has ever been considered, to make structural changes easier
to notice when those invariants stop holding. I don't think this is
actionable for this patch, but I'm just recording the thought here
while reading.

---

I ran the pg_plan_advice module tests as well as test_plan_advice, and
everything passed locally.

I also noticed that several tests in pg_plan_advice are not listed in
the Makefile. Specifically, they are semijoin.sql, syntax.sql,
prepared.sql, and local_collector.sql. I ran them manually, and they
all passed.

---

In pgpa_walk_recursively():
> /*
>  * Check the future_query_features list to see whether this was previously
>  * identified as a plan node that needs to be treated as a query feature.
>  * We must do this before handling elided nodes, because if there's an
>  * elided node associated with a future query feature, the RTIs associated
>  * with the elided node should be the only ones attributed to the query
>  * feature.
>  */
> foreach_ptr(pgpa_query_feature, qf, walker->future_query_features)
> {
> if (qf->plan == plan)
> {
> active_query_features = list_copy(active_query_features);
> active_query_features = lappend(active_query_features, qf);
> walker->future_query_features =
> list_delete_ptr(walker->future_query_features, plan);
> break;
> }
> }

Should this be deleting "qf" instead of "plan"? Right now the list is
not emptied after the plan tree walk as intended.

---

In pgpa_decompose_join():
> if (found_any_outer_gather &&
> ...
> if (found_any_inner_gather &&

"found_any_{outer,inner}_gather" should be dereferenced.

---

In pgpa_trim_shared_advice():
> /* Don't leave stale pointers around. */
> memset(&chunk_array[remaining_chunk_count], 0,
>   sizeof(pgpa_shared_advice_chunk *)
>   * (total_chunk_count - remaining_chunk_count));

Should the element size be "sizeof(dsa_pointer)", consistent with the
preceding memmove()?

---

In pg_plan_advice_advice_check_hook():
> if (error != NULL)
> {
> GUC_check_errdetail("Could not parse advice: %s", error); //
> return false;
> }
>
> MemoryContextSwitchTo(oldcontext);
> MemoryContextDelete(tmpcontext);
>
> return true;

If an error occurs, do we leak the memory context? Should we also
switch and delete memory context before returning false?

---

In pg_get_collected_local_advice(), we skip rows when:

> if (!member_can_set_role(userid, ca->userid))
> continue;

Do we need to do the same check for pg_get_collected_shared_advice()?

---

In pgpa_planner_append_feedback(), "StringInfoData buf;" is unused.

---

The "ExplainState *explain_state" field and the "MemoryContext
trove_cxt" field in "struct pgpa_planner_state" are both unused.

Best,
Alex

-- 
Alexandra Wang
EDB: https://www.enterprisedb.com

Reply via email to