Hi Junwang, Thanks for the quick turnaround on v2. Walked through it against the prior notes -- everything reads well; one small thing on 0002 to fix before the next spin, inline below.
> 1. A long chain pattern that survives pruning but used to enumerate > > the full N^K product (the g5 chain from your benchmark is a > > natural fit). Not currently covered in > > src/test/regress/sql/graph_table.sql -- worth adding. > > Added as v2-0002. > The g5 chain is a good fit. One nit: the comment above the SELECT has a U+2192 arrow in both the .sql and the .out -- unless a regress case is specifically exercising encoding, these stay ASCII-only, so please replace with a plain `->` in both files. > I added the following paragraph to the commit message, feel free to > recommend wording. > > The cyclic case where a closing edge has both endpoints already > in the prefix is already exercised by the existing same-variable > loop patterns in graph_table.sql (e.g. (a)-[b]->(a)-[b]->(a)), > because repeated vertex names are merged into a single path factor > before DFS. Likewise, the "all edge candidates pruned" path into > generate_query_for_empty_path_pattern() is already hit by the > MATCH (o IS orders)-[IS customer_orders]->(c IS customers) case, > where no catalog edge matches the declared direction; pruning just > makes those branches easier to reach. Neither case needs a > dedicated new test beyond what is already there. > Good. > > > > ## Nits (all stylistic -- feel free to skip) > > > > 1. In graph_path_prefix_is_feasible_with_new_element(), the > > `if (!IS_EDGE_PATTERN(prev_pe->path_factor->kind)) return true;` > > branch is unreachable -- gram.y enforces V-E alternation, and > > the path-factor linkage code in > > generate_queries_for_path_pattern() re-asserts it. Tightening > > to `Assert(IS_EDGE_PATTERN(...))` would match the rest of the > > file. > > Yeah, for a pattern such as (a)-[e1]-(b)-[e2]-(a)-[e3]-(c), the > path_factors > is V(a), E(e1), V(b), E(e2), E(e3), V(c), though not totally V-E > alternation, > for a Vertex, we can sure that the prev_pe is always Edge, so changed > to Assert. > Thanks -- I hadn't realized path_factors isn't strictly V-E alternating; your E(e2), E(e3) example pins down the actual invariant the Assert relies on. > > > > 2. `normal_ok` / `reverse_ok` reads slightly off from this file's > > diction. Consider `feasible` / `rev_feasible` -- the asymmetric > > `rev_` prefix matches the existing `edge_qual` / `rev_edge_qual` > > in generate_query_for_graph_path(), and echoes the function name > > itself. > > Changed. > > > > > 3. A one-line comment above the V-E alternation check (whether > > kept as `if` or tightened per nit 1) citing parse analysis would > > save the next reader a trip through gram.y. > > I added the following, feel free to suggest wording. > > /* > * Merged duplicate vertices only drop redundant factors from path_factors, > * not from the DFS path; preceding slot is always an edge for a vertex. > */ > +1, and actually a better framing than the gram.y pointer I originally suggested -- this is the invariant the Assert relies on. > > > > > 4. In the prune branch of generate_queries_for_path_pattern_recurse(), > > a one-liner noting that an exhausted level returns an empty list, > > which the caller routes to generate_query_for_empty_path_pattern(), > > would help -- pre-existing behavior, but far more reachable after > > the patch. > > Added, feel free to suggest wording. > Good. > > > > > ## On the trailing edge_qual guard > > > > The `if (edge_qual == NULL) return NULL;` in > > generate_query_for_graph_path() becomes unreachable after the patch, > > but worth keeping as a defensive backstop -- it is the literal dual > > of the helper's check, and removing it would only tighten coupling > > for no measurable gain. A short comment marking it as such would help. > > Added. > Thanks. With the ASCII fix in 0002, no further comments from my side. Regards, Henson
