Hi Ishii-san, Glad the review was useful!
2026년 1월 2일 (금) AM 11:16, Tatsuo Ishii <[email protected]>님이 작성: > Hi Henson, > > Thank you for the review! I will take care the issues (it will take > sometime). > > If you have other patches that need pre-analysis (coverage, style, valgrind, etc.), feel free to ask anytime. I enjoy this kind of work. > Best regards, > -- > Tatsuo Ishii > SRA OSS K.K. > English: http://www.sraoss.co.jp/index_en/ > Japanese:http://www.sraoss.co.jp > > > SQL/RPR Patch Review Report > > ============================ > > > > Patch: Row Pattern Recognition (SQL:2016) > > Commitfest: https://commitfest.postgresql.org/patch/4460 > > > > Review Methodology: > > This review focused on quality assessment, not line-by-line code audit. > > Key code paths and quality issues were examined with surrounding > context > > when concerns arose. Documentation files were reviewed with AI-assisted > > grammar and typo checking. Code coverage was measured using gcov and > > custom analysis tools. > > > > Limitations: > > - Not a security audit or formal verification > > - Parser and executor internals reviewed at module level, not > exhaustively > > - Performance characteristics not benchmarked > > > > > > TABLE OF CONTENTS > > ----------------- > > > > 1. Executive Summary > > 2. Issues Found > > 2.1 Critical / Major > > 2.2 Minor Issues (Review Needed) > > 2.3 Minor Issues (Style) > > 2.4 Suggestions for Discussion > > 3. Test Coverage > > 3.1 Covered Areas > > 3.2 Untested Items > > 3.3 Unimplemented Features (No Test Needed) > > 4. Code Coverage Analysis > > 4.1 Overall Coverage > > 4.2 Coverage by File > > 4.3 Uncovered Code Risk Assessment > > 5. Commit Analysis > > 6. Recommendations > > 7. Conclusion > > > > > > 1. EXECUTIVE SUMMARY > > -------------------- > > > > Overall assessment: GOOD > > > > The SQL/RPR patch demonstrates solid implementation quality within its > > defined scope. Code follows PostgreSQL coding standards (with minor style > > issues), test coverage is comprehensive at 96.4%, and documentation is > > thorough with only minor typos. > > > > Rating by Area: > > - Code Quality: Good (PostgreSQL style compliant, 26 static style > fixes > > recommended) > > - Test Coverage: Good (96.4% line coverage, 28 test cases) > > - Documentation: Good (Complete, 1 minor typo) > > - Build/Regress: Pass (make check-world, 753 tests passed) > > > > > > 2. ISSUES FOUND > > --------------- > > > > 2.1 Critical / Major > > > > #1 [Code] Greedy pattern combinatorial explosion > > Pattern: PATTERN (A+ B+ C+) with DEFINE A AS TRUE, B AS TRUE, C AS > TRUE > > Impact: Memory exhaustion or infinite wait > > Recommendation: Add work_mem-based memory limit (error on exceed) > > > > Evidence - No memory limit in current code: > > - nodeWindowAgg.c:5718-5722 string_set_init(): palloc() unconditional > > - nodeWindowAgg.c:5741-5750 string_set_add(): set_size *= 2; > repalloc() > > unlimited > > - nodeWindowAgg.c:5095-5174 generate_patterns(): Triple loop, no limit > > > > Only work_mem usage in RPR (nodeWindowAgg.c:1297): > > winstate->buffer = tuplestore_begin_heap(false, false, work_mem); > > -> For tuple buffer, unrelated to pattern combination memory > (StringSet) > > > > 2.2 Minor Issues (Review Needed) > > > > #1 [Code] nodeWindowAgg.c:5849,5909,5912 > > pos > NUM_ALPHABETS check - intent unclear, causes error at 28 PATTERN > > elements > > > > Reproduction: > > - PATTERN (A B C ... Z A) (27 elements) -> OK > > - PATTERN (A B C ... Z A B) (28 elements) -> ERROR "initial is not > valid > > char: b" > > > > 2.3 Minor Issues (Style) > > > > #1 [Code] nodeWindowAgg.c (25 blocks) > > #ifdef RPR_DEBUG -> recommend elog(DEBUG2, ...) or remove > > > > #2 [Code] src/backend/executor/nodeWindowAgg.c > > static keyword on separate line (26 functions) > > > > #3 [Code] src/backend/utils/adt/ruleutils.c > > Whitespace typo: "regexp !=NULL" -> "regexp != NULL" > > > > #4 [Code] nodeWindowAgg.c:4619 > > Error message case: "Unrecognized" -> "unrecognized" (PostgreSQL > style) > > > > #5 [Doc] doc/src/sgml/ref/select.sgml:1128 > > Typo: "maximu" -> "maximum" > > > > 2.4 Suggestions for Discussion > > > > #1 Incremental matching with streaming NFA redesign > > Benefits: > > - O(n) time complexity per row (current: exponential in worst case) > > - Bounded memory via state merging and context absorption > > - Natural extension for OR patterns, {n,m} quantifiers, nested groups > > - Enables MEASURES clause with incremental aggregates > > - Proven approach in CEP engines (Flink, Esper) > > > > > > 3. TEST COVERAGE > > ---------------- > > > > 3.1 Covered Areas > > > > - PATTERN clause: +, *, ? quantifiers (line 41, 71, 86) > > - DEFINE clause: Variable conditions, auto-TRUE for missing (line > 120-133) > > - PREV/NEXT functions: Single argument (line 44, 173) > > - AFTER MATCH SKIP: TO NEXT ROW (line 182), PAST LAST ROW (line 198) > > - Aggregate integration: sum, avg, count, max, min (line 258-277) > > - Window function integration: first_value, last_value, nth_value (line > > 34-35) > > - JOIN/CTE: JOIN (line 313), WITH (line 324) > > - View: VIEW creation, pg_get_viewdef (line 390-406) > > - Error cases: 7 error scenarios (line 409-532) > > - Large partition: 5000 row smoke test (line 360-387) > > - ROWS BETWEEN offset: CURRENT ROW AND 2 FOLLOWING (line 244) > > > > 3.2 Untested Items > > > > Feature Priority Recommendation > > > ------------------------------------------------------------------------------- > > 26 variable limit Medium Test 26 variables success, 27th > > variable error > > NULL value handling Low Test PREV(col) where col or previous > > row is NULL > > > > Sample test for 26 variable limit: > > > > -- Should fail with 27th variable (parser error, no table needed) > > SELECT * FROM (SELECT 1 AS x) t > > WINDOW w AS ( > > ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING > > PATTERN (a b c d e f g h i j k l m n o p q r s t u v w x y z aa) > > DEFINE a AS TRUE, b AS TRUE, c AS TRUE, d AS TRUE, e AS TRUE, f AS > > TRUE, > > g AS TRUE, h AS TRUE, i AS TRUE, j AS TRUE, k AS TRUE, l AS > > TRUE, > > m AS TRUE, n AS TRUE, o AS TRUE, p AS TRUE, q AS TRUE, r AS > > TRUE, > > s AS TRUE, t AS TRUE, u AS TRUE, v AS TRUE, w AS TRUE, x AS > > TRUE, > > y AS TRUE, z AS TRUE, aa AS TRUE > > ); > > -- ERROR: number of row pattern definition variable names exceeds 26 > > > > Sample test for NULL handling: > > > > CREATE TEMP TABLE stock_null (company TEXT, tdate DATE, price > INTEGER); > > INSERT INTO stock_null VALUES ('c1', '2023-07-01', 100); > > INSERT INTO stock_null VALUES ('c1', '2023-07-02', NULL); -- NULL in > > middle > > INSERT INTO stock_null VALUES ('c1', '2023-07-03', 200); > > INSERT INTO stock_null VALUES ('c1', '2023-07-04', 150); > > > > SELECT company, tdate, price, count(*) OVER w AS match_count > > FROM stock_null > > WINDOW w AS ( > > PARTITION BY company > > ORDER BY tdate > > ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING > > PATTERN (START UP DOWN) > > DEFINE START AS TRUE, UP AS price > PREV(price), DOWN AS price < > > PREV(price) > > ); > > -- Result: all rows show match_count = 0 (NULL breaks pattern > matching) > > > > 3.3 Unimplemented Features (No Test Needed) > > > > Per documentation (select.sgml:1133-1136): > > - MEASURES clause: Not implemented > > - SUBSET clause: Not implemented > > - AFTER MATCH variants: Only SKIP TO NEXT ROW, PAST LAST ROW supported > > > > Not in documentation, verified by testing: > > - {n,m} quantifier: Parser error ("syntax error at or near {") > > - (A|B) OR pattern: Not supported (parsed as single variable name "a|b") > > - (A B)+ compound repetition: Parser accepts, but execution returns 0 > > matches > > > > > > 4. CODE COVERAGE ANALYSIS > > ------------------------- > > > > 4.1 Overall Coverage > > > > 96.4% (732 / 759 lines) > > > > 4.2 Coverage by File (major RPR-modified files) > > > > nodeWindowAgg.c: 96.6% (560/580) - Pattern matching execution engine > > parse_clause.c: 96.7% (88/91) - PATTERN/DEFINE analysis > > ruleutils.c: 93.1% (54/58) - pg_get_viewdef output > > > > 4.3 Uncovered Code Risk Assessment > > > > Total: 27 lines uncovered > > > > Medium Risk (2 items) - Test addition recommended (see section 3.2): > > - parse_clause.c:4093: transformDefineClause - 27th variable error > > - nodeWindowAgg.c:5623: get_slots - null_slot for PREV at partition first > > row > > > > Low Risk (25 items) - Defensive code / Unreachable: > > - nodeWindowAgg.c:3074: attno_map_walker - Parser validates arg count > > - nodeWindowAgg.c:3137-3138: rpr_navigation_walker - switch default > > - nodeWindowAgg.c:3566: window_gettupleslot - Before mark position > > - nodeWindowAgg.c:4289: WinGetFuncArgInFrame - isout flag > > - nodeWindowAgg.c:4393: WinGetSlotInFrame - Boundary check > > - nodeWindowAgg.c:4618-4619: row_is_in_reduced_frame - switch default > > - nodeWindowAgg.c:4697: register_reduced_frame_map - pos < 0 check > > - nodeWindowAgg.c:5007: search_str_set - NULL return continue > > - nodeWindowAgg.c:5405: do_pattern_match - Regex compile error > > - nodeWindowAgg.c:5435,5437-5438: do_pattern_match - Regex exec error > > - nodeWindowAgg.c:5700: pattern_initial - Variable not found > > - nodeWindowAgg.c:5776: string_set_get - Index range check > > - nodeWindowAgg.c:5850: variable_pos_register - a-z range check > > - nodeWindowAgg.c:5910: variable_pos_fetch - a-z range check > > - nodeWindowAgg.c:5913: variable_pos_fetch - Index range check > > - parse_clause.c:3989: transformDefineClause - A_Expr type check > > - parse_clause.c:4145: transformPatternClause - A_Expr type check > > - ruleutils.c:6904-6908: get_rule_windowspec - SKIP TO NEXT ROW output > > > > Conclusion: Most uncovered code consists of defensive error handling or > > code unreachable due to parser pre-validation. No security or functional > > risk. > > > > > > 5. COMMIT ANALYSIS > > ------------------ > > > > 8 sequential commits: > > > > Commit Area Files +/- Key Content > > > ------------------------------------------------------------------------------- > > 1 Raw Parser 4 +174/-16 gram.y grammar > (PATTERN/DEFINE) > > 2 Parse/Analysis 4 +277/-1 parse_clause.c analysis logic > > 3 Rewriter 1 +109/-0 pg_get_viewdef extension > > 4 Planner 5 +73/-3 WindowAgg plan node extension > > 5 Executor 4 +1,942/-11 CORE: Pattern matching engine > > (+1,850) > > 6 Docs 3 +192/-7 advanced.sgml, > func-window.sgml > > 7 Tests 3 +1,585/-1 rpr.sql (532), rpr.out > (1,052) > > 8 typedefs 1 +6/-0 pgindent support > > > > Code Change Statistics: > > - Total files: 25 > > - Lines added: 4,358 > > - Lines deleted: 39 > > - Net increase: +4,319 lines > > > > > > 6. RECOMMENDATIONS > > ------------------ > > > > 6.1 Combinatorial Explosion Prevention (Major, Required) > > > > Add work_mem-based memory limit for StringSet allocation. > > Location: string_set_add() in nodeWindowAgg.c:5741-5750 > > Consistent with existing PostgreSQL approach (Hash Join, Sort, etc.) > > > > 6.2 Code Review Required (Minor, Decision Needed) > > > > Location: nodeWindowAgg.c:5849,5909,5912 > > Issue: pos > NUM_ALPHABETS check intent unclear > > Current: PATTERN with 28 elements causes error > > Question: Is 27 element limit intentional? > > > > 6.3 Code Style Fixes (Minor) > > > > - #ifdef RPR_DEBUG: Use elog(DEBUG2, ...) or remove (25 blocks) > > - static keyword on separate line: 26 functions to fix > > - Whitespace: "regexp !=NULL" -> "regexp != NULL" > > - Error message case: "Unrecognized" -> "unrecognized" > > > > 6.4 Documentation Fixes (Minor) > > > > - select.sgml: "maximu" -> "maximum" > > > > 6.5 Test Additions (Optional) > > > > Black-box Tests (Functional): > > > > Feature Test Case > Priority > > > ------------------------------------------------------------------------------- > > Variable limit 26 variables success, 27 error Medium > > NULL boundary PREV at partition first row Medium > > > > White-box Tests (Coverage) - covered by above black-box tests: > > > > File:Line Target Branch > > > ------------------------------------------------------------------------------- > > parse_clause.c:4093 Limit error branch (Variable limit > test) > > nodeWindowAgg.c:5623 null_slot usage (NULL boundary test) > > > > > > 7. CONCLUSION > > ------------- > > > > Test Quality: GOOD > > > > Core functionality is thoroughly tested with comprehensive error case > > coverage. > > > > The patch is well-implemented within its defined scope. Identified issues > > include > > one major concern (combinatorial explosion) and minor style/documentation > > items. > > > > Recommended actions before commit: > > 1. Add work_mem-based memory limit for pattern combinations (required) > > 2. Clarify pos > NUM_ALPHABETS check intent (review needed) > > 3. Fix code style issues (optional but recommended) > > 4. Fix documentation typo (trivial) > > 5. Add tests for variable limit and NULL handling (optional) > > > > Points for discussion (optional): > > 6. Incremental matching with streaming NFA redesign > > > > > > Attachment: > > - coverage.tgz (gcov HTML report, RPR-modified code only) > > > > > > --- > > End of Report > > > > 2025년 9월 24일 (수) PM 7:36, Tatsuo Ishii <[email protected]>님이 작성: > > > >> Attached are the v33 patches for Row pattern recognition. The > >> difference from v32 is that the raw parse tree printing patch is not > >> included in v33. This is because now that we have > >> debug_print_raw_parse. > >> > >> Best regards, > >> -- > >> Tatsuo Ishii > >> SRA OSS K.K. > >> English: http://www.sraoss.co.jp/index_en/ > >> Japanese:http://www.sraoss.co.jp > >> > Best regards, Henson
