Subject: [RPR] Context Absorption Design Documentation - Two-Flag Approach
Hi Hackers,
I'm writing to document the detailed design of context absorption
optimization in Row Pattern Recognition (RPR). This is mainly for
future reference - honestly, I'm afraid I'll forget these details
myself in a few months, and I'd rather have them written down
somewhere public than buried in my notes. I don't want to repeat the
mistake of famous mathematical theorems
where brilliant ideas were lost because "the margin was too narrow
to contain the proof" (yes, I'm looking at you, Fermat!). I certainly
don't want future PostgreSQL hackers to spend the next 358 years
trying to re-implement context absorption because the design rationale
was lost. Fortunately, the PostgreSQL community has provided us with
sufficient digital margins. :)
========================================================================
ROW PATTERN RECOGNITION: CONTEXT ABSORPTION DESIGN (DRAFT)
Two-Flag Approach for O(n²) → O(n) Optimization
========================================================================
The context absorption feature achieves O(n²) → O(n) performance
improvement by eliminating redundant match searches. The core idea
is simple: a newer context (started at a later row) can never
produce a longer match than an older context when certain conditions
are met. However, the implementation details are subtle enough that
I think they deserve careful documentation.
========================================================================
TWO-FLAG DESIGN
========================================================================
The optimization uses two element flags:
1. RPR_ELEM_ABSORBABLE - "Absorption judgment point"
Marks WHERE to compare contexts for absorption.
- For simple unbounded VAR (e.g., A in "A+"): the VAR itself
- For unbounded GROUP (e.g., "(A B)+"): the END element only
2. RPR_ELEM_ABSORBABLE_BRANCH - "Absorbable region marker"
Marks ALL elements in the absorbable region.
- Used for efficient region tracking at runtime
- All elements at the same depth as the unbounded start
========================================================================
WHY TWO FLAGS?
========================================================================
The key insight: for pattern "(A B)+", we cannot compare contexts
when they're at different positions (one at A, another at B). They
must synchronize at the END element.
Consider "(A B)+" matching data: A B A B A B...
Row 0 (A): Ctx1 starts, matches A
Row 1 (B): Ctx1 matches B → END (count=1)
Row 2 (A): Ctx1 loops to A (count=1), Ctx2 starts, matches A (count=0)
Row 3 (B): Ctx1 matches B → END (count=2)
Ctx2 matches B → END (count=1)
↑ Both at END with same elemIdx → comparable!
→ Ctx1.count(2) >= Ctx2.count(1) → Absorb Ctx2!
Every group-length rows (2 in this case), the contexts synchronize
at END, making absorption possible. This is why we need:
- ABSORBABLE flag: to mark END as the judgment point
- ABSORBABLE_BRANCH flag: to keep isAbsorbable=true while
traversing A→B→END
========================================================================
PATTERN-SPECIFIC FLAG SETTINGS
========================================================================
Pattern: A+ B
-------------
Element 0: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element 1: B
flags: (none)
A+ is absorbable. Judgment point is A itself. Every row at A,
contexts can be compared and absorbed. When contexts move to B,
B has no ABSORBABLE_BRANCH flag, so isAbsorbable turns false.
Pattern: (A B)+ C
-----------------
Element 0: A (depth=1)
flags: ABSORBABLE_BRANCH
Element 1: B (depth=1)
flags: ABSORBABLE_BRANCH
Element 2: END (depth=1, max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element 3: C (depth=0)
flags: (none)
A, B, END are in the absorbable region at depth 1. Only END is the
judgment point where contexts synchronize every 2 rows (group length).
When contexts move to C (depth 0), they leave the absorbable region
and isAbsorbable turns false.
Pattern: (A+ B+)+ C
-------------------
Element 0: A (depth=2, max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element 1: B (depth=2, max=INF)
flags: (none)
Element 2: END (inner, depth=2)
flags: (none)
Element 3: END (outer, depth=1, max=INF)
flags: (none)
Element 4: C (depth=0)
flags: (none)
Only the FIRST A+ on the FIRST iteration gets flags. Why?
Ctx1: A+ 10 times → B+ 10 times → outer loop → C
Ctx2: (1 row later) A+ 9 times → absorbed by Ctx1 at A
Ctx3: (1 row later) A+ 8 times → absorbed by Ctx1 at A
...
Ctx10: (1 row later) A+ 1 time → absorbed by Ctx1 at A
Ctx11: (1 row later) B input → pattern starts at A+, no match
Judgment point is A. Absorption works there. When contexts move to B+,
it has no ABSORBABLE_BRANCH flag, so isAbsorbable turns false.
Pattern: A B C
--------------
Element 0: A
flags: (none)
Element 1: B
flags: (none)
Element 2: C
flags: (none)
No unbounded element. Pattern is not absorbable.
Contexts start with isAbsorbable = false from the beginning.
Pattern: A+ C | B+
------------------
Element 0: ALT
flags: ABSORBABLE_BRANCH
Branch 1:
Element: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element: C
flags: (none)
Branch 2:
Element: B (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
A+ branch: judgment point is A. When contexts move to C, C has no
flags, so isAbsorbable turns false.
B+ branch: judgment point is B, fully absorbable.
ALT gets ABSORBABLE_BRANCH for efficient branch-level flag management.
Pattern: A+ B | (C D)+
----------------------
Element 0: ALT
flags: ABSORBABLE_BRANCH
Branch 1:
Element: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element: B
flags: (none)
Branch 2:
Element: C (depth=1)
flags: ABSORBABLE_BRANCH
Element: D (depth=1)
flags: ABSORBABLE_BRANCH
Element: END (depth=1, max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
A+ branch: judgment point is A. When contexts move to B, B has no
flags, so isAbsorbable turns false.
(C D)+ branch: judgment point is END, fully absorbable.
Contexts on different branches cannot be compared (different elemIdx).
ALT gets ABSORBABLE_BRANCH for efficient branch-level flag management.
Pattern: A+ | B C
-----------------
Element 0: ALT
flags: ABSORBABLE_BRANCH
Branch 1:
Element: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Branch 2:
Element: B
flags: (none)
Element: C
flags: (none)
A+ branch: judgment point is A, fully absorbable.
B C branch: no unbounded element, not absorbable. When contexts enter
B C branch, B has no ABSORBABLE_BRANCH flag, so isAbsorbable turns false.
ALT gets ABSORBABLE_BRANCH because at least one branch (A+) is absorbable.
Pattern: A+ | B C+
------------------
Element 0: ALT
flags: ABSORBABLE_BRANCH
Branch 1:
Element: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Branch 2:
Element: B
flags: (none)
Element: C (max=INF)
flags: (none)
A+ branch: judgment point is A, fully absorbable.
B C+ branch: B is not absorbable, C+ has no flags (not first unbounded).
Pattern: A B | C D
------------------
Element 0: ALT
flags: (none)
Branch 1:
Element: A
flags: (none)
Element: B
flags: (none)
Branch 2:
Element: C
flags: (none)
Element: D
flags: (none)
Neither branch has unbounded elements. Pattern is not absorbable.
Contexts start with isAbsorbable = false from the beginning.
ALT has no flags because no branch is absorbable.
========================================================================
ABSORBABLE REGION BOUNDARY: When Contexts Leave Absorbable Region
========================================================================
This section illustrates when contexts transition from absorbable to
non-absorbable state by showing how hasAbsorbableState flag changes
as contexts progress through pattern elements.
Pattern: A+ B+
--------------
Element 0: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Element 1: B (max=INF)
flags: (none)
Context execution:
Row 0: Ctx starts at A
→ state at A (count=1) → state.isAbsorbable = true
→ Ctx.hasAbsorbableState = true (at least one state is absorbable)
→ Ctx.allStatesAbsorbable = true (all states are absorbable)
Row 1: Ctx at A
→ state at A (count=2) → state.isAbsorbable = true
→ Ctx.hasAbsorbableState = true
→ Ctx.allStatesAbsorbable = true
Row 2: First B input, Ctx progresses from A to B
→ state transitions to B (count=1) → state.isAbsorbable = false
→ B has no ABSORBABLE_BRANCH flag
→ Ctx.hasAbsorbableState = false (no absorbable states remain)
→ Ctx.allStatesAbsorbable = false
→ Monotonic property: hasAbsorbableState stays false forever
Row 3+: Ctx continues at B
→ Ctx.hasAbsorbableState = false (permanent)
→ This context can never absorb others, and cannot be absorbed
Key point: Once contexts leave A+ region and enter B+, absorption
completely stops. The A+ portion achieved O(n) absorption while it
lasted, which is the design goal.
Pattern: ((A B)+ C)+
--------------------
Outer loop iteration 1:
Inner (A B)+: Elements A, B, inner END (depth=1)
flags on A, B: ABSORBABLE_BRANCH
flags on inner END: ABSORBABLE | ABSORBABLE_BRANCH
Element C (depth=0):
flags: (none)
Outer END (depth=0):
flags: (none)
Context execution (first outer iteration):
First inner iteration of (A B)+:
Row 0: Ctx starts at A
→ state at A (count=0, depth=1) → state.isAbsorbable = true
→ Ctx.hasAbsorbableState = true (at least one state is absorbable)
→ Ctx.allStatesAbsorbable = true (all states are absorbable)
Row 1: Ctx at B, then epsilon transition to inner END
→ state at B (count=0, depth=1) → state.isAbsorbable = true
→ B→END is epsilon transition (same row processing)
→ state at inner END (count=1, depth=1) → state.isAbsorbable = true
→ Ctx.hasAbsorbableState = true
→ Ctx.allStatesAbsorbable = true
→ Inner END loops back to A
Second inner iteration of (A B)+:
Row 2: Ctx at A again
→ state at A (count=1, depth=1) → state.isAbsorbable = true
→ Ctx.hasAbsorbableState = true
→ Ctx.allStatesAbsorbable = true
Row 3: Ctx at B, then epsilon transition to inner END → exit to C
→ state at B (count=1, depth=1) → state.isAbsorbable = true
→ state at inner END (count=2, depth=1) → state.isAbsorbable = true
→ Inner END can loop back or exit (assume (A B)+ min satisfied, exit
to C)
→ Ctx.hasAbsorbableState = true (still in absorbable region)
→ Ctx.allStatesAbsorbable = true
Row 4: Ctx at C, then epsilon transition to outer END
→ state transitions to C (depth=0) → state.isAbsorbable = false
→ C has no ABSORBABLE_BRANCH flag (depth=0, outside inner group)
→ Ctx.hasAbsorbableState = false (no absorbable states remain)
→ Ctx.allStatesAbsorbable = false (C state is not absorbable)
→ Monotonic property: hasAbsorbableState = false is PERMANENT for this
context
→ C→outer END is epsilon transition (same row processing)
→ state at outer END (count=1, depth=0) → state.isAbsorbable = false
→ Ctx.hasAbsorbableState = false (still false)
→ Ctx.allStatesAbsorbable = false (still false)
→ Outer END loops back to start
Row 5+: Ctx continues in second outer iteration (same context)
→ Outer END loops back to A (depth=1)
→ New A state created via transition from outer END state
→ state.isAbsorbable = false (inherited from outer END state,
monotonic!)
→ A element has ABSORBABLE_BRANCH flag, but cannot override transition
inheritance
→ Ctx.hasAbsorbableState = false (PERMANENT, monotonic property!)
→ Ctx.allStatesAbsorbable = false (PERMANENT, A state is not absorbable)
→ This context can no longer absorb others (hasAbsorbableState = false)
→ This context cannot be absorbed (allStatesAbsorbable = false)
→ Context continues matching but absorption is disabled forever
Note: Element flags only apply at initial state creation (context start).
During state transitions, state.isAbsorbable is inherited from previous
state.
Once false, it remains false through all subsequent transitions.
Key points:
- Within first outer iteration's (A B)+ region, absorption works
- When progressing to C (Row 4), hasAbsorbableState becomes false
permanently
- Same context continuing to second iteration: absorption stays disabled
- NEW contexts starting at later rows: fresh start with hasAbsorbableState
= true
- Example: If a new context (Ctx2) starts at Row 5, Ctx2.hasAbsorbableState
= true
- Pattern achieves O(n) absorption within first outer iteration's (A B)+
portion
========================================================================
DETAILED EXECUTION EXAMPLE: A+ | B C+
========================================================================
This section provides a detailed walkthrough of the "A+ | B C+" pattern
execution to illustrate absorption eligibility restoration, state
transition design, and the two-flag absorption mechanism.
Pattern Structure:
------------------
Element 0: ALT
flags: ABSORBABLE_BRANCH
Branch 1:
Element: A (max=INF)
flags: ABSORBABLE | ABSORBABLE_BRANCH
Branch 2:
Element: B
flags: (none)
Element: C (max=INF)
flags: (none)
Execution with input A=TRUE, B=TRUE, C=FALSE repeated:
Note: A+ keeps A state at the last matched position (not FIN state).
FIN transition is checked and recorded in matchedState, but FIN state
itself is not added to ctx->states. This keeps ctx->states clean with
only in-progress states for absorption comparison.
Row 1: Ctx1 starts → ALT creates A state and B state simultaneously
→ A state: A=TRUE matches (count=1) [state.isAbsorbable=true]
* Can transition to FIN → Ctx1.matchedState recorded (Row 1)
* A state remains in ctx->states (for potential continuation)
→ B state: B=TRUE matches (count=1) [state.isAbsorbable=false]
* B is bounded (max=1), but remains at B state for next row check
* B state remains in ctx->states (will check repetition on next
row)
→ Row 1 end: Ctx1.states = {A state (isAbsorbable=true), B state
(isAbsorbable=false)}
→ Ctx1.hasAbsorbableState = true (A state is absorbable)
→ Ctx1.allStatesAbsorbable = false (B state is not absorbable)
→ Ctx1.matchedState = A branch match ending at Row 1
Row 2: Ctx2 starts (same as Ctx1 at Row 1)
→ Ctx1: A state matches A=TRUE (count=2) [isAbsorbable=true]
* Can transition to FIN → Ctx1.matchedState updated (Row 2, longer
match)
* A state remains in ctx->states (count=2)
B state: count=1 already, max=1, cannot repeat → transitions to C
state
C state checks C=FALSE → mismatch, dies
* Note: Even if C+ completed, FIN check unnecessary
(A+ matchedState has better altPriority, would win anyway)
→ Ctx1 Row 2 end: Ctx1.states = {A state (isAbsorbable=true, count=2)}
→ Ctx1.hasAbsorbableState = true (A state is absorbable)
→ Ctx1.allStatesAbsorbable = true (only A state, which is absorbable)
→ Ctx2: A state (count=1) [isAbsorbable=true]
* Can transition to FIN → Ctx2.matchedState recorded (Row 2)
* A state remains in ctx->states
B state: B=TRUE matches (count=1) [isAbsorbable=false]
* B state remains (will transition to C on next row)
→ Ctx2 Row 2 end: Ctx2.states = {A state (isAbsorbable=true), B state
(isAbsorbable=false)}
→ Ctx2.hasAbsorbableState = true (A state is absorbable)
→ Ctx2.allStatesAbsorbable = false (B state is not absorbable)
→ Absorption check: Ctx1.hasAbsorbableState && Ctx2.allStatesAbsorbable?
→ false (Ctx2.allStatesAbsorbable = false) → No absorption
Row 3: Ctx3 starts
→ Ctx1: A state matches A=TRUE (count=3) [isAbsorbable=true]
* Can transition to FIN → Ctx1.matchedState updated (Row 3)
* A state remains (count=3)
→ Ctx1 Row 3 end: Ctx1.states = {A state (isAbsorbable=true, count=3)}
→ Ctx1.hasAbsorbableState = true
→ Ctx1.allStatesAbsorbable = true
→ Ctx2: A state matches A=TRUE (count=2) [isAbsorbable=true]
* Can transition to FIN → Ctx2.matchedState updated (Row 3)
* A state remains (count=2)
B state: count=1 already, max=1, cannot repeat → transitions to C
state
C state checks C=FALSE → mismatch, dies
* Note: Even if C+ completed, FIN check unnecessary (lexical order)
→ Ctx2 Row 3 end: Ctx2.states = {A state (isAbsorbable=true, count=2)}
→ Ctx2.hasAbsorbableState = true (A state is absorbable)
→ Ctx2.allStatesAbsorbable = true (only A state, which is absorbable)
→ Ctx3: A state (count=1) [isAbsorbable=true]
B state: B=TRUE matches (count=1) [isAbsorbable=false]
→ Ctx3 Row 3 end: Ctx3.states = {A state (isAbsorbable=true), B state
(isAbsorbable=false)}
→ Ctx3.hasAbsorbableState = true (A state is absorbable)
→ Ctx3.allStatesAbsorbable = false (B state is not absorbable)
→ Absorption check: Ctx1 vs Ctx3 (checked first, from back)
* Pre-check: Ctx1.hasAbsorbableState && Ctx3.allStatesAbsorbable?
* false (Ctx3.allStatesAbsorbable = false)
* No absorption attempted (fast filter rejected)
→ Absorption check: Ctx1 vs Ctx2
* Pre-check: Ctx1.hasAbsorbableState && Ctx2.allStatesAbsorbable?
* true (both flags are true)
* Ctx1.states = {A state (count=3)}
* Ctx2.states = {A state (count=2)}
* All Ctx2 states covered by Ctx1 (A: 3 >= 2) → Ctx2 absorbed!
* Note: Even if Ctx1 had additional states (absorbable or not),
Ctx2 would still be absorbed as long as all Ctx2 states are
covered
This example demonstrates three key design aspects:
1. Two-flag absorption design: Ctx2 becomes eligible for absorption
when its B state disappears due to mismatch. Two distinct flags track:
- hasAbsorbableState: remains true throughout (A state always absorbable)
- allStatesAbsorbable: changes from false (Row 2, has B) to true (Row 3,
only A)
This separation enables fast pre-check without state iteration.
2. State transition design: A+ keeps A state in ctx->states at the
last matched position for absorption comparison. FIN transition is
only checked and recorded in matchedState, not added to ctx->states.
3. Asymmetric absorption logic: Ctx1 can absorb Ctx2 even if Ctx1
has additional states, as long as Ctx2.allStatesAbsorbable=true and
all Ctx2 states are covered by matching Ctx1 states with higher counts.
4. Pruning opportunity (future optimization): With eager pruning,
the B state could be eliminated at Row 1 (since max=1, count=1, no
further progression possible), making Ctx1.allStatesAbsorbable=true
immediately. This would enable absorption at Row 2 instead of Row 3.
Alternatively, when B state dies during Row 2 processing, we could
immediately update allStatesAbsorbable and trigger absorption without
waiting for the next row. However, this optimization is not
implemented yet - implementation is left for future work.
========================================================================
STATE TRANSITION DESIGN FOR ABSORPTION
========================================================================
NFA State Management Philosophy
--------------------------------
The absorption optimization requires clean separation between:
- In-progress states (ctx->states): active states being evaluated each row
- Completion markers (ctx->matchedState): records best match found so far
This design decision has profound implications for how state transitions
work, especially for unbounded quantifiers like A+.
The FIN State Problem
---------------------
Traditional NFA implementation for A+ would:
1. Match A=TRUE → increment count
2. Create TWO states:
- Loop state: A (count++) → add to ctx->states for next row
- Escape state: FIN (pattern complete) → add to ctx->states
Problem: If FIN state is added to ctx->states, absorption becomes complex:
- Ctx1.states = {A state (count=3), FIN state}
- Ctx2.states = {A state (count=2), FIN state}
- Both have FIN states, but should we compare them?
- FIN states don't have meaningful counts to compare
- FIN is not in ABSORBABLE region (pattern already finished)
The Solution: Deferred FIN Transition
--------------------------------------
Instead of creating FIN state in ctx->states, we:
1. Keep A state at last matched position in ctx->states
2. Check if FIN transition is possible
3. Record completion in ctx->matchedState (separate from ctx->states)
4. FIN state is NOT added to ctx->states
Example for A+ with continuous A=TRUE input:
Row 1: Ctx1.states = {A state (count=1)}
Ctx1.matchedState = {FIN, ending at Row 1}
Row 2: Ctx1.states = {A state (count=2)} ← A state updated, not
duplicated
Ctx1.matchedState = {FIN, ending at Row 2} ← updated to longer
match
Row 3: Ctx1.states = {A state (count=3)}
Ctx1.matchedState = {FIN, ending at Row 3}
Absorption comparison (Row 3):
- Ctx1.states = {A state (count=3)}
- Ctx2.states = {A state (count=2)}
- Compare: count=3 >= count=2, same elemIdx → absorb Ctx2
- Clean! No FIN states to complicate the comparison.
Benefits of This Design
------------------------
1. Clean absorption logic:
- ctx->states contains ONLY in-progress states
- All states in ctx->states are comparable (same depth, count meaningful)
- FIN states don't interfere with absorption judgment
2. Efficient state management:
- No duplicate FIN states created each row
- matchedState updated in-place (not accumulated)
- Reduced memory footprint
3. Correct greedy semantics:
- A+ continues matching as long as A=TRUE
- matchedState always records the longest match so far
- When A finally mismatches, we use the last matchedState
4. Simple state tracking:
- ctx->states reflects "where we are in the pattern"
- matchedState reflects "best completion found so far"
- No confusion between "still matching" vs "already finished"
State Absorbability Inheritance
--------------------------------
Element flags (ABSORBABLE, ABSORBABLE_BRANCH) only apply at initial state
creation when a context starts. During state transitions, state.isAbsorbable
is inherited from the previous state, not from element flags.
This inheritance mechanism has critical implications:
1. Initial state creation (context start at Row 0):
- state.isAbsorbable = (elem->flags & RPR_ELEM_ABSORBABLE_BRANCH) != 0
- Element flags determine initial absorbability
2. State transitions (A→B, B→END, END→A, etc.):
- New state inherits state.isAbsorbable from previous state
- Element flags on the new element do NOT override this inheritance
- Once state.isAbsorbable = false, it remains false through all
transitions
3. Monotonic property:
- state.isAbsorbable can change from true → false (leaving absorbable
region)
- state.isAbsorbable CANNOT change from false → true (cannot re-enter)
- This ensures contexts permanently leave absorbable region once they
exit
Example: Pattern ((A B)+ C)+
First outer iteration (Row 0-4):
Row 0: Context starts → A state created
→ state.isAbsorbable = true (from A element's ABSORBABLE_BRANCH flag)
Row 1: A→B transition
→ B state.isAbsorbable = true (inherited from A state, not from B flag)
Row 1: B→END transition (epsilon, same row)
→ END state.isAbsorbable = true (inherited from B state)
Row 4: C created via transition from inner END
→ C state.isAbsorbable = false (C element has no ABSORBABLE_BRANCH flag)
Row 4: C→outer END transition (epsilon, same row)
→ outer END state.isAbsorbable = false (inherited from C state)
Second outer iteration (Row 5+, same context continuing):
Row 5+: outer END loops back to A
→ New A state created via transition from outer END state
→ state.isAbsorbable = false (INHERITED from outer END state)
→ A element has ABSORBABLE_BRANCH flag, but this does NOT apply
→ Element flags only apply at context start, not during transitions
→ Monotonic property: once false, stays false forever
This is why hasAbsorbableState becomes permanently false when contexts
leave the first unbounded region - all subsequent states inherit
isAbsorbable=false regardless of element flags they transition to.
========================================================================
PLANNING STAGE: computeAbsorbability()
========================================================================
Located in src/backend/optimizer/plan/rpr.c.
Core Concept: Find First Unbounded Portion
-------------------------------------------
The algorithm identifies the FIRST unbounded portion of the pattern starting
from element 0. Only this first portion gets absorption flags. Elements
after
this portion do not get flags, even if they are unbounded.
Examples:
- A+ B+: First unbounded portion is A+. B+ gets no flags.
- ((A B)+ C)+: First unbounded portion is (A B)+. C and outer END get no
flags.
- (A B)+ C: First unbounded portion is (A B)+. C gets no flags.
Key constraints for the first unbounded portion:
- Must be unbounded VAR (max=INF) or unbounded GROUP (END has max=INF)
- For unbounded GROUP:
* All elements inside at the same depth must be simple {1,1} VARs
* No nested GROUPs inside (depth > startDepth)
* No unbounded VARs inside (all VARs must be {1,1})
Why these constraints:
- Simple VARs ensure predictable synchronization at END element
- No nested GROUPs keeps depth-based flag assignment simple
- No unbounded VARs inside prevents complex state combinations
Examples of valid first unbounded portions:
- A+ (simple unbounded VAR)
- (A B)+ (unbounded GROUP with simple VARs A and B at depth=1)
- (A B C)+ (unbounded GROUP with simple VARs A, B, C at depth=1)
Examples where outer pattern structure is complex, but absorption still
works:
- ((A B)+ C)+
* Analysis STOPS at first unbounded portion: (A B)+
* First portion (A B)+ is valid: A and B are simple {1,1} VARs at depth=1
* C and outer END are beyond first portion (ignored, get no flags)
* Result: Absorption works for the (A B)+ portion
Examples that fail constraints (cannot set absorption flags):
- (A+ B+)+ (inner A+ and B+ are unbounded VARs, violates constraint for
first portion)
Algorithm:
1. Check if pattern is too short (< 2 elements) → not absorbable
2. If pattern starts with ALT:
- Iterate through each branch
- For each branch, check isUnboundedStart() on first element of branch
- If true, call setAbsorbabilityFlagsForBranch() for that branch
- If any branch is absorbable, set ABSORBABLE_BRANCH on ALT element
3. If pattern does not start with ALT:
- Check isUnboundedStart() on first element (element 0)
- If true, call setAbsorbabilityFlagsForBranch(pattern, 0)
Helper function isUnboundedStart(pattern, idx):
Scans from idx forward to determine if this starts an unbounded portion.
Case 1: Simple unbounded VAR
- Element at idx has max=INF
- Return true immediately
Case 2: Unbounded GROUP
- Scan elements from idx forward while depth >= startDepth
- Find END element with:
* depth == startDepth
* max == INF (unbounded loop)
* jump == idx (loops back to start)
- Check all elements at startDepth are simple {1,1} VARs
- Check no nested structures (depth > startDepth)
- Check for outer nesting: if another END exists at lower depth before
FIN
→ nested structure, return false
- Return true if all checks pass
Helper function setAbsorbabilityFlagsForBranch(pattern, branchIdx):
Performs two passes:
1. First pass: detect if this is an unbounded GROUP pattern
- Scan elements from branchIdx to find END with unbounded max that
jumps back
- Record endIdx if found
2. Second pass: set flags on all elements at unbounded depth
- ABSORBABLE_BRANCH: all elements where depth == startDepth
- ABSORBABLE: judgment point only
* Unbounded GROUP → only END element gets ABSORBABLE
* Simple unbounded VAR → only first element gets ABSORBABLE
Runtime conditions (in buildRPRPattern):
- Check runtime conditions first to avoid unnecessary pattern analysis:
* rpSkipTo must be ST_PAST_LAST_ROW (not NEXT ROW)
* Frame must be unlimited (UNBOUNDED FOLLOWING or not ROWS mode)
- If conditions met: call computeAbsorbability() for structural check
- If conditions not met: set pattern->isAbsorbable = false directly
- This eliminates runtime condition checks in executor
========================================================================
RUNTIME OPERATION
========================================================================
Located in src/backend/executor/nodeWindowAgg.c.
State-Level and Context-Level Absorption Flags:
Each NFA state tracks its own absorption status (state->isAbsorbable).
Context tracks TWO distinct absorption flags for different purposes:
State Initialization:
Two different mechanisms depending on how state is created:
1. Initial state creation (in nfa_start_context):
- state->isAbsorbable = (elem->flags & RPR_ELEM_ABSORBABLE_BRANCH) != 0
- Element flags determine initial absorbability when context starts
2. State transitions (in nfa_state_clone):
- state->isAbsorbable = previousState->isAbsorbable
- Inherited from previous state, NOT from element flags
- Element flags are NOT checked during transitions
- Monotonic: once false, stays false through all transitions
Context Initialization (in nfa_start_context):
- ctx->hasAbsorbableState = pattern->isAbsorbable
- ctx->allStatesAbsorbable = pattern->isAbsorbable
- Initial state's isAbsorbable also set based on element 0's flag
Flag Management (after each row in nfa_step()):
Two flags track absorption status:
1. hasAbsorbableState (can this context absorb others?):
- true if at least one state has isAbsorbable=true
- false if all states have isAbsorbable=false
- Indicates context is in absorbable region
- Monotonic: once false, stays false forever
2. allStatesAbsorbable (can this context be absorbed?):
- true if ALL states have isAbsorbable=true
- false if any state has isAbsorbable=false
- Indicates context is eligible for absorption
- Can change: false → true (non-absorbable states die)
Optimization: Once hasAbsorbableState becomes false, no need to
recalculate:
- All absorbable states have died
- New states only come from transitions (can't create new absorbable
states)
- Both flags remain false permanently
- Skip all state iteration for flag updates
Recalculation logic (only if hasAbsorbableState still true):
- Iterate through all states
- Check each state's isAbsorbable flag
- Update both flags based on current states
Absorption Check (in nfa_absorb_contexts()):
Runtime conditions (SKIP TO PAST LAST ROW, unlimited frame) are
pre-validated at planning time. Executor simply checks
pattern->isAbsorbable.
Absorption logic: Can Ctx1 (older) absorb Ctx2 (newer)?
Pre-check (fast filter using context flags):
1. Ctx1.hasAbsorbableState == true (Ctx1 has ability to absorb)
2. Ctx2.allStatesAbsorbable == true (Ctx2 is eligible to be absorbed)
- If either check fails → skip absorption entirely
- No need to iterate through states for these checks
Actual absorption check (if pre-check passes):
3. For each state in Ctx2, find matching state in Ctx1 (same elemIdx)
4. Each Ctx2 state must be covered: Ctx1 count >= Ctx2 count
Key insight: asymmetric check with two-flag design
- Ctx2 (newer): allStatesAbsorbable must be true (checked via flag)
- Ctx1 (older): hasAbsorbableState must be true (may have extra
non-absorbable states)
- We check if all Ctx2 states are covered by matching Ctx1 states
Example:
- Ctx1.states = {A state (absorbable, count=3), X state (non-absorbable)}
- Ctx1.hasAbsorbableState = true, Ctx1.allStatesAbsorbable = false
- Ctx2.states = {A state (absorbable, count=2)}
- Ctx2.hasAbsorbableState = true, Ctx2.allStatesAbsorbable = true
- Pre-check: Ctx1.hasAbsorbableState && Ctx2.allStatesAbsorbable? Yes
- Check: A state in Ctx1 covers A state in Ctx2? Yes (3 >= 2)
- Result: Ctx2 absorbed
- Ctx1's X state doesn't prevent absorption
This design ensures:
- No repeated element flag lookups (cached in state)
- No runtime condition checks (validated once at planning)
- Fast context-level filtering (two pre-computed flags, no state
iteration)
- Correct synchronization (only at designated judgment points)
- Simple depth-independent logic (flag-based, not depth comparisons)
- Asymmetric check: newer context must be fully absorbable, older just
needs one
========================================================================
PERFORMANCE IMPACT
========================================================================
Test results for "(A B)+" pattern with 60 rows:
- Absorbed length: 2/2/2 (= group length, correct synchronization)
- Peak states: 5
- Peak contexts: 4
- Absorbed count: 29
The two-flag design ensures contexts synchronize at END elements,
preventing premature absorption attempts at intermediate positions
(A or B). This guarantees correctness while maintaining efficiency.
For absorbable patterns like "A+" or "(A B)+", practically ALL
eligible contexts get absorbed because:
- Contexts separated by group-length rows always synchronize at
judgment points
- Older contexts always have count >= newer contexts at the same
judgment point
- This is exactly what achieves O(n²) → O(n) reduction
========================================================================
SUMMARY
========================================================================
The context absorption optimization achieves O(n²) → O(n) performance
by eliminating redundant match searches. The design uses multiple
coordinated mechanisms:
1. Two-Flag Element Design (Planning Time):
- RPR_ELEM_ABSORBABLE: marks judgment points (WHERE to compare)
- RPR_ELEM_ABSORBABLE_BRANCH: marks absorbable regions (tracks
boundaries)
- For simple unbounded VAR (A+): judgment point is the VAR itself
- For unbounded GROUP ((A B)+): judgment point is END element only
- Ensures contexts synchronize before comparison
2. First Unbounded Portion Strategy (Planning Time):
- Algorithm identifies FIRST unbounded portion starting from element 0
- Only this portion gets absorption flags
- Examples: A+ B+ → only A+ flagged; ((A B)+ C)+ → only (A B)+ flagged
- Achieves O(n) absorption within first portion (sufficient for most
cases)
3. State Inheritance Mechanism (Runtime):
- Element flags apply ONLY at context start (nfa_start_context)
- State transitions inherit state.isAbsorbable from previous state
- Element flags do NOT override inheritance during transitions
- Monotonic: once state.isAbsorbable = false, stays false forever
- This ensures contexts permanently leave absorbable region after first
portion
4. Two-Flag Context Design (Runtime):
- hasAbsorbableState: can this context absorb others? (≥1 absorbable
state)
* Monotonic: true→false only, cannot recover once false
- allStatesAbsorbable: can this context be absorbed? (ALL states
absorbable)
* Dynamic: can change false→true (when non-absorbable states die)
- Enables fast pre-check without state iteration
- Optimization: once hasAbsorbableState = false, skip all recalculation
forever
5. FIN State Separation (Runtime):
- ctx->states: in-progress states only (for absorption comparison)
- ctx->matchedState: pattern completion marker (best match so far)
- FIN state NOT added to ctx->states (keeps comparison clean)
- A+ pattern: A state remains at last position, FIN only in matchedState
- Prevents FIN states from complicating absorption logic
6. Asymmetric Absorption Logic (Runtime):
- Pre-check: Ctx1.hasAbsorbableState && Ctx2.allStatesAbsorbable
- Older context (Ctx1): needs ≥1 absorbable state (may have extra states)
- Newer context (Ctx2): ALL states must be absorbable
- Check: all Ctx2 states covered by matching Ctx1 states (count ≥)
- Asymmetry enables absorption even when Ctx1 has non-absorbable states
7. Synchronization at Judgment Points (Runtime):
- Contexts only compared when at same elemIdx (same pattern position)
- For (A B)+: contexts synchronize at END every group-length rows
- For A+: contexts synchronize at A every row
- Prevents premature comparison at intermediate positions
- Guarantees correctness while maintaining O(n) efficiency
Key Benefits:
- Correct synchronization: contexts compared only at designated judgment
points
- Efficient runtime: no repeated element flag lookups, fast context-level
filtering
- Simple state management: clean separation between in-progress vs
completed states
- Depth-independent logic: flag-based design works for nested patterns
- Monotonic optimization: permanent false flags skip unnecessary
recalculation
- No runtime condition checks: pre-validated at planning time
========================================================================
FUTURE WORK
========================================================================
Element Flags for Next Element Type
------------------------------------
To optimize state transition decisions, planning-time flags indicate
next element type (defined in src/include/optimizer/rpr.h):
RPR_ELEM_NEXT_IS_END: next element is END (epsilon transition, loop back)
RPR_ELEM_NEXT_IS_FIN: next element is FIN (epsilon transition, pattern
end)
These flags enable faster runtime decisions without accessing the next
element:
- Faster state transition decisions (single flag check vs array lookup)
- Better cache locality (element data already in cache)
- Clear distinction between END and FIN transitions
- Simplified runtime logic (flag-based branching)
Flag definitions added to header file. Setter logic in planning code (rpr.c)
would scan elements and set flags based on next element type during pattern
compilation.
Reluctant Quantifier Support (Mid-term)
---------------------------------------
Reluctant (non-greedy) quantifiers (A+?, B+?) are defined in SQL:2016
standard
but not yet supported by the PostgreSQL parser. Pattern syntax containing
'?'
after quantifiers currently results in parser error.
The RPR_ELEM_RELUCTANT flag (0x01) is defined in the header file for future
use, preserving the grammar structure for when parser support is added.
Once parser support is implemented, absorption optimization considerations:
- Reluctant quantifiers may create different synchronization patterns
- Count comparison logic remains valid (Ctx1 count >= Ctx2 count)
- May benefit from specialized absorption rules
This is left as a mid-term consideration pending parser implementation and
subsequent analysis of interaction between reluctant semantics and
absorption.
Lexical Order Optimization for ALT Patterns (Long-term)
--------------------------------------------------------
In ALT patterns (e.g., "A+ | B C+"), once a higher-priority branch creates
a matchedState, lower-priority branches cannot win due to lexical order.
Optimization opportunity:
- Once A+ branch creates matchedState, B C+ branch cannot win
- B C+ branch can skip FIN transition checks entirely
- Early pruning: if ctx->matchedState exists with better altPriority,
skip completion checks for lower-priority branches
Current implementation processes all branches fully, which is correct but
does redundant work for lower-priority branches. This optimization is left
as a long-term consideration due to implementation complexity versus
marginal performance gains.
I hope this documentation helps anyone (including future me) who
needs to understand or modify this code. The design took several
iterations to get right, and I wanted to record the reasoning
while it's still fresh in my mind.
Comments welcome!
Best regards,
Henson
2026년 1월 18일 (일) PM 9:09, Tatsuo Ishii <[email protected]>님이 작성:
> Hi Henson,
>
> > Attached is the patch with the following changes:
> >
> > 1. Removed 26 variables limit and fixed count overflow:
> > - Deleted defineInitial field from WindowClause (parsenodes.h),
> > WindowAgg (plannodes.h), and related code in createplan.c and
> > parse_rpr.c
> > - Now uses RPR_VARID_MAX (252) constant for the limit check
> > - Fixed NFA count overflow issue:
> > * Changed RPRNFAState.counts from int16 to int32 (execnodes.h)
> > * Added RPR_COUNT_MAX (INT32_MAX) constant (rpr.h)
> > * Handles pattern matching failures correctly at >32,767 repetitions
> > * Added INT32_MAX boundary handling to prevent overflow
> > - Updated test cases in rpr.sql:
> > * Updated variable limit tests from 26 to 252
> > * Added 100K row large partition test to verify int32 overflow
> handling
> >
> > 2. Simplified context absorption logic (under review):
> > - Key insight: Most absorption opportunities provide no practical
> benefit
> > because subsequent contexts are discarded immediately after a match
> > succeeds. Only absorption at unbounded pattern start shows
> measurable
> > impact.
> > - Changes:
> > * Planner (rpr.c): Added isUnboundedStart() to identify patterns
> where
> > leading unbounded quantifiers enable safe absorption
> > * Executor (nodeWindowAgg.c): Simplified to only absorb at pattern
> > start,
> > separated cleanup logic into nfa_cleanup_dead_contexts()
> > * Added pattern->isAbsorbable flag to enable absorption only when
> safe
> > - Impact: Reduces context growth from quadratic to linear.
> > Example from rpr_explain.sql Test 3.1 (PATTERN A+ B, 50 rows):
> > * With absorption: 51 total contexts, 3 peak, 30 absorbed
> > * Without absorption: Would create O(n²) contexts (each row starts a
> > context, creating 1+2+3+4 contexts per 5-row segment)
> > * Absorption maintains linear O(n) growth instead of quadratic
> > explosion
> > - Note: Changes in EXPLAIN ANALYZE output suggest there may be issues
> in
> > the matching process, particularly in absorption and NFA state
> > transitions.
> > I need to review these areas more thoroughly before I can be
> confident
> > in
> > their correctness.
> >
> > 3. Added NFA statistics to EXPLAIN ANALYZE output (under review):
> > - Extended explain.c to show NFA construction and matching statistics
> > - Added comprehensive statistics tracking (execnodes.h):
> > * NFALengthStats structure for min/max/avg length measurements
> > * State statistics: peak, total created, merged (deduplicated)
> > * Context statistics: peak, total created, absorbed, skipped, pruned
> > * Match statistics: succeeded/failed counts with length
> distributions
> > - Implemented statistics collection in nodeWindowAgg.c
> > - New test suite: rpr_explain.sql for testing NFA statistics output
> > - Note: rpr_explain is not included in parallel_schedule (runs
> > separately)
> > - Statistics reveal pattern complexity impact:
> > Example from Test 2.7 (PATTERN ((A | B)+)+, 1000 rows):
> > * NFA States: 2,664 peak, 892,447 total created
> > * Peak grows linearly with rows (~2.6x), likely due to state merging
> > * High total shows significant state churn despite controlled peak
> > * This demonstrates complex pattern behavior that users should
> > understand
> > * Question: Should we add a GUC parameter to limit NFA state peak
> count
> > (e.g., max_rpr_nfa_states) to provide administrators with safety
> > controls?
> > - This feature is still under review and may be refined in future
> patches
> >
> > 4. Major refactoring of pattern compilation and optimization (rpr.c):
> > - Split large monolithic functions into smaller, semantically
> meaningful
> > single-purpose functions for better readability
> > - Pattern compilation refactored:
> > * Added collectDefineVariables() to populate variable names from
> DEFINE
> > * Split scanRPRPattern() into recursive worker and wrapper
> > * Extracted allocateRPRPattern() for memory allocation
> > * Split fillRPRPattern() by node type: fillRPRPatternVar(),
> > fillRPRPatternGroup(), fillRPRPatternAlt()
> > - Pattern optimization converted to dispatcher pattern:
> > * optimizeSeqPattern() - flattens nested SEQ, merges consecutive
> > vars/groups
> > * optimizeAltPattern() - flattens nested ALT, removes duplicates
> > * optimizeGroupPattern() - multiplies quantifiers, unwraps single
> > children
> > - Each optimization step broken into separate helper functions:
> > * flattenSeqChildren(), mergeConsecutiveVars(),
> > mergeConsecutiveGroups()
> > * flattenAltChildren(), removeDuplicateAlternatives()
> > * tryMultiplyQuantifiers(), tryUnwrapGroup()
> > - New optimization: mergeGroupPrefixSuffix()
> > * A B (A B)+ -> (A B){2,}
> > * (A B)+ A B -> (A B){2,}
> > * A B (A B)+ A B -> (A B){3,}
> > - Code structure reorganized with categorized forward declarations
> > - Added test cases in rpr.sql to verify new optimizations:
> > * Consecutive GROUP merge tests
> > * Quantifier multiplication tests (including INF blocking)
> > * Group prefix/suffix merge tests
> >
> > 5. Fixed quantifier validation:
> > - Corrected bounds checking in gram.y (e.g., {,m} now requires m >= 1)
> > - Improved error messages for invalid quantifier ranges
> >
> > Additional notes:
> > - The NFA statistics output format in EXPLAIN ANALYZE was quickly
> > implemented for debugging purposes. I would appreciate feedback from
> > experienced PostgreSQL developers on:
> > * Whether the output format and structure are appropriate
> > * Whether these statistics would be valuable for query tuning in
> > production,
> > or if they are only useful for development/debugging
> > * If production-worthy, what simplifications might make them more
> > accessible
> > to users unfamiliar with NFA internals
> > - Reluctant quantifier handling needs further clarification. If I cannot
> > make confident fixes before the next review cycle, I may temporarily
> > raise errors for reluctant cases until the logic is properly sorted
> out.
> >
> > Next steps:
> > - The executor implementation (nodeWindowAgg.c) is the only major
> component
> > remaining for detailed review.
> > - I will focus on thoroughly reviewing the absorption logic and NFA state
> > transition mechanisms to address the concerns noted in section 2.
> > - After making necessary improvements, I will send the next patch with
> > more confidence in the executor correctness.
>
> Ok, I have merged your patches with my patches to create v40
> patches.
>
> My patches are to fix issue with RPR + IGNORE NULLS case. I modified
> ignorenulls_getfuncarginframe() which is called from window functions
> when IGNORE NULLS option is specified. To cope with RPR, I added
> row_is_in_reduced_frame() call to it. Also test cases were added.
>
> Attached are the v40 patches.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>