Hi Tatsuo,

However, this raises interesting questions: should we optimize patterns
> by removing {0} quantifiers or simplifying them? And if so, how should
> we handle patterns that become empty after such optimization?
>
> For example:
> - PATTERN (A{0})          → empty pattern
> - PATTERN (A{0} B{0})     → empty pattern
> - PATTERN (A{0} B)        → PATTERN (B) after optimization
>
> Empty patterns would result in zero-length matches, which our current
> implementation explicitly treats as invalid (see initialAdvance flag
> logic in nodeWindowAgg.c).
>
> More importantly, I recall that zero-length matches caused serious
> issues during development, which is why we added logic to explicitly
> avoid them.
>
> The reason I cannot immediately provide a concrete plan for A{0}
> support is that I need to deeply understand the semantic meaning of
> zero-length matches in the SQL standard first. Without this
> understanding, any implementation approach could be fundamentally
> flawed.
>
> Specifically, I need to investigate:
> - What zero-length matches mean semantically in RPR
> - How to handle empty patterns according to the standard
> - The correct behavior when a pattern optimizes to nothing
>
> After the current code review phase is complete, I'm also considering
> setting up an Oracle test environment to observe how it handles these
> edge cases. This could provide valuable insights into the expected
> behavior, especially for zero-length matches and empty patterns.
>

Our current implementation cannot support A{0} due to a structural
limitation.

The reduced_frame_map uses row-based representation (reduced_frame_map[pos]
= val),
which can only express matches consuming at least one row. It cannot
represent
zero-length matches that occur between rows without consuming any row
position.

Patterns like A{0}, A*, or A? can produce zero-length matches with no row
to mark
as RF_FRAME_HEAD and no position to register in the frame map.

We currently prevent this using the initialAdvance flag (nodeWindowAgg.c),
which skips FIN recording during initial epsilon transitions.

Supporting A{0} requires either restructuring reduced_frame_map to handle
virtual positions, or separate handling for zero-length matches. Before
choosing an approach, we need clarity on what the SQL standard expects for
zero-length match semantics (output generation, aggregate behavior, etc.).

Given this structural limitation, I'd like to ask: should we keep the
current
initialAdvance mechanism (which prevents zero-length matches) and handle
A{0}
separately?

Best regards,
Henson

Reply via email to