Hi hackers

I converted PostgreSQL RPR regression test queries to Oracle
MATCH_RECOGNIZE syntax and executed them on both systems to perform
cross-validation.

The tests were based on the following PostgreSQL regression test files:

  rpr_base.sql
  rpr_nfa.sql


PostgreSQL bug: zero-min reluctant quantifier

During cross-validation a PostgreSQL bug was discovered involving
reluctant quantifiers whose minimum repetition is 0.

Example pattern:

  PATTERN (A*?)
  DEFINE A AS val > 0

Result comparison:

pattern   PostgreSQL (cnt)   Oracle (cnt)
A*?       1,1,1              0,0,0
A??       1,1,1              0,0,0
A+?       1,1,1              1,1,1

For reluctant quantifiers with min=0 (such as *? and ??), PostgreSQL
always consumes at least one row, while Oracle allows a zero-length
match. When min>=1 (e.g., A+?), both systems behave the same.

This behavior was consistently observed across the converted tests.


Design difference: unused DEFINE variables

Example:

  PATTERN (A+)
  DEFINE A AS id > 0, B AS id > 5

PostgreSQL executes the query successfully and ignores the unused
variable B.

Oracle raises:

  ORA-62503: illegal variable definition


Oracle limitations observed


Bounded quantifier limit

A{200}  -> works
A{201}  -> ORA-62518

Oracle appears to limit the upper bound of bounded quantifiers to 200,
while PostgreSQL does not impose this restriction.


Nested nullable quantifiers

Examples:

  (A*)*
  (A*)+
  (((A)*)*)*

  (A?|B){1,2}
  ((A?){2,3}){2,3}
  (A?){n,m}
  (A? B?){2,3}

Oracle raises:

  ORA-62513

when a nullable subpattern is wrapped by an outer quantifier, while
PostgreSQL executes these patterns successfully.


These results come from running the converted PostgreSQL RPR regression
tests on Oracle for comparison.

Best regards,
SugJun

Reply via email to