Hi Henson,
> Hi Tatsuo,
>
> Thank you for the thorough testing and performance profiling!
>
> BTW, I have tested RPR with large partition/reduced frame (up to 100k
>> rows). Following query took over 3 minutes on my laptop.
>>
>
> The profiling results are very insightful and help us understand where
> the time is spent in extreme cases.
Glad to hear that the data is useful.
> I agree with your assessment. Let me clarify:
>
> The pattern itself (START UP+) is realistic - it's a common pattern for
> detecting upward trends. However, 100k+ consecutive matches without
> interruption is not realistic. Your UP{,3} example (94ms for 100k partition)
> demonstrates that the current implementation performs very well for
> realistic
> use cases.
Yes, I agree.
> I agree that we should prioritize realistic use cases. That said, there are
> potential optimizations that could help:
>
> 1. Anchored pattern absorption (see my earlier message:
>
> https://www.postgresql.org/message-id/CAAAe_zAEg7sVM%3DWDwXMyE-odGmQyXSVi5ZzWgye6SupSjdMKpg%40mail.gmail.com
> )
>
> 2. Alt-pruning: In patterns like "A* | B*", once the higher-priority A
> branch
> has a confirmed match, the B branch can be pruned immediately. Even if B
> could continue extending to a longer match, it can never be selected due
> to
> lexical order semantics―A will always win. This proactive pruning
> respects
> SQL standard semantics while reducing unnecessary state expansion.
>
> However, given the complexity of NFA internals, I believe we should take a
> step-by-step approach:
>
> 1. First, stabilize the current RPR patch and prepare it for review
> 2. Then, consider optimizations as separate follow-up patches
> 3. Each optimization should be well-tested and reviewed independently
>
> This approach reduces risk and makes review more manageable. The fact that
> the current implementation handles even unrealistic 100k-row cases without
> crashing (just slowly) shows it's already robust.
>
> What do you think about this phased approach?
I totally agree.
> Best regards,
> Henson
>
> P.S. I discovered a crash bug that was introduced in the latest patch
> refactoring. The issue occurs with nested alternation patterns like
> (A+ | (A | B)+)*, where infinite recursion happens in nfa_advance_alt when
> the inner BEGIN(+)'s skip jump is followed as an ALT branch pointer. I will
> include the fix in the next patch update.
Thanks for fixing the issue.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp