Hi Henson, In my understanding RANGE/GROUPS are not allowed when RPR is defined. (See ISO/IEC 19075-5 section 6.10.2 "ROWS BETWEEN CURRENT ROW AND"). So proper fix would be to error out at the parse/analyze phase if RANGE/GROUPS are used when RPR is defined.
Vik, Jaconb, Thoughts? Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp > Hi Ishii-san, > > During code review of nodeWindowAgg.c, I discovered that > update_reduced_frame() > can receive the same pos value repeatedly for RANGE/GROUPS frames. > > > ---------------------------------------------------------------------- > PATTERN DISCOVERED > ---------------------------------------------------------------------- > > For RANGE/GROUPS frames with peer rows (rows having the same ORDER BY > value): > > Data: id=1,2,3 all have val=10 (peer rows) > id=4,5 both have val=20 (peer rows) > > RANGE frame processing: > - id=1 processing: pos=0 (frameheadpos points to first peer) > - id=2 processing: pos=0 (same frameheadpos) > - id=3 processing: pos=0 (same frameheadpos) > - id=4 processing: pos=3 > - id=5 processing: pos=3 > > ROWS frames always have sequential pos values (0,1,2,3,4...), but > RANGE/GROUPS > can have repeated pos values (0,0,0,3,3...). > > > ---------------------------------------------------------------------- > CURRENT IMPLEMENTATION ISSUE > ---------------------------------------------------------------------- > > The current implementation appears to assume sequential processing only, > which > is valid for ROWS frames. Two specific issues found: > > 1. In update_reduced_frame(): > - The pos <= nfaLastProcessedRow check assumes "already processed pos > doesn't > need reprocessing", but RANGE/GROUPS require processing the same pos > multiple times. > > 2. In nfa_process_row(): > - Frame boundary calculation: ctxFrameEnd = matchStartRow + frameOffset > + 1 > - This adds frameOffset directly to row position, which works for ROWS > but > may be incorrect for RANGE (value-based) and GROUPS (group-based) > frames. > > > ---------------------------------------------------------------------- > TEST COVERAGE LIMITATION > ---------------------------------------------------------------------- > > The existing RANGE/GROUPS test cases in rpr_base.sql only cover simple > scenarios > that don't reach the frame end, which is why this issue wasn't detected. > > Most tests target ROWS frames, with only a small number testing RANGE/GROUPS > frames. > > > ---------------------------------------------------------------------- > PROPOSAL > ---------------------------------------------------------------------- > > Rather than directly modifying these functions, we should consider an > alternative approach for non-UNBOUNDED frames: > - Strictly respect the frame head position > - Minimize lower-level optimizations (absorption/pruning) > - Avoid creating subsequent contexts for limited reuse benefit > - For bounded frames, the benefit of context reuse across rows may be > limited > compared to UNBOUNDED frames where contexts can be reused extensively > - This would simplify RANGE/GROUPS handling at the cost of some optimization > > To implement this properly, we need to broaden our understanding of the > execution structure through code review of Window clauses and Aggregate > functions. > > Therefore, for this round of nodeWindowAgg.c review: > 1. Perform code simplification/refactoring for ROWS frames only > 2. Add FIXME comments to RANGE/GROUPS test cases > 3. Defer RANGE/GROUPS handling to a future phase > > There may be other issues beyond RANGE/GROUPS. > After this code review round is complete, I would like to regroup and > discuss > the direction forward. > > Best regards, > Henson
