On 7/28/23 09:09, Tatsuo Ishii wrote:
We already recalculate a frame each time a row is processed even
without RPR. See ExecWindowAgg.

Yes, after each row.  Not for each function.

Ok, I understand now. Closer look at the code, I realized that each
window function calls update_frameheadpos, which computes the frame
head position. But actually it checks winstate->framehead_valid and if
it's already true (probably by other window function), then it does
nothing.

Also RPR always requires a frame option ROWS BETWEEN CURRENT ROW,
which means the frame head is changed each time current row position
changes.

Off topic for now: I wonder why this restriction is in place and
whether we should respect or ignore it.  That is a discussion for
another time, though.

My guess is, it is because other than ROWS BETWEEN CURRENT ROW has
little or no meaning. Consider following example:

Yes, that makes sense.

I strongly disagree with this.  Window function do not need to know
how the frame is defined, and indeed they should not.
We already break the rule by defining *support functions. See
windowfuncs.c.
The support functions don't know anything about the frame, they just
know when a window function is monotonically increasing and execution
can either stop or be "passed through".

I see following code in window_row_number_support:

                /*
                 * The frame options can always become "ROWS BETWEEN UNBOUNDED
                 * PRECEDING AND CURRENT ROW".  row_number() always just 
increments by
                 * 1 with each row in the partition.  Using ROWS instead of 
RANGE
                 * saves effort checking peer rows during execution.
                 */
                req->frameOptions = (FRAMEOPTION_NONDEFAULT |
                                                         FRAMEOPTION_ROWS |
                                                         
FRAMEOPTION_START_UNBOUNDED_PRECEDING |
                                                         
FRAMEOPTION_END_CURRENT_ROW);

I think it not only knows about frame but it even changes the frame
options. This seems far from "don't know anything about the frame", no?

That's the planner support function. The row_number() function itself is not even allowed to *have* a frame, per spec. We allow it, but as you can see from that support function, we completely replace it.

So all of the partition-level window functions are not affected by RPR anyway.

I have two comments about this:

It isn't just for convenience, it is for correctness.  The window
functions do not need to know which rows they are *not* operating on.

There is no such thing as a "full" or "reduced" frame.  The standard
uses those terms to explain the difference between before and after
RPR is applied, but window functions do not get to choose which frame
they apply over.  They only ever apply over the reduced window frame.

I agree that "full window frame" and "reduced window frame" do not
exist at the same time, and in the end (after computation of reduced
frame), only "reduced" frame is visible to window
functions/aggregates. But I still do think that "full window frame"
and "reduced window frame" are important concept to explain/understand
how PRP works.

If we are just using those terms for documentation, then okay.
--
Vik Fearing



Reply via email to