On Sun, Feb 4, 2018 at 6:10 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Oliver Ford <ojf...@gmail.com> writes:
> > [ 0001-window-frame-v13.patch ]
>
> I've been hacking on this all week (with breaks for release notes) and
> have gotten it into a state that I think is close to committable.
>
> There was quite a lot I didn't like about the patch initially, notably
> that the interaction with operator classes/families was done all wrong.
> The idea is to add one support function per opclass, not jam them all
> into one opclass that breaks every rule for B-tree opclasses.  For one
> reason, with this approach there's no chance of dealing with non-default
> sort orders imposed by non-default opclasses.  (As a concrete example,
> suppose that we have two btree opclasses for complex numbers, one that
> sorts by real part and one that sorts by imaginary part.  You can write
> a well-defined in_range function for each of these, but one of them has
> to increment the real part and the other the imaginary part.)  I whacked
> that around and also wrote the missing documentation for the API spec
> for in_range functions.  The path of least resistance was to dump it
> into the nbtree/README text file, which I'm not that satisfied with;
> probably it should go in the main SGML docs, but I did not find a good
> place to put it.
>
> I also really didn't like the implementation you'd chosen in
> nodeWindowAgg.c to scan the entire partition and build an array of peer
> group lengths.  That risks running OOM with a large partition.  Even if
> the array doesn't lead to OOM, the tuplestore will spill to disk with
> nasty performance consequences.  We should try to ensure that the
> tuplestore needn't get larger than the frame, so that well-written queries
> with narrow frames can execute without spilling to disk.  So I rewrote
> that using an idea that had been speculated about in the original
> comments, but nobody had gotten to yet: add some more read pointers to
> track the frame boundaries, and advance them as needed.  I'm not really
> sure if this ends up as more or few row comparisons than the other way,
> but in any case it uses a fixed amount of memory, which is good.
>
> Also, the last patch's reimplementation of WinGetFuncArgInFrame isn't
> right: AFAICS, it results in any "relpos" that would point to a row
> in the exclusion range returning the row just after/before that range,
> which is already wrong if the exclusion range is more than one row,
> plus it doesn't renumber the rows beyond the exclusion.  The behavior
> we want is that the frame rows surviving after exclusion should appear
> consecutively numbered.  (This could be exposed with some tests using
> nth_value.)  I think the attached rewrite gets this right.  Also, punting
> entirely on the set-mark problem for SEEK_TAIL cases doesn't satisfy me,
> for the same reason as above that we don't want the tuplestore to bloat.
> What I did below is to set the mark at the frame start, which at least
> gives an opportunity for efficient queries.
>
> I hacked around on various other things too, for instance the behavior
> for null values in RANGE mode didn't seem to be per spec.
>
> I'm reasonably happy with all the code now, though surely it could use
> another look by someone else.  I've not yet reviewed the docs (other than
> the implementor-oriented details I added), nor have I really looked at the
> test cases.  I do have a couple suggestions on the test cases: for one,
> rather than duplicating the same window definition N times in each query,
> use one WINDOW clause and reference it with "OVER windowname".  Also,
> adding a bunch of columns of different types to a single table seems like
> a messy and not easily extensible way of testing different data types.
> I'd suggest leaving the existing table alone and adding a new test table
> per additional data type you want to test, so that there's an easy
> template for testing future additions of more in_range support.
>
> BTW, something I've not done here but am strongly tempted to do is
> run around and change all the uses of "RANGE value PRECEDING/FOLLOWING"
> terminology to, say, "RANGE offset PRECEDING/FOLLOWING".  "value" is
> just way too generic a term for this situation, making documentation
> confusing, plus you end up contorting sentences to avoid constructions
> like "value of the value".  I'm not wedded to "offset" if somebody's got a
> better word, but let's try to pick something more specific than "value".
> (In the ROWS and GROUPS cases, maybe write "count"?  Not entirely sure
> what to do for text that's trying to address all three cases, though.)
>
>
What about "extent_size" or just "size"? I see the SQL spec refers to
"preceding or following size" in an error message: ("data exception —
invalid preceding or following size in window function" )

Best regards
Pantelis Theodosiou

Reply via email to