Re: [HACKERS] Documentation, window functions
Robert Haas wrote: > On Thu, Sep 23, 2010 at 11:34 PM, Dennis Bj?rklund wrote: > >> On Wed, Sep 22, 2010 at 6:03 AM, Dennis Bj?rklund > >> wrote: > >> But I confess that I'm sort of murky on how ORDER affects the window > >> frame, or how to rephrase this more sensibly. > > > > The rows included in the calculation of the window function are per default > > > > RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW > > > > where CURRENT ROW include all the rows that are equal to the row you are > > at according to the ordering. So if you say order by name then all the > > rows up to your name and all rows with the same name are included, not > > later rows. > > > > If you don't have any ordering, then all rows are "equal" and all rows are > > included in the computation. That's why your example behaved like it did. > > > > At least that's my understanding of how these things work. I've not used > > window functions very much myself. > > > > This is fairly difficult stuff and it probably don't belong in a tutorial > > but the current wording suggest that you can add any ordering and it won't > > affect the result. That is also a bad since it might teach people the > > wrong thing. > > Hmm... it is true that average will produce the same results on any > ordering of the same set of input values, though. Perhaps the word > "partition" emcompass that, though then again maybe not. > > I'd be happy to fix this if I understand what to fix it to. I clarified the window function ORDER BY wording to avoid mentioning avg(). Applied patch attached. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml index 11859b4..218988e 100644 *** a/doc/src/sgml/advanced.sgml --- b/doc/src/sgml/advanced.sgml *** SELECT depname, empno, salary, avg(salar *** 383,392 ! Although avg will produce the same result no matter ! what order it processes the partition's rows in, this is not true of all ! window functions. When needed, you can control that order using ! ORDER BY within OVER. Here is an example: SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary; --- 383,392 ! You can also control the order in which rows are processed by ! window functions using ORDER BY within OVER. ! (The window ORDER BY does not even have to match the ! order in which the rows are output.) Here is an example: SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
Robert Haas wrote: > Hmm... it is true that average will produce the same results on any > ordering of the same set of input values, though. Not exactly. For floating point approximations you get a more accurate sum (and therefore a more accurate average) if you add the values in ascending order of absolute value. Depending on the data, it can make quite a difference. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
> Hmm... it is true that average will produce the same results on any > ordering of the same set of input values, though. Perhaps the word > "partition" emcompass that, though then again maybe not. > > I'd be happy to fix this if I understand what to fix it to. I wish I knew how to rephrase it so it is both as simple as now and also totally clear. We can't complicate it, it's a tutorial. /Dennis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
On Thu, Sep 23, 2010 at 11:34 PM, Dennis Björklund wrote: >> On Wed, Sep 22, 2010 at 6:03 AM, Dennis Björklund wrote: >> But I confess that I'm sort of murky on how ORDER affects the window >> frame, or how to rephrase this more sensibly. > > The rows included in the calculation of the window function are per default > > RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW > > where CURRENT ROW include all the rows that are equal to the row you are > at according to the ordering. So if you say order by name then all the > rows up to your name and all rows with the same name are included, not > later rows. > > If you don't have any ordering, then all rows are "equal" and all rows are > included in the computation. That's why your example behaved like it did. > > At least that's my understanding of how these things work. I've not used > window functions very much myself. > > This is fairly difficult stuff and it probably don't belong in a tutorial > but the current wording suggest that you can add any ordering and it won't > affect the result. That is also a bad since it might teach people the > wrong thing. Hmm... it is true that average will produce the same results on any ordering of the same set of input values, though. Perhaps the word "partition" emcompass that, though then again maybe not. I'd be happy to fix this if I understand what to fix it to. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
> On Wed, Sep 22, 2010 at 6:03 AM, Dennis Björklund wrote: > But I confess that I'm sort of murky on how ORDER affects the window > frame, or how to rephrase this more sensibly. The rows included in the calculation of the window function are per default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW where CURRENT ROW include all the rows that are equal to the row you are at according to the ordering. So if you say order by name then all the rows up to your name and all rows with the same name are included, not later rows. If you don't have any ordering, then all rows are "equal" and all rows are included in the computation. That's why your example behaved like it did. At least that's my understanding of how these things work. I've not used window functions very much myself. This is fairly difficult stuff and it probably don't belong in a tutorial but the current wording suggest that you can add any ordering and it won't affect the result. That is also a bad since it might teach people the wrong thing. /Dennis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
Excerpts from Hitoshi Harada's message of mié sep 22 12:54:45 -0400 2010: > > Maybe we can find some better wording of the above? > > Your point is true, but I believe it's still ok because the section is > a tutorial for novices. If you start to explain everything here, > readers don't want to read it to the end. We had this exact question in the spanish list two weeks ago. I also suggest that we need to explain this a bit more explicitely. Actually the spanish question involved a PARTITION BY / ORDER BY clause, and the difference showed up not because of the existance of ORDER BY alone, but by whether the ORDER BY had the same columns than PARTITION BY or more. Here it is: http://archives.postgresql.org/message-id/of7db68f53.5da80480-on8625779f.0058b555-8625779f.00591...@correogs.com.mx (The archive is a bit confusing because the results appear as images at the bottom of the email, instead of being interspersed with the text.) -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
2010/9/22 Dennis Björklund : > In > > http://www.postgresql.org/docs/9.0/static/tutorial-window.html > > it say > > "Although avg will produce the same result no matter what order it > processes the partition's rows in, this is not true of all window > functions. When needed, you can control that order using ORDER BY within > OVER." > > While it's true that avg() produce the same result no matter what order. A > ORDER BY clause will affect what rows are included in the computation and > thus change the result (the default window frame is > RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). So one can not in > general add an ORDER BY to the example in the tutorial and get the same > result as without an ORDER BY. > > Maybe we can find some better wording of the above? Your point is true, but I believe it's still ok because the section is a tutorial for novices. If you start to explain everything here, readers don't want to read it to the end. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation, window functions
On Wed, Sep 22, 2010 at 6:03 AM, Dennis Björklund wrote: > In > > http://www.postgresql.org/docs/9.0/static/tutorial-window.html > > it say > > "Although avg will produce the same result no matter what order it > processes the partition's rows in, this is not true of all window > functions. When needed, you can control that order using ORDER BY within > OVER." > > While it's true that avg() produce the same result no matter what order. A > ORDER BY clause will affect what rows are included in the computation and > thus change the result (the default window frame is > RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). So one can not in > general add an ORDER BY to the example in the tutorial and get the same > result as without an ORDER BY. > > Maybe we can find some better wording of the above? Yeah, that doesn't seem right. rhaas=# create table foo (a integer); CREATE TABLE rhaas=# insert into foo values (1); INSERT 0 1 rhaas=# insert into foo values (2); INSERT 0 1 rhaas=# insert into foo values (3); INSERT 0 1 rhaas=# select a, avg(a) over () from foo; a |avg ---+ 1 | 2. 2 | 2. 3 | 2. (3 rows) rhaas=# select a, avg(a) over (order by a) from foo; a | avg ---+ 1 | 1. 2 | 1.5000 3 | 2. (3 rows) But I confess that I'm sort of murky on how ORDER affects the window frame, or how to rephrase this more sensibly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Documentation, window functions
In http://www.postgresql.org/docs/9.0/static/tutorial-window.html it say "Although avg will produce the same result no matter what order it processes the partition's rows in, this is not true of all window functions. When needed, you can control that order using ORDER BY within OVER." While it's true that avg() produce the same result no matter what order. A ORDER BY clause will affect what rows are included in the computation and thus change the result (the default window frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). So one can not in general add an ORDER BY to the example in the tutorial and get the same result as without an ORDER BY. Maybe we can find some better wording of the above? /Dennis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers