Two Window aggregate node for logically same over clause

2023-10-05 Thread "Anitha S"
Hi team,

      We have observed that for logically same over clause two different window 
aggregate nodes are created in plan.  
The below query contains two window functions. Both Over clause contain the 
same partition & order clause in it. But in one over clause ordering option is 
mentioned as ascending but not in another over clause which represents the 
default option "ascending". 

 
Sample Query:
 







 CREATE TEMPORARY TABLE empsalary (

 depname varchar,

 empno bigint,

 salary int,

 enroll_date date

 );



 INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 1, 
5000, '2006-10-01'),

 ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, 
'2007-08-08'),('personnel', 2, 3900, '2006-12-23'),

 ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, 
'2008-01-01'),('sales', 3, 4800, '2007-08-01'),

 ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');



 explain verbose select rank() over (partition by depname order by salary),  
percent_rank() over(partition by depname order by salary asc) from empsalary;

  QUERY PLAN

 
--

 WindowAgg  (cost=174.54..1000114.66 rows=1070 width=52)

 Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname

  ->  WindowAgg  (cost=174.54..195.94 rows=1070 width=44)

  Output: salary, depname, rank() OVER (?)

->  Sort  (cost=174.54..177.21 rows=1070 width=36)

Output: salary, depname

Sort Key: empsalary.depname, empsalary.salary

   ->  Seq Scan on pg_temp_7.empsalary  (cost=0.00..20.70 
rows=1070 width=36)

   Output: salary, depname 





Ordering option of Sort is represented by enum SortByDir (parsenodes.h). 

List of SortBy is present in WindowDef structure which has info of order by 
clause in Over clause







 /* Sort ordering options for ORDER BY and CREATE INDEX */ 

 typedef enum SortByDir

 {

  SORTBY_DEFAULT,

  SORTBY_ASC,

  SORTBY_DESC,

  SORTBY_USING /* not allowed in CREATE INDEX ... */

 } SortByDir;  





 typedef struct SortBy

 {

  NodeTag type;

  Node *node; /* expression to sort on */

  SortByDir sortby_dir; /* ASC/DESC/USING/default */

  SortByNulls sortby_nulls; /* NULLS FIRST/LAST */

  List *useOp; /* name of op to use, if SORTBY_USING */

  int location; /* operator location, or -1 if none/unknown */

 } SortBy;  





In transformWindowFuncCall API, Equality check of order clause in window 
definition failed while comparing SortByDir enum of both over clause i.e 
SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created in 
parse tree resulting in the creation of two different window aggregate node.

This check can be modified to form a single window aggregate node for the above 
results in faster query execution. Is there any reason for creating two 
different window aggregate node?

Thanks 
Anitha S

 

Re: Two Window aggregate node for logically same over clause

2023-10-06 Thread Ashutosh Bapat
On Thu, Oct 5, 2023 at 8:53 PM "Anitha S"  wrote:
>
>
>
> Hi team,
>
>   We have observed that for logically same over clause two different 
> window aggregate nodes are created in plan.
> The below query contains two window functions. Both Over clause contain the 
> same partition & order clause in it. But in one over clause ordering option 
> is mentioned as ascending but not in another over clause which represents the 
> default option "ascending".
>
>
> Sample Query:
>
> CREATE TEMPORARY TABLE empsalary (
> depname varchar,
> empno bigint,
> salary int,
> enroll_date date
> );
>
> INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 
> 1, 5000, '2006-10-01'),
> ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, 
> '2007-08-08'),('personnel', 2, 3900, '2006-12-23'),
> ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, 
> '2008-01-01'),('sales', 3, 4800, '2007-08-01'),
> ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');
>
> explain verbose select rank() over (partition by depname order by salary), 
> percent_rank() over(partition by depname order by salary asc) from empsalary;
>   QUERY PLAN
> --
>  WindowAgg (cost=174.54..1000114.66 rows=1070 width=52)
>  Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname
>   -> WindowAgg (cost=174.54..195.94 rows=1070 width=44)
>   Output: salary, depname, rank() OVER (?)
> -> Sort (cost=174.54..177.21 rows=1070 width=36)
> Output: salary, depname
> Sort Key: empsalary.depname, empsalary.salary
>-> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 
> rows=1070 width=36)
>Output: salary, depname
>
>
> Ordering option of Sort is represented by enum SortByDir (parsenodes.h).
>
> List of SortBy is present in WindowDef structure which has info of order by 
> clause in Over clause
>
> /* Sort ordering options for ORDER BY and CREATE INDEX */
> typedef enum SortByDir
> {
> SORTBY_DEFAULT,
> SORTBY_ASC,
> SORTBY_DESC,
> SORTBY_USING /* not allowed in CREATE INDEX ... */
> } SortByDir;
> typedef struct SortBy
> {
> NodeTag type;
> Node *node; /* expression to sort on */
> SortByDir sortby_dir; /* ASC/DESC/USING/default */
> SortByNulls sortby_nulls; /* NULLS FIRST/LAST */
> List *useOp; /* name of op to use, if SORTBY_USING */
> int location; /* operator location, or -1 if none/unknown */
> } SortBy;
>
>
> In transformWindowFuncCall API, Equality check of order clause in window 
> definition failed while comparing SortByDir enum of both over clause i.e 
> SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created 
> in parse tree resulting in the creation of two different window aggregate 
> node.
>
> This check can be modified to form a single window aggregate node for the 
> above results in faster query execution. Is there any reason for creating two 
> different window aggregate node?

I don't see any. https://www.postgresql.org/docs/16/sql-select.html
description of ORDER BY clause clearly says that ASC is assumed when
no direction is mentioned. The only place in code which is used to
create the node treats DEFAULT and ASC as same. May be we want to
allow default to be ASC or DESC based on some setting (read GUC) in
some future.

Another angle is to ask: Why would the query add ASC to one window
specification and not the other?

-- 
Best Wishes,
Ashutosh Bapat




Re: Two Window aggregate node for logically same over clause

2023-10-06 Thread Tom Lane
Ashutosh Bapat  writes:
> On Thu, Oct 5, 2023 at 8:53 PM "Anitha S"  wrote:
>> We have observed that for logically same over clause two different window 
>> aggregate nodes are created in plan.
>> The below query contains two window functions. Both Over clause contain the 
>> same partition & order clause in it. But in one over clause ordering option 
>> is mentioned as ascending but not in another over clause which represents 
>> the default option "ascending".

> Another angle is to ask: Why would the query add ASC to one window
> specification and not the other?

Yeah.  I can't get excited about doing anything about this.  We
promise to merge identical window clauses, but these aren't identical.
If you say "let's merge semantically equivalent clauses", that's
opening a fairly large can of worms --- for example, ought we to
recognize that "x + 1.0" and "x + 1.00" are equivalent?  Or even
"x" and "x + 0"?  (I'm pretty sure I've seen query hacks recommended
that depend on our *not* detecting that.)

Also, it would be an extremely bad idea IMO to change the way
equal() deals with this, which means that transformWindowFuncCall
would have to use bespoke code not equal() to check for matches.
That'd be ugly and a permanent maintenance gotcha.

regards, tom lane




Re: Two Window aggregate node for logically same over clause

2023-10-09 Thread "Anitha S"
https://www.postgresql.org/docs/16/sql-select.html#SQL-ORDERBY:~:text=Optionally%20one%20can%20add%20the%20key%20word%20ASC%20(ascending)%20or%20DESC%20(descending)%20after%20any%20expression%20in%20the%20ORDER%20BY%20clause.%20If%20not%20specified%2C%20ASC%20is%20assumed%20by%20default.%20Alternatively

If order by directions is not mentioned it is assumed as ASC. It is mention 
ORDER by section. Currently PG don't have any GUC to set default order 
direction. Is there any idea to set default ordering direction via GUC ?

Another angle is to ask: Why would the query add ASC to one window 

specification and not the other?

Had seen a query from user where one over clause contains order direction & 
others not.  The reason for this thread is to get a solution to handle such 
cases also.







 On Fri, 06 Oct 2023 18:06:10 +0530 Ashutosh Bapat 
 wrote ---



On Thu, Oct 5, 2023 at 8:53 PM "Anitha S"  
wrote: 
> 
> 
> 
> Hi team, 
> 
>   We have observed that for logically same over clause two different 
> window aggregate nodes are created in plan. 
> The below query contains two window functions. Both Over clause contain the 
> same partition & order clause in it. But in one over clause ordering option 
> is mentioned as ascending but not in another over clause which represents the 
> default option "ascending". 
> 
> 
> Sample Query: 
> 
> CREATE TEMPORARY TABLE empsalary ( 
> depname varchar, 
> empno bigint, 
> salary int, 
> enroll_date date 
> ); 
> 
> INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 
> 1, 5000, '2006-10-01'), 
> ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, 
> '2007-08-08'),('personnel', 2, 3900, '2006-12-23'), 
> ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, 
> '2008-01-01'),('sales', 3, 4800, '2007-08-01'), 
> ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15'); 
> 
> explain verbose select rank() over (partition by depname order by salary), 
> percent_rank() over(partition by depname order by salary asc) from empsalary; 
>   QUERY PLAN 
> --
>  
>  WindowAgg (cost=174.54..1000114.66 rows=1070 width=52) 
>  Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname 
>   -> WindowAgg (cost=174.54..195.94 rows=1070 width=44) 
>   Output: salary, depname, rank() OVER (?) 
> -> Sort (cost=174.54..177.21 rows=1070 width=36) 
> Output: salary, depname 
> Sort Key: empsalary.depname, empsalary.salary 
>-> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 
> rows=1070 width=36) 
>Output: salary, depname 
> 
> 
> Ordering option of Sort is represented by enum SortByDir (parsenodes.h). 
> 
> List of SortBy is present in WindowDef structure which has info of order by 
> clause in Over clause 
> 
> /* Sort ordering options for ORDER BY and CREATE INDEX */ 
> typedef enum SortByDir 
> { 
> SORTBY_DEFAULT, 
> SORTBY_ASC, 
> SORTBY_DESC, 
> SORTBY_USING /* not allowed in CREATE INDEX ... */ 
> } SortByDir; 
> typedef struct SortBy 
> { 
> NodeTag type; 
> Node *node; /* expression to sort on */ 
> SortByDir sortby_dir; /* ASC/DESC/USING/default */ 
> SortByNulls sortby_nulls; /* NULLS FIRST/LAST */ 
> List *useOp; /* name of op to use, if SORTBY_USING */ 
> int location; /* operator location, or -1 if none/unknown */ 
> } SortBy; 
> 
> 
> In transformWindowFuncCall API, Equality check of order clause in window 
> definition failed while comparing SortByDir enum of both over clause i.e 
> SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created 
> in parse tree resulting in the creation of two different window aggregate 
> node. 
> 
> This check can be modified to form a single window aggregate node for the 
> above results in faster query execution. Is there any reason for creating two 
> different window aggregate node? 
 
I don't see any. https://www.postgresql.org/docs/16/sql-select.html 
description of ORDER BY clause clearly says that ASC is assumed when 
no direction is mentioned. The only place in code which is used to 
create the node treats DEFAULT and ASC as same. May be we want to 
allow default to be ASC or DESC based on some setting (read GUC) in 
some future. 
 
Another angle is to ask: Why would the query add ASC to one window 
specification and not the other? 
 
-- 
Best Wishes, 
Ashutosh Bapat