ozankabak commented on code in PR #14271:
URL: https://github.com/apache/datafusion/pull/14271#discussion_r1936681167


##########
datafusion/sqllogictest/test_files/window.slt:
##########
@@ -5452,3 +5452,89 @@ order by c1, c2, rank1, rank2;
 
 statement ok
 drop table t1;
+
+
+# Set-Monotonic Window Aggregate functions can output results in order
+statement ok
+CREATE EXTERNAL TABLE aggregate_test_100_ordered (
+  c1  VARCHAR NOT NULL,
+  c2  TINYINT NOT NULL,
+  c3  SMALLINT NOT NULL,
+  c4  SMALLINT,
+  c5  INT,
+  c6  BIGINT NOT NULL,
+  c7  SMALLINT NOT NULL,
+  c8  INT NOT NULL,
+  c9  INT UNSIGNED NOT NULL,
+  c10 BIGINT UNSIGNED NOT NULL,
+  c11 FLOAT NOT NULL,
+  c12 DOUBLE NOT NULL,
+  c13 VARCHAR NOT NULL
+)
+STORED AS CSV
+LOCATION '../../testing/data/csv/aggregate_test_100.csv'
+WITH ORDER (c1)
+OPTIONS ('format.has_header' 'true');
+
+statement ok
+set datafusion.optimizer.prefer_existing_sort = true;
+
+query TT
+EXPLAIN SELECT c1, SUM(c9) OVER(PARTITION BY c1) as sum_c9 FROM 
aggregate_test_100_ordered ORDER BY c1, sum_c9;

Review Comment:
   You are right -- the output of the query repeats the same value for `c9` for 
every `c1` group regardless of the particular window/aggregation function, 
because the frame is the whole table. So we should be able to do this 
optimization irrespective of set monotonicity. However, we don't just yet 
(using `AVG` instead of `SUM` reveals this).
   
   We will fix this with a follow-on PR early next week and move these tests 
elsewhere with that PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to