2010YOUY01 commented on code in PR #19278:
URL: https://github.com/apache/datafusion/pull/19278#discussion_r2610101314
##########
datafusion/functions-aggregate/src/median.rs:
##########
@@ -289,14 +289,32 @@ impl<T: ArrowNumericType> Accumulator for
MedianAccumulator<T> {
}
fn evaluate(&mut self) -> Result<ScalarValue> {
- let d = std::mem::take(&mut self.all_values);
+ let d = self.all_values.clone();
Review Comment:
I think so, we can't check if it's accumulating for window function or
regular aggregate function.
To do that we have to extend the `Accumulator` interface perhaps.
##########
datafusion/functions-aggregate/src/median.rs:
##########
@@ -289,14 +289,32 @@ impl<T: ArrowNumericType> Accumulator for
MedianAccumulator<T> {
}
fn evaluate(&mut self) -> Result<ScalarValue> {
- let d = std::mem::take(&mut self.all_values);
+ let d = self.all_values.clone();
Review Comment:
It would be great we can measure the performance after this extra deep copy.
We have queries with `median()` in extended-clickbench, and h2o-groupby
benchmarks. @alamb could you trigger the benchmark? Thank you!
##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -991,6 +991,54 @@ SELECT approx_median(col_f64_nan) FROM median_table
----
NaN
+# median_sliding_window
+statement ok
+CREATE TABLE median_window_test (
+ timestamp INT,
+ tags VARCHAR,
+ value DOUBLE
+);
+
+statement ok
+INSERT INTO median_window_test (timestamp, tags, value) VALUES
+(1, 'tag1', 10.0),
+(2, 'tag1', 20.0),
+(3, 'tag1', 30.0),
+(4, 'tag1', 40.0),
+(5, 'tag1', 50.0),
+(1, 'tag2', 60.0),
+(2, 'tag2', 70.0),
+(3, 'tag2', 80.0),
+(4, 'tag2', 90.0),
+(5, 'tag2', 100.0);
+
+query ITRR
+SELECT
+ timestamp,
+ tags,
+ value,
+ median(value) OVER (
+ PARTITION BY tags
+ ORDER BY timestamp
+ ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
Review Comment:
I recommend to test different window frames like `UNBOUNDED
PRECEDING/FOLLOWING`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]