Jefffrey commented on code in PR #19941:
URL: https://github.com/apache/datafusion/pull/19941#discussion_r2730970248


##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -2029,11 +2029,12 @@ statement ok
 INSERT INTO t1 VALUES (TRUE);
 
 # ISSUE: https://github.com/apache/datafusion/issues/12716
-# This test verifies that approx_percentile_cont_with_weight does not panic 
when given 'NaN' and returns 'inf'
+# This test verifies that approx_percentile_cont_with_weight does not panic 
when given 'NaN'
+# With weight=0, the data point does not contribute, so result is NULL
 query R
 SELECT approx_percentile_cont_with_weight(0, 0) WITHIN GROUP (ORDER BY 
'NaN'::DOUBLE) FROM t1 WHERE t1.v1;
 ----
-Infinity
+NULL

Review Comment:
   Does this conform with other implementations? What does clickhouse do in 
this case?



##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -2352,21 +2353,21 @@ e 115
 query TI
 SELECT c1, approx_percentile_cont_with_weight(c2, 0.95) WITHIN GROUP (ORDER BY 
c3) AS c3_p95 FROM aggregate_test_100 GROUP BY 1 ORDER BY 1
 ----
-a 74
+a 65
 b 68
-c 123
-d 124
-e 115
+c 122
+d 123
+e 110

Review Comment:
   Verified against clickhouse, looks good
   
   ```sql
   :) select c1, quantileTDigestWeighted(0.95)(c3, c2::UInt64) from 
file('/Users/jeffrey/Code/datafusion/testing/data/csv/aggregate_test_100_with_dates.csv',
 CSV) group by c1 order by c1;
   
   SELECT
       c1,
       quantileTDigestWeighted(0.95)(c3, CAST(c2, 'UInt64'))
   FROM 
file('/Users/jeffrey/Code/datafusion/testing/data/csv/aggregate_test_100_with_dates.csv',
 CSV)
   GROUP BY c1
   ORDER BY c1 ASC
   
   Query id: f2aa8f65-4db9-4e16-ae53-c5d22d17da7f
   
      ┌─c1─┬─quantileTDig⋯ 'UInt64'))─┐
   1. │ a  │                       65 │
   2. │ b  │                       68 │
   3. │ c  │               122.111115 │
   4. │ d  │                    123.4 │
   5. │ e  │                110.26666 │
      └────┴──────────────────────────┘
   
   5 rows in set. Elapsed: 0.010 sec.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to