realno commented on a change in pull request #1525:
URL: https://github.com/apache/arrow-datafusion/pull/1525#discussion_r780589214
##########
File path: datafusion/src/scalar.rs
##########
@@ -526,6 +526,282 @@ macro_rules! eq_array_primitive {
}
impl ScalarValue {
+ /// Return true if the value is numeric
+ pub fn is_numeric(&self) -> bool {
+ matches!(self,
+ ScalarValue::Float32(_)
+ | ScalarValue::Float64(_)
+ | ScalarValue::Decimal128(_, _, _)
+ | ScalarValue::Int8(_)
+ | ScalarValue::Int16(_)
+ | ScalarValue::Int32(_)
+ | ScalarValue::Int64(_)
+ | ScalarValue::UInt8(_)
+ | ScalarValue::UInt16(_)
+ | ScalarValue::UInt32(_)
+ | ScalarValue::UInt64(_)
+ )
+ }
+
+ /// Add two numeric ScalarValues
+ pub fn add(lhs: &ScalarValue, rhs: &ScalarValue) -> Result<ScalarValue> {
Review comment:
The different value from postgres and google is because of the nature of
stddev, there are two versions: 1. population and 2. sample. The one in this PR
is population, looks like the default one with postgres and google is sample.
The difference in the calculation is very minimal I can include the sampe
version in the PR as well. Good catch!
So my proposal is to add two functions: stddev_pop and stddev_samp
(following Postgres standard), and have stddev default to stddev_samp. Does
this look reasonable? @alamb
And I will look into the inconsistency between query runs. Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]