Jefffrey commented on code in PR #18519:
URL: https://github.com/apache/datafusion/pull/18519#discussion_r2499831489
##########
datafusion/sqllogictest/test_files/decimal.slt:
##########
@@ -858,10 +858,11 @@ select log(2,
100000000000000000000000000000000000::decimal(38,0));
116
# log(10^35) for decimal128 with another base
+# TODO: this should be 116.267483321058, error with native decimal log impl
query R
select log(2.0, 100000000000000000000000000000000000::decimal(38,0));
----
-116.267483321058
+116
Review Comment:
This is interesting; we weren't actually making use of the native decimal
implementation for log here apparently. The decimal value was actually being
casted to float. Fixing the signature to have decimals take precedence (so they
don't get casted to float) reveals a bug in our decimal log implementation.
Note to self: raise separate issue for this
##########
datafusion/sqllogictest/test_files/order.slt:
##########
@@ -675,11 +675,11 @@ query TT
----
logical_plan
01)Sort: log_c11_base_c12 ASC NULLS LAST
-02)--Projection: log(aggregate_test_100.c12, CAST(aggregate_test_100.c11 AS
Float64)) AS log_c11_base_c12
+02)--Projection: log(aggregate_test_100.c12, aggregate_test_100.c11) AS
log_c11_base_c12
Review Comment:
`c12` is f64, `c11` is f32; now `log()` will handle casting them to the
right types internally for it's implementation
##########
datafusion/functions/src/utils.rs:
##########
@@ -140,6 +140,7 @@ where
F: Fn(L::Native, R::Native) -> Result<O::Native, ArrowError>,
R::Native: TryFrom<ScalarValue>,
{
+ let right = right.cast_to(&R::DATA_TYPE, None)?;
Review Comment:
I found it odd that only arrays get casted and not scalars; fix so both of
them get casted now.
(My fix of keeping the base as scalar in log highlighted this issue in an
existing test)
##########
datafusion/functions/src/math/log.rs:
##########
@@ -192,76 +184,68 @@ impl ScalarUDFImpl for LogFunc {
// Support overloaded log(base, x) and log(x) which defaults to log(10, x)
fn invoke_with_args(&self, args: ScalarFunctionArgs) ->
Result<ColumnarValue> {
- let args = ColumnarValue::values_to_arrays(&args.args)?;
+ if args.arg_fields.iter().any(|a| a.data_type().is_null()) {
+ return ColumnarValue::Scalar(ScalarValue::Null)
+ .cast_to(args.return_type(), None);
+ }
- let (base, value) = if args.len() == 2 {
- // note in f64::log params order is different than in sql. e.g in
sql log(base, x) == f64::log(x, base)
- (ColumnarValue::Array(Arc::clone(&args[0])), &args[1])
+ let (base, value) = if args.args.len() == 2 {
+ (args.args[0].clone(), &args.args[1])
} else {
- // log(num) - assume base is 10
- let ret_type = if args[0].data_type().is_null() {
- &DataType::Float64
- } else {
- args[0].data_type()
- };
+ // no base specified, default to 10
(
- ColumnarValue::Array(
-
ScalarValue::new_ten(ret_type)?.to_array_of_size(args[0].len())?,
- ),
- &args[0],
+
ColumnarValue::Scalar(ScalarValue::new_ten(args.return_type())?),
+ &args.args[0],
)
};
+ let value = value.to_array(args.number_rows)?;
- // All log functors have format 'log(value, base)'
- // Therefore, for `calculate_binary_math` the first type means a type
of main array
- // The second type is the type of the base array (even if derived from
main)
- let arr: ArrayRef = match value.data_type() {
- DataType::Float32 => calculate_binary_math::<
- Float32Type,
- Float32Type,
- Float32Type,
- _,
- >(value, &base, |x, b| Ok(f32::log(x, b)))?,
- DataType::Float64 => calculate_binary_math::<
- Float64Type,
- Float64Type,
- Float64Type,
- _,
- >(value, &base, |x, b| Ok(f64::log(x, b)))?,
- DataType::Int32 => {
Review Comment:
I removed arms for Int32 & Int64 since they should now be casted to Float64
by the new signature; this is functionally equivalent to what the
implementations for Int32 & Int64 did anyway
##########
datafusion/functions/src/math/log.rs:
##########
@@ -72,37 +72,28 @@ impl Default for LogFunc {
impl LogFunc {
pub fn new() -> Self {
+ // Converts decimals & integers to float64, accepting other floats as
is
Review Comment:
Main fix here for signature; to me this is more readable than the previous
signature, and importantly it accepts any decimals regardless of
precision/scale.
Note to self: add tests for decimals of different precision/scale if they
don't exist
##########
datafusion/functions/src/math/log.rs:
##########
@@ -192,76 +184,68 @@ impl ScalarUDFImpl for LogFunc {
// Support overloaded log(base, x) and log(x) which defaults to log(10, x)
fn invoke_with_args(&self, args: ScalarFunctionArgs) ->
Result<ColumnarValue> {
- let args = ColumnarValue::values_to_arrays(&args.args)?;
Review Comment:
This was converting all the args to array, losing optimization opportunity
if the base was a scalar. Fix so we maintain the scalar throughout. the `value`
will need to be an array because `calculate_binary_math` requires its left
argument (`value` for us) to be an array anyway:
https://github.com/apache/datafusion/blob/a5eb9121ccf802dda547897155403b08a4fbf774/datafusion/functions/src/utils.rs#L131-L135
##########
datafusion/functions/src/math/log.rs:
##########
@@ -277,17 +261,28 @@ impl ScalarUDFImpl for LogFunc {
mut args: Vec<Expr>,
info: &dyn SimplifyInfo,
) -> Result<ExprSimplifyResult> {
+ let mut arg_types = args
+ .iter()
+ .map(|arg| info.get_data_type(arg))
+ .collect::<Result<Vec<_>>>()?;
+ let return_type = self.return_type(&arg_types)?;
+
+ // Null propagation
+ if arg_types.iter().any(|dt| dt.is_null()) {
+ return Ok(ExprSimplifyResult::Simplified(lit(
+ ScalarValue::Null.cast_to(&return_type)?
+ )));
+ }
Review Comment:
So any `log(null, x)`, `log(x, null)` or `log(null)` should be immediately
simplified to `null`.
I'll add a test case to show this actually has an effect.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]