[GitHub] [arrow-datafusion] jonahgao commented on a diff in pull request #7515: Fix some simplification rules for floating-point arithmetic operations

via GitHub Mon, 11 Sep 2023 20:27:33 -0700


jonahgao commented on code in PR #7515:
URL: https://github.com/apache/arrow-datafusion/pull/7515#discussion_r1322313872



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -734,19 +744,33 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for 
Simplifier<'a, S> {
                 op: Modulo,
                 right: _,
             }) if is_null(&left) => *left,
-            // A % 1 --> 0
+            // A % 1 --> 0 (if A is not nullable and not floating, since NAN % 
1 --> NAN)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
-            }) if !info.nullable(&left)? && is_one(&right) => lit(0),
-            // A % 0 --> DivideByZero Error
+            }) if !info.nullable(&left)?
+                && !info.get_data_type(&left)?.is_floating()
+                && is_one(&right) =>
+            {
+                lit(0)
+            }
+            // A % 0 --> DivideByZero Error (if A is not floating and not null)
+            // A % 0 --> NAN (if A is floating and not null)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
             }) if !info.nullable(&left)? && is_zero(&right) => {
-                return 
Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
+                match info.get_data_type(&left)? {

Review Comment:
   @alamb 
   DataFusion utilizes the `rem()` function from the arrow-rs to perform the 
`Modulo` operation.
   
   The modification here ensures that it behaves the same as the `rem()` 
function in arrow-rs, i.e., `float % 0.` --> `NAN`.
   
https://github.com/apache/arrow-rs/blob/77455d48cd6609045a4728ba908123de9d0b62fd/arrow-arith/src/numeric.rs#L71-L77
   
   And in the IEEE 754-2008 standard：
   >7.2 Invalid operation 7.2.0
   For operations producing results in floating-point format, the default 
result of an operation that signals the
   invalid operation exception shall be a quiet NaN...
   These operations are:
   ...
   f) remainder: remainder(x, y), when y is zero or x is infinite...
   ...
   
   Ref: https://en.wikipedia.org/wiki/NaN#Operations_generating_NaN



##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -734,19 +744,33 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for 
Simplifier<'a, S> {
                 op: Modulo,
                 right: _,
             }) if is_null(&left) => *left,
-            // A % 1 --> 0
+            // A % 1 --> 0 (if A is not nullable and not floating, since NAN % 
1 --> NAN)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
-            }) if !info.nullable(&left)? && is_one(&right) => lit(0),
-            // A % 0 --> DivideByZero Error
+            }) if !info.nullable(&left)?
+                && !info.get_data_type(&left)?.is_floating()
+                && is_one(&right) =>
+            {
+                lit(0)
+            }
+            // A % 0 --> DivideByZero Error (if A is not floating and not null)
+            // A % 0 --> NAN (if A is floating and not null)
             Expr::BinaryExpr(BinaryExpr {
                 left,
                 op: Modulo,
                 right,
             }) if !info.nullable(&left)? && is_zero(&right) => {
-                return 
Err(DataFusionError::ArrowError(ArrowError::DivideByZero));
+                match info.get_data_type(&left)? {

Review Comment:
   @alamb 
   DataFusion utilizes the `rem()` function from the arrow-rs to perform the 
`Modulo` operation.
   
   The modification here ensures that it behaves the same as the `rem()` 
function in arrow-rs, i.e., `float % 0.` --> `NAN`.
   
https://github.com/apache/arrow-rs/blob/77455d48cd6609045a4728ba908123de9d0b62fd/arrow-arith/src/numeric.rs#L71-L77
   
   And in the IEEE 754-2008 standard：
   >7.2 Invalid operation 7.2.0
   For operations producing results in floating-point format, the default 
result of an operation that signals the
   invalid operation exception shall be a quiet NaN...
   These operations are:
   ...
   f) remainder: remainder(x, y), when y is zero or x is infinite...
   ...
   
   Ref: https://en.wikipedia.org/wiki/NaN#Operations_generating_NaN



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] jonahgao commented on a diff in pull request #7515: Fix some simplification rules for floating-point arithmetic operations

Reply via email to