MohamedAbdeen21 commented on code in PR #11585:
URL: https://github.com/apache/datafusion/pull/11585#discussion_r1686180731


##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -289,6 +289,14 @@ impl PhysicalExpr for BinaryExpr {
             return apply_cmp_for_nested(self.op, &lhs, &rhs);
         }
 
+        if left_data_type.is_floating() {
+            lhs = normalize_floating_zeros(lhs, &left_data_type)?;
+        };
+
+        if right_data_type.is_floating() {
+            rhs = normalize_floating_zeros(rhs, &right_data_type)?;
+        }

Review Comment:
   Initial criterion results on 1,000,000 rows
   
   ```
   evaluate with normalization
                           time:   [20.411 ms 20.570 ms 20.747 ms]
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   evaluate without normalization
                           time:   [820.15 µs 825.46 µs 831.99 µs]
   Found 11 outliers among 100 measurements (11.00%)
     4 (4.00%) high mild
     7 (7.00%) high severe
   ```
   
   That's almost 25x worse.
   
   I'll try a couple changes first, if the performance is still horrible I'll 
close the PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to