findepi commented on code in PR #16781:
URL: https://github.com/apache/datafusion/pull/16781#discussion_r2207269417


##########
datafusion-examples/examples/function_factory.rs:
##########
@@ -153,6 +154,38 @@ impl ScalarUDFImpl for ScalarFunctionWrapper {
     fn output_ordering(&self, _input: &[ExprProperties]) -> 
Result<SortProperties> {
         Ok(SortProperties::Unordered)
     }
+
+    fn equals(&self, other: &dyn ScalarUDFImpl) -> bool {
+        let Some(other) = other.as_any().downcast_ref::<Self>() else {
+            return false;
+        };
+        let Self {
+            name,
+            expr,
+            signature,
+            return_type,
+        } = self;
+        name == &other.name
+            && expr == &other.expr
+            && signature == &other.signature
+            && return_type == &other.return_type
+    }
+
+    fn hash_value(&self) -> u64 {
+        let Self {
+            name,
+            expr,
+            signature,
+            return_type,
+        } = self;
+        let mut hasher = DefaultHasher::new();
+        std::any::type_name::<Self>().hash(&mut hasher);
+        name.hash(&mut hasher);
+        expr.hash(&mut hasher);
+        signature.hash(&mut hasher);
+        return_type.hash(&mut hasher);
+        hasher.finish()
+    }

Review Comment:
   I am not super worried from the mere fact there are similar lines of code. 
While we could hide some of the code behind macros, the macro itself would 
become more complex -- not all fields hash/compare the same way.
   
   i am more concerned about other aspects:
   -  unrelated pieces of code that needs to be changed in tandem -- not a case 
here, the implementations don't need to be similar, they just happen to be for 
consistency reasons
   - pieces of code that can be updated erroneously, or erroneously not 
updated, like equals/hash_value implementation need to be changed when new 
field is added -- this is mitigated by use of `Self {...} = self` 
destructuring, which enforces a code update
   - pieces of code that need to be added when something else changes, like 
equals/hash_value need to be implemented when an UDF gains new fields -- 
https://github.com/apache/datafusion/issues/16677 remains open. I don't have a 
solution for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to