kosiew commented on code in PR #16781:
URL: https://github.com/apache/datafusion/pull/16781#discussion_r2207167050


##########
datafusion-examples/examples/function_factory.rs:
##########
@@ -153,6 +154,38 @@ impl ScalarUDFImpl for ScalarFunctionWrapper {
     fn output_ordering(&self, _input: &[ExprProperties]) -> 
Result<SortProperties> {
         Ok(SortProperties::Unordered)
     }
+
+    fn equals(&self, other: &dyn ScalarUDFImpl) -> bool {
+        let Some(other) = other.as_any().downcast_ref::<Self>() else {
+            return false;
+        };
+        let Self {
+            name,
+            expr,
+            signature,
+            return_type,
+        } = self;
+        name == &other.name
+            && expr == &other.expr
+            && signature == &other.signature
+            && return_type == &other.return_type
+    }
+
+    fn hash_value(&self) -> u64 {
+        let Self {
+            name,
+            expr,
+            signature,
+            return_type,
+        } = self;
+        let mut hasher = DefaultHasher::new();
+        std::any::type_name::<Self>().hash(&mut hasher);
+        name.hash(&mut hasher);
+        expr.hash(&mut hasher);
+        signature.hash(&mut hasher);
+        return_type.hash(&mut hasher);
+        hasher.finish()
+    }

Review Comment:
   There are identical equals and hash_value methods in many places in this PR.
   
   Do you think creating a helper macro would help to reduce the risk of typos 
and ease future maintenance?



##########
datafusion/core/tests/user_defined/user_defined_aggregates.rs:
##########
@@ -957,6 +957,33 @@ impl AggregateUDFImpl for MetadataBasedAggregateUdf {
             curr_sum: 0,
         }))
     }
+
+    fn equals(&self, other: &dyn AggregateUDFImpl) -> bool {
+        let Some(other) = other.as_any().downcast_ref::<Self>() else {
+            return false;
+        };
+        let Self {
+            name,
+            signature,
+            metadata,
+        } = self;
+        name == &other.name
+            && signature == &other.signature
+            && metadata == &other.metadata
+    }
+
+    fn hash_value(&self) -> u64 {
+        let Self {
+            name,
+            signature,
+            metadata: _, // unhashable
+        } = self;
+        let mut hasher = DefaultHasher::new();
+        std::any::type_name::<Self>().hash(&mut hasher);
+        name.hash(&mut hasher);
+        signature.hash(&mut hasher);
+        hasher.finish()

Review Comment:
   In UDFs where you comment “// unhashable” and omit metadata from the hash 
(e.g. MetadataBasedAggregateUdf in here), you still include metadata in equals. 
Doesn't this violates the rule that a == b ⇒ hash(a) == hash(b).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to