kosiew commented on code in PR #16781: URL: https://github.com/apache/datafusion/pull/16781#discussion_r2207167050
########## datafusion-examples/examples/function_factory.rs: ########## @@ -153,6 +154,38 @@ impl ScalarUDFImpl for ScalarFunctionWrapper { fn output_ordering(&self, _input: &[ExprProperties]) -> Result<SortProperties> { Ok(SortProperties::Unordered) } + + fn equals(&self, other: &dyn ScalarUDFImpl) -> bool { + let Some(other) = other.as_any().downcast_ref::<Self>() else { + return false; + }; + let Self { + name, + expr, + signature, + return_type, + } = self; + name == &other.name + && expr == &other.expr + && signature == &other.signature + && return_type == &other.return_type + } + + fn hash_value(&self) -> u64 { + let Self { + name, + expr, + signature, + return_type, + } = self; + let mut hasher = DefaultHasher::new(); + std::any::type_name::<Self>().hash(&mut hasher); + name.hash(&mut hasher); + expr.hash(&mut hasher); + signature.hash(&mut hasher); + return_type.hash(&mut hasher); + hasher.finish() + } Review Comment: There are identical equals and hash_value methods in many places in this PR. Do you think creating a helper macro would help to reduce the risk of typos and ease future maintenance? ########## datafusion/core/tests/user_defined/user_defined_aggregates.rs: ########## @@ -957,6 +957,33 @@ impl AggregateUDFImpl for MetadataBasedAggregateUdf { curr_sum: 0, })) } + + fn equals(&self, other: &dyn AggregateUDFImpl) -> bool { + let Some(other) = other.as_any().downcast_ref::<Self>() else { + return false; + }; + let Self { + name, + signature, + metadata, + } = self; + name == &other.name + && signature == &other.signature + && metadata == &other.metadata + } + + fn hash_value(&self) -> u64 { + let Self { + name, + signature, + metadata: _, // unhashable + } = self; + let mut hasher = DefaultHasher::new(); + std::any::type_name::<Self>().hash(&mut hasher); + name.hash(&mut hasher); + signature.hash(&mut hasher); + hasher.finish() Review Comment: In UDFs where you comment “// unhashable” and omit metadata from the hash (e.g. MetadataBasedAggregateUdf in here), you still include metadata in equals. Doesn't this violates the rule that a == b ⇒ hash(a) == hash(b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org