alamb commented on code in PR #16977:
URL: https://github.com/apache/datafusion/pull/16977#discussion_r2243732590


##########
datafusion/expr/src/udf.rs:
##########
@@ -720,13 +721,15 @@ pub trait ScalarUDFImpl: Debug + Send + Sync {
     /// Similarly to [`Hash`] and [`Eq`], if [`Self::equals`] returns true for 
two UDFs,
     /// their `hash_value`s must be the same.
     ///
-    /// By default, it is consistent with default implementation of 
[`Self::equals`].
+    /// By default, it hashes the type, [`Self::name`], and [`Self::aliases`]. 
[`Self::signature`]
+    /// is not hashed, as usually the signature is implied by the UDF type. 
Recall that UDFs with
+    /// state (and thus possibly changing signature) must override 
[`Self::equals`] and
+    /// [`Self::hash_value`].
     fn hash_value(&self) -> u64 {
-        let hasher = &mut DefaultHasher::new();
+        let hasher = &mut AHasher::default();
         self.as_any().type_id().hash(hasher);
         self.name().hash(hasher);
         self.aliases().hash(hasher);

Review Comment:
   I think we could potentially avoid hashing `name` and `aliases` by default 
as well  -- hashing strings is relatively expensive and I think collisions are 
unlikely based on type_id.
   
   Likewise for the other default implementations (WindowUDFImpl, etc)
   
   We could do that in a follow on PR as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to