alamb commented on code in PR #8937:
URL: https://github.com/apache/arrow-rs/pull/8937#discussion_r2624619195
##########
arrow-schema/src/fields.rs:
##########
@@ -345,6 +345,21 @@ impl std::ops::Index<usize> for UnionFields {
}
}
+impl PartialEq for UnionFields {
+ fn eq(&self, other: &Self) -> bool {
+ self.len() == other.len() && self.iter().all(|a| other.iter().any(|b|
a == b))
Review Comment:
Can you please add a comment that says this is (and should remain)
consistent with the definition in DataType::Union?
##########
arrow-schema/src/fields.rs:
##########
@@ -345,6 +345,21 @@ impl std::ops::Index<usize> for UnionFields {
}
}
+impl PartialEq for UnionFields {
+ fn eq(&self, other: &Self) -> bool {
+ self.len() == other.len() && self.iter().all(|a| other.iter().any(|b|
a == b))
+ }
+}
+
+impl Hash for UnionFields {
+ fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
+ let mut v = self.0.iter().collect::<Vec<_>>();
Review Comment:
Another alternative might be to always sort the UnionFields on construction
🤔
##########
arrow-schema/src/fields.rs:
##########
@@ -345,6 +345,21 @@ impl std::ops::Index<usize> for UnionFields {
}
}
+impl PartialEq for UnionFields {
+ fn eq(&self, other: &Self) -> bool {
+ self.len() == other.len() && self.iter().all(|a| other.iter().any(|b|
a == b))
+ }
+}
+
+impl Hash for UnionFields {
+ fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
+ let mut v = self.0.iter().collect::<Vec<_>>();
Review Comment:
Allocating (the Vec) during a hash function is probably a pretty poor choice
-- can we please try and find some different implementation that doesn't
require a hash? For example a constant or hash the data types of the lowest and
highest type_id?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]