Re: [PR] fix: ignore `HoodieFileIndex#fileStatusCache` when compare `HoodieFileIndex` [hudi]

via GitHub Thu, 23 Oct 2025 20:26:15 -0700


TheR1sing3un commented on code in PR #14147:
URL: https://github.com/apache/hudi/pull/14147#discussion_r2458622299



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala:
##########
@@ -103,6 +103,25 @@ case class HoodieFileIndex(spark: SparkSession,
     startCompletionTime = options.get(DataSourceReadOptions.START_COMMIT.key),
     endCompletionTime = options.get(DataSourceReadOptions.END_COMMIT.key)) 
with FileIndex {
 
+  // ignore fileStatusCache in equals/hashCode
+  override def equals(obj: Any): Boolean = {

Review Comment:
   > I see this class got two sub-classes: `HoodieCDCFileIndex` and 
`HoodieIncrementalFileIndex`, should we also impl `#equals` for these two?
   
   By the way, `HoodieCDCFileIndex` and `HoodieIncrementalFileIndex` is a class 
inherits from case class pattern, so even if we don't rewrite `equals/hashcode` 
can correct use `HoodieFileIndex` comparison method. However, we still need to 
rewrite to compare the additional member variables of these two indices, such 
as `RangeType`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: ignore `HoodieFileIndex#fileStatusCache` when compare `HoodieFileIndex` [hudi]

Reply via email to