TheR1sing3un commented on code in PR #14147:
URL: https://github.com/apache/hudi/pull/14147#discussion_r2458622299
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala:
##########
@@ -103,6 +103,25 @@ case class HoodieFileIndex(spark: SparkSession,
startCompletionTime = options.get(DataSourceReadOptions.START_COMMIT.key),
endCompletionTime = options.get(DataSourceReadOptions.END_COMMIT.key))
with FileIndex {
+ // ignore fileStatusCache in equals/hashCode
+ override def equals(obj: Any): Boolean = {
Review Comment:
> I see this class got two sub-classes: `HoodieCDCFileIndex` and
`HoodieIncrementalFileIndex`, should we also impl `#equals` for these two?
By the way, `HoodieCDCFileIndex` and `HoodieIncrementalFileIndex` is a class
inherits from case class pattern, so even if we don't rewrite `equals/hashcode`
can correct use `HoodieFileIndex` comparison method. However, we still need to
rewrite to compare the additional member variables of these two indices, such
as `RangeType`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]