Davis-Zhang-Onehouse commented on code in PR #13489:
URL: https://github.com/apache/hudi/pull/13489#discussion_r2219934854
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/HoodieSparkIndexClient.java:
##########
@@ -134,6 +140,8 @@ public void
createOrUpdateColumnStatsIndexDefinition(HoodieTableMetaClient metaC
.withIndexType(PARTITION_NAME_COLUMN_STATS)
.withIndexFunction(PARTITION_NAME_COLUMN_STATS)
.withSourceFields(columnsToIndex)
+ // Use the existing version if exists, otherwise fall back to the
default version.
Review Comment:
> I also would like to audit and ensure we don't excessively read the index
defs from storage to. do these checks
Metaclient cache the index definition files to avoid reading it everytime
from disk.
> if indexDef exists ....
the function itself is saying "createOrUpdateColumnStatsIndexDefinition",
which means we don't know if this is create or update. So we use
existingIndexVersionOrDefault which compensates for the lack of the context
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]