yihua commented on code in PR #5746: URL: https://github.com/apache/hudi/pull/5746#discussion_r928972329
########## hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java: ########## @@ -166,6 +167,11 @@ public Map<String, List<FileSlice>> listFileSlices() { .collect(Collectors.toMap(e -> e.getKey().path, Map.Entry::getValue)); } + public int getFileSlicesCount() { + return cachedAllInputFileSlices.values().stream() + .reduce(0, (count, fileSlices) -> count + fileSlices.size(), Integer::sum); Review Comment: This is not quite obvious. Do sth like `cachedAllInputFileSlices.values().stream().mapToInt(List::size).sum()` instead? ########## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java: ########## @@ -187,6 +187,18 @@ public final class HoodieMetadataConfig extends HoodieConfig { .sinceVersion("0.11.0") .withDocumentation("Comma-separated list of columns for which column stats index will be built. If not set, all columns will be indexed"); + public static final String COLUMN_STATS_INDEX_PROCESSING_MODE_IN_MEMORY = "in-memory"; + public static final String COLUMN_STATS_INDEX_PROCESSING_MODE_SPARK = "spark"; Review Comment: nit: this can be enums. ########## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java: ########## @@ -187,6 +187,18 @@ public final class HoodieMetadataConfig extends HoodieConfig { .sinceVersion("0.11.0") .withDocumentation("Comma-separated list of columns for which column stats index will be built. If not set, all columns will be indexed"); + public static final String COLUMN_STATS_INDEX_PROCESSING_MODE_IN_MEMORY = "in-memory"; + public static final String COLUMN_STATS_INDEX_PROCESSING_MODE_SPARK = "spark"; Review Comment: `COLUMN_STATS_INDEX_PROCESSING_MODE_SPARK`: based on the logic, this mode is more of leveraging engine context, could be Spark or Flink (plain Java), right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org