majian1998 commented on code in PR #10493: URL: https://github.com/apache/hudi/pull/10493#discussion_r1452969248
########## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ########## @@ -106,14 +107,23 @@ class ColumnStatsIndexSupport(spark: SparkSession, * * Please check out scala-doc of the [[transpose]] method explaining this view in more details */ - def loadTransposed[T](targetColumns: Seq[String], shouldReadInMemory: Boolean)(block: DataFrame => T): T = { + def loadTransposed[T](targetColumns: Seq[String], shouldReadInMemory: Boolean, prunedPartitionFileNames: Set[String] = Set.empty)(block: DataFrame => T): T = { cachedColumnStatsIndexViews.get(targetColumns) match { case Some(cachedDF) => block(cachedDF) case None => - val colStatsRecords: HoodieData[HoodieMetadataColumnStats] = + val colStatsRecords: HoodieData[HoodieMetadataColumnStats] = if (prunedPartitionFileNames.isEmpty) { + // NOTE: In order to ensure that testing and unexpected logic are normal, judgment logic is added. loadColumnStatsIndexRecords(targetColumns, shouldReadInMemory) + } else { + val filterFunction = new SerializableFunction[HoodieMetadataColumnStats, java.lang.Boolean] { Review Comment: I refrained from introducing new tests as the current data skipping test logic is already comprehensive enough to encompass the modifications made here.I think ensuring the correctness of the existing tests should suffice. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org