LuciferYang commented on a change in pull request #33748: URL: https://github.com/apache/spark/pull/33748#discussion_r690882533
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala ########## @@ -60,6 +60,7 @@ case class OrcPartitionReaderFactory( private val capacity = sqlConf.orcVectorizedReaderBatchSize private val orcFilterPushDown = sqlConf.orcFilterPushDown private val ignoreCorruptFiles = sqlConf.ignoreCorruptFiles + private val metaCacheEnabled = sqlConf.fileMetaCacheEnabled Review comment: > BTW, @viirya 's suggestion about the config is a list config like spark.sql.sources.useV1SourceList. @dongjoon-hyun @viirya If `useFileMetaCacheList` config is used, without change the V2 API, it seems that it can only be hard coded as ``` val metaCacheEnabled = useFileMetaCacheList.toLowerCase(Locale.ROOT) .split(",").map(_.trim).contains("ORC") ``` the `formatName("ORC")` is defined `OrcTable`, we can't get it through the API here at present. On the other hand, this config will not have corresponding implementations for all build-in data format like `spark.sql.sources.useV1SourceList`, if the new data format is not considered, it may only work for `Parquet` and `Orc`, so are we sure we need to use a list config? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org