nuno-faria commented on code in PR #16971: URL: https://github.com/apache/datafusion/pull/16971#discussion_r2246052908
########## datafusion/execution/src/cache/cache_manager.rs: ########## @@ -86,6 +114,10 @@ pub struct CacheManagerConfig { /// location. /// Default is disable. pub list_files_cache: Option<ListFilesCache>, + /// Cache of file-embedded metadata, used to avoid reading it multiple times when processing a + /// data file (e.g., Parquet footer and page metadata). + /// If not provided, the [`CacheManager`] will create a [`DefaultFilesMetadataCache`]. + pub file_metadata_cache: Option<FileMetadataCache>, Review Comment: My initial idea here was to make it easy to enable the metadata cache without having to provide a custom `FileMetadataCache` when setting up the runtime (default). This way, the user can simply call `set datafusion.execution.parquet.cache_metadata = true;` or enable for a file with the `ParquetReadOptions`. But I don't know if there is a better approach (maybe removing the `Option` for the `file_metadata_cache` altogether?). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org