alamb commented on code in PR #3885: URL: https://github.com/apache/arrow-datafusion/pull/3885#discussion_r1032668603
########## datafusion/core/src/config.rs: ########## @@ -237,6 +247,29 @@ impl BuiltInConfigs { to reduce the number of rows decoded.", false, ), + ConfigDefinition::new_bool( + OPT_PARQUET_ENABLE_PRUNING, + "If true, the parquet reader attempts to skip entire row groups based \ + on the predicate in the query and the metadata (min/max values) stored in \ + the parquet file.", + true, + ), + ConfigDefinition::new_bool( + OPT_PARQUET_SKIP_METADATA, + "If true, the parquet reader skip the optional embedded metadata that may be in \ + the file Schema. This setting can help avoid schema conflicts when querying \ + multiple parquet files with schemas containing compatible types but different metadata.", + true, + ), + ConfigDefinition::new( + OPT_PARQUET_METADATA_SIZE_HINT, + "If specified, the parquet reader will try and fetch the last `size_hint` \ + bytes of the parquet file optimistically. If not specified, two read are required: \ + One read to fetch the 8-byte parquet footer and \ + another to fetch the metadata length encoded in the footer.", + DataType::UInt64, + ScalarValue::UInt64(None), Review Comment: 64K seems reasonable to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org