[ https://issues.apache.org/jira/browse/PARQUET-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jiashen Zhang updated PARQUET-2373: ----------------------------------- Description: The spec *PARQUET-2257* has added bloom_filter_length for reader to load the bloom filter in a single shot. This implementation is to utilize bloom_filter_length field to read bloom filter (header + bitset) to facilitate I/O scheduling. was: The specs only has added bloom_filter_offset to locate the bloom filter. The reader cannot load the bloom filter in a single shot until it parses the bloom filter header to get the total size. This issue proposes to add an optional bloom_filter_length field to track the size of bloom filter to facilitate I/O scheduling. > Improve performance with bloom_filter_length > -------------------------------------------- > > Key: PARQUET-2373 > URL: https://issues.apache.org/jira/browse/PARQUET-2373 > Project: Parquet > Issue Type: Improvement > Reporter: Jiashen Zhang > Priority: Minor > > The spec *PARQUET-2257* has added bloom_filter_length for reader to load the > bloom filter in a single shot. > This implementation is to utilize bloom_filter_length field to read bloom > filter (header + bitset) to facilitate I/O scheduling. -- This message was sent by Atlassian Jira (v8.20.10#820010)