alamb opened a new issue, #8844: URL: https://github.com/apache/arrow-rs/issues/8844
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** After the great work from @hhhizzz in https://github.com/apache/arrow-rs/pull/8733, we will (finally) have the ability to use a Bitmask filter representation when applying filters *during* Parquet decode. https://github.com/apache/arrow-rs/pull/8733 automatically converts an existing [`RowSelection`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html) (aka a `Vec<RowSelector>` of ranges) into a bitmask for evaluation. However, at the moment, when a filter is initially evaluated, it is *always* converted from Bitmask --> [`RowSelection`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html) here: https://github.com/apache/arrow-rs/blob/911331aafa13f5e230440cf5d02feb245985c64e/parquet/src/arrow/arrow_reader/read_plan.rs#L168-L167 This leads to inefficiency in the case where a Bitmask is converted to a RowSelection only to be turned back into a Bitmask for evaluation **Describe the solution you'd like** Add a way to avoid converting from a Mask --> Selection with the result of evaluating predicates I think the tricky bit will be to quickly look at a Mask and determine if it should be turned back into a Selection (probably we can use the same heuristics that @hhhizzz added in https://github.com/apache/arrow-rs/pull/8733 for going the other way) **Describe alternatives you've considered** **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
