alamb commented on PR #16711: URL: https://github.com/apache/datafusion/pull/16711#issuecomment-3052758619
> The adaptive selection will help Q30 and Q31 from previous PR result: > > [apache/arrow-rs#7454 (comment)](https://github.com/apache/arrow-rs/pull/7454#issuecomment-2840770864) @zhuqi-lucas my largest concern about all the previous adaptive row selection work was the software engineering / keeping the complexity under control. I have been dreaming about this -- I am thinking it might be time to create an internal RowSelection representation like ```rust enum InternalRowSelector { /// Skip the next n rows Skip(usize), /// Decode and output the next n rows Decode(usize), /// Decode the next filter.len() rows and apply the filter before outputting the rows DecodeAndFilter(Arc<BooleanArray>), } ``` Then the actual RecordReader would take a Vec<InternalRowSelection>` or something To stage / keep things of reasonable complexity we could make one PR that refactors to ```rust enum InternalRowSelector { /// Skip the next n rows Skip(usize), /// Decode and output the next n rows Decode(usize), // NO DecodeAndFilter variant } ``` And then add the DecodeAndFilter as a follow on PR Maybe I'll try and make a PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org