adriangb commented on issue #19972: URL: https://github.com/apache/datafusion/issues/19972#issuecomment-3817659629
I think you could try some canned dataset like a large parquet file(s) with millions of sequential integer rows: ```sql copy (select i as k from generate_series(1, 10000000) t(i)) to 'test.parquet'; create external table t stored as parquet location 'test.parquet'; ``` And then try various mask forms: ``` -- select nothing select count(*) from t where k < 0; -- select every other row select count(*) from t where k % 2 = 0; -- select everything select count(*) from t where k > 0; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
