andygrove commented on issue #863: URL: https://github.com/apache/arrow-datafusion/issues/863#issuecomment-897273083
If I comment out the code in our parquet reader that filters out row groups then I see the expected results. ``` > SELECT COUNT(*) FROM customer WHERE c_mktsegment = 'BUILDING'; +-----------------+ | COUNT(UInt8(1)) | +-----------------+ | 29998146 | +-----------------+ 1 row in set. Query took 0.874 seconds. ``` My conclusion is that we have a bug in our Parquet writer where we are writing incorrect statistics somehow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
