alamb opened a new issue, #7039: URL: https://github.com/apache/arrow-datafusion/issues/7039
### Describe the bug When running the following query (from ClickBench) on the partitioned dataset (100 parquet files) ```sql SELECT "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM hits_partitioned WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhoneModel" ORDER BY u DESC LIMIT 10; ``` I get the following error: ``` Error during planning: Cannot infer common argument type for comparison operation Binary != Utf8 ``` ### To Reproduce Get the data using `bench.sh` (after https://github.com/apache/arrow-datafusion/pull/7005 is merged) ```shell bench.sh data clickbench_1 bench.sh data clickbench_multi ``` ```sql CREATE EXTERNAL TABLE hits_partitioned STORED AS PARQUET LOCATION 'hits_partitioned'; SELECT "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM hits_partitioned WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhoneModel" ORDER BY u DESC LIMIT 10; ``` ### Expected behavior The query works fine with the single file dataset. I expect the same error ``` -- Single file parquet CREATE EXTERNAL TABLE hits_single STORED AS PARQUET LOCATION 'hits.parquet'; -- Single file works great SELECT "MobilePhoneModel", COUNT(DISTINCT "UserID") AS u FROM hits_single WHERE "MobilePhoneModel" <> '' GROUP BY "MobilePhoneModel" ORDER BY u DESC LIMIT 10; ``` +------------------------------+---------+ | hits_single.MobilePhoneModel | u | +------------------------------+---------+ | iPad | 1090347 | | iPhone | 45758 | | A500 | 16046 | | N8-00 | 5565 | | iPho | 3300 | | ONE TOUCH 6030A | 2759 | | GT-P7300B | 1907 | | 3110000 | 1871 | | GT-I9500 | 1598 | | eagle75 | 1492 | +------------------------------+---------+ ``` ``` ### Additional context I found this while working on some benchmark results for #6988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
