dutyu opened a new issue, #21960: URL: https://github.com/apache/doris/issues/21960
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 2.0-beta ### What's Wrong? SQL of Query 1: ``` select count(*), count(distinct user_no) as "no_reuse_distinct_user_total" from ( SELECT `id` AS `id`, `intf_type` AS `intf_type`, `actual_intf_type` AS `actual_intf_type`, ... `user_no` AS `user_no`, `reuse_flag` AS `reuse_flag`, ... `partitions` AS `partitions` FROM ( select `crs_query_cr_query_info_partition`.`id`, `crs_query_cr_query_info_partition`.`actual_intf_type`, ... `crs_query_cr_query_info_partition`.`user_no`, `crs_query_cr_query_info_partition`.`reuse_flag`, ... `crs_query_cr_query_info_partition`.`partitions` from `ods_safe`.`crs_query_cr_query_info_partition` ) `crs_query_cr_query_info_partition` ) t WHERE t.partitions in ( DATE_FORMAT( DATE_SUB(NOW(), INTERVAL 1 DAY), 'yyyy-MM-dd' ) ) and t.actual_intf_type = 'FuZhouPbocScore' and ( t.reuse_flag is null or t.reuse_flag <> 'Y' ); ``` Query 1 Result: ``` +----------+------------------------------+ | count(*) | no_reuse_distinct_user_total | +----------+------------------------------+ | 23 | 23 | +----------+------------------------------+ 1 row in set (1.89 sec) ``` SQL of Query 2: ``` select count(*), count(distinct user_no) as "no_reuse_distinct_user_total" from ( SELECT `id` AS `id`, `intf_type` AS `intf_type`, `actual_intf_type` AS `actual_intf_type`, ... `user_no` AS `user_no`, `reuse_flag` AS `reuse_flag`, ... `partitions` AS `partitions` FROM ( select `crs_query_cr_query_info_partition`.`id`, `crs_query_cr_query_info_partition`.`actual_intf_type`, ... `crs_query_cr_query_info_partition`.`user_no`, `crs_query_cr_query_info_partition`.`reuse_flag`, ... `crs_query_cr_query_info_partition`.`partitions` from `ods_safe`.`crs_query_cr_query_info_partition` ) `crs_query_cr_query_info_partition` ) t WHERE t.partitions in ( DATE_FORMAT( DATE_SUB(NOW(), INTERVAL 1 DAY), 'yyyy-MM-dd' ) ) and t.actual_intf_type = 'FuZhouPbocScore' and t.reuse_flag is null; ``` Result of Query 2: ``` +----------+------------------------------+ | count(*) | no_reuse_distinct_user_total | +----------+------------------------------+ | 123084 | 119912 | +----------+------------------------------+ 1 row in set (2.03 sec) ``` Table ddl: ``` CREATE TABLE `crs_query_cr_query_info_partition`( `id` bigint COMMENT '物理主键', `actual_intf_type` string COMMENT '', `user_no` string COMMENT '', ... `reuse_flag` string COMMENT '', ... ) PARTITIONED BY (`partitions` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://xxx/ods_safe.db/crs_query_cr_query_info_partition' TBLPROPERTIES ( 'last_modified_time' = '1688024841', 'spark.sql.sources.schema.numPartCols' = '1', 'spark.sql.sources.schema.part.0' = '...', 'spark.sql.sources.schema.partCol.0' = 'partitions', 'transient_lastDdlTime' = '1688025415', 'bucketing_version' = '2', 'last_modified_by' = 'hive', 'spark.sql.sources.schema.numParts' = '1', 'spark.sql.create.version' = '2.2 or prior' ); ``` ### What You Expected? The count of query 2 should be equal or greater than query 1 . ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
