benmayersohn opened a new issue, #51622:
URL: https://github.com/apache/doris/issues/51622

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   doris-3.0.4-rc02-39f9074cec
   
   ### What's Wrong?
   
   I'm getting inconsistent query results. Sometimes a query yields all the 
expected results, other times it yields none of them. My guess is it's related 
to filtering on char columns, maybe while trying to utilize the inverted 
index(?). The reason I am guessing this can be seen in my query, where leaving 
out the filter on the `country` column always yields consistent results.
   
   Here's a modified version of my create table statement. The table is 
partitioned by month:
   
   ```
   CREATE TABLE `the_table` ( `date` date NOT NULL, `second_id` tinyint NOT 
NULL,
   `country` char(2) NOT NULL, `first_id` bigint NOT NULL, `third_id` tinyint 
NOT NULL,
   INDEX idx_second_id (`second_id`) USING INVERTED, INDEX idx_third_id 
(`third_id`) USING INVERTED,
   INDEX idx_first_id (`first_id`) USING INVERTED, INDEX idx_country 
(`country`) USING INVERTED) 
   ENGINE=OLAP UNIQUE KEY(`date`, `second_id`, `country`, `first_id`, 
`third_id`)
   AUTO PARTITION BY RANGE (date_trunc(`date`, 'month'))
   (PARTITION p20170101000000 VALUES [('2017-01-01'), ('2017-02-01')),
   PARTITION p20170201000000 VALUES [('2017-02-01'), ('2017-03-01')), ...
   PARTITION p20250601000000 VALUES [('2025-06-01'), ('2025-07-01')))
   DISTRIBUTED BY HASH(`first_id`) BUCKETS 10 PROPERTIES (
   "replication_allocation" = "tag.location.default: 3", "min_load_replica_num" 
= "-1",
   "is_being_synced" = "false", "colocate_with" = "first_group", 
"storage_medium" = "hdd",
   "storage_format" = "V2", "inverted_index_storage_format" = "V1",
   "compression" = "LZ4", "enable_unique_key_merge_on_write" = "true",
   "light_schema_change" = "true", "disable_auto_compaction" = "false",
   "enable_single_replica_compaction" = "false", "group_commit_interval_ms" = 
"10000", 
   "group_commit_data_bytes" = "134217728", "enable_mow_light_delete" = "false" 
);
   ```
   
   When I try
   ```
   select `date`, count(*) from the_table where first_id=xxxxxxx and 
country='US'
   and `date` >= '2025-04-01' and `date` <= '2025-04-30' group by
   `date` order by `date` asc
   ```
   
   I expect to get one entry per date. But sometimes I get nothing, and other 
times I get the right answer.
   
   But when I omit the country component of the `where` clause, I always get 
the expected result (the masked `first_id` value above only exists when 
`country='US'`, so the query results should be identical).
   
   ### What You Expected?
   
   
   date | count(*)
   -- | --
   2025-04-01 | 1
   2025-04-02 | 1
   2025-04-03 | 1
   2025-04-04 | 1
   ...|...
   2025-04-30 | 1
   
   ### How to Reproduce?
   
   Hard to say because it's a large table, but if you create one like the one 
above with indices on string columns, you should see this when filtering on 
those string columns.
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to