zhimin wu created KYLIN-5828: -------------------------------- Summary: During multi-jobs concurrent building, the flat table may use inconsistent global dictionaries, resulting in incorrect count distinct query results. Key: KYLIN-5828 URL: https://issues.apache.org/jira/browse/KYLIN-5828 Project: Kylin Issue Type: Bug Components: Storage - Parquet Reporter: zhimin wu Assignee: zhimin wu
*Root Cause* When multiple tasks are concurrently building and using the same global dictionary, the consistency of the dictionary version used in the flat table encoding process is not guaranteed. At the same time, another task expands the dictionary, causing some flat table partitions to mistakenly use the new version of the dictionary partition file. Due to the inconsistent data distribution, the correct dictionary content cannot be obtained, resulting in a flat table encoding column of 0 and ultimately causing an abnormal count distinct value. -- This message was sent by Atlassian Jira (v8.20.10#820010)