[ 
https://issues.apache.org/jira/browse/KYLIN-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengfei.zhan resolved KYLIN-5828.
---------------------------------
    Fix Version/s: 5.0.0
       Resolution: Fixed

> During multi-jobs concurrent building, the flat table may use inconsistent 
> global dictionaries, resulting in incorrect count distinct query results.
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5828
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5828
>             Project: Kylin
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: Zhimin Wu
>            Assignee: Zhimin Wu
>            Priority: Major
>             Fix For: 5.0.0
>
>
> *Root Cause*
> When multiple tasks are concurrently building and using the same global 
> dictionary, the consistency of the dictionary version used in the flat table 
> encoding process is not guaranteed. At the same time, another task expands 
> the dictionary, causing some flat table partitions to mistakenly use the new 
> version of the dictionary partition file. Due to the inconsistent data 
> distribution, the correct dictionary content cannot be obtained, resulting in 
> a flat table encoding column of 0 and ultimately causing an abnormal count 
> distinct value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to