Zhiting Guo created KYLIN-5650:
----------------------------------
Summary: In the cloud environment, there is a probability that the
dictionary metadata file will be read abnormally during building job, resulting
in incorrect query results.
Key: KYLIN-5650
URL: https://issues.apache.org/jira/browse/KYLIN-5650
Project: Kylin
Issue Type: Bug
Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
Fix For: 5.0-alpha
Attachments: In the cloud environment, there is a probability that the
dictionary metadata file will be read abnormally during building job, resulting
in incorrect query results..pdf
Checked the dictionary, there are no duplicate values. Checked the execution
plan of the build dictionary step, there is no problem. Checked the steps of
building a flat table and found that there was a problem in the step of flat
table encoding dictionary.
The reason for the error is that the encoding is not performed after
repartition according to the dictionary column. As shown in the figure, there
is no repartition, and the encode column appears in the plan.
There are also the following logs:
{code:java}
2023-03-26T20:26:30,868 INFO [logger-thread-0] dict.NGlobalDictHDFSStore :
Commit from
s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/working
to
s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/version_1679862387539
2023-03-26T20:31:14,501 INFO [logger-thread-0] dict.NGlobalDictionaryV2 :
getMetaInfo versions.length is 12
2023-03-26T20:31:14,547 INFO [logger-thread-0] dict.NGlobalDictHDFSStore :
because metaFiles.length is 0, metaInfo is null
2023-03-26T20:31:14,547 INFO [logger-thread-0] dict.NGlobalDictionaryV2 :
getMetaInfo metadata is null : [true]{code}
This is on s3, after renaming the dictionary directory, no metadata file is
queried. However, if the meta is not obtained in the code and no error is
reported, it is not reasonable to encode directly without repartition. In
short, the result is that the encoding of the dictionary column on the flat
table fails.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)