loukey_j created HUDI-4133: ------------------------------ Summary: Sprak query mor by snapshot query lost data Key: HUDI-4133 URL: https://issues.apache.org/jira/browse/HUDI-4133 Project: Apache Hudi Issue Type: Bug Components: core Reporter: loukey_j
Suppose there are two no intersection batches of data written to a new hudi mor no partition table in turn by flink. Hooide timeline and log file as follows: hdfs dfs -ls hdfs://xxx/mor_test/.hoodie 0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/.aux 0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/.schema 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/.temp 5291 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit.inflight 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164201245.deltacommit.requested 5291 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit.inflight 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie/20220521164214473.deltacommit.requested 0 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/archived 798 2022-05-21 16:41 hdfs://xxx/mor_test/.hoodie/hoodie.properties hdfs dfs -ls hdfs://xxx/mor_test/ 13316 2022-05-21 16:42 hdfs://xxx/mor_test/.00000000-1dd6-4395-9c90-53f8a6c6eed3_20220521164201245.log.1_0-2-0 28395 2022-05-21 16:42 hdfs://xxx/mor_test/.00000000-1dd6-4395-9c90-53f8a6c6eed3_20220521164214473.log.1_0-2-0 0 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie 100 2022-05-21 16:42 hdfs://xxx/mor_test/.hoodie_partition_metadata Use spark snapshot query execute such sql 'select distinct _hoodie_commit_time from mor_test_rt' Expected results is 20220521164201245 and 20220521164214473, but actual results is 20220521164214473. -- This message was sent by Atlassian Jira (v8.20.7#820007)