[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Status: Closed (was: Patch Available) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Labels: pull-request-available > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Status: Patch Available (was: In Progress) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Labels: pull-request-available > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1502: - Labels: pull-request-available (was: ) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Labels: pull-request-available > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Status: Open (was: New) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Bug > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >.25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.1_2-160-263 > >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Parent: HUDI-1292 Issue Type: Sub-task (was: Bug) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Sub-task > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >.25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.1_2-160-263 > >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Status: In Progress (was: Open) > Restore on MOR table leaves metadata table out-of-sync from data table > -- > > Key: HUDI-1502 > URL: https://issues.apache.org/jira/browse/HUDI-1502 > Project: Apache Hudi > Issue Type: Bug > Components: Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.7.0 > > Attachments: image-2021-01-03-22-48-54-646.png > > > Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on > MOR tables. This seems like a more fundamental issue with deleting instant > files, during restore. > So what happens is that we restore which rolls back a delta commit that has > not been synced yet. (20210103224054 in the e.g) And that delta commit has > introduced a new log file, which has not been added to the metadata table. > But the restore effectively deletes the 20210103224054.deltacommit. > {code} > Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, > 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, > 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, > 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, > 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, > 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, > d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, > fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } > Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. > Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 > partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, > creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/16 partitionPath=files} > HoodieMetadataPayload {key=2015/03/16, type=2, > creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=2015/03/17 partitionPath=files} > HoodieMetadataPayload {key=2015/03/17, type=2, > creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], > deletions=[], } > HoodieKey { recordKey=__all_partitions__ partitionPath=files} > HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, > 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at > 20210103224051 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after delete) State at 20210103224052 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 >028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after clean) State at 20210103224053 files > > 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet >25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet > > >>> (after update) State at 20210103224054 files >.028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 >.25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.1_2-160-263 > >
[jira] [Updated] (HUDI-1502) Restore on MOR table leaves metadata table out-of-sync from data table
[ https://issues.apache.org/jira/browse/HUDI-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1502: - Description: Below is the stack trace from running `TestHoodieBackedMetadata#testSync` on MOR tables. This seems like a more fundamental issue with deleting instant files, during restore. So what happens is that we restore which rolls back a delta commit that has not been synced yet. (20210103224054 in the e.g) And that delta commit has introduced a new log file, which has not been added to the metadata table. But the restore effectively deletes the 20210103224054.deltacommit. {code} Commit 20210103224042 added HoodieKey { recordKey=2016/03/15 partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_4-2-6_20210103224041.parquet], deletions=[], } HoodieKey { recordKey=2015/03/16 partitionPath=files} HoodieMetadataPayload {key=2015/03/16, type=2, creations=[028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet, 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet], deletions=[], } HoodieKey { recordKey=2015/03/17 partitionPath=files} HoodieMetadataPayload {key=2015/03/17, type=2, creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_1-2-3_20210103224041.parquet, 4733dbda-7824-4411-a708-4b2d978f887b-0_4-9-22_20210103224042.parquet, 532a6f9b-ca89-4b96-84b7-0e3b13068b4b-0_3-9-21_20210103224042.parquet, 6842e596-46b3-4546-9faa-8a7f8c674a17-0_0-2-2_20210103224041.parquet, 7f0635d7-126e-40b6-9677-7fd8a123d5b9-0_3-2-5_20210103224041.parquet, d1906fdc-66ca-48a4-86b6-687c865d939d-0_2-9-20_20210103224042.parquet, fd446460-a662-434a-a6ab-1cd498af94ca-0_2-2-4_20210103224041.parquet], deletions=[], } HoodieKey { recordKey=__all_partitions__ partitionPath=files} HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, 2015/03/17, 2016/03/15], deletions=[], } Syncing [20210103224045__deltacommit__COMPLETED] to metadata table. Commit 20210103224045 added HoodieKey { recordKey=2016/03/15 partitionPath=files} HoodieMetadataPayload {key=2016/03/15, type=2, creations=[6b8f2187-5505-40ae-845e-a71a2163d064-0_0-31-52_20210103224045.parquet], deletions=[], } HoodieKey { recordKey=2015/03/16 partitionPath=files} HoodieMetadataPayload {key=2015/03/16, type=2, creations=[25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet], deletions=[], } HoodieKey { recordKey=2015/03/17 partitionPath=files} HoodieMetadataPayload {key=2015/03/17, type=2, creations=[2ab899de-4745-43c5-9fa4-d09721d3aa91-0_2-31-54_20210103224045.parquet], deletions=[], } HoodieKey { recordKey=__all_partitions__ partitionPath=files} HoodieMetadataPayload {key=__all_partitions__, type=1, creations=[2015/03/16, 2015/03/17, 2016/03/15], deletions=[], } >>> (after compaction) State at 20210103224051 files .028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet >>> (after delete) State at 20210103224052 files .028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224042.log.1_0-100-148 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_1-9-19_20210103224042.parquet 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_0-9-18_20210103224042.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet >>> (after clean) State at 20210103224053 files 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet >>> (after update) State at 20210103224054 files .028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 .25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.1_2-160-263 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet >>> (after restore) State after restore files .028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.1_1-160-262 .028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_20210103224051.log.2_1-0-1 .25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.1_2-160-263 .25c9a174-4c07-43a1-a1a2-40454a3f0310-0_20210103224045.log.2_1-0-1 028cc15e-85ef-4b6f-b6f1-a1aa01131dbc-0_3-110-170_20210103224051.parquet 25c9a174-4c07-43a1-a1a2-40454a3f0310-0_1-31-53_20210103224045.parquet Syncing