[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399085#comment-17399085 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar merged pull request #3210: URL: https://github.com/apache/hudi/pull/3210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399076#comment-17399076 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 349cd1db6edcf13dc5012242a286dea262ba7276 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399069#comment-17399069 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 7bad968f01fbae614e5aa973e12287afcc0e1e5e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1730) * 349cd1db6edcf13dc5012242a286dea262ba7276 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399067#comment-17399067 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 7bad968f01fbae614e5aa973e12287afcc0e1e5e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1730) * 349cd1db6edcf13dc5012242a286dea262ba7276 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399066#comment-17399066 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898802063 @prashantwason I reverted the change to add a `sync()` to `HoodieMetadataFileSystemView`. It breaks the timeline server based fetch, since reuse is not correctly configured during re-creation in sync. Its flaky, because the issue only happens if the test runs long enough beyond a sync interval. So if you introduce a sleep from the time the writeClient is created to when upserts happen etc. You ll observe this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399061#comment-17399061 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 7bad968f01fbae614e5aa973e12287afcc0e1e5e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1730) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399005#comment-17399005 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 6f885e8e020bf746dd9d0e85c3684bee0091f47e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1728) * 7bad968f01fbae614e5aa973e12287afcc0e1e5e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1730) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399004#comment-17399004 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * fb3435e082d9cee0841763dcdaf39135da0f4858 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1724) * 6f885e8e020bf746dd9d0e85c3684bee0091f47e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1728) * 7bad968f01fbae614e5aa973e12287afcc0e1e5e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398990#comment-17398990 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * fb3435e082d9cee0841763dcdaf39135da0f4858 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1724) * 6f885e8e020bf746dd9d0e85c3684bee0091f47e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1728) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398988#comment-17398988 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * fb3435e082d9cee0841763dcdaf39135da0f4858 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1724) * 6f885e8e020bf746dd9d0e85c3684bee0091f47e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398976#comment-17398976 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898757125 @prashantwason https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_apis/build/builds/1724/logs/29 has the logs. my fix is not working. still looking -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398962#comment-17398962 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * fb3435e082d9cee0841763dcdaf39135da0f4858 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1724) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398949#comment-17398949 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898740933 >But if tableMetadata is ever null then baseTableMetadata would also be null and that would be another NPE. its initialized in the constructor. the issue is more about re-creation on sync. NPE I saw is because of accessing a closed reader. `sync()` is already called with a lock on the view object. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398950#comment-17398950 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898740933 >But if tableMetadata is ever null then baseTableMetadata would also be null and that would be another NPE. its initialized in the constructor. the issue is more about re-creation on sync. NPE I saw is because of accessing a closed reader. `sync()` is already called with a lock on the view object. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398946#comment-17398946 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898738969 No side effects. But if tableMetadata is ever null then baseTableMetadata would also be null and that would be another NPE. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398947#comment-17398947 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898739309 Do we need to synchronize the sync()? Maybe two threads are calling it at the same time? I couldn't find which exact unit test is failing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398941#comment-17398941 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898735915 @prashantwason I think the issue is that the table metadata is not getting recreated when timeline service calls sync. I removed the null check you had in HoodieMetadataFileSystemView. ![image](https://user-images.githubusercontent.com/1179324/129421396-9e45c7e4-5f69-4972-a733-75edcce36f29.png) let me know if you see side-effects. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398936#comment-17398936 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 96c73ba8ed99f51f31efdac0ab60d6b95bc782ed Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1699) * fb3435e082d9cee0841763dcdaf39135da0f4858 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1724) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398934#comment-17398934 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 96c73ba8ed99f51f31efdac0ab60d6b95bc782ed Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1699) * fb3435e082d9cee0841763dcdaf39135da0f4858 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398886#comment-17398886 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898683191 My latest theory is that, the changes in this PR for syncing filesystem view could be surfacing some concurrency issue. It seems like, we are reusing the same baseFileReader even after close has been called. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398869#comment-17398869 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-898664569 This seems like a general NPE issue, i ran the tests again locally, they failed. something flaky. but there is a def underlying issue. looking into that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398430#comment-17398430 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 96c73ba8ed99f51f31efdac0ab60d6b95bc782ed Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1699) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398418#comment-17398418 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * a9dcb727c23272b2c8b74647b467f413b9e83f5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1698) * 96c73ba8ed99f51f31efdac0ab60d6b95bc782ed Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1699) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398417#comment-17398417 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * a9dcb727c23272b2c8b74647b467f413b9e83f5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1698) * 96c73ba8ed99f51f31efdac0ab60d6b95bc782ed UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398372#comment-17398372 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * a9dcb727c23272b2c8b74647b467f413b9e83f5d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1698) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398361#comment-17398361 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 23f261c5ebd97be2b7bd5cc9bb9c536c8866da57 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1694) * a9dcb727c23272b2c8b74647b467f413b9e83f5d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1698) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398360#comment-17398360 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 23f261c5ebd97be2b7bd5cc9bb9c536c8866da57 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1694) * a9dcb727c23272b2c8b74647b467f413b9e83f5d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398351#comment-17398351 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r688154275 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta +// +// When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5 will be committed to Metadata Table in Review comment: okay. lets revisit after the sync design and I ll add a flag down the line if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398243#comment-17398243 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 23f261c5ebd97be2b7bd5cc9bb9c536c8866da57 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1694) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398234#comment-17398234 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * e10bec1dffd99a3cf3007891a7464a30e83cf09d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1690) * 23f261c5ebd97be2b7bd5cc9bb9c536c8866da57 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1694) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398213#comment-17398213 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * e10bec1dffd99a3cf3007891a7464a30e83cf09d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1690) * 23f261c5ebd97be2b7bd5cc9bb9c536c8866da57 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398212#comment-17398212 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687976889 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java ## @@ -106,5 +109,13 @@ static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieMetad */ Option getSyncedInstantTime(); + /** + * Get the instant time to which the metadata is synced w.r.t data timeline on the reader side. + * + * This is different from the getSyncedInstantTime() because the reader should sync all completed instants + * from the data timeline in order to always provide the most up to date view of the files within the dataset. + */ + Option getSyncedInstantTimeForReader(); Review comment: Removed from public interface. Can easily be accessed in testing by casting the interface. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398205#comment-17398205 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * e10bec1dffd99a3cf3007891a7464a30e83cf09d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1690) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398202#comment-17398202 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 169b88e4ee1126779f2f44f1716687e874392d38 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1670) * e10bec1dffd99a3cf3007891a7464a30e83cf09d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1690) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398200#comment-17398200 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 169b88e4ee1126779f2f44f1716687e874392d38 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1670) * e10bec1dffd99a3cf3007891a7464a30e83cf09d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398201#comment-17398201 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687956408 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -344,8 +326,18 @@ protected HoodieEngineContext getEngineContext() { return engineContext != null ? engineContext : new HoodieLocalEngineContext(hadoopConf.get()); } + public HoodieMetadataConfig getMetadataConfig() { +return metadataConfig; + } + protected String getLatestDatasetInstantTime() { return datasetMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant() .map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP); } + + @Override + public Option getSyncedInstantTimeForReader() { +return Option.ofNullable(timelineMergedMetadata).map(TimelineMergedTableMetadata::getSyncedInstantTime()); Review comment: Oh. Dont know why it compiles for me. I will fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398199#comment-17398199 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687955736 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/TimelineMergedTableMetadata.java ## @@ -112,4 +116,15 @@ private void processNextRecord(HoodieRecord hoodi public Option> getRecordByKey(String key) { return Option.ofNullable((HoodieRecord) timelineMergedRecords.get(key)); } + + /** + * Returns the timestamp of the latest synced instant. + */ + public Option getSyncedInstantTime() { Review comment: Renamed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397856#comment-17397856 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687429575 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java ## @@ -106,5 +109,13 @@ static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieMetad */ Option getSyncedInstantTime(); + /** + * Get the instant time to which the metadata is synced w.r.t data timeline on the reader side. + * + * This is different from the getSyncedInstantTime() because the reader should sync all completed instants + * from the data timeline in order to always provide the most up to date view of the files within the dataset. + */ + Option getSyncedInstantTimeForReader(); Review comment: Can we avoid this if we can. This is a pretty important interface, having additional methods for testing may be an overkill.? Happy to take this up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397854#comment-17397854 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687428759 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -344,8 +326,18 @@ protected HoodieEngineContext getEngineContext() { return engineContext != null ? engineContext : new HoodieLocalEngineContext(hadoopConf.get()); } + public HoodieMetadataConfig getMetadataConfig() { +return metadataConfig; + } + protected String getLatestDatasetInstantTime() { return datasetMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant() .map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP); } + + @Override + public Option getSyncedInstantTimeForReader() { +return Option.ofNullable(timelineMergedMetadata).map(TimelineMergedTableMetadata::getSyncedInstantTime()); Review comment: `TimelineMergedTableMetadata::getSyncedInstantTime` to get past compile issues. but still `getSyncedInstantTime()`, returns a option by itself. So, may be we need to indeed go back to the following? ``` @Override public Option getSyncedInstantTimeForReader() { if (timelineMergedMetadata == null) { return Option.empty(); } return timelineMergedMetadata.getSyncedInstantTime(); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397708#comment-17397708 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 169b88e4ee1126779f2f44f1716687e874392d38 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1670) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397707#comment-17397707 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 1a90aaff7b511d52759b741db52fd81738fb5418 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1667) * 169b88e4ee1126779f2f44f1716687e874392d38 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397703#comment-17397703 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687268778 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -504,6 +511,20 @@ public void close() throws Exception { } } + /** + * Return the timestamp of the latest synced instant. + */ + @Override + public Option getLatestSyncedInstantTime() { Review comment: Yep. Remove this method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397696#comment-17397696 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687264482 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java ## @@ -260,6 +265,20 @@ public synchronized void close() throws Exception { logRecordScanner = null; } + /** + * Return the timestamp of the latest synced instant. + */ + @Override + public Option getSyncedInstantTime() { Review comment: I dont see this on the writer side. Writer side uses metadata.getSyncedInstantTime() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397690#comment-17397690 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687261435 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -322,8 +303,22 @@ protected HoodieEngineContext getEngineContext() { return engineContext != null ? engineContext : new HoodieLocalEngineContext(hadoopConf.get()); } + public HoodieMetadataConfig getMetadataConfig() { +return metadataConfig; + } + protected String getLatestDatasetInstantTime() { return datasetMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant() .map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP); } + + @Override + public Option getSyncedInstantTimeForReader() { +if (timelineMergedMetadata == null) { Review comment: Done. Much better now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397691#comment-17397691 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687261567 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataFileSystemView.java ## @@ -73,4 +73,16 @@ public void close() { throw new HoodieException("Error closing metadata file system view.", e); } } + + @Override + public void sync() { +// Sync the tableMetadata first as super.sync() may call listPartition +if (tableMetadata != null) { + BaseTableMetadata baseMetadata = (BaseTableMetadata)tableMetadata; Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397687#comment-17397687 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687261008 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java ## @@ -106,5 +109,13 @@ static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieMetad */ Option getSyncedInstantTime(); + /** + * Get the instant time to which the metadata is synced w.r.t data timeline on the reader side. + * + * This is different from the getSyncedInstantTime() because the reader should sync all completed instants + * from the data timeline in order to always provide the most up to date view of the files within the dataset. + */ + Option getSyncedInstantTimeForReader(); Review comment: Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397674#comment-17397674 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 1a90aaff7b511d52759b741db52fd81738fb5418 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1667) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397655#comment-17397655 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 0eebf83fba8adcc6b635f97a48d54037ba0dfedf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646) * 1a90aaff7b511d52759b741db52fd81738fb5418 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1667) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397653#comment-17397653 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 0eebf83fba8adcc6b635f97a48d54037ba0dfedf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646) * 1a90aaff7b511d52759b741db52fd81738fb5418 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397647#comment-17397647 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687045652 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -137,6 +138,9 @@ protected HoodieBackedTableMetadataWriter(Configuration hadoopConf, HoodieWriteC private HoodieWriteConfig createMetadataWriteConfig(HoodieWriteConfig writeConfig) { int parallelism = writeConfig.getMetadataInsertParallelism(); +int minCommitsToKeep = Math.max(writeConfig.getMetadataMinCommitsToKeep(), writeConfig.getMinCommitsToKeep()); Review comment: nice. ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -504,6 +511,20 @@ public void close() throws Exception { } } + /** + * Return the timestamp of the latest synced instant. + */ + @Override + public Option getLatestSyncedInstantTime() { Review comment: could n't we call `metadata.getSyncedInstantTime()` here? ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java ## @@ -106,5 +109,13 @@ static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieMetad */ Option getSyncedInstantTime(); + /** + * Get the instant time to which the metadata is synced w.r.t data timeline on the reader side. + * + * This is different from the getSyncedInstantTime() because the reader should sync all completed instants + * from the data timeline in order to always provide the most up to date view of the files within the dataset. + */ + Option getSyncedInstantTimeForReader(); Review comment: So we need this only for tests? ``` ./hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java: Option getSyncedInstantTimeForReader(); ./hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java: public Option getSyncedInstantTimeForReader() { ./hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: public Option getSyncedInstantTimeForReader() { ./hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java: assertEquals(metadata.getSyncedInstantTimeForReader().get(), newCommitTime); ``` ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +497,144 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta Review comment: this fix also helps avoid the inconsistent exception during the follow situation. cc @umehrot2 ``` // Dataset: C1 C2 C3.inflight[failed] R4[rolls back C3] C4 // Metadata: C1.delta C2.delta ``` ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/TimelineMergedTableMetadata.java ## @@ -112,4 +116,15 @@ private void processNextRecord(HoodieRecord hoodi public Option> getRecordByKey(String key) { return Option.ofNullable((HoodieRecord) timelineMergedRecords.get(key)); } + + /** + * Returns the timestamp of the latest synced instant. + */ + public Option getSyncedInstantTime() { Review comment: Could we call this something else? this basically returns the earliest unsynced instant time, right? Mixing `sync` terminology here is kind of throwing me off, trying to grok this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: P
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397538#comment-17397538 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687072435 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta +// +// When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5 will be committed to Metadata Table in Review comment: Having a config is acceptable as there can be various use cases for HUDI. We saw a production issue where the metadata table did not have the latest files. If there was no exception: 1. Only an error would have been logged 2. Metadata table would have returned an older file listing 3. Data would have got written to older version of files The above is a classic data lass scenario which would be very difficult to catch once the logs have rolled over. Hence, I want to err on the side of data-consistency. To contrast it to another component in HUDI wherein we do something similar - If the RemoteFileSystemView is unable to get the listing from TimelineServer, it logs an error and then falls back to file listing. There are no data consistency issues in this fallback so this makes sense to prevent user interventions. But wrong results from metadata table can lead to data consistency issues. BTW, there may be a better way to sync the rollback. Also much of this complexity will go away with the synchronous design. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397509#comment-17397509 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r687032405 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta +// +// When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5 will be committed to Metadata Table in Review comment: On this one, I continue to disagree :). We can easily do a logger.warn or error and collect information in an automated fashion, to fix the issues. In the current model, heavy user intervention is needed to fix the metadata table and then get the pipeline back online. This is not very user-friendly for folks in OSS. I am thinking of adding an internal config to make this behavior configurable, so you can have it failing hard in Uber and we can do the other way in OSS. It ll actually help us see more variety of issues and harden the implementation. wdyt? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397142#comment-17397142 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 0eebf83fba8adcc6b635f97a48d54037ba0dfedf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397124#comment-17397124 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) * 0eebf83fba8adcc6b635f97a48d54037ba0dfedf Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397122#comment-17397122 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r686569471 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta +// +// When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5 will be committed to Metadata Table in Review comment: When we relax this we cannot catch issues wherein we may have missed syncing an instant earlier. See HUDI-2286. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397123#comment-17397123 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-896569499 @vinothchandar I reworked part of this so please take a fresh look again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397120#comment-17397120 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r686568735 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataFileSystemView.java ## @@ -73,4 +73,16 @@ public void close() { throw new HoodieException("Error closing metadata file system view.", e); } } + + @Override + public void sync() { Review comment: Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397118#comment-17397118 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r686568347 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -137,6 +138,14 @@ protected HoodieBackedTableMetadataWriter(Configuration hadoopConf, HoodieWriteC private HoodieWriteConfig createMetadataWriteConfig(HoodieWriteConfig writeConfig) { int parallelism = writeConfig.getMetadataInsertParallelism(); +// The metadata table timeline should always encompass the dataset timeline so that syncing of manual rollbacks can +// always check whether the rolled-back-instant was ever synced to the metadata table. Since all actions on the +// dataset always lead to a delta-commit on the Metadata Table, we need to preserve enough deltacommits that can +// cover total number of all the non-archived actions on the dataset. +int multiplier = HoodieTimeline.VALID_ACTIONS_IN_TIMELINE.length; Review comment: Remove this as this was leading to too many instants retained on metadata table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397117#comment-17397117 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) * 0eebf83fba8adcc6b635f97a48d54037ba0dfedf UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395760#comment-17395760 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395594#comment-17395594 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394407#comment-17394407 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394404#comment-17394404 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r683866218 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java ## @@ -236,30 +239,50 @@ * * During a rollback files may be deleted (COW, MOR) or rollback blocks be appended (MOR only) to files. This * function will extract this change file for each partition. - * + * @param metadataTableTimeline Current timeline of the Metdata Table * @param rollbackMetadata {@code HoodieRollbackMetadata} * @param partitionToDeletedFiles The {@code Map} to fill with files deleted per partition. * @param partitionToAppendedFiles The {@code Map} to fill with files appended per partition and their sizes. */ - private static void processRollbackMetadata(HoodieRollbackMetadata rollbackMetadata, + private static void processRollbackMetadata(HoodieTimeline metadataTableTimeline, HoodieRollbackMetadata rollbackMetadata, Map> partitionToDeletedFiles, Map> partitionToAppendedFiles, Option lastSyncTs) { rollbackMetadata.getPartitionMetadata().values().forEach(pm -> { + final String instantToRollback = rollbackMetadata.getCommitsRollback().get(0); // Has this rollback produced new files? boolean hasRollbackLogFiles = pm.getRollbackLogFiles() != null && !pm.getRollbackLogFiles().isEmpty(); boolean hasNonZeroRollbackLogFiles = hasRollbackLogFiles && pm.getRollbackLogFiles().values().stream().mapToLong(Long::longValue).sum() > 0; - // If commit being rolled back has not been synced to metadata table yet then there is no need to update metadata + + // If instant-to-rollback has not been synced to metadata table yet then there is no need to update metadata + // This can happen in two cases: Review comment: if its uncommitted, we cannot sync the instant-to-rollback. I take this back -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394403#comment-17394403 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r683865065 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java ## @@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType tableType) throws Excepti client.syncTableMetadata(); validateMetadata(client); } + +// If an unsynced commit is automatically rolled back during next commit, the rollback commit gets a timestamp +// greater than than the new commit which is started. Ensure that in this case the rollback is not processed +// as the earlier failed commit would not have been committed. +// +// Dataset: C1C2 C3.inflight[failed] C4 R5[rolls back C3] +// Metadata: C1.delta C2.delta +// +// When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5 will be committed to Metadata Table in Review comment: I think we are doing this just for sake of maintaining this invariant that, deleting a file that does not exist should be a fatal exception. I think we can just relax this, instead of introducing more complexity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394399#comment-17394399 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 45925d349630415ee9e10cb8b32ae7fad7f25b25 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=664) * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1416) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394398#comment-17394398 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 45925d349630415ee9e10cb8b32ae7fad7f25b25 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=664) * 40b0fb75b0cf18dab3559a20a6fa44a3378764fa UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377125#comment-17377125 ] ASF GitHub Bot commented on HUDI-2119: -- vinothchandar commented on a change in pull request #3210: URL: https://github.com/apache/hudi/pull/3210#discussion_r665906425 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataFileSystemView.java ## @@ -73,4 +73,16 @@ public void close() { throw new HoodieException("Error closing metadata file system view.", e); } } + + @Override + public void sync() { +// Sync the tableMetadata first as super.sync() may call listPartition +if (tableMetadata != null) { + BaseTableMetadata baseMetadata = (BaseTableMetadata)tableMetadata; Review comment: nit: space after `)`. Suprised that checkstyle is happy ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -322,8 +303,22 @@ protected HoodieEngineContext getEngineContext() { return engineContext != null ? engineContext : new HoodieLocalEngineContext(hadoopConf.get()); } + public HoodieMetadataConfig getMetadataConfig() { +return metadataConfig; + } + protected String getLatestDatasetInstantTime() { return datasetMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant() .map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP); } + + @Override + public Option getSyncedInstantTimeForReader() { +if (timelineMergedMetadata == null) { Review comment: Just `return Option.ofNullable(timelineMergedMetadata).map(TimelineMergedTableMetadata::getSyncedInstantTime())` ? Meta point: We can just use the Option APIs like this without explicit checks. ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -137,6 +138,14 @@ protected HoodieBackedTableMetadataWriter(Configuration hadoopConf, HoodieWriteC private HoodieWriteConfig createMetadataWriteConfig(HoodieWriteConfig writeConfig) { int parallelism = writeConfig.getMetadataInsertParallelism(); +// The metadata table timeline should always encompass the dataset timeline so that syncing of manual rollbacks can +// always check whether the rolled-back-instant was ever synced to the metadata table. Since all actions on the +// dataset always lead to a delta-commit on the Metadata Table, we need to preserve enough deltacommits that can +// cover total number of all the non-archived actions on the dataset. +int multiplier = HoodieTimeline.VALID_ACTIONS_IN_TIMELINE.length; Review comment: why is this multiplier chosen? It's not guaranteed that the timeline will have all valid actions at once right? ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataFileSystemView.java ## @@ -73,4 +73,16 @@ public void close() { throw new HoodieException("Error closing metadata file system view.", e); } } + + @Override + public void sync() { Review comment: IIUC this is to handle the file system view refreshes ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -137,6 +138,14 @@ protected HoodieBackedTableMetadataWriter(Configuration hadoopConf, HoodieWriteC private HoodieWriteConfig createMetadataWriteConfig(HoodieWriteConfig writeConfig) { int parallelism = writeConfig.getMetadataInsertParallelism(); +// The metadata table timeline should always encompass the dataset timeline so that syncing of manual rollbacks can +// always check whether the rolled-back-instant was ever synced to the metadata table. Since all actions on the +// dataset always lead to a delta-commit on the Metadata Table, we need to preserve enough deltacommits that can +// cover total number of all the non-archived actions on the dataset. +int multiplier = HoodieTimeline.VALID_ACTIONS_IN_TIMELINE.length; Review comment: I wonder if say 8x-ing the retention of the metadata timeline is really necessary. ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java ## @@ -236,30 +239,50 @@ * * During a rollback files may be deleted (COW, MOR) or rollback blocks be appended (MOR only) to files. This * function will extract this change file for each partition. - * + * @param metadataTableTimeline Current timeline of the Metdata Table * @param rollbackMetadata {@code HoodieRollbackMetadata} * @param partitionToDeletedFiles The {@code Map} to fill with files deleted per partition. * @param partitionToAppendedFiles The {@code Map} to fill with files appended per partition and their sizes. */ - private static
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373847#comment-17373847 ] ASF GitHub Bot commented on HUDI-2119: -- codecov-commenter edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872684476 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3210](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (45925d3) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `1.82%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3210 +/- ## - Coverage 47.51% 45.68% -1.83% + Complexity 5429 4640 -789 Files 922 826 -96 Lines 4096837546-3422 Branches 4105 3751 -354 - Hits 1946417152-2312 + Misses1978018813 -967 + Partials 1724 1581 -143 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `40.00% <ø> (ø)` | | | hudiclient | `22.91% <0.00%> (-11.67%)` | :arrow_down: | | hudicommon | `48.29% <0.00%> (-0.09%)` | :arrow_down: | | hudiflink | `59.91% <ø> (-0.17%)` | :arrow_down: | | hudihadoopmr | `51.29% <ø> (ø)` | | | hudisparkdatasource | `67.24% <ø> (+0.13%)` | :arrow_up: | | hudisync | `54.05% <ø> (ø)` | | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `58.05% <ø> (+0.03%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...hudi/metadata/HoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldGFkYXRhL0hvb2RpZUJhY2tlZFRhYmxlTWV0YWRhdGFXcml0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...e/hudi/metadata/FileSystemBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvRmlsZVN5c3RlbUJhY2tlZFRhYmxlTWV0YWRhdGEuamF2YQ==) | `89.74% <0.00%> (-2.37%)` | :arrow_down: | | [...pache/hudi/metadata/HoodieBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllQmFja2VkVGFibGVNZXRhZGF0YS5qYXZh) | `0.00% <0.00%> (ø)` | | | [...he/hudi/metadata/HoodieMetadataFileSystemView.java](https://codecov.io/gh/apache/hudi/pul
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373846#comment-17373846 ] ASF GitHub Bot commented on HUDI-2119: -- codecov-commenter edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872684476 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3210](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (45925d3) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `6.00%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3210 +/- ## - Coverage 47.51% 41.51% -6.01% + Complexity 5429 4014-1415 Files 922 748 -174 Lines 4096833175-7793 Branches 4105 3084-1021 - Hits 1946413771-5693 + Misses1978018305-1475 + Partials 1724 1099 -625 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `40.00% <ø> (ø)` | | | hudiclient | `22.91% <0.00%> (-11.67%)` | :arrow_down: | | hudicommon | `48.29% <0.00%> (-0.09%)` | :arrow_down: | | hudiflink | `59.91% <ø> (-0.17%)` | :arrow_down: | | hudihadoopmr | `51.29% <ø> (ø)` | | | hudisparkdatasource | `?` | | | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.05% <ø> (+0.03%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...hudi/metadata/HoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldGFkYXRhL0hvb2RpZUJhY2tlZFRhYmxlTWV0YWRhdGFXcml0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...e/hudi/metadata/FileSystemBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvRmlsZVN5c3RlbUJhY2tlZFRhYmxlTWV0YWRhdGEuamF2YQ==) | `89.74% <0.00%> (-2.37%)` | :arrow_down: | | [...pache/hudi/metadata/HoodieBackedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllQmFja2VkVGFibGVNZXRhZGF0YS5qYXZh) | `0.00% <0.00%> (ø)` | | | [...he/hudi/metadata/HoodieMetadataFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tr
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373843#comment-17373843 ] ASF GitHub Bot commented on HUDI-2119: -- codecov-commenter edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872684476 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3210](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (45925d3) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `32.03%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3210 +/- ## = - Coverage 47.51% 15.47% -32.04% + Complexity 5429 479 -4950 = Files 922 280 -642 Lines 4096811559-29409 Branches 4105 945 -3160 = - Hits 19464 1789-17675 + Misses19780 9612-10168 + Partials 1724 158 -1566 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.05% <ø> (+0.03%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...hudi/metadata/HoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldGFkYXRhL0hvb2RpZUJhY2tlZFRhYmxlTWV0YWRhdGFXcml0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373840#comment-17373840 ] ASF GitHub Bot commented on HUDI-2119: -- codecov-commenter edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872684476 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3210](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (45925d3) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `44.62%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3210 +/- ## - Coverage 47.51% 2.88% -44.63% + Complexity 5429 82 -5347 Files 922 280 -642 Lines 40968 11559-29409 Branches 4105 945 -3160 - Hits 19464 334-19130 + Misses19780 11199 -8581 + Partials 1724 26 -1698 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.31% <ø> (-48.72%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...hudi/metadata/HoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldGFkYXRhL0hvb2RpZUJhY2tlZFRhYmxlTWV0YWRhdGFXcml0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373809#comment-17373809 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 45925d349630415ee9e10cb8b32ae7fad7f25b25 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=664) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373788#comment-17373788 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 9d104830439f5bbe589599d9a3bb5dd6f140f223 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=628) * 45925d349630415ee9e10cb8b32ae7fad7f25b25 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=664) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373785#comment-17373785 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 9d104830439f5bbe589599d9a3bb5dd6f140f223 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=628) * 45925d349630415ee9e10cb8b32ae7fad7f25b25 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373184#comment-17373184 ] ASF GitHub Bot commented on HUDI-2119: -- codecov-commenter commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872684476 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3210](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9d10483) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **increase** coverage by `2.23%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3210 +/- ## + Coverage 47.51% 49.74% +2.23% + Complexity 5429 406-5023 Files 922 67 -855 Lines 40968 2985 -37983 Branches 4105 320-3785 - Hits 19464 1485 -17979 + Misses19780 1365 -18415 + Partials 1724 135-1589 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `?` | | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `?` | | | huditimelineservice | `?` | | | hudiutilities | `49.74% <ø> (-8.28%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: | | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: | | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: | | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3210/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+A
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373053#comment-17373053 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot edited a comment on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 9d104830439f5bbe589599d9a3bb5dd6f140f223 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=628) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373048#comment-17373048 ] ASF GitHub Bot commented on HUDI-2119: -- hudi-bot commented on pull request #3210: URL: https://github.com/apache/hudi/pull/3210#issuecomment-872541566 ## CI report: * 9d104830439f5bbe589599d9a3bb5dd6f140f223 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Labels: pull-request-available > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2119) Syncing of rollbacks to metadata table does not work in all cases
[ https://issues.apache.org/jira/browse/HUDI-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373047#comment-17373047 ] ASF GitHub Bot commented on HUDI-2119: -- prashantwason opened a new pull request #3210: URL: https://github.com/apache/hudi/pull/3210 If the rolled-back instant was synced to the Metadata Table, a corresponding deltacommit with the same timestamp should have been created on the Metadata Table timeline. To ensure we can always perform this check, the Metadata Table instants should not be archived until their corresponding instants are present in the dataset timeline. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request ## Brief change log 1. Check that the instant-being-rolled-back was commited to metadata table 2. Keep larger number of instants in the metadata table timeline so the above check can be performed. ## Verify this pull request This pull request is already covered by existing tests for Metadata Table This change added tests to TestHoodieBackedMetadata ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Syncing of rollbacks to metadata table does not work in all cases > - > > Key: HUDI-2119 > URL: https://issues.apache.org/jira/browse/HUDI-2119 > Project: Apache Hudi > Issue Type: Bug >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > This is an issue with inline automatic rollbacks. > Metadata table assumes that a rollbacks is to be applied if the > instant-being-rolled back has a timestamp less than the last deltacommit time > on the metadata timeline. We do not explicitly check if the > instant-being-rolled-back was actually written to metadata table. > **A rollback adds a record to metadata table which "deletes" files from a > failed/earlier commit. If the files being deleted were never actually > committed to metadata table earlier, the deletes cannot be consolidated > during metadata table reads. This leads to a HoodieMetadataException as we > cannot differentiate this from a bug where we might have missed committing a > commit to metadata table. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)