[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425066#comment-17425066 ] sivabalan narayanan commented on HUDI-2159: --- With synchronous metadata table design, we can now run clustering along with metadata enabled. Closing this for now. Please re-open is something is still pending. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.10.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394392#comment-17394392 ] Vinoth Chandar commented on HUDI-2159: -- [~nishith29] [~pwason] any updates on this? Like to get this fixed before 0.9.0 next week. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379423#comment-17379423 ] Nishith Agarwal commented on HUDI-2159: --- Thanks for the detailed analysis [~pwason]. I think it is definitely worth solving (1) from the 0.9.0 release. This is a legitimate situation that can surface up especially as users schedule ingestion at a lower frequency there is more chances of such collisions. For (2), since it is more of a perf degradation in cases of failures, we can address this right after 0.9 by landing the tailing timeline based on completion time. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378342#comment-17378342 ] Vinoth Chandar commented on HUDI-2159: -- >Metadata Table sync only works in completion order. I almost feels like, this is the sticking point in all the issues we hit :) . We gained debuggability with the sync stuff. but there is too much complexity. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378341#comment-17378341 ] Vinoth Chandar commented on HUDI-2159: -- > Since, ingestion runs at faster cadence, we can set hoodie.metadata.sync=true >in ingestion pipeline as hoodie.metadata.sync=false in all other pipelines. This is a practical approach. I wonder again though, if the multi writer stuff already have something like this. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378340#comment-17378340 ] Vinoth Chandar commented on HUDI-2159: -- >There is a chance that two processed perform deltacommit at T5 on the metadata >table and one will fail (instant file already exists). Would n't the locking service we do for multi writer solve all this? > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Blocker > Fix For: 0.9.0 > > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378260#comment-17378260 ] Prashant Wason commented on HUDI-2159: -- [~vinoth] [~nagarwal]. [~satish]. Please review. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Major > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2159) Supporting Clustering and Metadata Table together
[ https://issues.apache.org/jira/browse/HUDI-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378257#comment-17378257 ] Prashant Wason commented on HUDI-2159: -- Possible solutions: # Create a reader mode for metadata table: ## hoodie.metadata.enable=true ## hoodie.metadata.sync=false In this mode, the client wont call syncMetadataTable() at the end of the operations. Since, ingestion runs at faster cadence, we can set hoodie.metadata.sync=true in ingestion pipeline as hoodie.metadata.sync=false in all other pipelines. 2. Clustering ca be cleaned as per the timeout detection using HeartBeats. > Supporting Clustering and Metadata Table together > - > > Key: HUDI-2159 > URL: https://issues.apache.org/jira/browse/HUDI-2159 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Prashant Wason >Assignee: Prashant Wason >Priority: Major > > I am testing clustering support for metadata enabled table and found a few > issues. > *Setup* > Pipeline 1: Ingestion pipeline with Metadata Table enabled. Runs every 30 > mins. > Pipeline 2: Clustering pipeline with long running jobs (3-4 hours) > Pipeline 3: Another clustering pipeline with long running jobs (3-4 hours) > > *Issue #1: Parallel commits on Metadata Table* > Assume the Clustering pipeline is completing T5.replacecommit and ingestion > pipeline is completing T10.commit. Metadata Table will synced at an instant > Now both the pipelines will call syncMetadataTable() which will do the > following: > # Find all un-synced instants from dataset (T5, T6 ... T10) > # Read each instant and perform a deltacommit on the Metadata Table with the > same timestamp as instant. > There is a chance that two processed perform deltacommit at T5 on the > metadata table and one will fail (instant file already exists). This will be > an exception raised and will be detected as failure of pipeline leading to > false-positive alerts. > > *Issue #2: No archiving/rollback support for failed clustering operations* > If a clustering operation fails, it leaves a left-over > T5.replacecommit.inflight. There is no automated way to rollback or archive > these. Since clustering is a long running operation in general and may be run > through multiple pipelines at the same time, automated rollback of left-over > inflights doesnt work as we cannot be sure that the process is dead. > Metadata Table sync only works in completion order. So if > T5.replacecommit.inflight is left-over, Metadata Table will not sync beyond > T5 causing a large number of LogBLocks to pile up which will have to be > merged in memory leading to deteriorating performance. > -- This message was sent by Atlassian Jira (v8.3.4#803005)