[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381088#comment-17381088 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (740f439) into [master](https://codecov.io/gh/apache/hudi/commit/039aeb6dcee0a8eb4372c079ec07b8fc2582e41f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (039aeb6) will **decrease** coverage by `30.41%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 46.16% 15.74% -30.42% + Complexity 5370 493 -4877 = Files 921 284 -637 Lines 3995311839-28114 Branches 4288 982 -3306 = - Hits 18444 1864-16580 + Misses19630 9812 -9818 + Partials 1879 163 -1716 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-30.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-49.20%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.18% <100.00%> (+0.74%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `70.64% <100.00%> (-0.64%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (740f439) into [master](https://codecov.io/gh/apache/hudi/commit/039aeb6dcee0a8eb4372c079ec07b8fc2582e41f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (039aeb6) will **decrease** coverage by `30.41%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 46.16% 15.74% -30.42% + Complexity 5370 493 -4877 = Files 921 284 -637 Lines 3995311839-28114 Branches 4288 982 -3306 = - Hits 18444 1864-16580 + Misses19630 9812 -9818 + Partials 1879 163 -1716 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-30.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-49.20%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.18% <100.00%> (+0.74%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `70.64% <100.00%> (-0.64%)` | :arrow_down: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381074#comment-17381074 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381068#comment-17381068 ] ASF GitHub Bot commented on HUDI-1860: -- Samrat002 edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-879154991 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381069#comment-17381069 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (740f439) into [master](https://codecov.io/gh/apache/hudi/commit/039aeb6dcee0a8eb4372c079ec07b8fc2582e41f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (039aeb6) will **decrease** coverage by `43.33%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 46.16% 2.82% -43.34% + Complexity 5370 85 -5285 Files 921 284 -637 Lines 39953 11839-28114 Branches 4288 982 -3306 - Hits 18444 335-18109 + Misses19630 11478 -8152 + Partials 1879 26 -1853 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-30.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-49.20%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.10% <0.00%> (-49.34%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.29%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (740f439) into [master](https://codecov.io/gh/apache/hudi/commit/039aeb6dcee0a8eb4372c079ec07b8fc2582e41f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (039aeb6) will **decrease** coverage by `43.33%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 46.16% 2.82% -43.34% + Complexity 5370 85 -5285 Files 921 284 -637 Lines 39953 11839-28114 Branches 4288 982 -3306 - Hits 18444 335-18109 + Misses19630 11478 -8152 + Partials 1879 26 -1853 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-30.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-49.20%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.10% <0.00%> (-49.34%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.29%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/utilities/sourc
[GitHub] [hudi] Samrat002 edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
Samrat002 edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-879154991 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381067#comment-17381067 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) * 740f4390fd6dd5798d5bee5f046a258dffd9887a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=923) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) * 740f4390fd6dd5798d5bee5f046a258dffd9887a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=923) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381060#comment-17381060 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) * 740f4390fd6dd5798d5bee5f046a258dffd9887a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) * 740f4390fd6dd5798d5bee5f046a258dffd9887a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381051#comment-17381051 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2181) Refine doc for FlinkCreateHandle
[ https://issues.apache.org/jira/browse/HUDI-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangminglei updated HUDI-2181: --- Summary: Refine doc for FlinkCreateHandle (was: Refine the doc for FlinkCreateHandle) > Refine doc for FlinkCreateHandle > > > Key: HUDI-2181 > URL: https://issues.apache.org/jira/browse/HUDI-2181 > Project: Apache Hudi > Issue Type: Improvement > Components: Flink Integration >Reporter: zhangminglei >Priority: Major > > FlinkCreateHandle does not append to the original file for subsequent > mini-batches, instead every inserts batch would create a new file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2181) Refine doc for FlinkCreateHandle
[ https://issues.apache.org/jira/browse/HUDI-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangminglei reassigned HUDI-2181: -- Assignee: zhangminglei > Refine doc for FlinkCreateHandle > > > Key: HUDI-2181 > URL: https://issues.apache.org/jira/browse/HUDI-2181 > Project: Apache Hudi > Issue Type: Improvement > Components: Flink Integration >Reporter: zhangminglei >Assignee: zhangminglei >Priority: Major > > FlinkCreateHandle does not append to the original file for subsequent > mini-batches, instead every inserts batch would create a new file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2181) Refine the doc for FlinkCreateHandle
zhangminglei created HUDI-2181: -- Summary: Refine the doc for FlinkCreateHandle Key: HUDI-2181 URL: https://issues.apache.org/jira/browse/HUDI-2181 Project: Apache Hudi Issue Type: Improvement Components: Flink Integration Reporter: zhangminglei FlinkCreateHandle does not append to the original file for subsequent mini-batches, instead every inserts batch would create a new file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381045#comment-17381045 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.79%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 44.10% 27.30% -16.80% + Complexity 5157 1292 -3865 = Files 936 386 -550 Lines 4162915343-26286 Branches 4189 1339 -2850 = - Hits 18362 4190-14172 + Misses2163810849-10789 + Partials 1629 304 -1325 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.91% <0.00%> (-13.56%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <0.00%> (+50.14%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.79%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 44.10% 27.30% -16.80% + Complexity 5157 1292 -3865 = Files 936 386 -550 Lines 4162915343-26286 Branches 4189 1339 -2850 = - Hits 18362 4190-14172 + Misses2163810849-10789 + Partials 1629 304 -1325 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.91% <0.00%> (-13.56%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <0.00%> (+50.14%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9
[jira] [Updated] (HUDI-2044) Extend support for rockDB and compression for Spillable map to all consumers of ExternalSpillableMap
[ https://issues.apache.org/jira/browse/HUDI-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2044: -- Description: # HUDI-2028 only implements rockDb support for Spillable map in HoodieMergeHandle since we are blocked on the configuration refactor PR to land # This ticket will track the implementation to extend rocksDB (and compression for bit cask) support for Spoilable Map to all consumers of ExternalSpillableMap.java was: # HUDI-2028 only implements rockDb support for Spillable map in HoodieMergeHandle since we are blocked on the configuration refactor PR to land # This ticket will track the implementation to extend rocksDB support for Spoilable Map to all consumers of ExternalSpillableMap.java > Extend support for rockDB and compression for Spillable map to all consumers > of ExternalSpillableMap > > > Key: HUDI-2044 > URL: https://issues.apache.org/jira/browse/HUDI-2044 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > > # HUDI-2028 only implements rockDb support for Spillable map in > HoodieMergeHandle since we are blocked on the configuration refactor PR to > land > # This ticket will track the implementation to extend rocksDB (and > compression for bit cask) support for Spoilable Map to all consumers of > ExternalSpillableMap.java -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381044#comment-17381044 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=922) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381043#comment-17381043 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880409359 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880409359 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381039#comment-17381039 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `28.35%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 44.10% 15.74% -28.36% + Complexity 5157 493 -4664 = Files 936 284 -652 Lines 4162911835-29794 Branches 4189 982 -3207 = - Hits 18362 1864-16498 + Misses21638 9808-11830 + Partials 1629 163 -1466 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <0.00%> (+50.14%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=t
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `28.35%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3184 +/- ## = - Coverage 44.10% 15.74% -28.36% + Complexity 5157 493 -4664 = Files 936 284 -652 Lines 4162911835-29794 Branches 4189 982 -3207 = - Hits 18362 1864-16498 + Misses21638 9808-11830 + Partials 1629 163 -1466 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.26% <0.00%> (+50.14%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9k
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381037#comment-17381037 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=921) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381035#comment-17381035 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `41.27%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 44.10% 2.83% -41.28% + Complexity 5157 85 -5072 Files 936 284 -652 Lines 41629 11835-29794 Branches 4189 982 -3207 - Hits 18362 335-18027 + Misses21638 11474-10164 + Partials 1629 26 -1603 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.11% <0.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_sou
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (814e45c) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `41.27%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 44.10% 2.83% -41.28% + Complexity 5157 85 -5072 Files 936 284 -652 Lines 41629 11835-29794 Branches 4189 982 -3207 - Hits 18362 335-18027 + Misses21638 11474-10164 + Partials 1629 26 -1603 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.11% <0.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-1
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381034#comment-17381034 ] ASF GitHub Bot commented on HUDI-1860: -- codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cf901c6) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `41.27%`. > The diff coverage is `n/a`. > :exclamation: Current head cf901c6 differs from pull request most recent head 814e45c. Consider uploading reports for the commit 814e45c to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 44.10% 2.83% -41.28% + Complexity 5157 85 -5072 Files 936 284 -652 Lines 41629 11835-29794 Branches 4189 982 -3207 - Hits 18362 335-18027 + Misses21638 11474-10164 + Partials 1629 26 -1603 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.11% <0.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
codecov-commenter edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870526141 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3184](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cf901c6) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `41.27%`. > The diff coverage is `n/a`. > :exclamation: Current head cf901c6 differs from pull request most recent head 814e45c. Consider uploading reports for the commit 814e45c to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3184 +/- ## - Coverage 44.10% 2.83% -41.28% + Complexity 5157 85 -5072 Files 936 284 -652 Lines 41629 11835-29794 Branches 4189 982 -3207 - Hits 18362 335-18027 + Misses21638 11474-10164 + Partials 1629 26 -1603 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <0.00%> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <0.00%> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `9.11% <0.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3184/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Sof
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381031#comment-17381031 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN * 814e45c99f54bb11ed54263cc4077e0a08689b48 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381030#comment-17381030 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) * cf901c664f7baaab834b3f02a819144b5558f952 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381027#comment-17381027 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **increase** coverage by `3.65%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## + Coverage 44.10% 47.76% +3.65% - Complexity 5157 5566 +409 Files 936 936 Lines 4162941653 +24 Branches 4189 4195 +6 + Hits 1836219897+1535 + Misses2163819987-1651 - Partials 1629 1769 +140 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.47% <ø> (+<0.01%)` | :arrow_up: | | hudicommon | `48.69% <ø> (ø)` | | | hudiflink | `59.68% <ø> (ø)` | | | hudihadoopmr | `52.02% <ø> (ø)` | | | hudisparkdatasource | `67.21% <ø> (ø)` | | | hudisync | `55.73% <ø> (ø)` | | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...e/hudi/client/heartbeat/HoodieHeartbeatClient.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9oZWFydGJlYXQvSG9vZGllSGVhcnRiZWF0Q2xpZW50LmphdmE=) | `69.15% <0.00%> (+0.93%)` | :arrow_up: | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `88.79% <0.00%> (+5.17%)` | :arrow_up: | | [...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==) | `100.00% <0.00%> (+11.11%)` | :arrow_up: | | [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apach
[GitHub] [hudi] vinothchandar commented on pull request #3155: [Do-No-Merge][WIP] Running TestCleaner tests repeatedly
vinothchandar commented on pull request #3155: URL: https://github.com/apache/hudi/pull/3155#issuecomment-880387121 closing this for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar closed pull request #3155: [Do-No-Merge][WIP] Running TestCleaner tests repeatedly
vinothchandar closed pull request #3155: URL: https://github.com/apache/hudi/pull/3155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **increase** coverage by `3.65%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## + Coverage 44.10% 47.76% +3.65% - Complexity 5157 5566 +409 Files 936 936 Lines 4162941653 +24 Branches 4189 4195 +6 + Hits 1836219897+1535 + Misses2163819987-1651 - Partials 1629 1769 +140 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `39.97% <ø> (ø)` | | | hudiclient | `34.47% <ø> (+<0.01%)` | :arrow_up: | | hudicommon | `48.69% <ø> (ø)` | | | hudiflink | `59.68% <ø> (ø)` | | | hudihadoopmr | `52.02% <ø> (ø)` | | | hudisparkdatasource | `67.21% <ø> (ø)` | | | hudisync | `55.73% <ø> (ø)` | | | huditimelineservice | `64.07% <ø> (ø)` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...e/hudi/client/heartbeat/HoodieHeartbeatClient.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9oZWFydGJlYXQvSG9vZGllSGVhcnRiZWF0Q2xpZW50LmphdmE=) | `69.15% <0.00%> (+0.93%)` | :arrow_up: | | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `88.79% <0.00%> (+5.17%)` | :arrow_up: | | [...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==) | `100.00% <0.00%> (+11.11%)` | :arrow_up: | | [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2h
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381025#comment-17381025 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381022#comment-17381022 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381014#comment-17381014 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.80%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 27.29% -16.81% + Complexity 5157 1292 -3865 = Files 936 386 -550 Lines 4162915367-26262 Branches 4189 1345 -2844 = - Hits 18362 4195-14167 + Misses2163810864-10774 + Partials 1629 308 -1321 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.91% <ø> (-13.56%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=refer
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `16.80%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 27.29% -16.81% + Complexity 5157 1292 -3865 = Files 936 386 -550 Lines 4162915367-26262 Branches 4189 1345 -2844 = - Hits 18362 4195-14167 + Misses2163810864-10774 + Partials 1629 308 -1321 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.91% <ø> (-13.56%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.0
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381011#comment-17381011 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=920) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1860) Add INSERT_OVERWRITE support to DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381009#comment-17381009 ] ASF GitHub Bot commented on HUDI-1860: -- hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add INSERT_OVERWRITE support to DeltaStreamer > - > > Key: HUDI-1860 > URL: https://issues.apache.org/jira/browse/HUDI-1860 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Original Estimate: 72h > Remaining Estimate: 72h > > As discussed in [this > RFC|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller], > having full fetch mode use the inser_overwrite to write to sync would be > better as it can handle schema changes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3184: [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer
hudi-bot edited a comment on pull request #3184: URL: https://github.com/apache/hudi/pull/3184#issuecomment-870410669 ## CI report: * f15a539b1ea1ed7fb9a5b8a31cc9b88d68a6710f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=911) * c0063ddcc875e3e13348861ebaf21ef47126a691 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381007#comment-17381007 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `28.34%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 15.76% -28.35% + Complexity 5157 493 -4664 = Files 936 284 -652 Lines 4162911859-29770 Branches 4189 988 -3201 = - Hits 18362 1869-16493 + Misses21638 9823-11815 + Partials 1629 167 -1462 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referr
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `28.34%`. > The diff coverage is `38.88%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3259 +/- ## = - Coverage 44.10% 15.76% -28.35% + Complexity 5157 493 -4664 = Files 936 284 -652 Lines 4162911859-29770 Branches 4189 988 -3201 = - Hits 18362 1869-16493 + Misses21638 9823-11815 + Partials 1629 167 -1462 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `0.00% <ø> (-34.47%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.85% <ø> (-50.88%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `58.96% <38.88%> (+49.84%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `51.06% <38.88%> (+51.06%)` | :arrow_up: | | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381001#comment-17381001 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) Also all the changes are done. PTAL and thanks a lot for your review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) Also all the changes are done. PTAL and thanks a lot for your review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381000#comment-17381000 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-880370120 > @zhangyue19921010 hello, Which company are you from? Can we add wechat? My wechat is lw19900302 It's my pleasure. I'm coming from freewheel :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380999#comment-17380999 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111508 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1059,6 +1059,50 @@ public void testHoodieAsyncClusteringJob() throws Exception { assertEquals(1, metaClient.getActiveTimeline().getCompletedReplaceTimeline().getInstants().toArray().length); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { +String tableBasePath = dfsBasePath + "/asyncClustering2"; +// Keep it higher than batch-size to test continuous mode +int totalRecords = 3000; + +// Initial bulk insert +HoodieDeltaStreamer.Config cfg = TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT); +cfg.continuousMode = true; +cfg.tableType = HoodieTableType.COPY_ON_WRITE.name(); +cfg.configs.add(String.format("%s=%d", SourceConfigs.MAX_UNIQUE_RECORDS_PROP, totalRecords)); +cfg.configs.add(String.format("%s=false", HoodieCompactionConfig.AUTO_CLEAN_PROP.key())); +cfg.configs.add(String.format("%s=true", HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY.key())); +HoodieDeltaStreamer ds = new HoodieDeltaStreamer(cfg, jsc); +deltaStreamerTestRunner(ds, cfg, (r) -> { + TestHelpers.assertAtLeastNCommits(2, tableBasePath, dfs); + HoodieClusteringJob.Config scheduleClusteringConfig = buildHoodieClusteringUtilConfig(tableBasePath, + null, true); + scheduleClusteringConfig.runningMode = "scheduleAndExecute"; + HoodieClusteringJob scheduleClusteringJob = new HoodieClusteringJob(jsc, scheduleClusteringConfig); + + try { +int result = scheduleClusteringJob.doScheduleAndCluster(); +if (result == 0) { + LOG.info("Cluster success"); +} else { + LOG.warn("Import failed"); + return false; +} + } catch (Exception e) { +LOG.warn("ScheduleAndExecute clustering failed", e); +return false; + } + + HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder().setConf(this.dfs.getConf()).setBasePath(tableBasePath).setLoadActiveTimelineOnLoad(true).build(); + int pendingReplaceSize = metaClient.getActiveTimeline().filterPendingReplaceTimeline().getInstants().toArray().length; Review comment: Nice catching. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380998#comment-17380998 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111402 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: E, actually, there already has doSchedule() and doCluster() function. But if we let doScheduleAndCluster() use doschedule() and docluster() directly, it will start and stop SparkRDDWriteClient twice which is an expensive action and unnecessary. Maybe let schedule action and cluster action use a common SparkRDDWriteClient is better. For example start and stop Timeline service twice. ``` 21/07/15 11:05:11 INFO EmbeddedTimelineService: Starting Timeline service !! 21/07/15 11:05:11 INFO EmbeddedTimelineService: Overriding hostIp to (localhost) found in spark-conf. It was null 21/07/15 11:05:11 INFO FileSystemViewManager: Creating View Manager with storage type :MEMORY 21/07/15 11:05:11 INFO FileSystemViewManager: Creating in-memory based Table View 21/07/15 11:05:11 INFO log: Logging initialized @4500ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog 21/07/15 11:05:11 INFO Javalin: __ __ _ / / _ _ __ _ / /(_) __ / // __ `/| | / // __ `// // // __ \ / /_/ // /_/ / | |/ // /_/ // // // / / / \/ \__,_/ |___/ \__,_//_//_//_/ /_/ https://javalin.io/documentation 21/07/15 11:05:11 INFO Javalin: Starting Javalin ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111508 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java ## @@ -1059,6 +1059,50 @@ public void testHoodieAsyncClusteringJob() throws Exception { assertEquals(1, metaClient.getActiveTimeline().getCompletedReplaceTimeline().getInstants().toArray().length); } + @Test + public void testHoodieAsyncClusteringJobWithScheduleAndExecute() throws Exception { +String tableBasePath = dfsBasePath + "/asyncClustering2"; +// Keep it higher than batch-size to test continuous mode +int totalRecords = 3000; + +// Initial bulk insert +HoodieDeltaStreamer.Config cfg = TestHelpers.makeConfig(tableBasePath, WriteOperationType.INSERT); +cfg.continuousMode = true; +cfg.tableType = HoodieTableType.COPY_ON_WRITE.name(); +cfg.configs.add(String.format("%s=%d", SourceConfigs.MAX_UNIQUE_RECORDS_PROP, totalRecords)); +cfg.configs.add(String.format("%s=false", HoodieCompactionConfig.AUTO_CLEAN_PROP.key())); +cfg.configs.add(String.format("%s=true", HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE_OPT_KEY.key())); +HoodieDeltaStreamer ds = new HoodieDeltaStreamer(cfg, jsc); +deltaStreamerTestRunner(ds, cfg, (r) -> { + TestHelpers.assertAtLeastNCommits(2, tableBasePath, dfs); + HoodieClusteringJob.Config scheduleClusteringConfig = buildHoodieClusteringUtilConfig(tableBasePath, + null, true); + scheduleClusteringConfig.runningMode = "scheduleAndExecute"; + HoodieClusteringJob scheduleClusteringJob = new HoodieClusteringJob(jsc, scheduleClusteringConfig); + + try { +int result = scheduleClusteringJob.doScheduleAndCluster(); +if (result == 0) { + LOG.info("Cluster success"); +} else { + LOG.warn("Import failed"); + return false; +} + } catch (Exception e) { +LOG.warn("ScheduleAndExecute clustering failed", e); +return false; + } + + HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder().setConf(this.dfs.getConf()).setBasePath(tableBasePath).setLoadActiveTimelineOnLoad(true).build(); + int pendingReplaceSize = metaClient.getActiveTimeline().filterPendingReplaceTimeline().getInstants().toArray().length; Review comment: Nice catching. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670111402 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { +return this.doScheduleAndCluster(jsc); + } + + public int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { +LOG.info("Step 1: Do schedule"); +String schemaStr = getSchemaFromLatestInstant(); +try (SparkRDDWriteClient client = UtilHelpers.createHoodieClient(jsc, cfg.basePath, schemaStr, cfg.parallelism, Option.empty(), props)) { + + Option instantTime; + if (cfg.clusteringInstantTime != null) { +client.scheduleClusteringAtInstant(cfg.clusteringInstantTime, Option.empty()); +instantTime = Option.of(cfg.clusteringInstantTime); + } else { +instantTime = client.scheduleClustering(Option.empty()); + } + + int result = instantTime.isPresent() ? 0 : -1; Review comment: E, actually, there already has doSchedule() and doCluster() function. But if we let doScheduleAndCluster() use doschedule() and docluster() directly, it will start and stop SparkRDDWriteClient twice which is an expensive action and unnecessary. Maybe let schedule action and cluster action use a common SparkRDDWriteClient is better. For example start and stop Timeline service twice. ``` 21/07/15 11:05:11 INFO EmbeddedTimelineService: Starting Timeline service !! 21/07/15 11:05:11 INFO EmbeddedTimelineService: Overriding hostIp to (localhost) found in spark-conf. It was null 21/07/15 11:05:11 INFO FileSystemViewManager: Creating View Manager with storage type :MEMORY 21/07/15 11:05:11 INFO FileSystemViewManager: Creating in-memory based Table View 21/07/15 11:05:11 INFO log: Logging initialized @4500ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog 21/07/15 11:05:11 INFO Javalin: __ __ _ / / _ _ __ _ / /(_) __ / // __ `/| | / // __ `// // // __ \ / /_/ // /_/ / | |/ // /_/ // // // / / / \/ \__,_/ |___/ \__,_//_//_//_/ /_/ https://javalin.io/documentation 21/07/15 11:05:11 INFO Javalin: Starting Javalin ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380997#comment-17380997 ] ASF GitHub Bot commented on HUDI-2164: -- codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `0.00%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3259 +/- ## === - Coverage 2.83% 2.82% -0.01% Complexity 85 85 === Files 284 284 Lines 11835 11859 +24 Branches982 988 +6 === Hits335 335 - Misses11474 11498 +24 Partials 26 26 ``` | Flag | Coverage Δ | | |---|---|---| | hudiclient | `0.00% <ø> (ø)` | | | hudisync | `4.85% <ø> (ø)` | | | hudiutilities | `9.04% <0.00%> (-0.08%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `0.00% <0.00%> (ø)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d024439...b4aa786](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > >
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380996#comment-17380996 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670110077 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { Review comment: Sure thing. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
codecov-commenter edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878091821 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3259](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4aa786) into [master](https://codecov.io/gh/apache/hudi/commit/d024439764ceeca6366cb33689b729a1c69a6272?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d024439) will **decrease** coverage by `0.00%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3259/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master #3259 +/- ## === - Coverage 2.83% 2.82% -0.01% Complexity 85 85 === Files 284 284 Lines 11835 11859 +24 Branches982 988 +6 === Hits335 335 - Misses11474 11498 +24 Partials 26 26 ``` | Flag | Coverage Δ | | |---|---|---| | hudiclient | `0.00% <ø> (ø)` | | | hudisync | `4.85% <ø> (ø)` | | | hudiutilities | `9.04% <0.00%> (-0.08%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/3259/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | `0.00% <0.00%> (ø)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [d024439...b4aa786](https://codecov.io/gh/apache/hudi/pull/3259?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670110077 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -171,4 +200,38 @@ private int doCluster(JavaSparkContext jsc) throws Exception { return client.scheduleClustering(Option.empty()); } } + + @TestOnly + public int doScheduleAndCluster() throws Exception { Review comment: Sure thing. Changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380995#comment-17380995 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109986 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); + switch (runningMode) { +case SCHEDULE: { + LOG.info("Running Mode: [" + SCHEDULE + "]; Do schedule"); + Option instantTime = doSchedule(jsc); + int result = instantTime.isPresent() ? 0 : -1; + if (result == 0) { +LOG.info("The schedule instant time is " + instantTime.get()); + } + return result; +} +case SCHEDULE_AND_EXECUTE: { + LOG.info("Running Mode: [" + SCHEDULE_AND_EXECUTE + "]"); + return doScheduleAndCluster(jsc); +} +case EXECUTE: +default: { + LOG.info("Running Mode: [" + EXECUTE + "]; Do cluster"); Review comment: Nice catching. I changed the default behavior as `LOG.info("Unsupported running mode [" + runningMode + "], quit the job directly");` in case users set a wrong value of --mode like `--mode abcd`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109986 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); + switch (runningMode) { +case SCHEDULE: { + LOG.info("Running Mode: [" + SCHEDULE + "]; Do schedule"); + Option instantTime = doSchedule(jsc); + int result = instantTime.isPresent() ? 0 : -1; + if (result == 0) { +LOG.info("The schedule instant time is " + instantTime.get()); + } + return result; +} +case SCHEDULE_AND_EXECUTE: { + LOG.info("Running Mode: [" + SCHEDULE_AND_EXECUTE + "]"); + return doScheduleAndCluster(jsc); +} +case EXECUTE: +default: { + LOG.info("Running Mode: [" + EXECUTE + "]; Do cluster"); Review comment: Nice catching. I changed the default behavior as `LOG.info("Unsupported running mode [" + runningMode + "], quit the job directly");` in case users set a wrong value of --mode like `--mode abcd`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380994#comment-17380994 ] ASF GitHub Bot commented on HUDI-2164: -- zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109400 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: When developers call `public int cluster(int retry)` internally like https://github.com/apache/hudi/blob/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52/hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java#L1069 They may not set running mode config, so we need check this value to avoid NLP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
zhangyue19921010 commented on a change in pull request #3259: URL: https://github.com/apache/hudi/pull/3259#discussion_r670109400 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java ## @@ -121,17 +141,26 @@ public static void main(String[] args) { public int cluster(int retry) { this.fs = FSUtils.getFs(cfg.basePath, jsc.hadoopConfiguration()); int ret = UtilHelpers.retry(retry, () -> { - if (cfg.runSchedule) { -LOG.info("Do schedule"); -Option instantTime = doSchedule(jsc); -int result = instantTime.isPresent() ? 0 : -1; -if (result == 0) { - LOG.info("The schedule instant time is " + instantTime.get()); + String runningMode = cfg.runningMode == null ? "" : cfg.runningMode.toLowerCase(); Review comment: When developers call `public int cluster(int retry)` internally like https://github.com/apache/hudi/blob/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52/hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java#L1069 They may not set running mode config, so we need check this value to avoid NLP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380993#comment-17380993 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=919) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2164) Build cluster plan and execute this plan at once for HoodieClusteringJob
[ https://issues.apache.org/jira/browse/HUDI-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380992#comment-17380992 ] ASF GitHub Bot commented on HUDI-2164: -- hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build cluster plan and execute this plan at once for HoodieClusteringJob > > > Key: HUDI-2164 > URL: https://issues.apache.org/jira/browse/HUDI-2164 > Project: Apache Hudi > Issue Type: Task >Reporter: Yue Zhang >Priority: Major > Labels: pull-request-available > > For now, Hudi can let users submit a HoodieClusteringJob to build a > clustering plan or execute a clustering plan through --schedule or > --instant-time config. > If users want to trigger a clustering job, he has to > # Submit a HoodieClusteringJob to build a clustering job through --schedule > config > # Copy the created clustering Instant time form Log info. > # Submit the HoodieClusteringJob again to execute this created clustering > plan through --instant-time config. > The pain point is that there are too many steps when trigger a clustering and > need to copy and paste the instant time from log file manually so that we > can't make it automatically. > > I just raise a PR to offer a new config named --mode or -m in short > ||--mode||remarks|| > |execute|Execute a cluster plan at given instant which means --instant-time > is needed here. default value. | > |schedule|Make a clustering plan.| > |*scheduleAndExecute*|Make a cluster plan first and execute that plan > immediately| > Now users can use --mode scheduleAndExecute to Build cluster plan and execute > this plan at once using HoodieClusteringJob. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3259: [HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering
hudi-bot edited a comment on pull request #3259: URL: https://github.com/apache/hudi/pull/3259#issuecomment-878086249 ## CI report: * 7ae050ed4b5ff0ce124a0ec580d51b3dfbb7f51a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=875) * b4aa7869d8343a16b225a81844e907fbee63b576 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2059) When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost
[ https://issues.apache.org/jira/browse/HUDI-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380986#comment-17380986 ] ASF GitHub Bot commented on HUDI-2059: -- xiarixiaoyao closed pull request #3181: URL: https://github.com/apache/hudi/pull/3181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > When log exists in mor table, clustering is triggered. The query result > shows that the update record in log is lost > > > Key: HUDI-2059 > URL: https://issues.apache.org/jira/browse/HUDI-2059 > Project: Apache Hudi > Issue Type: Bug >Affects Versions: 0.8.0 > Environment: hadoop 3.1.1 > spark3.1.1/spark2.4.5 > hive3.1.1 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > When log exists in mor table, and clustering is triggered. The query result > shows that the update record of log is lost。 > the reason of this problem is that: hoodie use HoodieFileSliceReader to read > table data and then do clustering. HoodieFileSliceReader call > HoodieMergedLogRecordScanner. > processNextRecord to merge update values and old valuse, when call that > function old values is reserved update values is discarded, this is wrong。 > test step: > // step1 : create hudi mor table > val df = spark.range(0, 1000).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1)) > .withColumn("p", lit(2)) > df.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "insert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Overwrite).save(basePath) > // step2, update age where keyid < 5 to produce log files > val df1 = spark.range(0, 5).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1 + 1000)) > .withColumn("p", lit(2)) > df1.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "upsert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Append).save(basePath) > // step3, do cluster inline > val df2 = spark.range(6, 10).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1 + 2000)) > .withColumn("p", lit(2)) > df2.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "upsert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option("hoodie.parquet.small.file.limit", "0"). > option("hoodie.clustering.inline", "true"). > option("hoodie.clustering.inline.max.commits", "1"). > option("hoodie.clustering.plan.strategy.target.file.max.bytes", > "1073741824"). > option("hoodie.clustering.plan.strategy.small.file.limit", "629145600"). > option("hoodie.clustering.plan.strategy.max.bytes.per.group", > Long.MaxValue.toString) > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Append).save(basePath) > spark.read.format("hudi") > .load(basePath).select("age").where("keyid = 0").show(100, false) > +---+ > |age| > +---+ > |1 | > +—+ > the result is wrong, since we update the value of age to 1001 at step 2. > > > > -- This message was sent by
[jira] [Commented] (HUDI-2059) When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost
[ https://issues.apache.org/jira/browse/HUDI-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380985#comment-17380985 ] ASF GitHub Bot commented on HUDI-2059: -- xiarixiaoyao commented on pull request #3181: URL: https://github.com/apache/hudi/pull/3181#issuecomment-880359721 @garyli1019thanks for your review。 close this pr, since HUDI-2170 solved this problem。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > When log exists in mor table, clustering is triggered. The query result > shows that the update record in log is lost > > > Key: HUDI-2059 > URL: https://issues.apache.org/jira/browse/HUDI-2059 > Project: Apache Hudi > Issue Type: Bug >Affects Versions: 0.8.0 > Environment: hadoop 3.1.1 > spark3.1.1/spark2.4.5 > hive3.1.1 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > When log exists in mor table, and clustering is triggered. The query result > shows that the update record of log is lost。 > the reason of this problem is that: hoodie use HoodieFileSliceReader to read > table data and then do clustering. HoodieFileSliceReader call > HoodieMergedLogRecordScanner. > processNextRecord to merge update values and old valuse, when call that > function old values is reserved update values is discarded, this is wrong。 > test step: > // step1 : create hudi mor table > val df = spark.range(0, 1000).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1)) > .withColumn("p", lit(2)) > df.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "insert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Overwrite).save(basePath) > // step2, update age where keyid < 5 to produce log files > val df1 = spark.range(0, 5).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1 + 1000)) > .withColumn("p", lit(2)) > df1.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "upsert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Append).save(basePath) > // step3, do cluster inline > val df2 = spark.range(6, 10).toDF("keyid") > .withColumn("col3", expr("keyid")) > .withColumn("age", lit(1 + 2000)) > .withColumn("p", lit(2)) > df2.write.format("hudi"). > option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL). > option(PRECOMBINE_FIELD_OPT_KEY, "col3"). > option(RECORDKEY_FIELD_OPT_KEY, "keyid"). > option(PARTITIONPATH_FIELD_OPT_KEY, "p"). > option(DataSourceWriteOptions.OPERATION_OPT_KEY, "upsert"). > option(HoodieWriteConfig.KEYGENERATOR_CLASS_PROP, > classOf[org.apache.hudi.keygen.ComplexKeyGenerator].getName). > option("hoodie.insert.shuffle.parallelism", "4"). > option("hoodie.upsert.shuffle.parallelism", "4"). > option("hoodie.parquet.small.file.limit", "0"). > option("hoodie.clustering.inline", "true"). > option("hoodie.clustering.inline.max.commits", "1"). > option("hoodie.clustering.plan.strategy.target.file.max.bytes", > "1073741824"). > option("hoodie.clustering.plan.strategy.small.file.limit", "629145600"). > option("hoodie.clustering.plan.strategy.max.bytes.per.group", > Long.MaxValue.toString) > .option(HoodieWriteConfig.TABLE_NAME, "hoodie_test") > .mode(SaveMode.Append).save(basePath) > spark.read.format("hudi") > .load(basePath).select("age").where("keyid = 0").show(100, false) > +---+ > |age| > +---+ > |1 | > +—+
[GitHub] [hudi] xiarixiaoyao closed pull request #3181: [HUDI-2059] When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost
xiarixiaoyao closed pull request #3181: URL: https://github.com/apache/hudi/pull/3181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #3181: [HUDI-2059] When log exists in mor table, clustering is triggered. The query result shows that the update record in log is lost
xiarixiaoyao commented on pull request #3181: URL: https://github.com/apache/hudi/pull/3181#issuecomment-880359721 @garyli1019thanks for your review。 close this pr, since HUDI-2170 solved this problem。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2139) MergeInto MOR Table May Result InCorrect Result
[ https://issues.apache.org/jira/browse/HUDI-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380982#comment-17380982 ] ASF GitHub Bot commented on HUDI-2139: -- pengzhiwei2018 commented on a change in pull request #3230: URL: https://github.com/apache/hudi/pull/3230#discussion_r670098428 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java ## @@ -189,8 +191,14 @@ protected boolean isUpdateRecord(HoodieRecord hoodieRecord) { private Option getIndexedRecord(HoodieRecord hoodieRecord) { Option> recordMetadata = hoodieRecord.getData().getMetadata(); try { - Option avroRecord = hoodieRecord.getData().getInsertValue(tableSchema, - config.getProps()); + // Pass the isUpdateRecord to the props for HoodieRecordPayload to judge + // Whether it is a update or insert record. + boolean isUpdateRecord = isUpdateRecord(hoodieRecord); Review comment: Here I just pass the `isUpdateRecord` flag to the `ExpressionPayload`. So it can know current record is a matched record or not matched record. The matched record will execute the match-clause in merge-into, while the not-matched record will execute the not-match-clause. If we do not have such information, the result of merge into will be incorrect. ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/payload/ExpressionPayload.scala ## @@ -126,48 +140,62 @@ class ExpressionPayload(record: GenericRecord, } } + /** + * Process the not-matched record. Test if the record matched any of insert-conditions, + * if matched then return the result of insert-assignment. Or else return a + * {@link HoodieWriteHandle.IGNORE_RECORD} which will be ignored by HoodieWriteHandle. + * + * @param inputRecord The input record to process. + * @param properties The properties. + * @return The result of the record to insert. + */ + private def processNotMatchedRecord(inputRecord: SqlTypedRecord, properties: Properties): HOption[IndexedRecord] = { +val insertConditionAndAssignmentsText = + properties.get(ExpressionPayload.PAYLOAD_INSERT_CONDITION_AND_ASSIGNMENTS) +// Get the evaluator for each condition and insert assignment. +initWriteSchemaIfNeed(properties) +val insertConditionAndAssignments = + ExpressionPayload.getEvaluator(insertConditionAndAssignmentsText.toString, writeSchema) +var resultRecordOpt: HOption[IndexedRecord] = null +for ((conditionEvaluator, assignmentEvaluator) <- insertConditionAndAssignments + if resultRecordOpt == null) { + val conditionVal = evaluate(conditionEvaluator, inputRecord).head.asInstanceOf[Boolean] + // If matched the insert condition then execute the assignment expressions to compute the + // result record. We will return the first matched record. + if (conditionVal) { +val results = evaluate(assignmentEvaluator, inputRecord) +resultRecordOpt = HOption.of(convertToRecord(results, writeSchema)) + } +} +if (resultRecordOpt != null) { + resultRecordOpt +} else { + // If there is no condition matched, just filter this record. + // Here we return a IGNORE_RECORD, HoodieCreateHandle will not handle it. + HOption.of(HoodieWriteHandle.IGNORE_RECORD) +} + } + override def getInsertValue(schema: Schema, properties: Properties): HOption[IndexedRecord] = { val incomingRecord = bytesToAvro(recordBytes, schema) if (isDeleteRecord(incomingRecord)) { HOption.empty[IndexedRecord]() } else { - val insertConditionAndAssignmentsText = - properties.get(ExpressionPayload.PAYLOAD_INSERT_CONDITION_AND_ASSIGNMENTS) - // Process insert val sqlTypedRecord = new SqlTypedRecord(incomingRecord) - // Get the evaluator for each condition and insert assignment. - initWriteSchemaIfNeed(properties) - val insertConditionAndAssignments = - ExpressionPayload.getEvaluator(insertConditionAndAssignmentsText.toString, writeSchema) - var resultRecordOpt: HOption[IndexedRecord] = null - for ((conditionEvaluator, assignmentEvaluator) <- insertConditionAndAssignments - if resultRecordOpt == null) { -val conditionVal = evaluate(conditionEvaluator, sqlTypedRecord).head.asInstanceOf[Boolean] -// If matched the insert condition then execute the assignment expressions to compute the -// result record. We will return the first matched record. -if (conditionVal) { - val results = evaluate(assignmentEvaluator, sqlTypedRecord) - resultRecordOpt = HOption.of(convertToRecord(results, writeSchema)) -} - } - - // Process delete for MOR - if (resultRecordOpt == null && isMORTable(properties)) { -val deleteConditionT
[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3230: [HUDI-2139] MergeInto MOR Table May Result InCorrect Result
pengzhiwei2018 commented on a change in pull request #3230: URL: https://github.com/apache/hudi/pull/3230#discussion_r670098428 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java ## @@ -189,8 +191,14 @@ protected boolean isUpdateRecord(HoodieRecord hoodieRecord) { private Option getIndexedRecord(HoodieRecord hoodieRecord) { Option> recordMetadata = hoodieRecord.getData().getMetadata(); try { - Option avroRecord = hoodieRecord.getData().getInsertValue(tableSchema, - config.getProps()); + // Pass the isUpdateRecord to the props for HoodieRecordPayload to judge + // Whether it is a update or insert record. + boolean isUpdateRecord = isUpdateRecord(hoodieRecord); Review comment: Here I just pass the `isUpdateRecord` flag to the `ExpressionPayload`. So it can know current record is a matched record or not matched record. The matched record will execute the match-clause in merge-into, while the not-matched record will execute the not-match-clause. If we do not have such information, the result of merge into will be incorrect. ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/payload/ExpressionPayload.scala ## @@ -126,48 +140,62 @@ class ExpressionPayload(record: GenericRecord, } } + /** + * Process the not-matched record. Test if the record matched any of insert-conditions, + * if matched then return the result of insert-assignment. Or else return a + * {@link HoodieWriteHandle.IGNORE_RECORD} which will be ignored by HoodieWriteHandle. + * + * @param inputRecord The input record to process. + * @param properties The properties. + * @return The result of the record to insert. + */ + private def processNotMatchedRecord(inputRecord: SqlTypedRecord, properties: Properties): HOption[IndexedRecord] = { +val insertConditionAndAssignmentsText = + properties.get(ExpressionPayload.PAYLOAD_INSERT_CONDITION_AND_ASSIGNMENTS) +// Get the evaluator for each condition and insert assignment. +initWriteSchemaIfNeed(properties) +val insertConditionAndAssignments = + ExpressionPayload.getEvaluator(insertConditionAndAssignmentsText.toString, writeSchema) +var resultRecordOpt: HOption[IndexedRecord] = null +for ((conditionEvaluator, assignmentEvaluator) <- insertConditionAndAssignments + if resultRecordOpt == null) { + val conditionVal = evaluate(conditionEvaluator, inputRecord).head.asInstanceOf[Boolean] + // If matched the insert condition then execute the assignment expressions to compute the + // result record. We will return the first matched record. + if (conditionVal) { +val results = evaluate(assignmentEvaluator, inputRecord) +resultRecordOpt = HOption.of(convertToRecord(results, writeSchema)) + } +} +if (resultRecordOpt != null) { + resultRecordOpt +} else { + // If there is no condition matched, just filter this record. + // Here we return a IGNORE_RECORD, HoodieCreateHandle will not handle it. + HOption.of(HoodieWriteHandle.IGNORE_RECORD) +} + } + override def getInsertValue(schema: Schema, properties: Properties): HOption[IndexedRecord] = { val incomingRecord = bytesToAvro(recordBytes, schema) if (isDeleteRecord(incomingRecord)) { HOption.empty[IndexedRecord]() } else { - val insertConditionAndAssignmentsText = - properties.get(ExpressionPayload.PAYLOAD_INSERT_CONDITION_AND_ASSIGNMENTS) - // Process insert val sqlTypedRecord = new SqlTypedRecord(incomingRecord) - // Get the evaluator for each condition and insert assignment. - initWriteSchemaIfNeed(properties) - val insertConditionAndAssignments = - ExpressionPayload.getEvaluator(insertConditionAndAssignmentsText.toString, writeSchema) - var resultRecordOpt: HOption[IndexedRecord] = null - for ((conditionEvaluator, assignmentEvaluator) <- insertConditionAndAssignments - if resultRecordOpt == null) { -val conditionVal = evaluate(conditionEvaluator, sqlTypedRecord).head.asInstanceOf[Boolean] -// If matched the insert condition then execute the assignment expressions to compute the -// result record. We will return the first matched record. -if (conditionVal) { - val results = evaluate(assignmentEvaluator, sqlTypedRecord) - resultRecordOpt = HOption.of(convertToRecord(results, writeSchema)) -} - } - - // Process delete for MOR - if (resultRecordOpt == null && isMORTable(properties)) { -val deleteConditionText = properties.get(ExpressionPayload.PAYLOAD_DELETE_CONDITION) -if (deleteConditionText != null) { - val deleteCondition = getEvaluator(deleteConditionText.toString, writeSchema).head._1 - val deleteConditionVal = evaluate
[jira] [Commented] (HUDI-2086) redo the logical of mor_incremental_view for hive
[ https://issues.apache.org/jira/browse/HUDI-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380980#comment-17380980 ] ASF GitHub Bot commented on HUDI-2086: -- xiarixiaoyao opened a new pull request #3203: URL: https://github.com/apache/hudi/pull/3203 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request redo the logical of mor_incremental_view for hive to fix some bugs for mor_incremental_view for hive/sparksql purpose of the pull request: 1) support read the lastest incremental datas which are stored by logs 2) support read incremental datas which before replacecommit 3) support read file groups which has only logs 4) keep the logical of mor_incremental_view as the same logicl as spark dataSource ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request new UT added ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > redo the logical of mor_incremental_view for hive > - > > Key: HUDI-2086 > URL: https://issues.apache.org/jira/browse/HUDI-2086 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration > Environment: spark3.1.1 > hive3.1.1 > hadoop3.1.1 > os: suse >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > > now ,There are some problems with mor_incremental_view for hive。 > For example, > 1):*hudi cannot read the lastest incremental datas which are stored by logs* > think that: create a mor table with bulk_insert, and then do upsert for this > table, > no we want to query the latest incremental data by hive/sparksql, however > the lastest incremental datas are stored by logs, when we do query nothings > will return > step1: prepare data > val df = spark.sparkContext.parallelize(0 to 20, 2).map(x => testCase(x, > x+"jack", Random.nextInt(2))).toDF() > .withColumn("col3", expr("keyid + 3000")) > .withColumn("p", lit(1)) > step2: do bulk_insert > mergePartitionTable(df, 4, "default", "inc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert") > step3: do upsert > mergePartitionTable(df, 4, "default", "inc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "upsert") > step4: check the lastest commit time and do query > spark.sql("set hoodie.inc.consume.mode=INCREMENTAL") > spark.sql("set hoodie.inc.consume.max.commits=1") > spark.sql("set hoodie.inc.consume.start.timestamp=20210628103935") > spark.sql("select keyid, col3 from inc_rt where `_hoodie_commit_time` > > '20210628103935' order by keyid").show(100, false) > +-++ > |keyid|col3| > +-++ > +-++ > > 2):*if we do insert_over_write/insert_over_write_table for hudi mor table, > the incr query result is wrong when we want to query the data before > insert_overwrite/insert_overwrite_table* > step1: do bulk_insert > mergePartitionTable(df, 4, "default", "overInc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert") > now the commits is > [20210628160614.deltacommit ] > step2: do insert_overwrite_table > mergePartitionTable(df, 4, "default", "overInc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "insert_overwrite_table") > now the commits is > [20210628160614.deltacommit, 20210628160923.replacecommit ] > step3: query the data before insert_overwrite_table > spark.sql("set hoodie.overInc.consume.mode=INCREMENTAL") > spark.sql("set hoodie.overInc.consume.max.commits=1") > spark.sql("set hoodie.overInc.consume.start.timestamp=0") > spark.sql("select keyid, col3 from overInc_rt where `_hoodie_commit_time` > > '0' order by keyid").show(100, false) > +-++ > |keyid|col3| > +-++ > +-++ > > 3) *hive/presto/flink cannot read file groups which has only logs* > when we use hbase/inmemory as index, mor table will produce log files instead > of parquet file, but now hive/presto cannot
[GitHub] [hudi] xiarixiaoyao opened a new pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive
xiarixiaoyao opened a new pull request #3203: URL: https://github.com/apache/hudi/pull/3203 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request redo the logical of mor_incremental_view for hive to fix some bugs for mor_incremental_view for hive/sparksql purpose of the pull request: 1) support read the lastest incremental datas which are stored by logs 2) support read incremental datas which before replacecommit 3) support read file groups which has only logs 4) keep the logical of mor_incremental_view as the same logicl as spark dataSource ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request new UT added ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2086) redo the logical of mor_incremental_view for hive
[ https://issues.apache.org/jira/browse/HUDI-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380979#comment-17380979 ] ASF GitHub Bot commented on HUDI-2086: -- xiarixiaoyao closed pull request #3203: URL: https://github.com/apache/hudi/pull/3203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > redo the logical of mor_incremental_view for hive > - > > Key: HUDI-2086 > URL: https://issues.apache.org/jira/browse/HUDI-2086 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration > Environment: spark3.1.1 > hive3.1.1 > hadoop3.1.1 > os: suse >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > > now ,There are some problems with mor_incremental_view for hive。 > For example, > 1):*hudi cannot read the lastest incremental datas which are stored by logs* > think that: create a mor table with bulk_insert, and then do upsert for this > table, > no we want to query the latest incremental data by hive/sparksql, however > the lastest incremental datas are stored by logs, when we do query nothings > will return > step1: prepare data > val df = spark.sparkContext.parallelize(0 to 20, 2).map(x => testCase(x, > x+"jack", Random.nextInt(2))).toDF() > .withColumn("col3", expr("keyid + 3000")) > .withColumn("p", lit(1)) > step2: do bulk_insert > mergePartitionTable(df, 4, "default", "inc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert") > step3: do upsert > mergePartitionTable(df, 4, "default", "inc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "upsert") > step4: check the lastest commit time and do query > spark.sql("set hoodie.inc.consume.mode=INCREMENTAL") > spark.sql("set hoodie.inc.consume.max.commits=1") > spark.sql("set hoodie.inc.consume.start.timestamp=20210628103935") > spark.sql("select keyid, col3 from inc_rt where `_hoodie_commit_time` > > '20210628103935' order by keyid").show(100, false) > +-++ > |keyid|col3| > +-++ > +-++ > > 2):*if we do insert_over_write/insert_over_write_table for hudi mor table, > the incr query result is wrong when we want to query the data before > insert_overwrite/insert_overwrite_table* > step1: do bulk_insert > mergePartitionTable(df, 4, "default", "overInc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "bulk_insert") > now the commits is > [20210628160614.deltacommit ] > step2: do insert_overwrite_table > mergePartitionTable(df, 4, "default", "overInc", tableType = > DataSourceWriteOptions.MOR_TABLE_TYPE_OPT_VAL, op = "insert_overwrite_table") > now the commits is > [20210628160614.deltacommit, 20210628160923.replacecommit ] > step3: query the data before insert_overwrite_table > spark.sql("set hoodie.overInc.consume.mode=INCREMENTAL") > spark.sql("set hoodie.overInc.consume.max.commits=1") > spark.sql("set hoodie.overInc.consume.start.timestamp=0") > spark.sql("select keyid, col3 from overInc_rt where `_hoodie_commit_time` > > '0' order by keyid").show(100, false) > +-++ > |keyid|col3| > +-++ > +-++ > > 3) *hive/presto/flink cannot read file groups which has only logs* > when we use hbase/inmemory as index, mor table will produce log files instead > of parquet file, but now hive/presto cannot read those files since those > files are log files. > *HUDI-2048* mentions this problem. > > however when we use spark data source to executre incremental query, there is > no such problem above。keep the logical of mor_incremental_view for hive as > the same logicl as spark dataSource is necessary。 > we redo the logical of mor_incremental_view for hive,to solve above problems > and keep the logical of mor_incremental_view as the same logicl as spark > dataSource > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] xiarixiaoyao closed pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive
xiarixiaoyao closed pull request #3203: URL: https://github.com/apache/hudi/pull/3203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1676) Support SQL with spark3
[ https://issues.apache.org/jira/browse/HUDI-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380977#comment-17380977 ] ASF GitHub Bot commented on HUDI-1676: -- xiarixiaoyao closed pull request #2761: URL: https://github.com/apache/hudi/pull/2761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support SQL with spark3 > --- > > Key: HUDI-1676 > URL: https://issues.apache.org/jira/browse/HUDI-1676 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available, sev:normal > Fix For: 0.9.0 > > > 1、support CTAS for spark3 > 3、support INSERT for spark3 > 4、support merge、update、delete without RowKey constraint for spark3 > 5、support dataSourceV2 for spark3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2029) Implement compression for DiskBasedMap in Spillable Map
[ https://issues.apache.org/jira/browse/HUDI-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380978#comment-17380978 ] ASF GitHub Bot commented on HUDI-2029: -- nsivabalan merged pull request #3128: URL: https://github.com/apache/hudi/pull/3128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement compression for DiskBasedMap in Spillable Map > --- > > Key: HUDI-2029 > URL: https://issues.apache.org/jira/browse/HUDI-2029 > Project: Apache Hudi > Issue Type: Improvement > Components: Performance >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > Labels: pull-request-available > > Implement compression for DiskBasedMap in Spillable Map > Without compression, DiskBasedMap is causing more spilling to disk than > RockDb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (75040ee -> d024439)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 75040ee [HUDI-2149] Ensure and Audit docs for every configuration class in the codebase (#3272) add d024439 [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map (#3128) No new revisions were added by this update. Summary of changes: .../org/apache/hudi/config/HoodieWriteConfig.java | 14 .../java/org/apache/hudi/io/HoodieMergeHandle.java | 2 +- .../common/util/collection/BitCaskDiskMap.java | 93 ++ .../util/collection/ExternalSpillableMap.java | 10 ++- .../common/util/collection/LazyFileIterable.java | 9 ++- .../common/util/collection/TestBitCaskDiskMap.java | 40 ++ .../util/collection/TestExternalSpillableMap.java | 53 +++- 7 files changed, 167 insertions(+), 54 deletions(-)
[GitHub] [hudi] nsivabalan merged pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map
nsivabalan merged pull request #3128: URL: https://github.com/apache/hudi/pull/3128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xiarixiaoyao closed pull request #2761: [HUDI-1676] Support SQL with spark3
xiarixiaoyao closed pull request #2761: URL: https://github.com/apache/hudi/pull/2761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2029) Implement compression for DiskBasedMap in Spillable Map
[ https://issues.apache.org/jira/browse/HUDI-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380976#comment-17380976 ] ASF GitHub Bot commented on HUDI-2029: -- nsivabalan commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java ## @@ -188,21 +204,25 @@ public R get(Object key) { } private R get(ValueMetadata entry) { -return get(entry, getRandomAccessFile()); +return get(entry, getRandomAccessFile(), isCompressionEnabled); } - public static R get(ValueMetadata entry, RandomAccessFile file) { + public static R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) { try { - return SerializationUtils - .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue())); + byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()); + if (isCompressionEnabled) { +return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk)); + } Review comment: not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. So, whenever you have "if" "else"s, try to always explicitly add else block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement compression for DiskBasedMap in Spillable Map > --- > > Key: HUDI-2029 > URL: https://issues.apache.org/jira/browse/HUDI-2029 > Project: Apache Hudi > Issue Type: Improvement > Components: Performance >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > Labels: pull-request-available > > Implement compression for DiskBasedMap in Spillable Map > Without compression, DiskBasedMap is causing more spilling to disk than > RockDb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2170) Always choose the latest record for HoodieRecordPayload
[ https://issues.apache.org/jira/browse/HUDI-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380974#comment-17380974 ] ASF GitHub Bot commented on HUDI-2170: -- nsivabalan commented on a change in pull request #3267: URL: https://github.com/apache/hudi/pull/3267#discussion_r670095143 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java ## @@ -49,7 +49,7 @@ public OverwriteWithLatestAvroPayload(Option record) { @Override public OverwriteWithLatestAvroPayload preCombine(OverwriteWithLatestAvroPayload another) { // pick the payload with greatest ordering value -if (another.orderingVal.compareTo(orderingVal) > 0) { +if (another.orderingVal.compareTo(orderingVal) >= 0) { Review comment: Looks good to me. @vinothchandar : Can you think of any particular reason why it was done this way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Always choose the latest record for HoodieRecordPayload > --- > > Key: HUDI-2170 > URL: https://issues.apache.org/jira/browse/HUDI-2170 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Now in {{OverwriteWithLatestAvroPayload.preCombine}}, we still choose the old > record when the new record has the same preCombine field with the old one, > actually it is more natural to keep the new incoming record instead. The > {{DefaultHoodieRecordPayload.combineAndGetUpdateValue}} method already did > that. > See issue: https://github.com/apache/hudi/issues/3266. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map
nsivabalan commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java ## @@ -188,21 +204,25 @@ public R get(Object key) { } private R get(ValueMetadata entry) { -return get(entry, getRandomAccessFile()); +return get(entry, getRandomAccessFile(), isCompressionEnabled); } - public static R get(ValueMetadata entry, RandomAccessFile file) { + public static R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) { try { - return SerializationUtils - .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue())); + byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()); + if (isCompressionEnabled) { +return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk)); + } Review comment: not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. So, whenever you have "if" "else"s, try to always explicitly add else block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3267: [HUDI-2170] Always choose the latest record for HoodieRecordPayload
nsivabalan commented on a change in pull request #3267: URL: https://github.com/apache/hudi/pull/3267#discussion_r670095143 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java ## @@ -49,7 +49,7 @@ public OverwriteWithLatestAvroPayload(Option record) { @Override public OverwriteWithLatestAvroPayload preCombine(OverwriteWithLatestAvroPayload another) { // pick the payload with greatest ordering value -if (another.orderingVal.compareTo(orderingVal) > 0) { +if (another.orderingVal.compareTo(orderingVal) >= 0) { Review comment: Looks good to me. @vinothchandar : Can you think of any particular reason why it was done this way? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #3266: [SUPPORT] Upsert data with an identical record key and pre-combine field
danny0405 commented on issue #3266: URL: https://github.com/apache/hudi/issues/3266#issuecomment-880349759 Yes, the PR would be merged soon once the CI tests pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2029) Implement compression for DiskBasedMap in Spillable Map
[ https://issues.apache.org/jira/browse/HUDI-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380973#comment-17380973 ] ASF GitHub Bot commented on HUDI-2029: -- rmahindra123 commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r670094851 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java ## @@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() { LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge); this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(), new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema), - config.getSpillableDiskMapType()); + config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled()); Review comment: Good point, will be done in a follow up PR https://issues.apache.org/jira/browse/HUDI-2044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement compression for DiskBasedMap in Spillable Map > --- > > Key: HUDI-2029 > URL: https://issues.apache.org/jira/browse/HUDI-2029 > Project: Apache Hudi > Issue Type: Improvement > Components: Performance >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > Labels: pull-request-available > > Implement compression for DiskBasedMap in Spillable Map > Without compression, DiskBasedMap is causing more spilling to disk than > RockDb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2044) Extend support for rockDB and compression for Spillable map to all consumers of ExternalSpillableMap
[ https://issues.apache.org/jira/browse/HUDI-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2044: -- Summary: Extend support for rockDB and compression for Spillable map to all consumers of ExternalSpillableMap (was: Extend support for rocked for spoilable map to all consumers of ExternalSpillableMap) > Extend support for rockDB and compression for Spillable map to all consumers > of ExternalSpillableMap > > > Key: HUDI-2044 > URL: https://issues.apache.org/jira/browse/HUDI-2044 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > > # HUDI-2028 only implements rockDb support for Spillable map in > HoodieMergeHandle since we are blocked on the configuration refactor PR to > land > # This ticket will track the implementation to extend rocksDB support for > Spoilable Map to all consumers of ExternalSpillableMap.java -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map
rmahindra123 commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r670094851 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java ## @@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() { LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge); this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(), new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema), - config.getSpillableDiskMapType()); + config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled()); Review comment: Good point, will be done in a follow up PR https://issues.apache.org/jira/browse/HUDI-2044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2029) Implement compression for DiskBasedMap in Spillable Map
[ https://issues.apache.org/jira/browse/HUDI-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380972#comment-17380972 ] ASF GitHub Bot commented on HUDI-2029: -- nsivabalan commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java ## @@ -188,21 +204,25 @@ public R get(Object key) { } private R get(ValueMetadata entry) { -return get(entry, getRandomAccessFile()); +return get(entry, getRandomAccessFile(), isCompressionEnabled); } - public static R get(ValueMetadata entry, RandomAccessFile file) { + public static R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) { try { - return SerializationUtils - .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue())); + byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()); + if (isCompressionEnabled) { +return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk)); + } Review comment: not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. So, whenever you have if else, try to always explicitly add else block. ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java ## @@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() { LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge); this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(), new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema), - config.getSpillableDiskMapType()); + config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled()); Review comment: Do you wanna make change in HoodieMergedLogRecordScanner as well ? Or thats planned for a follow up PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Implement compression for DiskBasedMap in Spillable Map > --- > > Key: HUDI-2029 > URL: https://issues.apache.org/jira/browse/HUDI-2029 > Project: Apache Hudi > Issue Type: Improvement > Components: Performance >Reporter: Rajesh Mahindra >Assignee: Rajesh Mahindra >Priority: Major > Labels: pull-request-available > > Implement compression for DiskBasedMap in Spillable Map > Without compression, DiskBasedMap is causing more spilling to disk than > RockDb. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2170) Always choose the latest record for HoodieRecordPayload
[ https://issues.apache.org/jira/browse/HUDI-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380971#comment-17380971 ] ASF GitHub Bot commented on HUDI-2170: -- codecov-commenter edited a comment on pull request #3267: URL: https://github.com/apache/hudi/pull/3267#issuecomment-878977860 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3267](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fd664b5) into [master](https://codecov.io/gh/apache/hudi/commit/b0089b894ad12da11fbd6a0fb08508c7adee68e6?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b0089b8) will **decrease** coverage by `21.38%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3267/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3267 +/- ## = - Coverage 47.72% 26.34% -21.39% + Complexity 5529 1303 -4226 = Files 934 386 -548 Lines 4145716006-25451 Branches 4167 1379 -2788 = - Hits 19787 4217-15570 + Misses1990811486 -8422 + Partials 1762 303 -1459 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.09% <ø> (-14.37%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.57% <ø> (-49.94%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.16% <ø> (-0.10%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_me
[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map
nsivabalan commented on a change in pull request #3128: URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java ## @@ -188,21 +204,25 @@ public R get(Object key) { } private R get(ValueMetadata entry) { -return get(entry, getRandomAccessFile()); +return get(entry, getRandomAccessFile(), isCompressionEnabled); } - public static R get(ValueMetadata entry, RandomAccessFile file) { + public static R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) { try { - return SerializationUtils - .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue())); + byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()); + if (isCompressionEnabled) { +return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk)); + } Review comment: not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. So, whenever you have if else, try to always explicitly add else block. ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java ## @@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() { LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge); this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(), new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema), - config.getSpillableDiskMapType()); + config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled()); Review comment: Do you wanna make change in HoodieMergedLogRecordScanner as well ? Or thats planned for a follow up PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3267: [HUDI-2170] Always choose the latest record for HoodieRecordPayload
codecov-commenter edited a comment on pull request #3267: URL: https://github.com/apache/hudi/pull/3267#issuecomment-878977860 # [Codecov](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3267](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fd664b5) into [master](https://codecov.io/gh/apache/hudi/commit/b0089b894ad12da11fbd6a0fb08508c7adee68e6?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b0089b8) will **decrease** coverage by `21.38%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3267/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) ```diff @@ Coverage Diff @@ ## master#3267 +/- ## = - Coverage 47.72% 26.34% -21.39% + Complexity 5529 1303 -4226 = Files 934 386 -548 Lines 4145716006-25451 Branches 4167 1379 -2788 = - Hits 19787 4217-15570 + Misses1990811486 -8422 + Partials 1762 303 -1459 ``` | Flag | Coverage Δ | | |---|---|---| | hudicli | `?` | | | hudiclient | `20.09% <ø> (-14.37%)` | :arrow_down: | | hudicommon | `?` | | | hudiflink | `?` | | | hudihadoopmr | `?` | | | hudisparkdatasource | `?` | | | hudisync | `4.57% <ø> (-49.94%)` | :arrow_down: | | huditimelineservice | `?` | | | hudiutilities | `59.16% <ø> (-0.10%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3267?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: | | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3267/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh)
[jira] [Commented] (HUDI-1676) Support SQL with spark3
[ https://issues.apache.org/jira/browse/HUDI-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380966#comment-17380966 ] ASF GitHub Bot commented on HUDI-1676: -- xiarixiaoyao commented on pull request #2761: URL: https://github.com/apache/hudi/pull/2761#issuecomment-880347459 @lw309637554 。 Thank you for paying attention to this pr。 since #2645 has been merged, i will close this pr。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support SQL with spark3 > --- > > Key: HUDI-1676 > URL: https://issues.apache.org/jira/browse/HUDI-1676 > Project: Apache Hudi > Issue Type: Sub-task > Components: Spark Integration >Affects Versions: 0.9.0 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available, sev:normal > Fix For: 0.9.0 > > > 1、support CTAS for spark3 > 3、support INSERT for spark3 > 4、support merge、update、delete without RowKey constraint for spark3 > 5、support dataSourceV2 for spark3 -- This message was sent by Atlassian Jira (v8.3.4#803005)