Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2060513736 ## CI report: * b2eda0f44dc17ccc3722be2eecbf001a2c57a955 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

(hudi) branch master updated: [HUDI-7625] Avoid unnecessary rewrite for metadata table (#11038)

2024-04-16 Thread rexan
This is an automated email from the ASF dual-hosted git repository. rexan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 5d63616a99f [HUDI-7625] Avoid unnecessary rewrite f

Re: [PR] [HUDI-7625] Avoid unnecessary rewrite for metadata table [hudi]

2024-04-16 Thread via GitHub
boneanxs merged PR #11038: URL: https://github.com/apache/hudi/pull/11038 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

Re: [PR] [HUDI-7625] Avoid unnecessary rewrite for metadata table [hudi]

2024-04-16 Thread via GitHub
boneanxs commented on PR #11038: URL: https://github.com/apache/hudi/pull/11038#issuecomment-2060483975 ![Screenshot 2024-04-17 at 14 35 12](https://github.com/apache/hudi/assets/10115332/eb424945-41aa-44f7-9c32-2d6e2f401258) CI passed -- This is an automated message from the Apach

[jira] [Updated] (HUDI-7626) UserGroupInformation lost in the new thread of timeline service threadpool

2024-04-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7626: - Labels: pull-request-available (was: ) > UserGroupInformation lost in the new thread of timeline

[PR] [HUDI-7626] propagate UserGroupInformation from the main thread to the new thread of timeline service threadpool [hudi]

2024-04-16 Thread via GitHub
beyond1920 opened a new pull request, #11039: URL: https://github.com/apache/hudi/pull/11039 ### Change Logs The UserGroupInformation lost in the new thread of timeline service threadpool. If it does not match the UserGroupInformation from the main thread, the spark writers might fa

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
zhuanshenbsj1 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568296222 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos()

[jira] [Updated] (HUDI-7626) UserGroupInformation lost in the new thread of timeline service threadpool

2024-04-16 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-7626: - Description: (was: https://github.com/apache/hudi/assets/1525333/fef39f4b-89c9-44be-b034-ae58c2615764";

[jira] [Updated] (HUDI-7626) UserGroupInformation lost in the new thread of timeline service threadpool

2024-04-16 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-7626: - Description: see more information in https://github.com/apache/hudi/issues/11030 > UserGroupInformation l

[jira] [Created] (HUDI-7626) UserGroupInformation lost in the new thread of timeline service threadpool

2024-04-16 Thread Jing Zhang (Jira)
Jing Zhang created HUDI-7626: Summary: UserGroupInformation lost in the new thread of timeline service threadpool Key: HUDI-7626 URL: https://issues.apache.org/jira/browse/HUDI-7626 Project: Apache Hudi

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2060453357 ## CI report: * 4e167a80603c4bb8c6bfcd5e69bf4d7f1065b36f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [I] [SUPPORT]Data Loss Issue with Hudi Table After 3 Days of Continuous Writes [hudi]

2024-04-16 Thread via GitHub
juice411 commented on issue #11016: URL: https://github.com/apache/hudi/issues/11016#issuecomment-2060452646 ![image](https://github.com/apache/hudi/assets/10968514/9c567a2c-9237-453c-8706-af380cf28a6b) During our testing, we've encountered an unusual issue with the Hudi stream read tabl

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2060444207 ## CI report: * 4e167a80603c4bb8c6bfcd5e69bf4d7f1065b36f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7235] Fix checkpoint bug for S3/GCS Incremental Source [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10336: URL: https://github.com/apache/hudi/pull/10336#issuecomment-2060442988 ## CI report: * de49a9da9db751d6fd6e0eaa1a750f8726a55018 UNKNOWN * 5edeb0f1a9a1db3ba3df651a8f39a2459d7d4f21 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060435999 ## CI report: * 254fbf794c65c5d54251f388ec7ea8fdbae29d03 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2060435950 ## CI report: * 4e167a80603c4bb8c6bfcd5e69bf4d7f1065b36f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7235] Fix checkpoint bug for S3/GCS Incremental Source [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10336: URL: https://github.com/apache/hudi/pull/10336#issuecomment-2060364623 ## CI report: * de49a9da9db751d6fd6e0eaa1a750f8726a55018 UNKNOWN * 1bb5e222b29e392a69ef2ff0a4c990ba0ad7219d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

[jira] [Updated] (HUDI-7625) Avoid unnecessary rewrite for metadata table

2024-04-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7625: - Status: Patch Available (was: In Progress) > Avoid unnecessary rewrite for metadata table > -

[jira] [Updated] (HUDI-7625) Avoid unnecessary rewrite for metadata table

2024-04-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7625: - Status: In Progress (was: Open) > Avoid unnecessary rewrite for metadata table >

[jira] [Updated] (HUDI-7625) Avoid unnecessary rewrite for metadata table

2024-04-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7625: - Sprint: Sprint 2024-03-25 > Avoid unnecessary rewrite for metadata table > ---

Re: [PR] [HUDI-7625] Avoid unnecessary rewrite for metadata table [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11038: URL: https://github.com/apache/hudi/pull/11038#issuecomment-2060358767 ## CI report: * 5b12c14eaeef50a4da32f5f7c4cf1e13c638616f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2060358663 ## CI report: * e7967d543ae82dba4a9024214872894700f673de Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7235] Fix checkpoint bug for S3/GCS Incremental Source [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10336: URL: https://github.com/apache/hudi/pull/10336#issuecomment-2060357745 ## CI report: * de49a9da9db751d6fd6e0eaa1a750f8726a55018 UNKNOWN * 1bb5e222b29e392a69ef2ff0a4c990ba0ad7219d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7625] Avoid unnecessary rewrite for metadata table [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11038: URL: https://github.com/apache/hudi/pull/11038#issuecomment-2060351379 ## CI report: * 5b12c14eaeef50a4da32f5f7c4cf1e13c638616f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2060351263 ## CI report: * e7967d543ae82dba4a9024214872894700f673de Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060310749 ## CI report: * 4106226e5d9706b385b13e7c3347f7a50416fd48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2060310726 ## CI report: * 821e081933a557b4e064677f03f807711c3ffdd5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
sampan-s-nayak commented on code in PR #11018: URL: https://github.com/apache/hudi/pull/11018#discussion_r1568189219 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -129,6 +130,14 @@ public class HoodieStreamer implements Serializabl

[jira] [Updated] (HUDI-7625) Avoid unnecessary rewrite for metadata table

2024-04-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7625: - Labels: pull-request-available (was: ) > Avoid unnecessary rewrite for metadata table > -

[PR] [HUDI-7625] Avoid unnecessary rewrite for metadata table [hudi]

2024-04-16 Thread via GitHub
danny0405 opened a new pull request, #11038: URL: https://github.com/apache/hudi/pull/11038 ### Change Logs Follow-up for #11028, improvement for MDT cow and compaction performance. ### Impact none ### Risk level (write none, low medium or high below) none

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060304921 ## CI report: * 4106226e5d9706b385b13e7c3347f7a50416fd48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2060304865 ## CI report: * 821e081933a557b4e064677f03f807711c3ffdd5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060299490 ## CI report: * 4106226e5d9706b385b13e7c3347f7a50416fd48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

[jira] [Created] (HUDI-7625) Avoid unnecessary rewrite for metadata table

2024-04-16 Thread Danny Chen (Jira)
Danny Chen created HUDI-7625: Summary: Avoid unnecessary rewrite for metadata table Key: HUDI-7625 URL: https://issues.apache.org/jira/browse/HUDI-7625 Project: Apache Hudi Issue Type: Improvemen

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568173216 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos() {

Re: [PR] [HUDI-7624] Fixing index tagging duration [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #11035: URL: https://github.com/apache/hudi/pull/11035#discussion_r1568172191 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/HoodieMetrics.java: ## @@ -207,6 +210,13 @@ public Timer.Context getIndexCtx() { return in

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on PR #10980: URL: https://github.com/apache/hudi/pull/10980#issuecomment-2060281356 Close because it is fixed in #11028 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[jira] [Closed] (HUDI-7578) Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance

2024-04-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7578. Resolution: Fixed Fixed via master branch: 876c4e26ecf2b710e37e826f583cbf7c5722f246 > Avoid unnecessary rew

(hudi) branch master updated: [HUDI-7578] Avoid unnecessary rewriting to improve performance (#11028)

2024-04-16 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 876c4e26ecf [HUDI-7578] Avoid unnecessary rewri

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting to improve performance [hudi]

2024-04-16 Thread via GitHub
danny0405 merged PR #11028: URL: https://github.com/apache/hudi/pull/11028 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apac

Re: [PR] [HUDI-6386][WIP] Fixing compaction plan parsing issue when compaction.requested is empty [hudi]

2024-04-16 Thread via GitHub
bvaradar commented on PR #9085: URL: https://github.com/apache/hudi/pull/9085#issuecomment-2060278858 @nsivabalan : It looks like we assume that files (.requested). that are being created are immediately available to readers. S3 for example does not allow this. Also, I enabled this

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
zhuanshenbsj1 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568167071 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos()

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
zhuanshenbsj1 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568164751 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos()

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060268040 ## CI report: * 4106226e5d9706b385b13e7c3347f7a50416fd48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
wombatu-kun commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060266213 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060256473 ## CI report: * 4106226e5d9706b385b13e7c3347f7a50416fd48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [I] [SUPPORT]Data Loss Issue with Hudi Table After 3 Days of Continuous Writes [hudi]

2024-04-16 Thread via GitHub
juice411 commented on issue #11016: URL: https://github.com/apache/hudi/issues/11016#issuecomment-2060230825 @danny0405 The previous versions we were using were Hudi 0.14.1 and Flink 1.17.2. Also, we believe our issue is not related to the precombine field as we have a unique ID to identify

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060214047 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 7c517227bb1079621647852c99dd7836f9900025 UNKNOWN * fc9e3e0248bc828273da6497a6bf837999ebfbf0 Azure: [FAIL

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060208077 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 7c517227bb1079621647852c99dd7836f9900025 UNKNOWN * b010f8000fcdcf82a67adf5d047934530dfe1a8b Azure: [FAIL

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060201744 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 7c517227bb1079621647852c99dd7836f9900025 UNKNOWN * 23f2e7932da90cb97e0e475ad43ca6cd1b2be7e4 Azure: [FAIL

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10915: URL: https://github.com/apache/hudi/pull/10915#issuecomment-2060202131 ## CI report: * ae80da44ef0b736d26626a961aaac8fd017e80a8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7580] Fix order of fields when records inserted out of order [hudi]

2024-04-16 Thread via GitHub
jonvex commented on PR #11019: URL: https://github.com/apache/hudi/pull/11019#issuecomment-2060191158 From your test ``` spark.sql( s""" |create table $tableName ( | id int, | name string, | price i

[jira] [Updated] (HUDI-7574) Auto-scaling for Flink Hudi sink tasks

2024-04-16 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7574: - Summary: Auto-scaling for Flink Hudi sink tasks (was: Auto-pilot for Flink Hudi sink tasks) > Au

[jira] [Updated] (HUDI-7574) Auto-scaling for Flink Hudi sink tasks

2024-04-16 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7574: - Status: Patch Available (was: In Progress) > Auto-scaling for Flink Hudi sink tasks > ---

(hudi) branch master updated: [MINOR] Rename location to path in `makeQualified` (#11037)

2024-04-16 Thread jonvex
This is an automated email from the ASF dual-hosted git repository. jonvex pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 7f20e108897 [MINOR] Rename location to path in `ma

Re: [PR] [MINOR] Rename location to path in `makeQualified` [hudi]

2024-04-16 Thread via GitHub
jonvex merged PR #11037: URL: https://github.com/apache/hudi/pull/11037 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060168522 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 2aaf5a81b3aa54133276b51cbb8df16d09bbb887 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060162963 ## CI report: * 4599676b2b37f991f21b5031337500daf2b0e519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10915: URL: https://github.com/apache/hudi/pull/10915#issuecomment-2060162686 ## CI report: * acfe81fa3814c77cd04a39eb2bbddb9960bdc437 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060161031 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 2aaf5a81b3aa54133276b51cbb8df16d09bbb887 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #10915: URL: https://github.com/apache/hudi/pull/10915#discussion_r1568087606 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -114,6 +114,24 @@ public Map> extractCDCFileSplits() { ValidationUti

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060153469 ## CI report: * 4599676b2b37f991f21b5031337500daf2b0e519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10915: URL: https://github.com/apache/hudi/pull/10915#issuecomment-2060153308 ## CI report: * acfe81fa3814c77cd04a39eb2bbddb9960bdc437 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060152878 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 2c375f28f41a2beb0fa2b76bf7e822865427b9d1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [MINOR] Rename location to path in `makeQualified` [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11037: URL: https://github.com/apache/hudi/pull/11037#issuecomment-2060146480 ## CI report: * 3b4ea58d6239cb6101859f5d7af4dfa7bb1cf193 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060146431 ## CI report: * 4599676b2b37f991f21b5031337500daf2b0e519 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [I] [SUPPORT] Flink-Hudi - Upsert into the same Hudi table via two different Flink pipelines (stream and batch) [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on issue #10914: URL: https://github.com/apache/hudi/issues/10914#issuecomment-2060138661 Close it now, feel free to reopen it if you still think it is an issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] [SUPPORT] Flink-Hudi - Upsert into the same Hudi table via two different Flink pipelines (stream and batch) [hudi]

2024-04-16 Thread via GitHub
danny0405 closed issue #10914: [SUPPORT] Flink-Hudi - Upsert into the same Hudi table via two different Flink pipelines (stream and batch) URL: https://github.com/apache/hudi/issues/10914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568078282 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos() {

Re: [PR] [MINOR] Optimization function MergeOnReadTableState#getRequiredPositions [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1568078034 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -83,11 +96,20 @@ public int getOperationPos() {

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
nsivabalan commented on code in PR #10915: URL: https://github.com/apache/hudi/pull/10915#discussion_r1568072027 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -114,6 +114,24 @@ public Map> extractCDCFileSplits() { ValidationUt

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
nsivabalan commented on code in PR #10915: URL: https://github.com/apache/hudi/pull/10915#discussion_r1568072805 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -114,6 +114,24 @@ public Map> extractCDCFileSplits() { ValidationUt

Re: [I] [SUPPORT]Data Loss Issue with Hudi Table After 3 Days of Continuous Writes [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on issue #11016: URL: https://github.com/apache/hudi/issues/11016#issuecomment-2060119989 And can you also supplement the Hudi and Flink release you use here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [HUDI-7532] Include only compaction instants for lastCompaction in getDeltaCommitsSinceLatestCompaction [hudi]

2024-04-16 Thread via GitHub
nsivabalan commented on code in PR #10915: URL: https://github.com/apache/hudi/pull/10915#discussion_r1568072027 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -114,6 +114,24 @@ public Map> extractCDCFileSplits() { ValidationUt

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-04-16 Thread via GitHub
danny0405 commented on code in PR #10886: URL: https://github.com/apache/hudi/pull/10886#discussion_r1568071623 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePartitionMetadata.java: ## @@ -92,11 +92,12 @@ public int getPartitionDepth() { /** * Write th

Re: [PR] [HUDI-7503] Compaction and LogCompaction executions should start a heartbeat on every attempt and block concurrent executions of same plan [hudi]

2024-04-16 Thread via GitHub
kbuci commented on code in PR #10965: URL: https://github.com/apache/hudi/pull/10965#discussion_r1568070078 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1135,8 +1138,36 @@ protected void completeLogCompaction(HoodieCo

[jira] [Assigned] (HUDI-6912) Avoid using Hadoop classes and APIs in HoodieFileGroupReader

2024-04-16 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-6912: --- Assignee: Jonathan Vexler (was: Ethan Guo) > Avoid using Hadoop classes and APIs in HoodieFileGroupR

Re: [PR] [HUDI-7623] Refactoring of RemoteHoodieTableFileSystemView and RequestHandler [hudi]

2024-04-16 Thread via GitHub
wombatu-kun commented on PR #11032: URL: https://github.com/apache/hudi/pull/11032#issuecomment-2060109199 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [HUDI-7503] Compaction and LogCompaction executions should start a heartbeat on every attempt and block concurrent executions of same plan [hudi]

2024-04-16 Thread via GitHub
kbuci commented on code in PR #10965: URL: https://github.com/apache/hudi/pull/10965#discussion_r1556157237 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1135,8 +1138,36 @@ protected void completeLogCompaction(HoodieCo

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060102910 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 6b3af98bb8f376043bb585109c5e3707b337deda Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [MINOR] Rename location to path in `makeQualified` [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11037: URL: https://github.com/apache/hudi/pull/11037#issuecomment-2060086516 ## CI report: * 3b4ea58d6239cb6101859f5d7af4dfa7bb1cf193 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [MINOR] Rename location to path in `makeQualified` [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11037: URL: https://github.com/apache/hudi/pull/11037#issuecomment-2060051829 ## CI report: * 3b4ea58d6239cb6101859f5d7af4dfa7bb1cf193 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060051275 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 6b3af98bb8f376043bb585109c5e3707b337deda Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2060045371 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 303733a1dfd60bfc3edf20a9a487d459a1fe851e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

[PR] [MINOR] Rename location to path in `makeQualified` [hudi]

2024-04-16 Thread via GitHub
yihua opened a new pull request, #11037: URL: https://github.com/apache/hudi/pull/11037 ### Change Logs As above. ### Impact Code quality improvement. ### Risk level none ### Documentation Update none ### Contributor's checklist -

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
nsivabalan commented on code in PR #11018: URL: https://github.com/apache/hudi/pull/11018#discussion_r1568020073 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -129,6 +130,14 @@ public class HoodieStreamer implements Serializable {

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
rmahindra123 commented on code in PR #11018: URL: https://github.com/apache/hudi/pull/11018#discussion_r1566876795 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -129,6 +130,14 @@ public class HoodieStreamer implements Serializable

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2059998574 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 303733a1dfd60bfc3edf20a9a487d459a1fe851e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2059990399 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * f7ab315084f8534388db563a20d34b174cc63fa3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

[jira] [Assigned] (HUDI-7591) Implement InlineFS in HoodieStorage

2024-04-16 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7591: --- Assignee: Jonathan Vexler (was: Ethan Guo) > Implement InlineFS in HoodieStorage > -

Re: [PR] [HUDI-7624] Fixing index tagging duration [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11035: URL: https://github.com/apache/hudi/pull/11035#issuecomment-2059983018 ## CI report: * 244e2a201eb3b482089a30812f5ff53065ac8918 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

[I] [SUPPORT] Flink-Hudi Unable to use Hudi metadata with S3 [hudi]

2024-04-16 Thread via GitHub
ChiehFu opened a new issue, #11036: URL: https://github.com/apache/hudi/issues/11036 **Describe the problem you faced** Hi, I was creating a Flink SQL stream pipeline in AWS EMR to compact data into a Hudi COW table. Because of S3 slowdown errors that occasionally happened dur

Re: [PR] [HUDI-7624] Fixing index tagging duration [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11035: URL: https://github.com/apache/hudi/pull/11035#issuecomment-2059929841 ## CI report: * 244e2a201eb3b482089a30812f5ff53065ac8918 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7624] Fixing index tagging duration [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #11035: URL: https://github.com/apache/hudi/pull/11035#issuecomment-2059920162 ## CI report: * 244e2a201eb3b482089a30812f5ff53065ac8918 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] Test fg reader hive3 [hudi]

2024-04-16 Thread via GitHub
hudi-bot commented on PR #10996: URL: https://github.com/apache/hudi/pull/10996#issuecomment-2059919883 ## CI report: * 974f610eb5a569dfe2ea0b58c9d80d0ca350dfc3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-16 Thread via GitHub
nsivabalan commented on code in PR #11018: URL: https://github.com/apache/hudi/pull/11018#discussion_r1567935589 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -150,21 +159,27 @@ public HoodieStreamer(Config cfg, JavaSparkContext js

Re: [I] [SUPPORT] Flink-Hudi - Upsert into the same Hudi table via two different Flink pipelines (stream and batch) [hudi]

2024-04-16 Thread via GitHub
ChiehFu commented on issue #10914: URL: https://github.com/apache/hudi/issues/10914#issuecomment-2059909294 @danny0405 Thanks for the answer. The SQL hint worked for me for disabling index bootstrap. -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] [HUDI-7624] Fixing index tagging duration [hudi]

2024-04-16 Thread via GitHub
nsivabalan opened a new pull request, #11035: URL: https://github.com/apache/hudi/pull/11035 ### Change Logs Index lookup duration we emit as of now is buggy. We compute the duration before and after tag() call which is actually lazy. So, the actual lookup was not even triggered, but

Re: [PR] [HUDI-7624] Fix index duration metrics [hudi]

2024-04-16 Thread via GitHub
nsivabalan closed pull request #11034: [HUDI-7624] Fix index duration metrics URL: https://github.com/apache/hudi/pull/11034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] [HUDI-7624] Fix index duration metrics [hudi]

2024-04-16 Thread via GitHub
nsivabalan opened a new pull request, #11034: URL: https://github.com/apache/hudi/pull/11034 ### Change Logs Index lookup duration we emit as of now is buggy. We compute the duration before and after tag() call which is actually lazy. So, the actual lookup was not even triggered, but

  1   2   >