[GitHub] [hudi] prashantwason commented on pull request #8487: [HUDI-6093] Use the correct partitionToReplacedFileIds during commit.

2023-04-19 Thread via GitHub
prashantwason commented on PR #8487: URL: https://github.com/apache/hudi/pull/8487#issuecomment-1515805437 Says all checks have passed. Probably those tests are flaky and not due to this commit. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-19 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1172085324 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -441,6 +441,8 @@ private Stream getCommitInstantsToArchive()

[hudi] branch master updated (04517310ae9 -> 68c3be5df5b)

2023-04-19 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 04517310ae9 [HUDI-5987] Fix clustering on bootstrap table with row writer disabled (#8342) add 68c3be5df5b [HUDI-55

[GitHub] [hudi] leesf merged pull request #7614: [HUDI-5509] check if dfs support atomic creation when using filesyste…

2023-04-19 Thread via GitHub
leesf merged PR #7614: URL: https://github.com/apache/hudi/pull/7614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] prashantwason commented on a diff in pull request #8484: [HUDI-6092] Reuse schema objects while deserializing log blocks.

2023-04-19 Thread via GitHub
prashantwason commented on code in PR #8484: URL: https://github.com/apache/hudi/pull/8484#discussion_r1172149905 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java: ## @@ -195,14 +193,16 @@ protected ClosableIterator> lookupRecords(L

[jira] [Created] (HUDI-6107) Fix java.lang.IllegalArgumentException for bootstrap

2023-04-19 Thread weiming (Jira)
weiming created HUDI-6107: - Summary: Fix java.lang.IllegalArgumentException for bootstrap Key: HUDI-6107 URL: https://issues.apache.org/jira/browse/HUDI-6107 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-6107) Fix java.lang.IllegalArgumentException for bootstrap

2023-04-19 Thread weiming (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weiming updated HUDI-6107: -- Priority: Critical (was: Major) > Fix java.lang.IllegalArgumentException for bootstrap >

[jira] [Assigned] (HUDI-6107) Fix java.lang.IllegalArgumentException for bootstrap

2023-04-19 Thread weiming (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weiming reassigned HUDI-6107: - Assignee: weiming > Fix java.lang.IllegalArgumentException for bootstrap >

[GitHub] [hudi] Amar1404 opened a new issue, #8509: Support MetaSync Even there is no new data, Just create the table if not exist in DeltaStream

2023-04-19 Thread via GitHub
Amar1404 opened a new issue, #8509: URL: https://github.com/apache/hudi/issues/8509 I have wanted to sync the hudi table in HIVE Catalog, but not able to sync, due to no new data in Source while using DeltaStream. It should create a table if not exist or leave as it. if it is empt

[GitHub] [hudi] prashantwason commented on a diff in pull request #8467: [HUDI-6084] Added FailOnFirstErrorWriteStatus for MDT to ensure that write operations fail fast on any error.

2023-04-19 Thread via GitHub
prashantwason commented on code in PR #8467: URL: https://github.com/apache/hudi/pull/8467#discussion_r1172143179 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -170,9 +171,12 @@ protected HoodieBackedTableM

[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
XuQianJin-Stars commented on code in PR #8488: URL: https://github.com/apache/hudi/pull/8488#discussion_r1172121964 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -27,21 +27,29 @@ import org.apache.spark.api.java.JavaSparkCo

[GitHub] [hudi] hudi-bot commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515758278 ## CI report: * 3874447e48c21cb336f28625e1682b8f229f623c UNKNOWN * bfe8f905ffd99f983173a653481915c34a0c4fbe Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[hudi] branch master updated (b2b1ae3cb78 -> 04517310ae9)

2023-04-19 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from b2b1ae3cb78 [HUDI-6017] Sort the results of Call help Procedure with no params (#8358) add 04517310ae9 [HUDI-5987]

[GitHub] [hudi] yihua merged pull request #8342: [HUDI-5987] Fix clustering on bootstrap table with row writer disabled

2023-04-19 Thread via GitHub
yihua merged PR #8342: URL: https://github.com/apache/hudi/pull/8342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] codope commented on a diff in pull request #8508: [MINOR][DOCS] Update docs for fs based lock provider

2023-04-19 Thread via GitHub
codope commented on code in PR #8508: URL: https://github.com/apache/hudi/pull/8508#discussion_r1172107083 ## website/docs/concurrency_control.md: ## @@ -51,11 +51,22 @@ There are 4 different lock providers that require different configurations to be **`FileSystem`** based l

[GitHub] [hudi] yihua commented on a diff in pull request #8342: [HUDI-5987] Fix clustering on bootstrap table with row writer disabled

2023-04-19 Thread via GitHub
yihua commented on code in PR #8342: URL: https://github.com/apache/hudi/pull/8342#discussion_r1172103490 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieBootstrapFileReader.java: ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-19 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1172085324 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -441,6 +441,8 @@ private Stream getCommitInstantsToArchive()

[GitHub] [hudi] xushiyan opened a new pull request, #8508: [MINOR][DOCS] Update docs for fs based lock provider

2023-04-19 Thread via GitHub
xushiyan opened a new pull request, #8508: URL: https://github.com/apache/hudi/pull/8508 ### Change Logs Update docs for FileSystem based lock provider ### Impact NA ### Risk level NA ### Documentation Update NA ### Contributor's checkli

[GitHub] [hudi] voonhous commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-19 Thread via GitHub
voonhous commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1172090557 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -85,7 +86,7 @@ public int getOperationPos() { publ

[GitHub] [hudi] hudi-bot commented on pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8319: URL: https://github.com/apache/hudi/pull/8319#issuecomment-1515722209 ## CI report: * 6fd96ac5e47dbca48aebb74c5166add63074b63f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-19 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1172085324 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -441,6 +441,8 @@ private Stream getCommitInstantsToArchive()

[GitHub] [hudi] hudi-bot commented on pull request #8481: [HUDI-6091] Add Java 11 and 17 to bundle validation

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515716696 ## CI report: * a2cd52729bec940ced866ab16174f9365dbcc8d2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] hudi-bot commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1515716287 ## CI report: * bc165df2e2fcf0cbdb3646d35c3b0d5a2cdcc1ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646

[GitHub] [hudi] hudi-bot commented on pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8319: URL: https://github.com/apache/hudi/pull/8319#issuecomment-1515716157 ## CI report: * 6fd96ac5e47dbca48aebb74c5166add63074b63f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646

[GitHub] [hudi] hudi-bot commented on pull request #8506: [HUDI-6104] Clean deleted partition with clean policy

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8506: URL: https://github.com/apache/hudi/pull/8506#issuecomment-1515711149 ## CI report: * 8ec5618449900b90ce465037700b129c9d51cde4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] hudi-bot commented on pull request #8481: [HUDI-6091] Add Java 11 and 17 to bundle validation

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515711046 ## CI report: * a2cd52729bec940ced866ab16174f9365dbcc8d2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1515710796 ## CI report: * bc165df2e2fcf0cbdb3646d35c3b0d5a2cdcc1ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1646

[GitHub] [hudi] danny0405 commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1172077964 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -441,6 +441,8 @@ private Stream getCommitInstantsToArchive() t

[GitHub] [hudi] yihua commented on a diff in pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
yihua commented on code in PR #8319: URL: https://github.com/apache/hudi/pull/8319#discussion_r1172071709 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java: ## @@ -1088,28 +1088,30 @@ public void testCleanerDeleteReplacedDataWi

[GitHub] [hudi] xccui opened a new issue, #8507: [SUPPORT] NoClassDefFoundError for org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFile

2023-04-19 Thread via GitHub
xccui opened a new issue, #8507: URL: https://github.com/apache/hudi/issues/8507 We occasionally hit the following exception when running a Flink writer job. The job won't self-heal, but can be recovered by manually restarting the TaskManager. ``` java.lang.NoClassDefFoundError: Could

[GitHub] [hudi] xccui commented on issue #6428: [BUG] S3 Deltastreamer: Block has already been inflated

2023-04-19 Thread via GitHub
xccui commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1515692373 I don't know the detailed logic here, but apparently recursively invoking `inflate()` when hitting an `IOException` will cause the state check to fail. ![image](https://user-images

[GitHub] [hudi] hudi-bot commented on pull request #8506: [HUDI-6104] Clean deleted partition with clean policy

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8506: URL: https://github.com/apache/hudi/pull/8506#issuecomment-1515686287 ## CI report: * 8ec5618449900b90ce465037700b129c9d51cde4 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1515686189 ## CI report: * fae0ad3bf18befc79e59e4428bc3ed86c51a42df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1642

[GitHub] [hudi] huangxiaopingRD commented on a diff in pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-19 Thread via GitHub
huangxiaopingRD commented on code in PR #8355: URL: https://github.com/apache/hudi/pull/8355#discussion_r1172059909 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -40,8 +41,9 @@ object HoodieCLIUtils extends ProvidesHoodieCo

[GitHub] [hudi] xccui commented on issue #6428: [BUG] S3 Deltastreamer: Block has already been inflated

2023-04-19 Thread via GitHub
xccui commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1515681584 Hit this when using Flink 1.16 and Hudi bdb50ddccc9631317dfb06a06abc38cbd3714ce8 on EKS. Metadata table was disabled. ``` 2023-04-19 23:55:23 org.apache.hudi.exception.HoodieM

[GitHub] [hudi] prm-xingcan commented on issue #6428: [BUG] S3 Deltastreamer: Block has already been inflated

2023-04-19 Thread via GitHub
prm-xingcan commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1515680981 Hit this when using Flink 1.16 and Hudi bdb50ddccc9631317dfb06a06abc38cbd3714ce8 on EKS. Metadata table was disabled. ``` 2023-04-19 23:55:23 org.apache.hudi.exception.H

[GitHub] [hudi] hudi-bot commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1515680917 ## CI report: * fae0ad3bf18befc79e59e4428bc3ed86c51a42df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1642

[jira] [Updated] (HUDI-6104) Clean deleted partition with clean policy

2023-04-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6104: - Labels: pull-request-available (was: ) > Clean deleted partition with clean policy >

[GitHub] [hudi] hbgstc123 opened a new pull request, #8506: [HUDI-6104] Clean deleted partition with clean policy

2023-04-19 Thread via GitHub
hbgstc123 opened a new pull request, #8506: URL: https://github.com/apache/hudi/pull/8506 ### Change Logs Now cleaner will clean deleted partition right away regardless of clean policy used. The pr propose to clean deleted partition according to clean policy, that is, if a partiti

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515676365 ## CI report: * bcd54355f02696b50cd3998e8cc93f5e64cfc338 UNKNOWN * e14ff9c13938b28ec4e74af8572eb09a664df717 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515676161 ## CI report: * 3874447e48c21cb336f28625e1682b8f229f623c UNKNOWN * bfe8f905ffd99f983173a653481915c34a0c4fbe Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] jhchee commented on issue #8499: [SUPPORT] Support partial insert in merge into command

2023-04-19 Thread via GitHub
jhchee commented on issue #8499: URL: https://github.com/apache/hudi/issues/8499#issuecomment-1515675847 @danny0405 I think the ticket title is slightly misleading. It should be Partial insert for merge into. -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1172048066 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMeta

[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1172047827 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMeta

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
Zouxxyy commented on code in PR #8488: URL: https://github.com/apache/hudi/pull/8488#discussion_r1172047050 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -40,8 +40,12 @@ object HoodieCLIUtils extends ProvidesHoodieConfig{

[GitHub] [hudi] danny0405 commented on a diff in pull request #8394: [HUDI-6085] Eliminate cleaning tasks for flink mor table if async cleaning is disabled

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8394: URL: https://github.com/apache/hudi/pull/8394#discussion_r1172046008 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSink.java: ## @@ -103,6 +103,8 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Co

[GitHub] [hudi] danny0405 commented on pull request #8489: [HUDI-6094] make utilities kafka send call from async to sync

2023-04-19 Thread via GitHub
danny0405 commented on PR #8489: URL: https://github.com/apache/hudi/pull/8489#issuecomment-1515662740 Thanks for the contribution, can you elaborate a little more what the gains here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] danny0405 commented on a diff in pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8355: URL: https://github.com/apache/hudi/pull/8355#discussion_r1172044384 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -40,8 +41,9 @@ object HoodieCLIUtils extends ProvidesHoodieConfig{

[GitHub] [hudi] Mulavar commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
Mulavar commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515660755 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] danny0405 commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1172043586 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadTableState.java: ## @@ -85,7 +86,7 @@ public int getOperationPos() { pub

[GitHub] [hudi] Mulavar commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
Mulavar commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515659350 @bvaradar Yeah, but I'm not very familiar with the compaction test, could you give me some tips on which example should I look at? -- This is an automated message from the Apache Git Servi

[jira] [Assigned] (HUDI-6105) Partial update for MERGE INTO

2023-04-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen reassigned HUDI-6105: Assignee: Jing Zhang > Partial update for MERGE INTO > - > >

[jira] [Assigned] (HUDI-6105) Partial update for MERGE INTO

2023-04-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen reassigned HUDI-6105: Assignee: Jing Zhang (was: Danny Chen) > Partial update for MERGE INTO > -

[jira] [Assigned] (HUDI-6105) Partial update for MERGE INTO

2023-04-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen reassigned HUDI-6105: Assignee: (was: Jing Zhang) > Partial update for MERGE INTO > - > >

[GitHub] [hudi] danny0405 commented on issue #8500: [DISCUSS] Hive Sync will lose some partitions in multi writer scenario

2023-04-19 Thread via GitHub
danny0405 commented on issue #8500: URL: https://github.com/apache/hudi/issues/8500#issuecomment-1515654829 You are right, that's why recently we are trying to address the issue by introducing the real transition time on the timeline: https://github.com/apache/hudi/pull/7627, by using the t

[GitHub] [hudi] danny0405 commented on issue #8276: [SUPPORT] Flink Exceeded checkpoint tolerable failure threshold.

2023-04-19 Thread via GitHub
danny0405 commented on issue #8276: URL: https://github.com/apache/hudi/issues/8276#issuecomment-1515653560 Nice findings, can you help to dig into the root cause @gfunc ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515651440 ## CI report: * a7e4751d6c6ac5f2aa813020002ca57b4ba9a77a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1643

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1515651524 ## CI report: * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] danny0405 commented on a diff in pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8319: URL: https://github.com/apache/hudi/pull/8319#discussion_r1172038393 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestHoodieIncrSource.java: ## @@ -96,7 +96,7 @@ public void testHoodieIncrSource(HoodieTableType tableT

[GitHub] [hudi] hudi-bot commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515651124 ## CI report: * 3874447e48c21cb336f28625e1682b8f229f623c UNKNOWN * bfe8f905ffd99f983173a653481915c34a0c4fbe Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] danny0405 commented on a diff in pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8319: URL: https://github.com/apache/hudi/pull/8319#discussion_r1172038257 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestGcsEventsHoodieIncrSource.java: ## @@ -215,7 +215,7 @@ private HoodieRecord getGcsMetadataRecord(St

[GitHub] [hudi] danny0405 commented on a diff in pull request #8319: [HUDI-5934] Remove archival configs for metadata table

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8319: URL: https://github.com/apache/hudi/pull/8319#discussion_r1172037993 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java: ## @@ -1088,28 +1088,30 @@ public void testCleanerDeleteReplacedDa

[GitHub] [hudi] stream2000 commented on issue #8504: [SUPPORT] JettyServer Threadpool keeps running and ocp job hangs

2023-04-19 Thread via GitHub
stream2000 commented on issue #8504: URL: https://github.com/apache/hudi/issues/8504#issuecomment-1515646260 Hi, does this pr #8335 fix your problem? You can also try to set `hoodie.embed.timeline.server=false` to disable timeline service and check whether you app are hang by the timelin

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1515645860 ## CI report: * 55007361a8c01779a883cee54ecf45ce94e25dce UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] bvaradar commented on pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-04-19 Thread via GitHub
bvaradar commented on PR #8029: URL: https://github.com/apache/hudi/pull/8029#issuecomment-1515645800 @stayrascal : Since this is a old PR, can you please rebase this PR so that we can land it. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515645752 ## CI report: * a7e4751d6c6ac5f2aa813020002ca57b4ba9a77a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1643

[GitHub] [hudi] bvaradar commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
bvaradar commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515644714 @Mulavar : W.r.t running compaction after upgrade/downgrade, this will help prove that the upgrade will be safer if we can ensure compaction did not fail in any place after upgrade/downgrad

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8394: [HUDI-6085] Eliminate cleaning tasks for flink mor table if async cleaning is disabled

2023-04-19 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8394: URL: https://github.com/apache/hudi/pull/8394#discussion_r1172031561 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -320,7 +322,9 @@ protected HoodieWriteMetadata> logCompact(Str

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515640123 ## CI report: * a7e4751d6c6ac5f2aa813020002ca57b4ba9a77a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1643

[GitHub] [hudi] hudi-bot commented on pull request #8385: [HUDI-6040]Stop writing and reading compaction plans from .aux folder

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8385: URL: https://github.com/apache/hudi/pull/8385#issuecomment-1515639862 ## CI report: * 3874447e48c21cb336f28625e1682b8f229f623c UNKNOWN * 1cd0db680780d02ff786121f394dccfcd621d37d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] huangxiaopingRD commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
huangxiaopingRD commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515638492 I forcibly triggered the check, and the same UnknownHostException as other PRs appeared here -- This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Created] (HUDI-6106) Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread zhuanshenbsj1 (Jira)
zhuanshenbsj1 created HUDI-6106: --- Summary: Spark offline compaction/Clustering Job will do clean like Flink job Key: HUDI-6106 URL: https://issues.apache.org/jira/browse/HUDI-6106 Project: Apache Hudi

[GitHub] [hudi] zhuanshenbsj1 opened a new pull request, #8505: Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-19 Thread via GitHub
zhuanshenbsj1 opened a new pull request, #8505: URL: https://github.com/apache/hudi/pull/8505 ### Change Logs Adjust the cleaning operation in SparkRDDWriteClient#cluster/compact, when ASYNC_CLEAN is true will do asynchronous clean in prewrite, otherwise will do synchronous clean in

[hudi] branch master updated (8afe5498c1b -> b2b1ae3cb78)

2023-04-19 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 8afe5498c1b [HUDI-6099] Improved the performance of checking for valid commits when tagging record location (#8494)

[GitHub] [hudi] bvaradar merged pull request #8358: [HUDI-6017] Sort the results of Call help Procedure with no params

2023-04-19 Thread via GitHub
bvaradar merged PR #8358: URL: https://github.com/apache/hudi/pull/8358 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

[jira] [Closed] (HUDI-6099) Improve performance of checking for valid commits when tagging record location

2023-04-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6099. Fix Version/s: 0.13.1 0.14.0 Resolution: Fixed Fixed via master branch: 8afe5498c1

[hudi] branch master updated (f9f110695fc -> 8afe5498c1b)

2023-04-19 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from f9f110695fc [HUDI-6056] Validate archival configs alignment with cleaner configs with policy based on hours (#8422)

[GitHub] [hudi] danny0405 merged pull request #8494: [HUDI-6099] Improved the performance of checking for valid commits when tagging record location.

2023-04-19 Thread via GitHub
danny0405 merged PR #8494: URL: https://github.com/apache/hudi/pull/8494 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[jira] [Created] (HUDI-6105) Partial update for MERGE INTO

2023-04-19 Thread Danny Chen (Jira)
Danny Chen created HUDI-6105: Summary: Partial update for MERGE INTO Key: HUDI-6105 URL: https://issues.apache.org/jira/browse/HUDI-6105 Project: Apache Hudi Issue Type: New Feature Com

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515635574 ## CI report: * a7e4751d6c6ac5f2aa813020002ca57b4ba9a77a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1643

[GitHub] [hudi] danny0405 commented on issue #8499: [SUPPORT] Support partial insert in merge into command

2023-04-19 Thread via GitHub
danny0405 commented on issue #8499: URL: https://github.com/apache/hudi/issues/8499#issuecomment-1515635501 Reasonable, issue created: https://issues.apache.org/jira/browse/HUDI-6105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[jira] [Created] (HUDI-6104) Clean deleted partition with clean policy

2023-04-19 Thread HBG (Jira)
HBG created HUDI-6104: - Summary: Clean deleted partition with clean policy Key: HUDI-6104 URL: https://issues.apache.org/jira/browse/HUDI-6104 Project: Apache Hudi Issue Type: Task Reporter:

[GitHub] [hudi] danny0405 commented on a diff in pull request #8422: [HUDI-6056] Validate archival configs alignment with cleaner configs with policy based on hours

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8422: URL: https://github.com/apache/hudi/pull/8422#discussion_r1172021405 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -475,7 +485,24 @@ private Stream getCommitInstantsToArchive()

[GitHub] [hudi] danny0405 commented on a diff in pull request #8394: [HUDI-6085] Eliminate cleaning tasks for flink mor table if async cleaning is disabled

2023-04-19 Thread via GitHub
danny0405 commented on code in PR #8394: URL: https://github.com/apache/hudi/pull/8394#discussion_r1172020216 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -320,7 +322,9 @@ protected HoodieWriteMetadata> logCompact(String

[GitHub] [hudi] danny0405 commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-19 Thread via GitHub
danny0405 commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1515624824 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] dubin555 commented on pull request #8489: [HUDI-6094] make utilities kafka send call from async to sync

2023-04-19 Thread via GitHub
dubin555 commented on PR #8489: URL: https://github.com/apache/hudi/pull/8489#issuecomment-1515618874 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
XuQianJin-Stars commented on code in PR #8488: URL: https://github.com/apache/hudi/pull/8488#discussion_r1172009398 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -40,8 +40,12 @@ object HoodieCLIUtils extends ProvidesHoodieC

[jira] [Closed] (HUDI-6046) Remove clean operator for comapaction turn off in mor table upsert

2023-04-19 Thread zhuanshenbsj1 (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuanshenbsj1 closed HUDI-6046. --- Resolution: Fixed > Remove clean operator for comapaction turn off in mor table upsert > -

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-19 Thread via GitHub
Zouxxyy commented on code in PR #8488: URL: https://github.com/apache/hudi/pull/8488#discussion_r1171998031 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieCLIUtils.scala: ## @@ -40,8 +40,12 @@ object HoodieCLIUtils extends ProvidesHoodieConfig{

[GitHub] [hudi] hudi-bot commented on pull request #8481: [HUDI-6091] Add Java 11 and 17 to bundle validation

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515567821 ## CI report: * 1e96d226fa07318b9200d98d92664c9b35488097 UNKNOWN * 1a7bd62f368c6e82f21b9801adc5ed8bf910d38f UNKNOWN * a2cd52729bec940ced866ab16174f9365dbcc8d2 Azure: [FAILUR

[GitHub] [hudi] hudi-bot commented on pull request #8481: [HUDI-6091] Add Java 11 and 17 to bundle validation

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515562245 ## CI report: * 1e96d226fa07318b9200d98d92664c9b35488097 UNKNOWN * 50220066a729cd58dbf64e8583376bf691458d09 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8481: [HUDI-6091] Add Java 11 and 17 to bundle validation

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515558078 ## CI report: * 1e96d226fa07318b9200d98d92664c9b35488097 UNKNOWN * 50220066a729cd58dbf64e8583376bf691458d09 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] soumilshah1995 commented on issue #8400: [SUPPORT] Hudi Offline Compaction in EMR Serverless 6.10 for YouTube Video

2023-04-19 Thread via GitHub
soumilshah1995 commented on issue #8400: URL: https://github.com/apache/hudi/issues/8400#issuecomment-1515543681 i shall test this again on weekends with New JAR files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [hudi] rahil-c commented on a diff in pull request #8441: Upgrade aws java sdk to v2

2023-04-19 Thread via GitHub
rahil-c commented on code in PR #8441: URL: https://github.com/apache/hudi/pull/8441#discussion_r1171959227 ## packaging/hudi-utilities-bundle/pom.xml: ## @@ -153,6 +153,10 @@ commons-codec:commons-codec commons-io:commons-io

[GitHub] [hudi] rahil-c commented on a diff in pull request #8441: Upgrade aws java sdk to v2

2023-04-19 Thread via GitHub
rahil-c commented on code in PR #8441: URL: https://github.com/apache/hudi/pull/8441#discussion_r1171958869 ## packaging/hudi-utilities-bundle/pom.xml: ## @@ -153,6 +153,10 @@ commons-codec:commons-codec commons-io:commons-io

[GitHub] [hudi] hudi-bot commented on pull request #8481: [DNM][HUDI-6091] Add Java 11 and 17 to bundle validation image

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515533664 ## CI report: * 1e96d226fa07318b9200d98d92664c9b35488097 UNKNOWN * 50220066a729cd58dbf64e8583376bf691458d09 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] rahil-c commented on a diff in pull request #8441: Upgrade aws java sdk to v2

2023-04-19 Thread via GitHub
rahil-c commented on code in PR #8441: URL: https://github.com/apache/hudi/pull/8441#discussion_r1171957728 ## hudi-utilities/pom.xml: ## @@ -456,6 +468,14 @@ test + + + + software.amazon.awssdk + sqs + ${aws.sdk.version} + + Re

[GitHub] [hudi] hudi-bot commented on pull request #8481: [DNM][HUDI-6091] Add Java 11 and 17 to bundle validation image

2023-04-19 Thread via GitHub
hudi-bot commented on PR #8481: URL: https://github.com/apache/hudi/pull/8481#issuecomment-1515524200 ## CI report: * 632fbc22aceebaf5bae97c7ad65af1af790fd055 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1644

[GitHub] [hudi] rahil-c commented on a diff in pull request #8441: Upgrade aws java sdk to v2

2023-04-19 Thread via GitHub
rahil-c commented on code in PR #8441: URL: https://github.com/apache/hudi/pull/8441#discussion_r1171954215 ## hudi-utilities/pom.xml: ## @@ -136,6 +136,18 @@ hudi-aws ${project.version} + + + org.apache.httpcomponents + httpclient + ${aws.

[GitHub] [hudi] rahil-c commented on a diff in pull request #8441: Upgrade aws java sdk to v2

2023-04-19 Thread via GitHub
rahil-c commented on code in PR #8441: URL: https://github.com/apache/hudi/pull/8441#discussion_r1171953158 ## hudi-aws/src/main/java/org/apache/hudi/aws/utils/DynamoTableUtils.java: ## @@ -0,0 +1,265 @@ +/* + * Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.

  1   2   3   >