[GitHub] [hudi] zhuanshenbsj1 commented on pull request #9039: [HUDI-6424] getOldestInstantToRetainForCompaction needs to add clean validation

2023-06-25 Thread via GitHub
zhuanshenbsj1 commented on PR #9039: URL: https://github.com/apache/hudi/pull/9039#issuecomment-1606735274 > Nice findings, should already be fixed via: #8373 If there are too many partitions and files, the cost of full cleaning is too high. And without this verification, there will

[GitHub] [hudi] hudi-bot commented on pull request #9037: [HUDI-6420] Fixing Hfile on-demand and prefix based reads to use optimized apis

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9037: URL: https://github.com/apache/hudi/pull/9037#issuecomment-1606721795 ## CI report: * 7712591445d940ba79574a6e59b2d528a7ea0091 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1810

[jira] [Created] (HUDI-6438) Fix issue while inserting non-nullable array columns to nullable columns

2023-06-25 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-6438: --- Summary: Fix issue while inserting non-nullable array columns to nullable columns Key: HUDI-6438 URL: https://issues.apache.org/jira/browse/HUDI-6438 Project: Apache Hu

[GitHub] [hudi] ad1happy2go commented on issue #9042: [SUPPORT] Cannot write nullable values to non-null column

2023-06-25 Thread via GitHub
ad1happy2go commented on issue #9042: URL: https://github.com/apache/hudi/issues/9042#issuecomment-1606711938 @dht7 Able to reproduce this issue. Root cause was the source table had column nullable false but the hudi table had it true. Source table https://github.com/a

[GitHub] [hudi] flashJd commented on pull request #9048: [HUDI-6434] Fix illegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread via GitHub
flashJd commented on PR #9048: URL: https://github.com/apache/hudi/pull/9048#issuecomment-1606692365 Flink generate log files at the first data written on MOR, I'm puzzled with this as spark generate base file first, I want to know the consideration @danny0405 -- This is an automated me

[GitHub] [hudi] xushiyan closed issue #7059: [SUPPORT] Metadata table column_stats index not used for range pruning when using a CompositeKey of two columns

2023-06-25 Thread via GitHub
xushiyan closed issue #7059: [SUPPORT] Metadata table column_stats index not used for range pruning when using a CompositeKey of two columns URL: https://github.com/apache/hudi/issues/7059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] ad1happy2go commented on issue #7059: [SUPPORT] Metadata table column_stats index not used for range pruning when using a CompositeKey of two columns

2023-06-25 Thread via GitHub
ad1happy2go commented on issue #7059: URL: https://github.com/apache/hudi/issues/7059#issuecomment-1606687255 @ssandona This issue was fixed in later versions of hudi. I am able to see this behaviour with composite key with hudi 0.11.1. But with 0.13.X version, we no longer see the f

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606663218 ## CI report: * f018119dfff1a1ecb6dbdaa79bb06473e3412585 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1810

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606655448 ## CI report: * f018119dfff1a1ecb6dbdaa79bb06473e3412585 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1810

[GitHub] [hudi] ad1happy2go commented on issue #9050: [SUPPORT] Hudi Metadata BloomIndex stats failed (Failed to get the bloom filter)

2023-06-25 Thread via GitHub
ad1happy2go commented on issue #9050: URL: https://github.com/apache/hudi/issues/9050#issuecomment-1606634740 @neerajpadarthi I was able to reproduce this issue with Hudi 0.11.X versions of hudi but it got fixed with hudi 0.12.x. Code - ``` val path="file:///tmp/output/issu

[GitHub] [hudi] boneanxs commented on a diff in pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
boneanxs commented on code in PR #7343: URL: https://github.com/apache/hudi/pull/7343#discussion_r1241517646 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -143,6 +143,15 @@ public class HoodieClusteringConfig extends Ho

[GitHub] [hudi] hudi-bot commented on pull request #9051: [HUDI-6436] Make the function of AlterHoodieTableChangeColumnCommand …

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9051: URL: https://github.com/apache/hudi/pull/9051#issuecomment-1606568295 ## CI report: * 380eb02521631bcbe138829dae332e496b335281 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1810

[GitHub] [hudi] danny0405 commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9035: URL: https://github.com/apache/hudi/pull/9035#discussion_r1241489704 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java: ## @@ -138,9 +139,35 @@ protected Path makeNewFilePath(String partitionPath,

[GitHub] [hudi] danny0405 commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9035: URL: https://github.com/apache/hudi/pull/9035#discussion_r1241489103 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -612,6 +612,20 @@ public class HoodieWriteConfig extends HoodieConfi

[GitHub] [hudi] danny0405 commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9035: URL: https://github.com/apache/hudi/pull/9035#discussion_r1241488956 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -901,6 +901,9 @@ private void startCommit(String instantTime, St

[GitHub] [hudi] hudi-bot commented on pull request #9051: [HUDI-6436] Make the function of AlterHoodieTableChangeColumnCommand …

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9051: URL: https://github.com/apache/hudi/pull/9051#issuecomment-1606563050 ## CI report: * 380eb02521631bcbe138829dae332e496b335281 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] danny0405 commented on a diff in pull request #9035: [HUDI-6416] Completion markers for handling execution engine (spark) …

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9035: URL: https://github.com/apache/hudi/pull/9035#discussion_r1241487644 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/marker/DirectWriteMarkers.java: ## @@ -119,7 +139,7 @@ public Set createdAndMergedDataPaths(Hoodie

[jira] [Closed] (HUDI-6226) Leverage parquet bloom filter feature

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6226. Resolution: Fixed Fixed via master branch: e038901efb9f8eadd2a62c98500ef8cbd1fc0bc3 > Leverage parquet bloo

[jira] [Updated] (HUDI-6226) Leverage parquet bloom filter feature

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6226: - Fix Version/s: 0.14.0 > Leverage parquet bloom filter feature > - > >

[hudi] branch master updated (f39327c3c1a -> e038901efb9)

2023-06-25 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from f39327c3c1a [HUDI-6433] Make the meta sync of streaming sink thread safe (#9046) add e038901efb9 [HUDI-6226] Su

[GitHub] [hudi] hudi-bot commented on pull request #9037: [HUDI-6420] Fixing Hfile on-demand and prefix based reads to use optimized apis

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9037: URL: https://github.com/apache/hudi/pull/9037#issuecomment-1606555093 ## CI report: * 07f58aad9d3d0ad866f5ee44e05d7ee5022c43c1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1807

[GitHub] [hudi] danny0405 merged pull request #8716: [HUDI-6226] Support parquet native bloom filters

2023-06-25 Thread via GitHub
danny0405 merged PR #8716: URL: https://github.com/apache/hudi/pull/8716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] danny0405 commented on a diff in pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #7343: URL: https://github.com/apache/hudi/pull/7343#discussion_r1241483430 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -143,6 +143,15 @@ public class HoodieClusteringConfig extends H

[GitHub] [hudi] danny0405 commented on a diff in pull request #9051: [HUDI-6436] Make the function of AlterHoodieTableChangeColumnCommand …

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9051: URL: https://github.com/apache/hudi/pull/9051#discussion_r1241479716 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/AlterHoodieTableChangeColumnCommand.scala: ## @@ -102,4 +104,8 @@ case class

[GitHub] [hudi] boneanxs commented on a diff in pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
boneanxs commented on code in PR #7343: URL: https://github.com/apache/hudi/pull/7343#discussion_r1241478711 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -143,6 +143,15 @@ public class HoodieClusteringConfig extends Ho

[GitHub] [hudi] danny0405 closed pull request #9039: [HUDI-6424] getOldestInstantToRetainForCompaction needs to add clean validation

2023-06-25 Thread via GitHub
danny0405 closed pull request #9039: [HUDI-6424] getOldestInstantToRetainForCompaction needs to add clean validation URL: https://github.com/apache/hudi/pull/9039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [hudi] danny0405 commented on pull request #9039: [HUDI-6424] getOldestInstantToRetainForCompaction needs to add clean validation

2023-06-25 Thread via GitHub
danny0405 commented on PR #9039: URL: https://github.com/apache/hudi/pull/9039#issuecomment-1606546919 Nice findings, should already be fixed via: https://github.com/apache/hudi/pull/8373 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[jira] [Updated] (HUDI-6436) AlterHoodieTableChangeColumnCommand should not change column nullable

2023-06-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6436: - Labels: pull-request-available (was: ) > AlterHoodieTableChangeColumnCommand should not change co

[GitHub] [hudi] Zouxxyy opened a new pull request, #9051: [HUDI-6436] Make the function of AlterHoodieTableChangeColumnCommand …

2023-06-25 Thread via GitHub
Zouxxyy opened a new pull request, #9051: URL: https://github.com/apache/hudi/pull/9051 …consistent with AlterTableChangeColumnCommand ### Change Logs Currently `AlterTableChangeColumnCommand` only supports modifying comment, `AlterHoodieTableChangeColumnCommand` should be cons

[GitHub] [hudi] danny0405 commented on a diff in pull request #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9049: URL: https://github.com/apache/hudi/pull/9049#discussion_r1241471454 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/ClientIds.java: ## @@ -167,6 +167,7 @@ private void updateHeartbeat(Path heartbeatFilePath) throws

[GitHub] [hudi] danny0405 commented on a diff in pull request #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9049: URL: https://github.com/apache/hudi/pull/9049#discussion_r1241471218 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java: ## @@ -262,6 +262,7 @@ private void updateHeartbeat(String i

[jira] [Closed] (HUDI-6433) Make the meta sync of streaming sink thread safe

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6433. Resolution: Fixed Fixed via master branch: f39327c3c1aa668faaeded6e789cc74150d08923 > Make the meta sync of

[hudi] branch master updated: [HUDI-6433] Make the meta sync of streaming sink thread safe (#9046)

2023-06-25 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f39327c3c1a [HUDI-6433] Make the meta sync of s

[GitHub] [hudi] danny0405 merged pull request #9046: [HUDI-6433] Make the meta sync of streaming sink thread safe

2023-06-25 Thread via GitHub
danny0405 merged PR #9046: URL: https://github.com/apache/hudi/pull/9046 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] danny0405 commented on a diff in pull request #9013: [HUDI-6437] Refine avg record size by considering both commit and deltacommit

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9013: URL: https://github.com/apache/hudi/pull/9013#discussion_r1241469975 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -170,7 +171,7 @@ private void assignInserts(WorkloadProf

[jira] [Updated] (HUDI-6437) Refine avg record size by considering both commit and deltacommit

2023-06-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6437: - Labels: pull-request-available (was: ) > Refine avg record size by considering both commit and de

[GitHub] [hudi] guanlisheng commented on a diff in pull request #9013: [HUDI-6437] Refine avg record size by considering both commit and deltacommit

2023-06-25 Thread via GitHub
guanlisheng commented on code in PR #9013: URL: https://github.com/apache/hudi/pull/9013#discussion_r1241468919 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -170,7 +171,7 @@ private void assignInserts(WorkloadPr

[jira] [Updated] (HUDI-6437) Refine avg record size by considering both commit and deltacommit

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6437: - Fix Version/s: 0.14.0 > Refine avg record size by considering both commit and deltacommit > --

[jira] [Created] (HUDI-6437) Refine avg record size by considering both commit and deltacommit

2023-06-25 Thread Danny Chen (Jira)
Danny Chen created HUDI-6437: Summary: Refine avg record size by considering both commit and deltacommit Key: HUDI-6437 URL: https://issues.apache.org/jira/browse/HUDI-6437 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #9037: [HUDI-6420] Fixing Hfile on-demand and prefix based reads to use optimized apis

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9037: URL: https://github.com/apache/hudi/pull/9037#issuecomment-1606530870 ## CI report: * 07f58aad9d3d0ad866f5ee44e05d7ee5022c43c1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1807

[GitHub] [hudi] danny0405 commented on a diff in pull request #9013: refine avg record size by considering both commit and deltacommit

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #9013: URL: https://github.com/apache/hudi/pull/9013#discussion_r1241465457 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -170,7 +171,7 @@ private void assignInserts(WorkloadProf

[jira] [Closed] (HUDI-5902) Parallelise glue sync calls

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-5902. Resolution: Fixed Fixed via master branch: dcb1135207a1c5f151c7c5a4f9e85e0cd7e7bf32 > Parallelise glue sync

[jira] [Updated] (HUDI-5902) Parallelise glue sync calls

2023-06-25 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-5902: - Fix Version/s: 0.14.0 > Parallelise glue sync calls > --- > > Key:

[hudi] branch master updated (096fe11482d -> dcb1135207a)

2023-06-25 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 096fe11482d [HUDI-6432] Fix TestCleanPlanExecutor (#9045) add dcb1135207a [HUDI-5902] Parallelize glue sync call

[GitHub] [hudi] danny0405 merged pull request #8124: [HUDI-5902]: Parallelise glue sync calls

2023-06-25 Thread via GitHub
danny0405 merged PR #8124: URL: https://github.com/apache/hudi/pull/8124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] danny0405 commented on pull request #8124: [HUDI-5902]: Parallelise glue sync calls

2023-06-25 Thread via GitHub
danny0405 commented on PR #8124: URL: https://github.com/apache/hudi/pull/8124#issuecomment-1606526751 Assumed you have done the tests locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] danny0405 closed pull request #9030: [HUDI-6328] Flink support generate resize plan for consistent bucket index

2023-06-25 Thread via GitHub
danny0405 closed pull request #9030: [HUDI-6328] Flink support generate resize plan for consistent bucket index URL: https://github.com/apache/hudi/pull/9030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [hudi] danny0405 commented on a diff in pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
danny0405 commented on code in PR #7343: URL: https://github.com/apache/hudi/pull/7343#discussion_r1241458726 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -143,6 +143,15 @@ public class HoodieClusteringConfig extends H

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606431331 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 3bfadb0f7a5cf7c8db36272ce3791b8cc853a5d0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606423520 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 3bfadb0f7a5cf7c8db36272ce3791b8cc853a5d0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] zhuanshenbsj1 commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
zhuanshenbsj1 commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606414825 @hudi-bot run azur -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #9041: [HUDI-6431] Support update partition path in record-level index

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9041: URL: https://github.com/apache/hudi/pull/9041#issuecomment-1606224076 ## CI report: * b681df04a7ad0febbcd9235622c2ee7f98759cf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1808

[GitHub] [hudi] neerajpadarthi opened a new issue, #9050: [SUPPORT] Hudi Metadata BloomIndex stats failed (Failed to get the bloom filter)

2023-06-25 Thread via GitHub
neerajpadarthi opened a new issue, #9050: URL: https://github.com/apache/hudi/issues/9050 Hello team, I am using EMR 6.7/Hudi Version 0.11.0. During ingestion, I enabled the metadata bloom filter stats(**hoodie.metadata.index.bloom.filter.enable**), and to refer these stats in Upsert

[GitHub] [hudi] hudi-bot commented on pull request #9041: [HUDI-6431] Support update partition path in record-level index

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9041: URL: https://github.com/apache/hudi/pull/9041#issuecomment-1606197589 ## CI report: * b681df04a7ad0febbcd9235622c2ee7f98759cf9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1808

[GitHub] [hudi] hudi-bot commented on pull request #9041: [HUDI-6431] Support update partition path in record-level index

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9041: URL: https://github.com/apache/hudi/pull/9041#issuecomment-1606169674 ## CI report: * b681df04a7ad0febbcd9235622c2ee7f98759cf9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] xushiyan commented on pull request #9017: [WIP][HUDI-6393] Add functional tests for RecordLevelIndex

2023-06-25 Thread via GitHub
xushiyan commented on PR #9017: URL: https://github.com/apache/hudi/pull/9017#issuecomment-1606166153 @lokeshj1703 can we disable the known failing testcases and land this PR first? some code clean-up and the testcases' runtime optimization would be needed though -- This is an automated

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606159682 ## CI report: * f018119dfff1a1ecb6dbdaa79bb06473e3412585 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1810

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606156576 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * 02b3132bf72d0d456d495a06501fd051d50b381e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9049: URL: https://github.com/apache/hudi/pull/9049#issuecomment-1606141694 ## CI report: * 8960860b33c4b0a0016d8ee718525cb58f0a6959 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[jira] [Created] (HUDI-6436) AlterHoodieTableChangeColumnCommand should not change column nullable

2023-06-25 Thread zouxxyy (Jira)
zouxxyy created HUDI-6436: - Summary: AlterHoodieTableChangeColumnCommand should not change column nullable Key: HUDI-6436 URL: https://issues.apache.org/jira/browse/HUDI-6436 Project: Apache Hudi Is

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606121649 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 3bfadb0f7a5cf7c8db36272ce3791b8cc853a5d0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606121266 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * f4cf8eb1001906a9f93677f37c2cb028dd049106 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606121103 ## CI report: * aa529e6c286b7bde0301751a1ad452c899d36479 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606107948 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * f4cf8eb1001906a9f93677f37c2cb028dd049106 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606107753 ## CI report: * aa529e6c286b7bde0301751a1ad452c899d36479 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] boneanxs opened a new pull request, #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
boneanxs opened a new pull request, #8452: URL: https://github.com/apache/hudi/pull/8452 ### Change Logs 1. Implement some basic `Expression`s for HUDI 2. Try to convert all spark `Expression` to HUDI `Expression` 3. Implement `PartialBindVisitor` and `BindVistor` to bind values

[GitHub] [hudi] boneanxs closed pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
boneanxs closed pull request #8452: [HUDI-6077] Add more partition push down filters URL: https://github.com/apache/hudi/pull/8452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1606105821 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 4df779ff644aa4dca955c8db1f14ba091b1ba82d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606102136 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * f4cf8eb1001906a9f93677f37c2cb028dd049106 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #9048: [HUDI-6434] Fix illegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9048: URL: https://github.com/apache/hudi/pull/9048#issuecomment-1606082805 ## CI report: * 94a99ae8e528129288071087c9ebd0f7f567 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606074223 ## CI report: * aa529e6c286b7bde0301751a1ad452c899d36479 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9049: URL: https://github.com/apache/hudi/pull/9049#issuecomment-1606061425 ## CI report: * 8960860b33c4b0a0016d8ee718525cb58f0a6959 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606057759 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 1b759be0a856503f51d8c054a68ad098e5c869bd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #9030: [HUDI-6328] Flink support generate resize plan for consistent bucket index

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9030: URL: https://github.com/apache/hudi/pull/9030#issuecomment-1606057729 ## CI report: * 4fb1de96d7f1ceb26b45228c67dab6697908b9cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9049: URL: https://github.com/apache/hudi/pull/9049#issuecomment-1606057771 ## CI report: * 8960860b33c4b0a0016d8ee718525cb58f0a6959 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606056977 ## CI report: * 6052ffe99900de18214c07820c3f9c50f536ba83 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=180

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606055539 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 1b759be0a856503f51d8c054a68ad098e5c869bd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606054978 ## CI report: * a1538aeaf01d9b02c5c26334b1212fa24c4c210c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1345

[jira] [Updated] (HUDI-6435) Add some logs to the updateHeartbeat method

2023-06-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6435: - Labels: pull-request-available (was: ) > Add some logs to the updateHeartbeat method > --

[GitHub] [hudi] c-f-cooper opened a new pull request, #9049: [HUDI-6435] Add some logs to the updateHeartbeat method

2023-06-25 Thread via GitHub
c-f-cooper opened a new pull request, #9049: URL: https://github.com/apache/hudi/pull/9049 ### Change Logs Add some logs to the updateHeartbeat,in order to find the concurrency issues. ### Impact updateHeartbeat ### Risk level (write none, low medium or high below)

[GitHub] [hudi] boneanxs commented on a diff in pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
boneanxs commented on code in PR #7343: URL: https://github.com/apache/hudi/pull/7343#discussion_r1241140092 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -143,6 +143,15 @@ public class HoodieClusteringConfig extends Ho

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1606038124 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 678535aab0611b66b98e4f6fdcaf64ac9954d7d5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606036699 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * 4d8ba7c81f123af23dca05642c5562ac79eed4ff Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #7343: [HUDI-5303] Allow users to control the concurrency to submit jobs in clustering

2023-06-25 Thread via GitHub
hudi-bot commented on PR #7343: URL: https://github.com/apache/hudi/pull/7343#issuecomment-1606035793 ## CI report: * a1538aeaf01d9b02c5c26334b1212fa24c4c210c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1345

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1606028164 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 678535aab0611b66b98e4f6fdcaf64ac9954d7d5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1606028004 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * 4d8ba7c81f123af23dca05642c5562ac79eed4ff Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #9048: [HUDI-6434] Fix illegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9048: URL: https://github.com/apache/hudi/pull/9048#issuecomment-1606025223 ## CI report: * 94a99ae8e528129288071087c9ebd0f7f567 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1606024796 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 678535aab0611b66b98e4f6fdcaf64ac9954d7d5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Created] (HUDI-6435) Add some logs to the updateHeartbeat method

2023-06-25 Thread lichangfu (Jira)
lichangfu created HUDI-6435: --- Summary: Add some logs to the updateHeartbeat method Key: HUDI-6435 URL: https://issues.apache.org/jira/browse/HUDI-6435 Project: Apache Hudi Issue Type: Wish

[GitHub] [hudi] boneanxs commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-25 Thread via GitHub
boneanxs commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1606013767 > if oyu could attach the query plan for before and after this change, it would be helpful. There's no query plan difference btw before and after, since all filters will be pushed to

[GitHub] [hudi] hudi-bot commented on pull request #9048: [HUDI-6434] Fix illegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9048: URL: https://github.com/apache/hudi/pull/9048#issuecomment-1606007029 ## CI report: * 94a99ae8e528129288071087c9ebd0f7f567 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1606001189 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 1b759be0a856503f51d8c054a68ad098e5c869bd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-

[jira] [Updated] (HUDI-6434) IllegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6434: - Labels: pull-request-available (was: ) > IllegalArgumentException when do read_optimized read in

[GitHub] [hudi] flashJd opened a new pull request, #9048: [HUDI-6434] Fix illegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread via GitHub
flashJd opened a new pull request, #9048: URL: https://github.com/apache/hudi/pull/9048 ### Change Logs when do read_optimized read in Flink on a table that contains log files only, illegalArgumentException will be threw;Actually we should get an empty set. ### Impact N/A

[jira] [Created] (HUDI-6434) IllegalArgumentException when do read_optimized read in Flink

2023-06-25 Thread yonghua jian (Jira)
yonghua jian created HUDI-6434: -- Summary: IllegalArgumentException when do read_optimized read in Flink Key: HUDI-6434 URL: https://issues.apache.org/jira/browse/HUDI-6434 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1605998469 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 1b759be0a856503f51d8c054a68ad098e5c869bd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] zhuanshenbsj1 commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
zhuanshenbsj1 commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1605992461 @hudi-bot run azur -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1605970336 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN * 1b759be0a856503f51d8c054a68ad098e5c869bd UNKNOWN Bot commands @hudi-bot supports the following

[GitHub] [hudi] hudi-bot commented on pull request #9030: [HUDI-6328] Flink support generate resize plan for consistent bucket index

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9030: URL: https://github.com/apache/hudi/pull/9030#issuecomment-1605970028 ## CI report: * 9d724202d3fde4914bb1351baf9c355977c08eb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1807

[GitHub] [hudi] hudi-bot commented on pull request #9047: [Hudi 6422] Solve the issues of compiling dependency on Hadoop 3.1.1

2023-06-25 Thread via GitHub
hudi-bot commented on PR #9047: URL: https://github.com/apache/hudi/pull/9047#issuecomment-1605956497 ## CI report: * b21b404def382da18dcb406bd9957a844913c280 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

  1   2   >