Re: [PR] [MINOR] Make ordering deterministic in small file selection [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11008: URL: https://github.com/apache/hudi/pull/11008#issuecomment-2051884870 ## CI report: * e7dde68f9c2bda3e1045d3bcda6c2472072395a0 Azure:

Re: [PR] [HUDI-7565] Create spark file readers to read a single file instead of an entire partition [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10954: URL: https://github.com/apache/hudi/pull/10954#issuecomment-2051884556 ## CI report: * 8f1ba6d46d8777f39c522d8bcac545ba3d4fd544 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]

2024-04-12 Thread via GitHub
jonvex commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1562569055 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -201,8 +201,26 @@ object HoodieWriterUtils {

Re: [PR] [MINOR] Make ordering deterministic in small file selection [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11008: URL: https://github.com/apache/hudi/pull/11008#issuecomment-2051806863 ## CI report: * e7dde68f9c2bda3e1045d3bcda6c2472072395a0 Azure:

Re: [PR] [MINOR] Make ordering deterministic in small file selection [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11008: URL: https://github.com/apache/hudi/pull/11008#issuecomment-2051793682 ## CI report: * e7dde68f9c2bda3e1045d3bcda6c2472072395a0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7609] Support array field type whose element type can be nullable [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11006: URL: https://github.com/apache/hudi/pull/11006#issuecomment-2051780035 ## CI report: * 33451d51be0e7999695483b980aba6d57052bf1b Azure:

Re: [I] [SUPPORT] Rollback failed clustering 0.12.2 [hudi]

2024-04-12 Thread via GitHub
VitoMakarevich commented on issue #10964: URL: https://github.com/apache/hudi/issues/10964#issuecomment-2051734341 I managed to do it with [hoodie.clustering.updates.strategy](https://hudi.apache.org/docs/configurations/#hoodieclusteringupdatesstrategy) ->

Re: [PR] [HUDI-7576] Improve efficiency of getRelativePartitionPath, reduce computation of partitionPath in AbstractTableFileSystemView [hudi]

2024-04-12 Thread via GitHub
the-other-tim-brown commented on PR #11001: URL: https://github.com/apache/hudi/pull/11001#issuecomment-2051712705 @danny0405 https://github.com/apache/hudi/pull/11008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] [MINOR] Make ordering deterministic in small file selection [hudi]

2024-04-12 Thread via GitHub
the-other-tim-brown opened a new pull request, #11008: URL: https://github.com/apache/hudi/pull/11008 ### Change Logs Makes the ordering deterministic to get consistent results and avoid any issues in tests ### Impact Small file selection is consistent (mostly helps

Re: [PR] [HUDI-7576] Improve efficiency of getRelativePartitionPath, reduce computation of partitionPath in AbstractTableFileSystemView [hudi]

2024-04-12 Thread via GitHub
the-other-tim-brown commented on PR #11001: URL: https://github.com/apache/hudi/pull/11001#issuecomment-2051701476 @danny0405 error is: ``` TestUpsertPartitioner.testUpsertPartitionerWithSmallFileHandlingPickingMultipleCandidates:470 expected: <[BucketInfo {bucketType=UPDATE,

Re: [PR] [HUDI-7608] Fix Flink table creation configuration not taking effect when writing… [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11005: URL: https://github.com/apache/hudi/pull/11005#issuecomment-2051683072 ## CI report: * c0ca195bf69614784e60bd51d300df04a61fdf21 Azure:

[I] [SUPPORT]After compacting, there are a large number of logs with size 0, and they can never be cleared. [hudi]

2024-04-12 Thread via GitHub
MrAladdin opened a new issue, #11007: URL: https://github.com/apache/hudi/issues/11007 **Describe the problem you faced** 1、spark structured streaming : upsert mor (record_index) 2、After compacting, there are a large number of logs with size 0, and they can never be cleared.

Re: [PR] [HUDI-7609] Support array field type whose element type can be nullable [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11006: URL: https://github.com/apache/hudi/pull/11006#issuecomment-2051614975 ## CI report: * 33451d51be0e7999695483b980aba6d57052bf1b Azure:

[jira] [Updated] (HUDI-7609) Spark cannot write the hudi table containing array type created by flink

2024-04-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7609: - Labels: pull-request-available (was: ) > Spark cannot write the hudi table containing array type

Re: [PR] [HUDI-7609] Support array field type whose element type can be nullable [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11006: URL: https://github.com/apache/hudi/pull/11006#issuecomment-2051604069 ## CI report: * 33451d51be0e7999695483b980aba6d57052bf1b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-12 Thread via GitHub
ad1happy2go commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2051561910 @brightwon Yes changing precombining key will not be allowed. I do understand you trying to repartition to scale the tagging stage. You can try repartition on record key and see

[jira] [Created] (HUDI-7609) Spark cannot write the hudi table containing array type created by flink

2024-04-12 Thread Jira
陈磊 created HUDI-7609: Summary: Spark cannot write the hudi table containing array type created by flink Key: HUDI-7609 URL: https://issues.apache.org/jira/browse/HUDI-7609 Project: Apache Hudi Issue

[PR] Support array field type whose element type can be nullable [hudi]

2024-04-12 Thread via GitHub
empcl opened a new pull request, #11006: URL: https://github.com/apache/hudi/pull/11006 ### Change Logs _Support array field type whose element type can be nullable._ ### Impact _none._ ### Risk level (write none, low medium or high below) _none._ --

Re: [PR] [HUDI-7608] Fix Flink table creation configuration not taking effect when writing… [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11005: URL: https://github.com/apache/hudi/pull/11005#issuecomment-2051533385 ## CI report: * c0ca195bf69614784e60bd51d300df04a61fdf21 Azure:

Re: [PR] [HUDI-7608] Fix Flink table creation configuration not taking effect when writing… [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11005: URL: https://github.com/apache/hudi/pull/11005#issuecomment-2051522148 ## CI report: * c0ca195bf69614784e60bd51d300df04a61fdf21 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[PR] [HUDI-7608] Fix Flink table creation configuration not taking effect when writing… [hudi]

2024-04-12 Thread via GitHub
empcl opened a new pull request, #11005: URL: https://github.com/apache/hudi/pull/11005 … to Spark ### Change Logs Fix Flink table creation configuration not taking effect when writing to Spark ### Impact _none._ ### Risk level (write none, low medium or

[jira] [Updated] (HUDI-7608) Flink table creation configuration not taking effect when writing to Spark

2024-04-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7608: - Labels: pull-request-available (was: ) > Flink table creation configuration not taking effect

[jira] [Created] (HUDI-7608) Flink table creation configuration not taking effect when writing to Spark

2024-04-12 Thread Jira
陈磊 created HUDI-7608: Summary: Flink table creation configuration not taking effect when writing to Spark Key: HUDI-7608 URL: https://issues.apache.org/jira/browse/HUDI-7608 Project: Apache Hudi Issue

Re: [I] [SUPPORT]insert_overwrite_table table slow [hudi]

2024-04-12 Thread via GitHub
wkhappy1 commented on issue #10979: URL: https://github.com/apache/hudi/issues/10979#issuecomment-2051443367 @ad1happy2go i try insert_overwrite_table append with test data, find it has two rdd cache in memory

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11000: URL: https://github.com/apache/hudi/pull/11000#issuecomment-2051442199 ## CI report: * 6c81f312f55df6d28363cb836202aa8ec7173a3e Azure:

[I] [SUPPORT] StreamWriteFunction support Exectly-Once in Flink ? [hudi]

2024-04-12 Thread via GitHub
seekforshell opened a new issue, #11004: URL: https://github.com/apache/hudi/issues/11004 **Describe the problem you faced** flink1.14.3 + hudi 0.12.1 when i use org.apache.hudi.sink.StreamWriteFunction in flink stream job, if jobmanager.execution.failover-strategy, region is

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
nsivabalan commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562262882 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -268,6 +268,7 @@ public boolean commitStats(String

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11000: URL: https://github.com/apache/hudi/pull/11000#issuecomment-2051329377 ## CI report: * 12cf06d732847bf9ca925bf2bb4e2e0eb39b8855 Azure:

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #11000: URL: https://github.com/apache/hudi/pull/11000#issuecomment-2051316642 ## CI report: * 12cf06d732847bf9ca925bf2bb4e2e0eb39b8855 Azure:

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
rmahindra123 commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562199875 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -236,8 +236,8 @@ public boolean commitStats(String

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
nsivabalan commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562194553 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -236,8 +236,8 @@ public boolean commitStats(String

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
rmahindra123 commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562176259 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -236,8 +236,8 @@ public boolean commitStats(String

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
rmahindra123 commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562176259 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -236,8 +236,8 @@ public boolean commitStats(String

Re: [PR] [HUDI-7606] Unpersist RDDs after table services, mainly compaction [hudi]

2024-04-12 Thread via GitHub
nsivabalan commented on code in PR #11000: URL: https://github.com/apache/hudi/pull/11000#discussion_r1562161402 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -236,8 +236,8 @@ public boolean commitStats(String

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10992: URL: https://github.com/apache/hudi/pull/10992#issuecomment-2051187608 ## CI report: * 1f421909625781304a531ccadcbf6a37ca5185a4 UNKNOWN * d8dda49ff97feca5172346047aacb007746568ae Azure:

Re: [PR] [MINOR] Streamer test setup performance [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10806: URL: https://github.com/apache/hudi/pull/10806#issuecomment-2051187027 ## CI report: * e0414708ebbd734156c0383cb4e5dbfe5ff4151a UNKNOWN * 11c19fa8fd39ed058a4e3487c99c793610b61564 UNKNOWN * d9f583043f1a5ffd532d613b2ce95aa7a8fddc47 Azure:

Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2051092856 ## CI report: * dfab8e1285bf0241eea2e71f9d85607c647446d7 Azure:

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10992: URL: https://github.com/apache/hudi/pull/10992#issuecomment-2051093641 ## CI report: * 1f421909625781304a531ccadcbf6a37ca5185a4 UNKNOWN * c8423769cd6ef01b7afcaafd63f51b9f450ec7ea Azure:

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-12 Thread via GitHub
hudi-bot commented on PR #10992: URL: https://github.com/apache/hudi/pull/10992#issuecomment-2051085060 ## CI report: * 1f421909625781304a531ccadcbf6a37ca5185a4 UNKNOWN * c8423769cd6ef01b7afcaafd63f51b9f450ec7ea Azure:

(hudi) branch master updated: [HUDI-7601] Add heartbeat mechanism to refresh lock (#10994)

2024-04-12 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 4fbf5c52f19 [HUDI-7601] Add heartbeat mechanism to

Re: [PR] [HUDI-7601] Add heartbeat mechanism to refresh lock [hudi]

2024-04-12 Thread via GitHub
leesf merged PR #10994: URL: https://github.com/apache/hudi/pull/10994 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

<    1   2