Re: [PR] [HUDI-7775] Remove unused APIs in HoodieStorage [hudi]

2024-05-23 Thread via GitHub
hudi-bot commented on PR #11281: URL: https://github.com/apache/hudi/pull/11281#issuecomment-2128686656 ## CI report: * 79939ac35fd5d7240fcb55a3d49e3f9ea42e5b56 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

(hudi) branch master updated (f2e276a38e5 -> bcc1f8de4d9)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from f2e276a38e5 [MINOR] Query index warning check (#11276) add bcc1f8de4d9 [HUDI-7785] Keep public APIs in utilities mod

Re: [PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua merged PR #11279: URL: https://github.com/apache/hudi/pull/11279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.o

Re: [PR] [MINOR] Release 0.15.0 roaring bit map dependency [hudi]

2024-05-23 Thread via GitHub
yihua closed pull request #11282: [MINOR] Release 0.15.0 roaring bit map dependency URL: https://github.com/apache/hudi/pull/11282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] [MINOR] Release 0.15.0 roaring bit map dependency [hudi]

2024-05-23 Thread via GitHub
yihua commented on PR #11282: URL: https://github.com/apache/hudi/pull/11282#issuecomment-2128654871 The same changes are merged to `branch-0.x` through #11283 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua merged PR #11280: URL: https://github.com/apache/hudi/pull/11280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.o

(hudi) branch branch-0.x updated: [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction (#11280)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch branch-0.x in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/branch-0.x by this push: new ace55a93e58 [HUDI-7785] Keep public APIs in

(hudi) branch release-0.15.0 updated: [HUDI-7786] Fix roaring bitmap dependency in hudi-integ-test-bundle (#11283)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch release-0.15.0 in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/release-0.15.0 by this push: new 0de2e8018ff [HUDI-7786] Fix roaring

[jira] [Updated] (HUDI-7786) Fix roaring bitmap dependency in hudi-integ-test-bundle

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7786: Summary: Fix roaring bitmap dependency in hudi-integ-test-bundle (was: Fix roaring bitmap dependency in int

[jira] [Updated] (HUDI-7786) Fix roaring bitmap dependency in hudi-integ-test-bundle

2024-05-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7786: - Labels: pull-request-available (was: ) > Fix roaring bitmap dependency in hudi-integ-test-bundle

(hudi) branch branch-0.x updated (2e39b41be07 -> 1162234541c)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch branch-0.x in repository https://gitbox.apache.org/repos/asf/hudi.git from 2e39b41be07 [HUDI-7784] Fix serde of HoodieHadoopConfiguration in Spark (#11270) add 1162234541c [HUDI-7786] Fi

Re: [PR] [HUDI-7786] Fix roaring bitmap dependency in hudi-integ-test-bundle [hudi]

2024-05-23 Thread via GitHub
yihua merged PR #11283: URL: https://github.com/apache/hudi/pull/11283 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.o

[jira] [Created] (HUDI-7786) Fix roaring bitmap dependency in integ test bundle

2024-05-23 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7786: --- Summary: Fix roaring bitmap dependency in integ test bundle Key: HUDI-7786 URL: https://issues.apache.org/jira/browse/HUDI-7786 Project: Apache Hudi Issue Type: Bug

[PR] [MINOR] Release 0.15.0 roaring bit map dependency [hudi]

2024-05-23 Thread via GitHub
nsivabalan opened a new pull request, #11283: URL: https://github.com/apache/hudi/pull/11283 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any perform

[PR] [MINOR] Release 0.15.0 roaring bit map dependency [hudi]

2024-05-23 Thread via GitHub
nsivabalan opened a new pull request, #11282: URL: https://github.com/apache/hudi/pull/11282 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any perform

[PR] [HUDI-7775] Remove unused APIs in HoodieStorage [hudi]

2024-05-23 Thread via GitHub
yihua opened a new pull request, #11281: URL: https://github.com/apache/hudi/pull/11281 ### Change Logs PR targeting master: https://github.com/apache/hudi/pull/11255 This PR targeting `branch-0.x` contains the same changes as the above. As above. ### Impact Sim

Re: [PR] [HUDI-7785][branch-0.x] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua commented on PR #11280: URL: https://github.com/apache/hudi/pull/11280#issuecomment-2128566853 Azure CI is green. https://github.com/apache/hudi/assets/2497195/4b71299c-259a-42b1-9a77-e053c484ac3a";> -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua commented on PR #11279: URL: https://github.com/apache/hudi/pull/11279#issuecomment-2128566038 Azure CI is green. https://github.com/apache/hudi/assets/2497195/81fbadce-3391-4054-8f65-eefac5eee824";> -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua commented on PR #11279: URL: https://github.com/apache/hudi/pull/11279#issuecomment-2128564075 > Did you comb through entire hudi utilities for similar changes? for eg, adhoc jobs like metadata validator, snapshot exporter etc? Yes, I went through all classes in `hudi-utilities`

Re: [PR] [HUDI-7785][branch-0.x] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
hudi-bot commented on PR #11280: URL: https://github.com/apache/hudi/pull/11280#issuecomment-2128378909 ## CI report: * d278d260c57ff33656e3d51f4fc8200c634ae2c3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7779: -- Description: Archiving commits from active timeline could lead to data consistency issue

[jira] [Assigned] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-7779: - Assignee: sivabalan narayanan > Guarding archival to not archive unintended commi

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7779: -- Description: Archiving commits from active timeline could lead to data consistency issue

[jira] [Commented] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849139#comment-17849139 ] sivabalan narayanan commented on HUDI-7779: --- Hey Sagar,      I updated the Jira

Re: [PR] [HUDI-7785][branch-0.x] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
nsivabalan commented on PR #11280: URL: https://github.com/apache/hudi/pull/11280#issuecomment-2128345854 left comments in https://github.com/apache/hudi/pull/11279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
hudi-bot commented on PR #11279: URL: https://github.com/apache/hudi/pull/11279#issuecomment-2128325249 ## CI report: * bc54848427ed5e47cfb4c53abb3d87716630cb51 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7779: -- Description: Archiving commits from active timeline could lead to data consistency issue

[PR] [HUDI-7785][branch-0.x] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua opened a new pull request, #11280: URL: https://github.com/apache/hudi/pull/11280 ### Change Logs Same changes targeting master: https://github.com/apache/hudi/pull/11279 This PR targets at `branch-0.x`. `BaseErrorTableWriter`, `HoodieStreamer`, `StreamSync`, and `Stre

[jira] [Updated] (HUDI-7785) Keep public APIs in utilities module the same as before HoodieStorage abstraction

2024-05-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7785: - Labels: hoodie-storage pull-request-available (was: hoodie-storage) > Keep public APIs in utiliti

[PR] [HUDI-7785] Keep public APIs in utilities module the same as before HoodieStorage abstraction [hudi]

2024-05-23 Thread via GitHub
yihua opened a new pull request, #11279: URL: https://github.com/apache/hudi/pull/11279 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

[jira] [Updated] (HUDI-7785) Keep public APIs in utilities module the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Summary: Keep public APIs in utilities module the same as before HoodieStorage abstraction (was: Keep the A

[jira] [Updated] (HUDI-7785) Keep the APIs in utilities module the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Description: BaseErrorTableWriter, HoodieStreamer, StreamSync, etc., are public API classes and contain publ

[jira] [Updated] (HUDI-7785) Keep the APIs in utilities module the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Summary: Keep the APIs in utilities module the same as before HoodieStorage abstraction (was: Keep the Base

(hudi) branch master updated: [MINOR] Query index warning check (#11276)

2024-05-23 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f2e276a38e5 [MINOR] Query index warning check (

Re: [PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
danny0405 merged PR #11276: URL: https://github.com/apache/hudi/pull/11276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apac

[jira] [Closed] (HUDI-4491) Re-enable TestHoodieFlinkQuickstart

2024-05-23 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-4491. Resolution: Fixed Fixed via master branch: 8d4a35b1f2e60457cc4316b82c0e1b221ac1ca7e > Re-enable TestHoodieF

Re: [PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
danny0405 commented on PR #11276: URL: https://github.com/apache/hudi/pull/11276#issuecomment-2128293727 The tests have passed: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=24027&view=results -- This is an automated message from the Apache Git Service. To

[jira] [Updated] (HUDI-4491) Re-enable TestHoodieFlinkQuickstart

2024-05-23 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-4491: - Fix Version/s: 1.0.0 > Re-enable TestHoodieFlinkQuickstart > > >

(hudi) branch master updated: [HUDI-4491] Re-enable TestHoodieFlinkQuickstart (#11272)

2024-05-23 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 8d4a35b1f2e [HUDI-4491] Re-enable TestHoodieFli

Re: [PR] [HUDI-4491] Re-enable TestHoodieFlinkQuickstart [hudi]

2024-05-23 Thread via GitHub
danny0405 merged PR #11272: URL: https://github.com/apache/hudi/pull/11272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apac

Re: [PR] [HUDI-4491] Re-enable TestHoodieFlinkQuickstart [hudi]

2024-05-23 Thread via GitHub
danny0405 commented on PR #11272: URL: https://github.com/apache/hudi/pull/11272#issuecomment-2128292776 The tests passed: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=24025&view=results -- This is an automated message from the Apache Git Service. To resp

Re: [I] Failed insert schema compatibility mismatch issue [hudi]

2024-05-23 Thread via GitHub
danny0405 commented on issue #11277: URL: https://github.com/apache/hudi/issues/11277#issuecomment-2128291509 Did you try to use the `drop column` statement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
danny0405 commented on code in PR #11276: URL: https://github.com/apache/hudi/pull/11276#discussion_r1612519725 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ## @@ -116,8 +116,8 @@ class ColumnStatsIndexSupport(spark: Sp

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Description: BaseErrorTableWriter is a public API class which should be kept the same as before,  (was: Bas

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Description: BaseErrorTableWriter is a public API class which should be kept the same as before. (was: Base

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Labels: hoodie-storage (was: ) > Keep the BaseErrorTableWriter APIs the same as before HoodieStorage > abs

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Priority: Blocker (was: Major) > Keep the BaseErrorTableWriter APIs the same as before HoodieStorage > abs

[jira] [Assigned] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7785: --- Assignee: Ethan Guo > Keep the BaseErrorTableWriter APIs the same as before HoodieStorage > abstract

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Description: BaseErrorTableWriter is a public API class > Keep the BaseErrorTableWriter APIs the same as bef

[jira] [Updated] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7785: Fix Version/s: 0.15.0 1.0.0 > Keep the BaseErrorTableWriter APIs the same as before Hoodi

[jira] [Created] (HUDI-7785) Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction

2024-05-23 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7785: --- Summary: Keep the BaseErrorTableWriter APIs the same as before HoodieStorage abstraction Key: HUDI-7785 URL: https://issues.apache.org/jira/browse/HUDI-7785 Project: Apache Hud

[I] [SUPPORT] Datadog Metrics reporter fails with null pointer exception using hudi 0.14.0 [hudi]

2024-05-23 Thread via GitHub
hpalaniswamy opened a new issue, #11278: URL: https://github.com/apache/hudi/issues/11278 **Describe the problem you faced** I am trying to setup the datadog metrics reporter with an api key for some hudi spark jobs and getting the following issue when using `0.14.0` and spark `3.1.3` o

Re: [I] [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data [hudi]

2024-05-23 Thread via GitHub
shubhamn21 commented on issue #11254: URL: https://github.com/apache/hudi/issues/11254#issuecomment-2127897417 Closing this as it is no longer an issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data [hudi]

2024-05-23 Thread via GitHub
shubhamn21 closed issue #11254: [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data URL: https://github.com/apache/hudi/issues/11254 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
the-other-tim-brown commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612095281 ## hudi-spark-datasource/hudi-spark-common/src/test/java/org/apache/hudi/TestHoodieSchemaUtils.java: ## @@ -239,6 +240,51 @@ void testMissingColumn(boolean a

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
the-other-tim-brown commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612095035 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestBasicSchemaEvolution.scala: ## @@ -169,20 +169,35 @@ class TestBasicSchemaE

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
jonvex commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612085493 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestBasicSchemaEvolution.scala: ## @@ -169,20 +169,35 @@ class TestBasicSchemaEvolution exte

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
jonvex commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612082279 ## hudi-spark-datasource/hudi-spark-common/src/test/java/org/apache/hudi/TestHoodieSchemaUtils.java: ## @@ -239,6 +240,51 @@ void testMissingColumn(boolean allowDroppedCo

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
jonvex commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612076217 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSchemaUtils.scala: ## @@ -93,14 +93,14 @@ object HoodieSchemaUtils { // in the ta

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-23 Thread via GitHub
jonvex commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1612070797 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/InternalSchemaBuilder.java: ## @@ -67,6 +68,10 @@ public Map buildNameToId(Type type) { return visit(typ

(hudi) 02/02: [HUDI-7784] Fix serde of HoodieHadoopConfiguration in Spark (#11270)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch release-0.15.0 in repository https://gitbox.apache.org/repos/asf/hudi.git commit ed2dc912003a7f5f13f09374aec99399ff8d614b Author: Y Ethan Guo AuthorDate: Wed May 22 15:27:48 2024 -0700 [HUDI-

(hudi) 01/02: [MINOR] [BRANCH-0.x] Added condition to check default value to fix extracting password from credential store (#11247)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch release-0.15.0 in repository https://gitbox.apache.org/repos/asf/hudi.git commit f5a7c0f4607b1e8b39fa1eeff00a16bfb5b24851 Author: Aditya Goenka <63430370+ad1happy...@users.noreply.github.com> Autho

(hudi) branch release-0.15.0 updated (16ba7f23887 -> ed2dc912003)

2024-05-23 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch release-0.15.0 in repository https://gitbox.apache.org/repos/asf/hudi.git from 16ba7f23887 Remove local change from 0.14.0 new f5a7c0f4607 [MINOR] [BRANCH-0.x] Added condition to check de

Re: [I] Failed insert schema compatibility mismatch issue [hudi]

2024-05-23 Thread via GitHub
SamarthRaval commented on issue #11277: URL: https://github.com/apache/hudi/issues/11277#issuecomment-2127477765 @xushiyan @ad1happy2go @bhasudha Could please help me here thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Failed insert schema compatibility mismatch issue [hudi]

2024-05-23 Thread via GitHub
SamarthRaval commented on issue #11277: URL: https://github.com/apache/hudi/issues/11277#issuecomment-2127475324 While trying to do insert operation after bulk-insert, ran into above error. Not sure what to do here ? -- This is an automated message from the Apache Git Service. To re

[I] Failed insert schema compatibility mismatch issue [hudi]

2024-05-23 Thread via GitHub
SamarthRaval opened a new issue, #11277: URL: https://github.com/apache/hudi/issues/11277 **Describe the problem you faced** - I did bulk-insert operation for my data, which ran fine, but for incoming files I did insert operation [For incoming data there were few columns missing and

Re: [I] [SUPPORT]Performance degrade for migrating from Hudi 0.7 to Hudi 0.14 [hudi]

2024-05-23 Thread via GitHub
KnightChess commented on issue #11274: URL: https://github.com/apache/hudi/issues/11274#issuecomment-2127326143 @bibhu107 and why shuffle data grow, I haven't looked at the code in detail; the following is just my guess. you have too much reducer, so the shuffle data may be need more meta.

Re: [I] [SUPPORT]Performance degrade for migrating from Hudi 0.7 to Hudi 0.14 [hudi]

2024-05-23 Thread via GitHub
bibhu107 commented on issue #11274: URL: https://github.com/apache/hudi/issues/11274#issuecomment-2127279409 Hi @KnightChess Thanks for commenting. But my major doubt is why shuffle write is nearly doubled in hudi 0.14? And that is leading to issues in step 121 -- This is an automated m

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-23 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1611744651 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BloomFiltersIndexSupport.scala: ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-23 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1611759448 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodiePruneFileSourceFiles.scala: ## @@ -0,0 +1,146 @@ +/* + * Licensed to the

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-23 Thread via GitHub
codope commented on PR #11043: URL: https://github.com/apache/hudi/pull/11043#issuecomment-2127182553 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
hudi-bot commented on PR #11276: URL: https://github.com/apache/hudi/pull/11276#issuecomment-2127123812 ## CI report: * f7382d1c527cd033c11d452b2d5dd137522e2122 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
KnightChess commented on code in PR #11276: URL: https://github.com/apache/hudi/pull/11276#discussion_r1611629758 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ## @@ -116,8 +116,8 @@ class ColumnStatsIndexSupport(spark:

[PR] [MINOR] query index warning check [hudi]

2024-05-23 Thread via GitHub
KnightChess opened a new pull request, #11276: URL: https://github.com/apache/hudi/pull/11276 ### Change Logs - add warning check - some `isIndexAvailable` check will throw exception, will interrupt index selection. ### Impact none ### Risk level (write none, l

[I] [SUPPORT] [hudi]

2024-05-23 Thread via GitHub
Pavan792reddy opened a new issue, #11275: URL: https://github.com/apache/hudi/issues/11275 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-

[I] [SUPPORT]Performance degrade for migrating from Hudi 0.7 to Hudi 0.14 [hudi]

2024-05-23 Thread via GitHub
bibhu107 opened a new issue, #11274: URL: https://github.com/apache/hudi/issues/11274 Hi Team, I am upgrading my Spark EMR jobs **FROM [Spark 2.4.8, EMR-5.36.1, Hudi 0.7]** **TO** **[Spark 3.3.1, EMR 6.10.1, and Hudi 0.14]**. This upgrade is leading to a 230% performance degradation.

[I] [SUPPORT]Hudi Deltastreamer compaction is taking longer duration [hudi]

2024-05-23 Thread via GitHub
SuneethaYamani opened a new issue, #11273: URL: https://github.com/apache/hudi/issues/11273 Hi, I am creating COW table.I want run compaction separately instead of along with my write operation.So I used hoodie.datasource.write.streaming.disable.compaction=true. Still compac

Re: [PR] [MINOR] LSMTimeline needs to handle case for tables which has not performed first archived yet [hudi]

2024-05-23 Thread via GitHub
danny0405 merged PR #11271: URL: https://github.com/apache/hudi/pull/11271 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apac

Re: [PR] [MINOR] LSMTimeline needs to handle case for tables which has not performed first archived yet [hudi]

2024-05-23 Thread via GitHub
danny0405 commented on code in PR #11271: URL: https://github.com/apache/hudi/pull/11271#discussion_r1611302439 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/LSMTimeline.java: ## @@ -158,13 +159,18 @@ public static int latestSnapshotVersion(HoodieTableMetaC

(hudi) branch master updated: [MINOR] LSMTimeline needs to handle case for tables which has not performed first archived yet (#11271)

2024-05-23 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 94b564151a2 [MINOR] LSMTimeline needs to handle