Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1993682626 ## CI report: * 7bce9399d616a570e8a04c783b06e7e2f404dc5a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7492] fix the issue of incorrect keygenerator specification when creating m… [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10840: URL: https://github.com/apache/hudi/pull/10840#issuecomment-1993682540 ## CI report: * cf41aa0ce79b39dc6f09db500db4b123fed34ff0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7492] fix the issue of incorrect keygenerator specification when creating m… [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10840: URL: https://github.com/apache/hudi/pull/10840#issuecomment-1993675304 ## CI report: * cf41aa0ce79b39dc6f09db500db4b123fed34ff0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [I] [SUPPORT] ERROR BaseSparkCommitActionExecutor: Error upserting bucketType UPDATE for partition :13 [hudi]

2024-03-12 Thread via GitHub
codope closed issue #9119: [SUPPORT] ERROR BaseSparkCommitActionExecutor: Error upserting bucketType UPDATE for partition :13 URL: https://github.com/apache/hudi/issues/9119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] [SUPPORT] ERROR BaseSparkCommitActionExecutor: Error upserting bucketType UPDATE for partition :13 [hudi]

2024-03-12 Thread via GitHub
ad1happy2go commented on issue #9119: URL: https://github.com/apache/hudi/issues/9119#issuecomment-1993660645 Closing this as this was Fixed via: https://github.com/apache/hudi/pull/9984 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1993620652 ## CI report: * 7a74bf1e9e175c7ea4ad31c99f6fc88db81b46ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1993614748 ## CI report: * 7a74bf1e9e175c7ea4ad31c99f6fc88db81b46ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
geserdugarov commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1993528107 > Old key name `hoodie.clean.automatic` in `TestCDCDataFrameSuite.testCOWDataSourceWrite` doesn't work. Search for the reason. The reason is that in `HoodieCDCTestBase` `Hoodie

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10845: URL: https://github.com/apache/hudi/pull/10845#issuecomment-1993363802 ## CI report: * c50e42d4b21dc1af358b61b0d814cfb50248bfe0 UNKNOWN * 948d9ecb6dc661628b787ba800756b78d52791af Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10845: URL: https://github.com/apache/hudi/pull/10845#issuecomment-1993226243 ## CI report: * c50e42d4b21dc1af358b61b0d814cfb50248bfe0 UNKNOWN * 998a987f33866407fd1b2d8350e6c2f2386f59ad Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10845: URL: https://github.com/apache/hudi/pull/10845#issuecomment-1993208787 ## CI report: * c50e42d4b21dc1af358b61b0d814cfb50248bfe0 UNKNOWN * 998a987f33866407fd1b2d8350e6c2f2386f59ad Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r152204 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieGlobalTimeline.java: ## @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r1522287754 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieGlobalTimeline.java: ## @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r1522286800 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieGlobalTimeline.java: ## @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r1522285239 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java: ## @@ -167,12 +167,13 @@ public HoodieDefaultTimeline getWriteTimeline()

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r1522284552 ## hudi-client/hudi-client-common/src/test/java/org/apache/hudi/common/table/timeline/TestHoodieGlobalTimeline.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the Apach

Re: [PR] DOCS-updated videos [hudi]

2024-03-12 Thread via GitHub
nfarah86 commented on PR #10855: URL: https://github.com/apache/hudi/pull/10855#issuecomment-1992652425 https://github.com/apache/hudi/assets/5392555/264db195-dfe0-4712-b12f-d60122feb8dc";> @bhasudha @xushiyan pr for videos -- This is an automated message from the Apache Git Servic

[PR] updated videos [hudi]

2024-03-12 Thread via GitHub
nfarah86 opened a new pull request, #10855: URL: https://github.com/apache/hudi/pull/10855 ### Change Logs updated videos ### Impact none ### Risk level (write none, low medium or high below) none ### Documentation Update none - _The config

Re: [I] [SUPPORT] java.lang.NoClassDefFoundError: org/apache/hudi/com/fasterxml/jackson/module/scala/DefaultScalaModule$ when doing an Incremental CDC Query in 0.14.1 [hudi]

2024-03-12 Thread via GitHub
Tyler-Rendina commented on issue #10590: URL: https://github.com/apache/hudi/issues/10590#issuecomment-1992510921 While I can kick off backfills, they eventually fail along side streams with `java.lang.NoSuchMethodError: com.amazonaws.transform.JsonUnmarshallerContext.getCurrentToken()Lcom/

Re: [PR] [DO NOT MERGE][DOCS] Add more users in the powered-by page [hudi]

2024-03-12 Thread via GitHub
bhasudha commented on PR #10854: URL: https://github.com/apache/hudi/pull/10854#issuecomment-1992331710 Tested locally. Screenshots here! ![Screenshot 2024-03-12 at 11 49 27 AM](https://github.com/apache/hudi/assets/2179254/d0d65d05-d081-44dc-854f-cafed6126cfe) ![Screenshot 2024-03-12

[PR] [DO NOT MERGE][DOCS] Add more users in the powered-by page [hudi]

2024-03-12 Thread via GitHub
bhasudha opened a new pull request, #10854: URL: https://github.com/apache/hudi/pull/10854 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performan

Re: [I] [SUPPORT] java.lang.NoClassDefFoundError: org/apache/hudi/com/fasterxml/jackson/module/scala/DefaultScalaModule$ when doing an Incremental CDC Query in 0.14.1 [hudi]

2024-03-12 Thread via GitHub
Tyler-Rendina commented on issue #10590: URL: https://github.com/apache/hudi/issues/10590#issuecomment-1992114835 Final note, apologies for the amount of posts, but this may help EMR users with Glue as their Hive service. Make sure to build Hudi using Java 8, if you are on ARM use som

(hudi) branch asf-site updated: [DOCS][MINOR] Update powered-by page (#10853)

2024-03-12 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 03ea1e1b180 [DOCS][MINOR] Update powered-by

Re: [PR] [DOCS][MINOR] Update powered-by page [hudi]

2024-03-12 Thread via GitHub
xushiyan merged PR #10853: URL: https://github.com/apache/hudi/pull/10853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

Re: [PR] [HUDI-7497] Add a global timeline mingled with active and archived instants [hudi]

2024-03-12 Thread via GitHub
vinothchandar commented on code in PR #10845: URL: https://github.com/apache/hudi/pull/10845#discussion_r1521484287 ## hudi-client/hudi-client-common/src/test/java/org/apache/hudi/common/table/timeline/TestHoodieGlobalTimeline.java: ## @@ -0,0 +1,153 @@ +/* + * Licensed to the A

Re: [I] RLI Spark Hudi Error occurs when executing map [hudi]

2024-03-12 Thread via GitHub
bksrepo commented on issue #10609: URL: https://github.com/apache/hudi/issues/10609#issuecomment-1991619367 I am using spark 3.4.1 with hudi bundle 'hudi-spark3.4-bundle_2.12-0.14.0.jar', Hadoop is 3.3.6 and source database is mysql version 8.0.36 Reported ERROR comes at the

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1991552838 ## CI report: * 7a74bf1e9e175c7ea4ad31c99f6fc88db81b46ee Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
geserdugarov commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1991532665 Need to figure out the reason of failures in `TestCDCDataFrameSuite.testCOWDataSourceWrite`. -- This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Comment Edited] (HUDI-7493) Clean configuration for clean service

2024-03-12 Thread Geser Dugarov (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825613#comment-17825613 ] Geser Dugarov edited comment on HUDI-7493 at 3/12/24 12:17 PM: -

[PR] [DOCS][MINOR] Update powered-by page [hudi]

2024-03-12 Thread via GitHub
bhasudha opened a new pull request, #10853: URL: https://github.com/apache/hudi/pull/10853 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performan

Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
NishantBaheti commented on issue #10850: URL: https://github.com/apache/hudi/issues/10850#issuecomment-1991494289 ![image](https://github.com/apache/hudi/assets/31793052/89d5982e-f029-4bca-8438-0b623a99d6b8) doesn't work. another issue. -- This is an automated message from the

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1991474440 ## CI report: * 7a74bf1e9e175c7ea4ad31c99f6fc88db81b46ee Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-12 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-1991462452 ## CI report: * 7a74bf1e9e175c7ea4ad31c99f6fc88db81b46ee UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
ad1happy2go commented on issue #10850: URL: https://github.com/apache/hudi/issues/10850#issuecomment-1991461575 @NishantBaheti I checked before, incremental query works fine with 0.14.1. can you paste the full reproducible script or table/writer properties you used to populate. Which

Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
ad1happy2go commented on issue #10850: URL: https://github.com/apache/hudi/issues/10850#issuecomment-1991460541 @NishantBaheti I checked before, incremental query works fine with 0.14.1. can you paste the full reproducible script or table/writer properties you used to populate. I chec

Re: [I] [SUPPORT] Needed a way to load the specific data from the HUDI DATALAKE. [hudi]

2024-03-12 Thread via GitHub
jayesh2424 commented on issue #10852: URL: https://github.com/apache/hudi/issues/10852#issuecomment-1991447595 @ad1happy2go Okay, May be the question is not clear. But What you have suggested is have a full load of entire datalake. Then have it in a dataframe. So that doing df.createOrRe

Re: [I] [SUPPORT] Needed a way to load the specific data from the HUDI DATALAKE. [hudi]

2024-03-12 Thread via GitHub
ad1happy2go commented on issue #10852: URL: https://github.com/apache/hudi/issues/10852#issuecomment-1991416169 @jayesh2424 Sorry but I am not exactly clear of the question. In case you are asking how to read a specific part of table, You can read a data frame and do where/filter on

Re: [I] [SUPPORT] Needed a way to load the specific data from the HUDI DATALAKE. [hudi]

2024-03-12 Thread via GitHub
jayesh2424 commented on issue #10852: URL: https://github.com/apache/hudi/issues/10852#issuecomment-1991371544 @xushiyan, @ad1happy2go and @codope could you please help me out with this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[I] [SUPPORT] Needed a way to load the specific data from the HUDI DATALAKE. [hudi]

2024-03-12 Thread via GitHub
jayesh2424 opened a new issue, #10852: URL: https://github.com/apache/hudi/issues/10852 I have a Hudi datalake in my AWS. Currently to have a ETL operation I usually use the full load of Hudi Datalake for my operations. I want to know how Can I have a particular set of data only from the Hu

[PR] [HUDI-7493] Consistent naming of Clean configuration parameters [hudi]

2024-03-12 Thread via GitHub
geserdugarov opened a new pull request, #10851: URL: https://github.com/apache/hudi/pull/10851 ### Change Logs `ConfigProperty.key()` and `ConfigOption.key()` are used for docs generating, and we need to move towards consistent naming of all parameters in Hudi. This MR proposes the f

[jira] [Updated] (HUDI-7493) Clean configuration for clean service

2024-03-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7493: - Labels: pull-request-available (was: ) > Clean configuration for clean service >

[jira] [Commented] (HUDI-7493) Clean configuration for clean service

2024-03-12 Thread Geser Dugarov (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825613#comment-17825613 ] Geser Dugarov commented on HUDI-7493: - Could be label by the ["Config Simplification"

Re: [PR] [HUDI-6806] Support Spark 3.5.0 [hudi]

2024-03-12 Thread via GitHub
ranwani commented on PR #9717: URL: https://github.com/apache/hudi/pull/9717#issuecomment-1991259976 @yihua : We need to use Hudi with Spark 3.5. Can you let me know when is Hudi 0.15.0 release planned? -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
NishantBaheti commented on issue #10850: URL: https://github.com/apache/hudi/issues/10850#issuecomment-1991167509 Hello, I am using this jar - hudi-spark3.3-bundle_2.12-0.14.1.jar - spark 3.3 - hudi 0.14.1 -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on issue #10850: URL: https://github.com/apache/hudi/issues/10850#issuecomment-1991109123 Hi, @NishantBaheti , thanks for your feedback, could you also supplement the release version for Spark and Hudi respectively. -- This is an automated message from the Apache Git S

Re: [PR] [HUDI-7436] Fix the conditions for determining whether the records need to be rewritten [hudi]

2024-03-12 Thread via GitHub
danny0405 commented on code in PR #10727: URL: https://github.com/apache/hudi/pull/10727#discussion_r1521102072 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java: ## @@ -202,7 +202,9 @@ private Option> composeSchemaEvolut

Re: [PR] [HUDI-7436] Fix the conditions for determining whether the records need to be rewritten [hudi]

2024-03-12 Thread via GitHub
xiarixiaoyao commented on code in PR #10727: URL: https://github.com/apache/hudi/pull/10727#discussion_r1521039170 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java: ## @@ -202,7 +202,9 @@ private Option> composeSchemaEvo

[I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-03-12 Thread via GitHub
NishantBaheti opened a new issue, #10850: URL: https://github.com/apache/hudi/issues/10850 ## Error Error Category: QUERY_ERROR; AnalysisException: Found duplicate column(s) in the data schema: `_hoodie_commit_seqno`, `_hoodie_commit_time`, `_hoodie_file_name`, `_hoodie_partition_path`,