[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1241053170 ## CI report: * 1a7511e07003745ea2d0d7a802ea5bf1731bd9a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1125

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1241053135 ## CI report: * 0c0a1c94e70675f5034fd4fd8bcc0812f27a272c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] rahil-c commented on issue #6552: [SUPPORT] AWSDmsAvroPayload does not work correctly with any version above 0.10.0

2022-09-08 Thread GitBox
rahil-c commented on issue #6552: URL: https://github.com/apache/hudi/issues/6552#issuecomment-1241038767 Draft pr: https://github.com/apache/hudi/pull/6637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] rahil-c opened a new pull request, #6637: Fix AWSDmsAvroPayload#getInsertValue,combineAndGetUpdateValue to invo…

2022-09-08 Thread GitBox
rahil-c opened a new pull request, #6637: URL: https://github.com/apache/hudi/pull/6637 …ke correct api ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature cha

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] minihippo commented on a diff in pull request #4676: [HUDI-3304] support partial update on mor table

2022-09-08 Thread GitBox
minihippo commented on code in PR #4676: URL: https://github.com/apache/hudi/pull/4676#discussion_r966225565 ## hudi-common/src/main/java/org/apache/hudi/common/model/PartialUpdateAvroPayload.java: ## @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240978121 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * a77a425c017a4be221304a12f6330bbc986f3dd2 Azure: [SUCCES

[GitHub] [hudi] minihippo commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-08 Thread GitBox
minihippo commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240919039 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [hudi] minihippo commented on a diff in pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-08 Thread GitBox
minihippo commented on code in PR #6630: URL: https://github.com/apache/hudi/pull/6630#discussion_r966149554 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -72,6 +73,26 @@ public static List getLatestBaseFilesForPartition(

[jira] [Updated] (HUDI-3617) MOR compact improve

2022-09-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3617: - Sprint: 2022/09/19 > MOR compact improve > --- > > Key: HUDI-3617 >

[jira] [Updated] (HUDI-3617) MOR compact improve

2022-09-08 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3617: - Sprint: (was: 2022/09/05) > MOR compact improve > --- > > Key: HUDI-3617

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240892039 ## CI report: * 0ac5e2ce35cf25599bcc7274a9f3dbbc59200adb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1240887022 ## CI report: * fc4fcb0d0b7f98e4925f806a58d5dfbaf4a2be9c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] minihippo commented on pull request #6595: [HUDI-4777] Fix flink gen bucket index of mor table not consistent wi…

2022-09-08 Thread GitBox
minihippo commented on PR #6595: URL: https://github.com/apache/hudi/pull/6595#issuecomment-1240879029 > but in flink side, I think deduplicate should also open as default option for mor table , when duplicate write to log file, very hard for compact to read, also lead mor table not stable

[GitHub] [hudi] joao-miranda commented on issue #6601: [SUPPORT] "default" folder not outputted by Hudi for non-partitioned tables when used with Spark

2022-09-08 Thread GitBox
joao-miranda commented on issue #6601: URL: https://github.com/apache/hudi/issues/6601#issuecomment-1240849925 Thank you for replying. You are correct that we should be using a different Generator. However that ended up not being necessary since our solution was just to add an empty c

[GitHub] [hudi] joao-miranda closed issue #6601: [SUPPORT] "default" folder not outputted by Hudi for non-partitioned tables when used with Spark

2022-09-08 Thread GitBox
joao-miranda closed issue #6601: [SUPPORT] "default" folder not outputted by Hudi for non-partitioned tables when used with Spark URL: https://github.com/apache/hudi/issues/6601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] hudi-bot commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-08 Thread GitBox
hudi-bot commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240822244 ## CI report: * 07f8c3922c20d3350a21ead05f0104ba57af0092 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-08 Thread GitBox
hudi-bot commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240814678 ## CI report: * 07f8c3922c20d3350a21ead05f0104ba57af0092 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240793844 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * aa48e1f17d294980b6219a52d031c8a6cd8f Azure: [PENDIN

[GitHub] [hudi] nsivabalan commented on pull request #5958: [HUDI-3900] [UBER] Support log compaction action for MOR tables

2022-09-08 Thread GitBox
nsivabalan commented on PR #5958: URL: https://github.com/apache/hudi/pull/5958#issuecomment-1240759220 hey @suryaprasanna : we also wanted to see if its feasible to not introduce a new action, but re-use the ones for compaction? can you check the feasibility on that? -- This is an auto

[jira] [Updated] (HUDI-4815) Hudi report metrics to AWS cloudwatch error

2022-09-08 Thread Power Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Power Liu updated HUDI-4815: Description: *HUDI version:* 0.11.1 *Application:* Spark 3.1.3 *Running Environment:* AWS EKS pod (kuberne

[jira] [Created] (HUDI-4815) Hudi report metrics to AWS cloudwatch error

2022-09-08 Thread Power Liu (Jira)
Power Liu created HUDI-4815: --- Summary: Hudi report metrics to AWS cloudwatch error Key: HUDI-4815 URL: https://issues.apache.org/jira/browse/HUDI-4815 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-4815) Hudi report metrics to AWS cloudwatch error

2022-09-08 Thread Power Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Power Liu updated HUDI-4815: Description: *HUDI version:* 0.11.1 *Application:* Spark 3.1.3 *Running Environment:* AWS EKS pod (kuberne

[GitHub] [hudi] hudi-bot commented on pull request #6636: add new index RANGE_BUCKET , when primary key is auto-increment like most mysql table

2022-09-08 Thread GitBox
hudi-bot commented on PR #6636: URL: https://github.com/apache/hudi/pull/6636#issuecomment-1240718297 ## CI report: * b837b813fb706508b1fccc0924f839275e9373c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1125

[GitHub] [hudi] codope commented on pull request #6629: [HUDI-4807] Use base table instant for metadata table initialization

2022-09-08 Thread GitBox
codope commented on PR #6629: URL: https://github.com/apache/hudi/pull/6629#issuecomment-1240716869 @nsivabalan Can you take a look as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [hudi] hudi-bot commented on pull request #6636: add new index RANGE_BUCKET , when primary key is auto-increment like most mysql table

2022-09-08 Thread GitBox
hudi-bot commented on PR #6636: URL: https://github.com/apache/hudi/pull/6636#issuecomment-1240711307 ## CI report: * b837b813fb706508b1fccc0924f839275e9373c3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Commented] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

2022-09-08 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601819#comment-17601819 ] Sagar Sumit commented on HUDI-3391: --- I have attempted to reproduce the issue with latest

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240702987 ## CI report: * 04542752f07caf843d43cc25efacfb487b5b79d3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1026

[jira] [Updated] (HUDI-4661) Test COW: Hive QL with bootstrap

2022-09-08 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4661: -- Status: In Progress (was: Open) > Test COW: Hive QL with bootstrap >

[GitHub] [hudi] wqwl611 opened a new pull request, #6636: add new index RANGE_BUCKET

2022-09-08 Thread GitBox
wqwl611 opened a new pull request, #6636: URL: https://github.com/apache/hudi/pull/6636 ### Change Logs Usually, in the mysql... table, there is an auto-increment primary key, base this fact and Bucket index , we propose a Range_Bucket index. And get a good performence in my practice.

[GitHub] [hudi] hudi-bot commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-08 Thread GitBox
hudi-bot commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240626015 ## CI report: * 07f8c3922c20d3350a21ead05f0104ba57af0092 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] TJX2014 commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
TJX2014 commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1240623881 Seeems `TestHoodieSparkSqlWriter.testToWriteWithoutParametersIncludedInHoodieTableConfig:868 » NoSuchElement` not related to this pr. -- This is an automated message from the Apache Git S

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240561233 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-08 Thread GitBox
hudi-bot commented on PR #6502: URL: https://github.com/apache/hudi/pull/6502#issuecomment-1240556180 ## CI report: * fbedf9a29c4c574ad4d69406416dbb057c080345 UNKNOWN * 8b1585464429a60d9eff4cfa2cb9f937b1ac6f0d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240555646 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6631: [HUDI-4810] Fixing Hudi bundles requiring log4j2 on the classpath

2022-09-08 Thread GitBox
hudi-bot commented on PR #6631: URL: https://github.com/apache/hudi/pull/6631#issuecomment-1240550924 ## CI report: * e8e8c4d8047b5985764f7534bd84e82763c3ad28 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] guanziyue commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

2022-09-08 Thread GitBox
guanziyue commented on PR #6602: URL: https://github.com/apache/hudi/pull/6602#issuecomment-1240533451 > SizeAwareFSDataOutputStream Thanks for your clarification. Do agree with you that the correct way is change the result returned by SizeAwareFSDataOutputStream. -- This is an au

[GitHub] [hudi] TJX2014 commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
TJX2014 commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1240518290 I will check this ci failed issue local. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [hudi] wangzheyuan opened a new issue, #6635: [SUPPORT] Failed to build hudi 0.12.0 with spark 3.2.2

2022-09-08 Thread GitBox
wangzheyuan opened a new issue, #6635: URL: https://github.com/apache/hudi/issues/6635 **Describe the problem you faced** Failed to build hudi 0.12.0 with spark 3.2.2 **To Reproduce** Steps to reproduce the behavior: ```shell mvn clean package -DskipTests -Dscal

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240488982 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * aa48e1f17d294980b6219a52d031c8a6cd8f Azure: [PENDIN

[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1240483194 ## CI report: * 1a7511e07003745ea2d0d7a802ea5bf1731bd9a4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1125

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240483085 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * aa48e1f17d294980b6219a52d031c8a6cd8f Azure: [PENDIN

[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1240477132 ## CI report: * 1a7511e07003745ea2d0d7a802ea5bf1731bd9a4 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5269: [HUDI-3636] Create new write clients for async table services in DeltaStreamer

2022-09-08 Thread GitBox
hudi-bot commented on PR #5269: URL: https://github.com/apache/hudi/pull/5269#issuecomment-1240474858 ## CI report: * a360d286f9a9bff3f60cc7231bc0abfe86675a88 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-08 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240470610 ## CI report: * 85a8f5166c17ec5ce9fa00e2c38846f440582acf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[GitHub] [hudi] hudi-bot commented on pull request #6629: [HUDI-4807] Use base table instant for metadata table initialization

2022-09-08 Thread GitBox
hudi-bot commented on PR #6629: URL: https://github.com/apache/hudi/pull/6629#issuecomment-1240470563 ## CI report: * c88a869d5d8e748edac75698c7c504176a06e47d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: [#6573 |https://github.com/apache/hudi/issues/6573] (was: [[SUPPORT] There are many request and inflight

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: [[SUPPORT] There are many request and inflight clustering in the .hoodie directory. · Issue #6573 · apach

[hudi] branch master updated (adf36093d2 -> dcb55b7019)

2022-09-08 Thread forwardxu
This is an automated email from the ASF dual-hosted git repository. forwardxu pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from adf36093d2 [HUDI-4766] Strengthen flink clustering job (#6566) add dcb55b7019 [HUDI-4797] fix merge into table f

[GitHub] [hudi] XuQianJin-Stars merged pull request #6620: [HUDI-4797] fix merge into table for source table with different column order

2022-09-08 Thread GitBox
XuQianJin-Stars merged PR #6620: URL: https://github.com/apache/hudi/pull/6620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.

[jira] [Reopened] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric reopened HUDI-4814: > There are many request and inflight clustering in the .hoodie directory >

[jira] [Resolved] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric resolved HUDI-4814. > There are many request and inflight clustering in the .hoodie directory >

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: #6573 (was: [ #6573|https://github.com/apache/hudi/issues/6573] [#6574|https://github.com/apache/hudi/p

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: [ #6573|https://github.com/apache/hudi/issues/6573] [#6574|https://github.com/apache/hudi/pull/6574]

[GitHub] [hudi] eric9204 commented on issue #6573: [SUPPORT] There are many request and inflight clustering in the .hoodie directory.

2022-09-08 Thread GitBox
eric9204 commented on issue #6573: URL: https://github.com/apache/hudi/issues/6573#issuecomment-1240442753 https://issues.apache.org/jira/projects/HUDI/issues/HUDI-4814?filter=allissues -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: [[SUPPORT] There are many request and inflight clustering in the .hoodie directory. · Issue #6573 · apac

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: [[SUPPORT] There are many request and inflight clustering in the .hoodie directory. · Issue #6573 · apach

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Description: h1. #6573 > There are many request and inflight clustering in the .hoodie directory > ---

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Component/s: clustering > There are many request and inflight clustering in the .hoodie directory > --

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Fix Version/s: 0.13.0 > There are many request and inflight clustering in the .hoodie directory >

[jira] [Created] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
eric created HUDI-4814: -- Summary: There are many request and inflight clustering in the .hoodie directory Key: HUDI-4814 URL: https://issues.apache.org/jira/browse/HUDI-4814 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-4814) There are many request and inflight clustering in the .hoodie directory

2022-09-08 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-4814: --- Affects Version/s: 0.12.0 > There are many request and inflight clustering in the .hoodie directory >

[GitHub] [hudi] TJX2014 commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
TJX2014 commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1240430808 Hi @danny0405 , after my test, I find the inferFunc of https://github.com/apache/hudi/pull/5815 not be called, please correct me if I am not right. -- This is an automated message from th

[jira] [Updated] (HUDI-4813) Infer keygen not work in sparksql side

2022-09-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4813: - Labels: pull-request-available (was: ) > Infer keygen not work in sparksql side > ---

[GitHub] [hudi] TJX2014 opened a new pull request, #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-08 Thread GitBox
TJX2014 opened a new pull request, #6634: URL: https://github.com/apache/hudi/pull/6634 ### Change Logs Fix KEY_GENERATOR_CLASS_NAME will be assigned default as `ComplexKeyGenerator` in `org.apache.spark.sql.catalyst.catalog.HoodieCatalogTable#extraTableConfig` The `org.apache.hudi.Da

[jira] [Created] (HUDI-4813) Infer keygen not work in sparksql side

2022-09-08 Thread JinxinTang (Jira)
JinxinTang created HUDI-4813: Summary: Infer keygen not work in sparksql side Key: HUDI-4813 URL: https://issues.apache.org/jira/browse/HUDI-4813 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1240411438 ## CI report: * fc4fcb0d0b7f98e4925f806a58d5dfbaf4a2be9c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240410317 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240404127 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1240398984 ## CI report: * fc4fcb0d0b7f98e4925f806a58d5dfbaf4a2be9c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240398927 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * aa48e1f17d294980b6219a52d031c8a6cd8f Azure: [PENDIN

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1240392517 ## CI report: * fc4fcb0d0b7f98e4925f806a58d5dfbaf4a2be9c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] hudi-bot commented on pull request #6574: Keep a clustering running at the same time.#6573

2022-09-08 Thread GitBox
hudi-bot commented on PR #6574: URL: https://github.com/apache/hudi/pull/6574#issuecomment-1240392195 ## CI report: * b158b5a580ffb609380dcac27a299c9a7557d649 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[GitHub] [hudi] dongkelun commented on pull request #5406: [HUDI-3954] Don't keep the last commit before the earliest commit to retain

2022-09-08 Thread GitBox
dongkelun commented on PR #5406: URL: https://github.com/apache/hudi/pull/5406#issuecomment-1240374420 > hey @dongkelun : may be there is some rational behind the original intent. Its just deducting 1 commit from what user wants right. as of now, I don't feel this is giving us much or fixin

[GitHub] [hudi] YuweiXiao commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-08 Thread GitBox
YuweiXiao commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r965619569 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @xushiyan

[GitHub] [hudi] YuweiXiao commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-08 Thread GitBox
YuweiXiao commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r965619569 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @xushiyan

[GitHub] [hudi] TJX2014 commented on a diff in pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-08 Thread GitBox
TJX2014 commented on code in PR #6630: URL: https://github.com/apache/hudi/pull/6630#discussion_r965606704 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -72,6 +73,26 @@ public static List getLatestBaseFilesForPartition(

[GitHub] [hudi] hudi-bot commented on pull request #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
hudi-bot commented on PR #6633: URL: https://github.com/apache/hudi/pull/6633#issuecomment-1240337342 ## CI report: * fc4fcb0d0b7f98e4925f806a58d5dfbaf4a2be9c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240335341 ## CI report: * c711e86c12cc97e9bb28afefe1de0334a07d840a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[jira] [Created] (HUDI-4812) Delay file groups fetching after partition prune in Spark Query

2022-09-08 Thread Yuwei Xiao (Jira)
Yuwei Xiao created HUDI-4812: Summary: Delay file groups fetching after partition prune in Spark Query Key: HUDI-4812 URL: https://issues.apache.org/jira/browse/HUDI-4812 Project: Apache Hudi Is

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240331863 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN * aa48e1f17d294980b6219a52d031c8a6cd8f UNKNOWN

[jira] [Updated] (HUDI-4811) Fix the checkstyle of hudi flink

2022-09-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4811: - Labels: pull-request-available (was: ) > Fix the checkstyle of hudi flink > -

[GitHub] [hudi] danny0405 opened a new pull request, #6633: [HUDI-4811] Fix the checkstyle of hudi flink

2022-09-08 Thread GitBox
danny0405 opened a new pull request, #6633: URL: https://github.com/apache/hudi/pull/6633 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performanc

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240329864 ## CI report: * c711e86c12cc97e9bb28afefe1de0334a07d840a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240326450 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN * ba3513d5b65e39f7cbb71e851ddd34cfe9d846a0 UNKNOWN Bot commands @hudi-bot supports the following

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240325589 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1124

[GitHub] [hudi] codope commented on issue #6024: [SUPPORT] DELETE_PARTITION causes AWS Athena Query failure

2022-09-08 Thread GitBox
codope commented on issue #6024: URL: https://github.com/apache/hudi/issues/6024#issuecomment-1240325386 @Gatsby-Lee Thanks and I have noted your point. Would you mind upstreaming your fix (logic that checks if the target partition exists). I believe this would be helpful for other users as

[GitHub] [hudi] loukey-lj commented on pull request #6602: [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large

2022-09-08 Thread GitBox
loukey-lj commented on PR #6602: URL: https://github.com/apache/hudi/pull/6602#issuecomment-1240324396 > Hi loukey-lj, could you share which kind of filesystem you use? HDFS or S3 or any other type. The problem you mentioned should be covered in UT TestHoodieLogFormat#testRollover. I just h

[GitHub] [hudi] codope closed issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-08 Thread GitBox
codope closed issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert. URL: https://github.com/apache/hudi/issues/5452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] codope commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-08 Thread GitBox
codope commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1240323859 Great! Gonna close this issue then. FYI, we also plan to flip the default for schema reconciliation in the next release. See #6196 -- This is an automated message from the Apache Git

[GitHub] [hudi] scxwhite commented on a diff in pull request #5030: [HUDI-3617] MOR compact improve

2022-09-08 Thread GitBox
scxwhite commented on code in PR #5030: URL: https://github.com/apache/hudi/pull/5030#discussion_r965590717 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java: ## @@ -123,25 +133,24 @@ public long getNumMergedRecordsInLog() { ret

[GitHub] [hudi] hudi-bot commented on pull request #6632: [HUDI-4753] more accurate record size estimation for log writing and spillable map

2022-09-08 Thread GitBox
hudi-bot commented on PR #6632: URL: https://github.com/apache/hudi/pull/6632#issuecomment-1240321256 ## CI report: * d9e12ddf962b670b8ec1e2260d5389c688e16001 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
hudi-bot commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1240320365 ## CI report: * 5a9d4eb8ff3160e20c534d4eff1912a07ba4e9fd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] codope commented on a diff in pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-08 Thread GitBox
codope commented on code in PR #6196: URL: https://github.com/apache/hudi/pull/6196#discussion_r965587959 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieCommonConfig.java: ## @@ -38,7 +38,7 @@ public class HoodieCommonConfig extends HoodieConfig { public s

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-08 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240319125 ## CI report: * c711e86c12cc97e9bb28afefe1de0334a07d840a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

<    1   2   3   >