[jira] [Updated] (HUDI-6257) upgrade table version with hive style path will not check default path

2023-05-23 Thread KnightChess (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KnightChess updated HUDI-6257: -- Description: upgrade tabler version four to five will check default partition, it not work for table

[GitHub] [hudi] danny0405 closed pull request #8776: [HUDI-5994] Bucket index supports bulk insert row writer

2023-05-23 Thread via GitHub
danny0405 closed pull request #8776: [HUDI-5994] Bucket index supports bulk insert row writer URL: https://github.com/apache/hudi/pull/8776 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Created] (HUDI-6257) upgrade table version with hive style path will not check default path

2023-05-23 Thread KnightChess (Jira)
KnightChess created HUDI-6257: - Summary: upgrade table version with hive style path will not check default path Key: HUDI-6257 URL: https://issues.apache.org/jira/browse/HUDI-6257 Project: Apache Hudi

[GitHub] [hudi] xushiyan commented on pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
xushiyan commented on PR #8790: URL: https://github.com/apache/hudi/pull/8790#issuecomment-1560492299 @zhangyue19921010 thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203473413 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java: ## @@ -214,4 +214,17 @@ public List

[GitHub] [hudi] ad1happy2go commented on issue #8791: [SUPPORT] ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f

2023-05-23 Thread via GitHub
ad1happy2go commented on issue #8791: URL: https://github.com/apache/hudi/issues/8791#issuecomment-1560490347 @lucienoz This error we normally see due to incompability of the spark and hudi version with respect to Scala. Please confirm if you are using this artifact only -

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203470106 ## hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java: ## @@ -310,12 +312,16 @@ private static Properties

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203468779 ## hudi-common/src/main/java/org/apache/hudi/HoodieVersion.java: ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203468378 ## hudi-client/hudi-java-client/src/main/java/org/apache/hudi/client/common/HoodieJavaEngineContext.java: ## @@ -161,4 +162,11 @@ public List

[jira] [Updated] (HUDI-4067) Add Spanner based lock provider for GCP

2023-05-23 Thread kazdy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kazdy updated HUDI-4067: Status: Open (was: In Progress) > Add Spanner based lock provider for GCP >

[jira] [Updated] (HUDI-4068) Add Cosmos based lock provider for Azure

2023-05-23 Thread kazdy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kazdy updated HUDI-4068: Status: Open (was: In Progress) > Add Cosmos based lock provider for Azure >

[jira] [Updated] (HUDI-4067) Add Spanner based lock provider for GCP

2023-05-23 Thread kazdy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kazdy updated HUDI-4067: Status: In Progress (was: Open) > Add Spanner based lock provider for GCP >

[GitHub] [hudi] xushiyan commented on a diff in pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-23 Thread via GitHub
xushiyan commented on code in PR #8445: URL: https://github.com/apache/hudi/pull/8445#discussion_r1203466299 ## hudi-spark-datasource/hudi-spark3-common/src/test/java/org/apache/hudi/spark3/internal/TestHoodieDataSourceInternalBatchWrite.java: ## @@ -70,7 +71,7 @@ private

[jira] [Updated] (HUDI-4068) Add Cosmos based lock provider for Azure

2023-05-23 Thread kazdy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kazdy updated HUDI-4068: Status: In Progress (was: Open) > Add Cosmos based lock provider for Azure >

[GitHub] [hudi] xushiyan commented on a diff in pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-23 Thread via GitHub
xushiyan commented on code in PR #8445: URL: https://github.com/apache/hudi/pull/8445#discussion_r1203463851 ## hudi-integ-test/pom.xml: ## @@ -100,6 +98,21 @@ test + + + org.apache.parquet + parquet-avro + ${parquet.version} + test +

[GitHub] [hudi] xushiyan commented on a diff in pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-23 Thread via GitHub
xushiyan commented on code in PR #8445: URL: https://github.com/apache/hudi/pull/8445#discussion_r1203462529 ## hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/index/hbase/TestSparkHoodieHBaseIndex.java: ## @@ -114,6 +114,9 @@ public class TestSparkHoodieHBaseIndex

[GitHub] [hudi] xushiyan commented on a diff in pull request #8445: [HUDI-3088] Use Spark 3.2 as default Spark version

2023-05-23 Thread via GitHub
xushiyan commented on code in PR #8445: URL: https://github.com/apache/hudi/pull/8445#discussion_r1203461726 ## hudi-client/hudi-client-common/src/test/java/org/apache/hudi/io/storage/TestHoodieHFileReaderWriter.java: ## @@ -198,10 +200,10 @@ public void

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203459409 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/embedded/EmbeddedTimelineService.java: ## @@ -112,7 +112,7 @@ private void setHostAddr(String

[GitHub] [hudi] prashantwason commented on a diff in pull request #8724: [HUDI-6220] Add HUDI code version to commit files and hoodie.properties.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8724: URL: https://github.com/apache/hudi/pull/8724#discussion_r1203459158 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/embedded/EmbeddedTimelineService.java: ## @@ -112,7 +112,7 @@ private void setHostAddr(String

[GitHub] [hudi] prashantwason commented on pull request #8609: [HUDI-6154] Introduced retry while reading hoodie.properties to deal with parallel updates.

2023-05-23 Thread via GitHub
prashantwason commented on PR #8609: URL: https://github.com/apache/hudi/pull/8609#issuecomment-1560479644 @nsivabalan PTAL again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8604: URL: https://github.com/apache/hudi/pull/8604#issuecomment-1560478315 ## CI report: * f0cf8b8e2b8bc3a2ac30c77762bfe574597a9606 Azure:

[GitHub] [hudi] prashantwason commented on pull request #8606: [MINOR] Check the return value from delete during rollback and finalize to ensure the files actually got deleted.

2023-05-23 Thread via GitHub
prashantwason commented on PR #8606: URL: https://github.com/apache/hudi/pull/8606#issuecomment-1560478261 @danny0405 All checks have passed. PTAL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #8638: added new exception types

2023-05-23 Thread via GitHub
the-other-tim-brown commented on code in PR #8638: URL: https://github.com/apache/hudi/pull/8638#discussion_r1203453231 ## hudi-common/src/main/java/org/apache/hudi/exception/HoodieMetaSyncException.java: ## @@ -18,12 +18,18 @@ package org.apache.hudi.exception; +import

[GitHub] [hudi] prashantwason commented on a diff in pull request #8605: [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8605: URL: https://github.com/apache/hudi/pull/8605#discussion_r1203451740 ## hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieIndex.java: ## @@ -587,6 +594,43 @@ public void

[GitHub] [hudi] zhangyue19921010 closed pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
zhangyue19921010 closed pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3 URL: https://github.com/apache/hudi/pull/8790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] zhangyue19921010 commented on pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
zhangyue19921010 commented on PR #8790: URL: https://github.com/apache/hudi/pull/8790#issuecomment-1560475229 Close this test pr. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] zhangyue19921010 commented on pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
zhangyue19921010 commented on PR #8790: URL: https://github.com/apache/hudi/pull/8790#issuecomment-1560474994 `Java CI / test-spark (scala-2.12, spark3.0, hudi-spark-datasource/hudi-spark3.0.x) ` passed. Only need to add `3.0.1` in spark3.0 profile. cc @xushiyan -- This

[GitHub] [hudi] hudi-bot commented on pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8604: URL: https://github.com/apache/hudi/pull/8604#issuecomment-1560473281 ## CI report: * f1653d9899f1c925e1d662ea8ee6ae26edae573b Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8526: [HUDI-6116] Optimize log block reading by removing seeks to check corrupted blocks.

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8526: URL: https://github.com/apache/hudi/pull/8526#issuecomment-1560473142 ## CI report: * 0f2f4ddd192879cdc6a9c91aa2b2c5c6813ab490 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8526: [HUDI-6116] Optimize log block reading by removing seeks to check corrupted blocks.

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8526: URL: https://github.com/apache/hudi/pull/8526#issuecomment-1560468070 ## CI report: * 0f2f4ddd192879cdc6a9c91aa2b2c5c6813ab490 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8487: [HUDI-6093] Use the correct partitionToReplacedFileIds during commit.

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8487: URL: https://github.com/apache/hudi/pull/8487#issuecomment-1560467979 ## CI report: * 195a90f292a61ec35698ab1d2472c56316c4a611 Azure:

[GitHub] [hudi] prashantwason commented on a diff in pull request #8605: [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8605: URL: https://github.com/apache/hudi/pull/8605#discussion_r1203436289 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -170,9 +171,35 @@ public static List filterKeysFromFile(Path

[GitHub] [hudi] prashantwason commented on a diff in pull request #8605: [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8605: URL: https://github.com/apache/hudi/pull/8605#discussion_r1203435656 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -170,9 +171,35 @@ public static List filterKeysFromFile(Path

[GitHub] [hudi] prashantwason commented on a diff in pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8604: URL: https://github.com/apache/hudi/pull/8604#discussion_r1203427378 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java: ## @@ -161,27 +161,28 @@ protected void

[GitHub] [hudi] prashantwason commented on a diff in pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8604: URL: https://github.com/apache/hudi/pull/8604#discussion_r1203427058 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java: ## @@ -161,27 +161,28 @@ protected void

[GitHub] [hudi] prashantwason commented on a diff in pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8604: URL: https://github.com/apache/hudi/pull/8604#discussion_r1203423423 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java: ## @@ -161,27 +161,28 @@ protected void

[GitHub] [hudi] prashantwason commented on a diff in pull request #8526: [HUDI-6116] Optimize log block reading by removing seeks to check corrupted blocks.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8526: URL: https://github.com/apache/hudi/pull/8526#discussion_r1203418156 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java: ## @@ -152,98 +153,107 @@ private void addShutDownHook() { // TODO :

[GitHub] [hudi] prashantwason commented on a diff in pull request #8526: [HUDI-6116] Optimize log block reading by removing seeks to check corrupted blocks.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8526: URL: https://github.com/apache/hudi/pull/8526#discussion_r1203417127 ## hudi-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java: ## @@ -2571,6 +2579,174 @@ public void

[GitHub] [hudi] prashantwason commented on a diff in pull request #8526: [HUDI-6116] Optimize log block reading by removing seeks to check corrupted blocks.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8526: URL: https://github.com/apache/hudi/pull/8526#discussion_r1203416332 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java: ## @@ -152,98 +153,107 @@ private void addShutDownHook() { // TODO :

[GitHub] [hudi] parisni commented on a diff in pull request #8716: [HUDI-6226] Support parquet native bloom filters

2023-05-23 Thread via GitHub
parisni commented on code in PR #8716: URL: https://github.com/apache/hudi/pull/8716#discussion_r1203415278 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieBaseParquetWriter.java: ## @@ -67,6 +89,22 @@ public HoodieBaseParquetWriter(Path file,

[GitHub] [hudi] hudi-bot commented on pull request #8792: [HUDI-6256] Fix the data table archiving and MDT cleaning config conf…

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8792: URL: https://github.com/apache/hudi/pull/8792#issuecomment-1560442201 ## CI report: * 778dc47926bc4c02961ca36e0f0e2056e1e99bed Azure:

[GitHub] [hudi] bvaradar commented on pull request #8387: [HUDI-6041] add `options` input to Bootstrap Procedure for passing hudi properties

2023-05-23 Thread via GitHub
bvaradar commented on PR #8387: URL: https://github.com/apache/hudi/pull/8387#issuecomment-1560439825 @lvyanquan : Can you fix the conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot commented on pull request #8792: [HUDI-6256] Fix the data table archiving and MDT cleaning config conf…

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8792: URL: https://github.com/apache/hudi/pull/8792#issuecomment-1560438163 ## CI report: * 778dc47926bc4c02961ca36e0f0e2056e1e99bed UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Closed] (HUDI-6213) Parallelize deletion of files during rollback.

2023-05-23 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason closed HUDI-6213. Fix Version/s: 0.14.0 Resolution: Fixed > Parallelize deletion of files during rollback. >

[GitHub] [hudi] prashantwason commented on a diff in pull request #8599: [MINOR] Ensure metrics prefix does not contain any dot.

2023-05-23 Thread via GitHub
prashantwason commented on code in PR #8599: URL: https://github.com/apache/hudi/pull/8599#discussion_r1203396182 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -2175,7 +2175,8 @@ public boolean

[GitHub] [hudi] hudi-bot commented on pull request #8638: added new exception types

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8638: URL: https://github.com/apache/hudi/pull/8638#issuecomment-1560433831 ## CI report: * c8cf2d86b1be30d3215b3b6e89b8bda33a1fe5dc UNKNOWN * 333d9faa53e71ba535a7cb8c60ce8b350a33452c UNKNOWN * b34d619d7162d5d70012f4a0b186f2f0f0810455 Azure:

[jira] [Closed] (HUDI-6117) Parallelize creation of initial file groups for MDT partitions

2023-05-23 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason closed HUDI-6117. Resolution: Fixed > Parallelize creation of initial file groups for MDT partitions >

[GitHub] [hudi] danny0405 commented on a diff in pull request #8792: [HUDI-6256] Fix the data table archiving and MDT cleaning config conf…

2023-05-23 Thread via GitHub
danny0405 commented on code in PR #8792: URL: https://github.com/apache/hudi/pull/8792#discussion_r1203388715 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java: ## @@ -93,7 +92,7 @@ public static HoodieWriteConfig

[GitHub] [hudi] flashJd opened a new pull request, #8792: [HUDI-6256] Fix the data table archiving and MDT cleaning config conf…

2023-05-23 Thread via GitHub
flashJd opened a new pull request, #8792: URL: https://github.com/apache/hudi/pull/8792 ### Background 1) table config: ("hoodie.cleaner.commits.retained", "2"), ("hoodie.keep.min.commits", "3") 2) V0.12.2 works well , but when upgrade from 0.12 --> 0.13,throws

[jira] [Created] (HUDI-6256) fix the data table archiving and MDT cleaning config conflict

2023-05-23 Thread yonghua jian (Jira)
yonghua jian created HUDI-6256: -- Summary: fix the data table archiving and MDT cleaning config conflict Key: HUDI-6256 URL: https://issues.apache.org/jira/browse/HUDI-6256 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #8776: [HUDI-5994] Bucket index supports bulk insert row writer

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8776: URL: https://github.com/apache/hudi/pull/8776#issuecomment-1560409336 ## CI report: * 23435aedff88e482ea60143c0e88fa88f723f5c1 Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #8787: [HUDI-6254] Allow using absolute path in ManifestFileWriter

2023-05-23 Thread via GitHub
danny0405 commented on code in PR #8787: URL: https://github.com/apache/hudi/pull/8787#discussion_r1203363874 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncTool.java: ## @@ -96,7 +96,7 @@ private void syncCoWTable(HoodieBigQuerySyncClient bqSyncClient) {

[GitHub] [hudi] lucienoz opened a new issue, #8791: [SUPPORT]

2023-05-23 Thread via GitHub
lucienoz opened a new issue, #8791: URL: https://github.com/apache/hudi/issues/8791 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

[GitHub] [hudi] hudi-bot commented on pull request #8776: [HUDI-5994] Bucket index supports bulk insert row writer

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8776: URL: https://github.com/apache/hudi/pull/8776#issuecomment-1560404999 ## CI report: * 23435aedff88e482ea60143c0e88fa88f723f5c1 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8790: URL: https://github.com/apache/hudi/pull/8790#issuecomment-1560405056 ## CI report: * 70cad2e0ed2fd9448027565e6e4de55f765add82 UNKNOWN * 9de7680e6340c396db3fcaa314afe4420493fe91 UNKNOWN Bot commands @hudi-bot supports the

[GitHub] [hudi] nsivabalan commented on a diff in pull request #8638: added new exception types

2023-05-23 Thread via GitHub
nsivabalan commented on code in PR #8638: URL: https://github.com/apache/hudi/pull/8638#discussion_r1203336055 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionUtils.scala: ## @@ -138,18 +139,26 @@ object AvroConversionUtils { def

[jira] [Updated] (HUDI-6099) Improve performance of checking for valid commits when tagging record location

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-6099: Fix Version/s: (was: 0.13.1) > Improve performance of checking for valid commits when tagging record

[jira] [Updated] (HUDI-5909) Reuse hive client if possible

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5909: Fix Version/s: (was: 0.13.1) > Reuse hive client if possible > - > >

[jira] [Updated] (HUDI-5983) Improve loading data via cloud store incr source

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5983: Fix Version/s: (was: 0.13.1) > Improve loading data via cloud store incr source >

[jira] [Updated] (HUDI-5972) Fix the flink 1.13 bundle jar version on website

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5972: Fix Version/s: (was: 0.13.1) > Fix the flink 1.13 bundle jar version on website >

[jira] [Updated] (HUDI-5922) Reuse IMetaStoreClient between HoodieHiveSyncClient and DDLExecutor

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5922: Fix Version/s: (was: 0.13.1) > Reuse IMetaStoreClient between HoodieHiveSyncClient and DDLExecutor >

[jira] [Updated] (HUDI-5916) flink bundle jar includes the hive-exec core by default

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5916: Fix Version/s: (was: 0.13.1) > flink bundle jar includes the hive-exec core by default >

[jira] [Updated] (HUDI-5887) Should not mark the concurrency mode as OCC by default when MDT is enabled

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5887: Fix Version/s: (was: 0.13.1) > Should not mark the concurrency mode as OCC by default when MDT is

[jira] [Updated] (HUDI-5870) Fix some comments about column type change rules

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5870: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix some comments about column type change

[jira] [Updated] (HUDI-5947) Update the README for flink jar building

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5947: Fix Version/s: (was: 0.13.1) > Update the README for flink jar building >

[GitHub] [hudi] hudi-bot commented on pull request #8725: [HUDI-6219] Ensure consistency between Spark catalog schema and Hudi schema

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8725: URL: https://github.com/apache/hudi/pull/8725#issuecomment-1560400891 ## CI report: * f1ac941150308aa36ed897ba78b5d655498e29b1 Azure:

[jira] [Updated] (HUDI-5873) The pending compactions of dataset table should not block MDT compaction

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5873: Fix Version/s: (was: 0.13.1) > The pending compactions of dataset table should not block MDT compaction

[jira] [Updated] (HUDI-5881) Handle pending clean instants while running savepoint

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5881: Fix Version/s: (was: 0.13.1) 0.14.0 > Handle pending clean instants while running

[jira] [Updated] (HUDI-5611) Revisit metadata-table-based file listing calls and use batch lookup instead

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5611: Fix Version/s: 0.14.0 (was: 0.13.1) > Revisit metadata-table-based file listing

[jira] [Closed] (HUDI-6235) Update and Delete statements for Flink

2023-05-23 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6235. Resolution: Fixed Fixed via master branch: 2f6d24f53d9937d44bf208f26d7edcaaa4eda19b > Update and Delete

[jira] [Updated] (HUDI-5809) Keep RFC-56 early conflict detection update to date

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5809: Fix Version/s: 0.14.0 (was: 0.13.1) > Keep RFC-56 early conflict detection update to

[jira] [Updated] (HUDI-5848) No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5848: Fix Version/s: 0.14.0 (was: 0.13.1) > No PreCombineField mode - make

[jira] [Updated] (HUDI-5794) Fail any new commits if there is any inflight restore in timeline

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5794: Fix Version/s: 0.14.0 (was: 0.13.1) > Fail any new commits if there is any inflight

[GitHub] [hudi] danny0405 merged pull request #8749: [HUDI-6235] Update and Delete statements for Flink

2023-05-23 Thread via GitHub
danny0405 merged PR #8749: URL: https://github.com/apache/hudi/pull/8749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [HUDI-6235] Update and Delete statements for Flink (#8749)

2023-05-23 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 2f6d24f53d9 [HUDI-6235] Update and Delete

[jira] [Updated] (HUDI-5845) Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5845: Fix Version/s: (was: 0.13.1) > Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields >

[jira] [Updated] (HUDI-5853) Add infer function for BQ sync configs

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5853: Fix Version/s: 0.14.0 (was: 0.13.1) > Add infer function for BQ sync configs >

[jira] [Updated] (HUDI-5812) Optimize the data size check in HoodieBaseParquetWriter

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5812: Fix Version/s: (was: 0.13.1) > Optimize the data size check in HoodieBaseParquetWriter >

[jira] [Updated] (HUDI-5651) sort the inputs by record keys for bulk insert tasks

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5651: Fix Version/s: (was: 0.13.1) > sort the inputs by record keys for bulk insert tasks >

[jira] [Updated] (HUDI-5664) Improve SqlQueryPreCommitValidator#queries Parallelism

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5664: Fix Version/s: (was: 0.13.1) > Improve SqlQueryPreCommitValidator#queries Parallelism >

[jira] [Updated] (HUDI-5791) Handle empty payloads for AbstractDebeziumAvroPayload

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5791: Fix Version/s: 0.14.0 (was: 0.13.1) > Handle empty payloads for

[jira] [Updated] (HUDI-5724) Test MOR table w/ global index w/ update partition path to true

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5724: Fix Version/s: 0.14.0 (was: 0.13.1) > Test MOR table w/ global index w/ update

[jira] [Updated] (HUDI-5612) Integrate metadata table with SpillableMapBasedFileSystemView and RocksDbBasedFileSystemView

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5612: Fix Version/s: 0.14.0 (was: 0.13.1) > Integrate metadata table with

[jira] [Updated] (HUDI-5531) RECENT_DAYS strategy of ClusteringPlanPartitionFilterMode should rename to RECENT_PARTITIONS

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5531: Fix Version/s: 0.14.0 (was: 0.13.1) > RECENT_DAYS strategy of

[jira] [Updated] (HUDI-5449) Misc and adhoc fixes to add record level index support to MDT

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5449: Fix Version/s: 0.14.0 (was: 0.13.1) > Misc and adhoc fixes to add record level index

[jira] [Updated] (HUDI-5445) Fix/Unify deletion code paths for MDT

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5445: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix/Unify deletion code paths for MDT >

[jira] [Updated] (HUDI-5300) Optimize initial commit w/ metadata table

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5300: Fix Version/s: 0.14.0 (was: 0.13.1) > Optimize initial commit w/ metadata table >

[jira] [Updated] (HUDI-4585) Optimize query performance on Presto Hudi connector

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4585: Fix Version/s: 0.14.0 (was: 0.13.1) > Optimize query performance on Presto Hudi

[jira] [Updated] (HUDI-5451) Ensure switching "001" and "002" suffix for compaction and cleaning in MDT is backwards compatible

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5451: Fix Version/s: 0.14.0 (was: 0.13.1) > Ensure switching "001" and "002" suffix for

[jira] [Updated] (HUDI-5245) Honor pruned partitions while looking up in col stats partition in MDT

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5245: Fix Version/s: 0.14.0 (was: 0.13.1) > Honor pruned partitions while looking up in

[jira] [Updated] (HUDI-5173) Skip if there is only one file in clusteringGroup

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5173: Fix Version/s: (was: 0.13.1) > Skip if there is only one file in clusteringGroup >

[jira] [Updated] (HUDI-5453) Ensure new fileId format is good across all code paths and backwards compatible

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5453: Fix Version/s: 0.14.0 (was: 0.13.1) > Ensure new fileId format is good across all

[jira] [Updated] (HUDI-5498) Update docs for reading Hudi tables on Databricks runtime

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5498: Fix Version/s: 0.14.0 (was: 0.13.1) > Update docs for reading Hudi tables on

[jira] [Updated] (HUDI-5427) Cache final snapshot records for Metadata table FILES partition

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5427: Fix Version/s: 0.14.0 (was: 0.13.1) > Cache final snapshot records for Metadata

[jira] [Updated] (HUDI-5448) Add metrics to record level index in MDT

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5448: Fix Version/s: 0.14.0 (was: 0.13.1) > Add metrics to record level index in MDT >

[jira] [Updated] (HUDI-5440) Benchmark MDT performance for diff sizes of datasets

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5440: Fix Version/s: 0.14.0 (was: 0.13.1) > Benchmark MDT performance for diff sizes of

[jira] [Updated] (HUDI-5292) Exclude the test resources from every module packaging

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-5292: Fix Version/s: 0.14.0 (was: 0.13.1) > Exclude the test resources from every module

[jira] [Updated] (HUDI-4329) Add separate control for compaction operation sync/async mode

2023-05-23 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4329: Fix Version/s: 0.14.0 (was: 0.13.1) > Add separate control for compaction operation

[GitHub] [hudi] danny0405 commented on a diff in pull request #8716: [HUDI-6226] Support parquet native bloom filters

2023-05-23 Thread via GitHub
danny0405 commented on code in PR #8716: URL: https://github.com/apache/hudi/pull/8716#discussion_r1203315990 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieBaseParquetWriter.java: ## @@ -67,6 +89,22 @@ public HoodieBaseParquetWriter(Path file,

[GitHub] [hudi] hudi-bot commented on pull request #8790: [DNM][Test CI][TEST] Hudi 3088 default spark32 3

2023-05-23 Thread via GitHub
hudi-bot commented on PR #8790: URL: https://github.com/apache/hudi/pull/8790#issuecomment-1560367102 ## CI report: * 70cad2e0ed2fd9448027565e6e4de55f765add82 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

  1   2   3   >