[GitHub] [hudi] BruceKellan closed pull request #5953: [HUDI-4314] Improve the performance of reading from the specified ins…

2022-07-21 Thread GitBox
BruceKellan closed pull request #5953: [HUDI-4314] Improve the performance of reading from the specified ins… URL: https://github.com/apache/hudi/pull/5953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] wzx140 commented on a diff in pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-07-21 Thread GitBox
wzx140 commented on code in PR #5629: URL: https://github.com/apache/hudi/pull/5629#discussion_r927340028 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java: ## @@ -21,64 +21,46 @@ import org.apache.hudi.common.model.Ho

[jira] [Assigned] (HUDI-4412) Multiple writers NPE when Insert_overwrite

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4412: - Assignee: liujinhui > Multiple writers NPE when Insert_overwrite > --

[jira] [Commented] (HUDI-4415) Support spark writer running on thrift server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569838#comment-17569838 ] Rajesh Mahindra commented on HUDI-4415: --- [~minihippo] are you planning to work on it

[jira] [Updated] (HUDI-4418) Implement ProtoKafkaSource

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4418: -- Fix Version/s: 0.13.0 > Implement ProtoKafkaSource > -- > >

[GitHub] [hudi] danny0405 commented on pull request #5954: [HUDI-4303] Use Hive sentinel value as partition default to avoid casting err

2022-07-21 Thread GitBox
danny0405 commented on PR #5954: URL: https://github.com/apache/hudi/pull/5954#issuecomment-1192238932 > @codope looks like IT still fails after rerunning. Could you check the failure? The hudi-flink test case is flaky and i'm fixing it in https://github.com/apache/hudi/pull/6181, so

[jira] [Commented] (HUDI-4422) read parquet failed due to length is 0 or corrupt parquet file

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569836#comment-17569836 ] Rajesh Mahindra commented on HUDI-4422: --- [~JinxinTang] Feel free to raise the PR aft

[GitHub] [hudi] wzx140 commented on a diff in pull request #6132: [HUDI-4414] Update the RFC-46 doc to fix comments feedback

2022-07-21 Thread GitBox
wzx140 commented on code in PR #6132: URL: https://github.com/apache/hudi/pull/6132#discussion_r927337196 ## rfc/rfc-46/rfc-46.md: ## @@ -84,59 +84,90 @@ is known to have poor performance (compared to non-reflection based instantiatio Record Merge API -Stateless compo

[GitHub] [hudi] danny0405 commented on a diff in pull request #6020: [HUDI-4348] fix merge into sql data quality in concurrent scene

2022-07-21 Thread GitBox
danny0405 commented on code in PR #6020: URL: https://github.com/apache/hudi/pull/6020#discussion_r927336771 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/payload/SqlTypedRecord.scala: ## @@ -53,6 +53,11 @@ object SqlTypedRecord {

[GitHub] [hudi] jiezi2026 commented on issue #6158: [SUPPORT] Spark 3.2.1, the value of hoodie.datasource.write.hive_style_partitioning in HMS catalog is different with that in hoodie.properties

2022-07-21 Thread GitBox
jiezi2026 commented on issue #6158: URL: https://github.com/apache/hudi/issues/6158#issuecomment-1192237718 @xiaozhch5 I think this is the same problem as #6070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Assigned] (HUDI-4429) Make Spark 3.1.3 the default profile

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4429: - Assignee: Rahil Chertara > Make Spark 3.1.3 the default profile > --

[jira] [Comment Edited] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra edited comment on HUDI-4430 at 7/22/22 6:30 AM:

[jira] [Updated] (HUDI-4451) Multiple writer using insert_overwrite loses some hive partitions

2022-07-21 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-4451: Description: At that time, hudi was used to write concurrently, with multiple writers. Sync to hive, find lo

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra commented on HUDI-4430: --- Looks like your input column is of type str

[jira] [Updated] (HUDI-4451) Multiple writer sync hive using insert_overwrite loses some partitions

2022-07-21 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-4451: Summary: Multiple writer sync hive using insert_overwrite loses some partitions (was: Multiple writer hive

[jira] [Updated] (HUDI-4451) Multiple writer using insert_overwrite loses some hive partitions

2022-07-21 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-4451: Summary: Multiple writer using insert_overwrite loses some hive partitions (was: Multiple writer sync hive

[jira] [Created] (HUDI-4451) Multiple writer hive using insert_overwrite loses some partitions

2022-07-21 Thread liujinhui (Jira)
liujinhui created HUDI-4451: --- Summary: Multiple writer hive using insert_overwrite loses some partitions Key: HUDI-4451 URL: https://issues.apache.org/jira/browse/HUDI-4451 Project: Apache Hudi Is

[jira] [Assigned] (HUDI-4434) Disable EMRFS and EMR spark related properties

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4434: - Assignee: Rahil Chertara > Disable EMRFS and EMR spark related properties >

[jira] [Assigned] (HUDI-4440) Treat boostrapped table as non-partitioned in HudiFileIndex if partition column is missing from schema

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4440: - Assignee: Rahil Chertara > Treat boostrapped table as non-partitioned in HudiFileIndex if

[jira] [Assigned] (HUDI-4439) Fix Amazon CloudWatch reporter for metadata enabled tables

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4439: - Assignee: Rahil Chertara > Fix Amazon CloudWatch reporter for metadata enabled tables > -

[jira] [Updated] (HUDI-4442) Converting from json to avro does not sanitize field names

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4442: -- Fix Version/s: 0.12.0 > Converting from json to avro does not sanitize field names > ---

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Labels: blocker (was: ) > Add DeltaStreamer support for AWS managed Kafka (MSK) >

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Fix Version/s: 0.13.0 > Add DeltaStreamer support for AWS managed Kafka (MSK) > ---

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927284257 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadInputFormat.java: ## @@ -135,6 +139,12 @@ */ private boolean closed = t

[jira] [Updated] (HUDI-4445) Fix few things related to S3 Incremental Source

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4445: -- Description: # Decode file resource url before operating on it. # Fix serializability of hadoop

[jira] [Assigned] (HUDI-4448) Remove the latest commit refresh for timeline server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4448: - Assignee: Danny Chen > Remove the latest commit refresh for timeline server > ---

[jira] [Assigned] (HUDI-4450) Revert the checkpoint abort notification

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4450: - Assignee: Danny Chen > Revert the checkpoint abort notification > ---

[GitHub] [hudi] hudi-bot commented on pull request #6170: try adding in propery enabling bridge

2022-07-21 Thread GitBox
hudi-bot commented on PR #6170: URL: https://github.com/apache/hudi/pull/6170#issuecomment-1192220568 ## CI report: * 86690888d4002b8fe3aa8ef07b8f4347cc615304 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1017

[GitHub] [hudi] hudi-bot commented on pull request #5943: [HUDI-4186] Support Hudi with Spark 3.3.0

2022-07-21 Thread GitBox
hudi-bot commented on PR #5943: URL: https://github.com/apache/hudi/pull/5943#issuecomment-1192220299 ## CI report: * fa048b175c2b3b5a80c6ef8d0b9709097b822cfb UNKNOWN * b94604147edcfc5040b6cf8a1a649e9a0cf1eb2a UNKNOWN * 0fdc1347c43459f3946b27cdf6753e3166ea6055 UNKNOWN * af

[GitHub] [hudi] rmahindra123 commented on issue #6171: [SUPPORT] ClassNotFound Exception when saving DataFrame as Hudi in EMR

2022-07-21 Thread GitBox
rmahindra123 commented on issue #6171: URL: https://github.com/apache/hudi/issues/6171#issuecomment-1192218126 @lewis262626 While i try to reproduce the steps, could you try specifying the confs before the main class? park-submit --conf "spark.serializer=org.apache.spark.serializer.

[GitHub] [hudi] hudi-bot commented on pull request #6182: [DO NOT MERGE] 0.11.1 release patch branch

2022-07-21 Thread GitBox
hudi-bot commented on PR #6182: URL: https://github.com/apache/hudi/pull/6182#issuecomment-1192217911 ## CI report: * 98e9df75d6475813609c8c92ee73417205acd1f7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6170: try adding in propery enabling bridge

2022-07-21 Thread GitBox
hudi-bot commented on PR #6170: URL: https://github.com/apache/hudi/pull/6170#issuecomment-1192217842 ## CI report: * 86690888d4002b8fe3aa8ef07b8f4347cc615304 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1017

[GitHub] [hudi] hudi-bot commented on pull request #5943: [HUDI-4186] Support Hudi with Spark 3.3.0

2022-07-21 Thread GitBox
hudi-bot commented on PR #5943: URL: https://github.com/apache/hudi/pull/5943#issuecomment-1192217509 ## CI report: * fa048b175c2b3b5a80c6ef8d0b9709097b822cfb UNKNOWN * b94604147edcfc5040b6cf8a1a649e9a0cf1eb2a UNKNOWN * 0fdc1347c43459f3946b27cdf6753e3166ea6055 UNKNOWN * af

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927319541 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FormatUtils.java: ## @@ -130,6 +132,7 @@ public static HoodieMergedLogRecordScanner logScanne

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927319245 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java: ## @@ -99,10 +113,36 @@ public CopyOnWriteInputFormat(

[GitHub] [hudi] hudi-bot commented on pull request #6181: [HUDI-4450] Revert the checkpoint abort notification

2022-07-21 Thread GitBox
hudi-bot commented on PR #6181: URL: https://github.com/apache/hudi/pull/6181#issuecomment-1192214939 ## CI report: * 85e29c39fa38db707b990f3435128981640f9de5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
hudi-bot commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192214905 ## CI report: * d7f999096d9a730c470cfc5a548fcbf634acd086 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #5328: [HUDI-3883] Fix Bulk Insert to repartition the dataset based on Partition Path

2022-07-21 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1192214134 ## CI report: * 0f0fae82a029d42fa9db7ea8d2df4ba1787fded6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8129

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927312625 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieUnMergedLogRecordScanner.java: ## @@ -135,10 +137,15 @@ public Builder withLogRecordScannerCallback(Log

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927312625 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieUnMergedLogRecordScanner.java: ## @@ -135,10 +137,15 @@ public Builder withLogRecordScannerCallback(Log

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927312625 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieUnMergedLogRecordScanner.java: ## @@ -135,10 +137,15 @@ public Builder withLogRecordScannerCallback(Log

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927306341 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -279,7 +279,7 @@ private void saveInternalSchema(HoodieTable table,

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927309714 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1667,16 +1667,14 @@ public void reOrderColPosition(String colName,

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927306341 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -279,7 +279,7 @@ private void saveInternalSchema(HoodieTable table,

[GitHub] [hudi] hudi-bot commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
hudi-bot commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192190313 ## CI report: * d7f999096d9a730c470cfc5a548fcbf634acd086 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6176: [HUDI-4445] S3 Incremental source improvements

2022-07-21 Thread GitBox
hudi-bot commented on PR #6176: URL: https://github.com/apache/hudi/pull/6176#issuecomment-1192190280 ## CI report: * 0ff79e179d4ed9d6e300dfb609379860b7b9c9ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #5328: [HUDI-3883] Fix Bulk Insert to repartition the dataset based on Partition Path

2022-07-21 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1192189718 ## CI report: * 0f0fae82a029d42fa9db7ea8d2df4ba1787fded6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8129

[GitHub] [hudi] danny0405 commented on a diff in pull request #5643: [HUDI-4071] Change defaults for some of the configs

2022-07-21 Thread GitBox
danny0405 commented on code in PR #5643: URL: https://github.com/apache/hudi/pull/5643#discussion_r927295285 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -349,7 +349,7 @@ public class HoodieWriteConfig extends HoodieConfig

[GitHub] [hudi] danny0405 commented on a diff in pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
danny0405 commented on code in PR #6179: URL: https://github.com/apache/hudi/pull/6179#discussion_r927294480 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/compact/ITTestHoodieFlinkCompactor.java: ## @@ -253,6 +253,7 @@ public void testHoodieFlinkCompact

[GitHub] [hudi] alexeykudinkin commented on pull request #5328: [HUDI-3883] Fix Bulk Insert to repartition the dataset based on Partition Path

2022-07-21 Thread GitBox
alexeykudinkin commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1192187988 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [hudi] danny0405 commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
danny0405 commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192187215 cc @nsivabalan , you may need to take a look at this ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [hudi] hudi-bot commented on pull request #6182: [DO NOT MERGE] 0.11.1 release patch branch

2022-07-21 Thread GitBox
hudi-bot commented on PR #6182: URL: https://github.com/apache/hudi/pull/6182#issuecomment-1192185722 ## CI report: * 98e9df75d6475813609c8c92ee73417205acd1f7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5643: [HUDI-4071] Change defaults for some of the configs

2022-07-21 Thread GitBox
hudi-bot commented on PR #5643: URL: https://github.com/apache/hudi/pull/5643#issuecomment-1192185324 ## CI report: * 8c5174c329ab0bdb4de1eabd2a04e667f1206424 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8785

[GitHub] [hudi] hudi-bot commented on pull request #5643: [HUDI-4071] Change defaults for some of the configs

2022-07-21 Thread GitBox
hudi-bot commented on PR #5643: URL: https://github.com/apache/hudi/pull/5643#issuecomment-1192183377 ## CI report: * 8c5174c329ab0bdb4de1eabd2a04e667f1206424 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8785

[GitHub] [hudi] hudi-bot commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
hudi-bot commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192181606 ## CI report: * d7f999096d9a730c470cfc5a548fcbf634acd086 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6180: [HUDI-4447] fix the sync problem when performing delete table operation

2022-07-21 Thread GitBox
hudi-bot commented on PR #6180: URL: https://github.com/apache/hudi/pull/6180#issuecomment-1192181625 ## CI report: * d371724cc9596ccabbc751bf73d1ccf625f8d471 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #5708: [HUDI-4420][Stacked on 5430] Fixing table schema delineation on partition/data schema for Spark relations

2022-07-21 Thread GitBox
hudi-bot commented on PR #5708: URL: https://github.com/apache/hudi/pull/5708#issuecomment-1192181156 ## CI report: * 192e15ade1d6b8a291d003477b287bd7a5ef9e76 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927284257 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadInputFormat.java: ## @@ -135,6 +139,12 @@ */ private boolean closed = t

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927287333 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/BaseMergeHelper.java: ## @@ -130,4 +145,48 @@ protected Void getResult() { return

[GitHub] [hudi] codope commented on a diff in pull request #6176: [HUDI-4445] S3 Incremental source improvements

2022-07-21 Thread GitBox
codope commented on code in PR #6176: URL: https://github.com/apache/hudi/pull/6176#discussion_r927282574 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -156,53 +214,52 @@ public Pair>, String> fetchNextBatch(Option lastCk

[GitHub] [hudi] trushev commented on a diff in pull request #5830: [HUDI-3981] Flink engine support for comprehensive schema evolution(RFC-33)

2022-07-21 Thread GitBox
trushev commented on code in PR #5830: URL: https://github.com/apache/hudi/pull/5830#discussion_r927284257 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/mor/MergeOnReadInputFormat.java: ## @@ -135,6 +139,12 @@ */ private boolean closed = t

[GitHub] [hudi] danny0405 opened a new pull request, #6182: 0.11 patch

2022-07-21 Thread GitBox
danny0405 opened a new pull request, #6182: URL: https://github.com/apache/hudi/pull/6182 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpo

[GitHub] [hudi] rahil-c commented on a diff in pull request #6151: [HUDI-4429] Make Spark3.1 the default profile

2022-07-21 Thread GitBox
rahil-c commented on code in PR #6151: URL: https://github.com/apache/hudi/pull/6151#discussion_r927278345 ## hudi-integ-test/prepare_integration_suite.sh: ## @@ -42,7 +42,7 @@ get_spark_command() { else scala=$scala fi - echo "spark-submit --packages org.apache.spar

[GitHub] [hudi] rahil-c commented on a diff in pull request #6151: [HUDI-4429] Make Spark3.1 the default profile

2022-07-21 Thread GitBox
rahil-c commented on code in PR #6151: URL: https://github.com/apache/hudi/pull/6151#discussion_r927276458 ## hudi-client/hudi-spark-client/pom.xml: ## @@ -48,10 +48,22 @@ org.apache.spark spark-core_${scala.binary.version} + + + org.ap

[GitHub] [hudi] rahil-c commented on a diff in pull request #6151: [HUDI-4429] Make Spark3.1 the default profile

2022-07-21 Thread GitBox
rahil-c commented on code in PR #6151: URL: https://github.com/apache/hudi/pull/6151#discussion_r927266490 ## hudi-utilities/pom.xml: ## @@ -241,6 +245,17 @@ + + org.apache.spark + spark-hive_${scala.binary.version} + + + * +

[GitHub] [hudi] rahil-c commented on a diff in pull request #6151: [HUDI-4429] Make Spark3.1 the default profile

2022-07-21 Thread GitBox
rahil-c commented on code in PR #6151: URL: https://github.com/apache/hudi/pull/6151#discussion_r927272053 ## packaging/hudi-spark-bundle/pom.xml: ## @@ -95,6 +95,12 @@ org.antlr:stringtemplate org.apache.parquet:parquet-avro +

[GitHub] [hudi] hudi-bot commented on pull request #6181: [HUDI-4450] Revert the checkpoint abort notification

2022-07-21 Thread GitBox
hudi-bot commented on PR #6181: URL: https://github.com/apache/hudi/pull/6181#issuecomment-1192153974 ## CI report: * 85e29c39fa38db707b990f3435128981640f9de5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #5708: [HUDI-4420][Stacked on 5430] Fixing table schema delineation on partition/data schema for Spark relations

2022-07-21 Thread GitBox
hudi-bot commented on PR #5708: URL: https://github.com/apache/hudi/pull/5708#issuecomment-1192153606 ## CI report: * e689b295bf78d07ec16ecad0da2956672987862a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1016

[GitHub] [hudi] rahil-c commented on a diff in pull request #6151: [HUDI-4429] Make Spark3.1 the default profile

2022-07-21 Thread GitBox
rahil-c commented on code in PR #6151: URL: https://github.com/apache/hudi/pull/6151#discussion_r927266490 ## hudi-utilities/pom.xml: ## @@ -241,6 +245,17 @@ + + org.apache.spark + spark-hive_${scala.binary.version} + + + * +

[GitHub] [hudi] hudi-bot commented on pull request #6181: [HUDI-4450] Revert the checkpoint abort notification

2022-07-21 Thread GitBox
hudi-bot commented on PR #6181: URL: https://github.com/apache/hudi/pull/6181#issuecomment-1192152056 ## CI report: * 85e29c39fa38db707b990f3435128981640f9de5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6176: [HUDI-4445] S3 Incremental source improvements

2022-07-21 Thread GitBox
hudi-bot commented on PR #6176: URL: https://github.com/apache/hudi/pull/6176#issuecomment-1192152022 ## CI report: * e434c6b5bb7b4dc8ee24164a01d58cf0e03207cd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=101

[GitHub] [hudi] hudi-bot commented on pull request #5708: [HUDI-4420][Stacked on 5430] Fixing table schema delineation on partition/data schema for Spark relations

2022-07-21 Thread GitBox
hudi-bot commented on PR #5708: URL: https://github.com/apache/hudi/pull/5708#issuecomment-1192151709 ## CI report: * e689b295bf78d07ec16ecad0da2956672987862a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1016

[GitHub] [hudi] hudi-bot commented on pull request #5523: [HUDI-4039][Stacked on 5470] Make sure all builtin `KeyGenerator`s properly implement Spark specific APIs

2022-07-21 Thread GitBox
hudi-bot commented on PR #5523: URL: https://github.com/apache/hudi/pull/5523#issuecomment-1192151599 ## CI report: * 24da4bb8af969e5c0706c91eb4ae5e64ecb5d1dc UNKNOWN * dbaafef2a8cfe013976422897ddd18bf0038a7dc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #5943: [HUDI-4186] Support Hudi with Spark 3.3.0

2022-07-21 Thread GitBox
hudi-bot commented on PR #5943: URL: https://github.com/apache/hudi/pull/5943#issuecomment-1192150133 ## CI report: * fa048b175c2b3b5a80c6ef8d0b9709097b822cfb UNKNOWN * b94604147edcfc5040b6cf8a1a649e9a0cf1eb2a UNKNOWN * 0fdc1347c43459f3946b27cdf6753e3166ea6055 UNKNOWN * af

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5708: [HUDI-4420][Stacked on 5430] Fixing table schema delineation on partition/data schema for Spark relations

2022-07-21 Thread GitBox
alexeykudinkin commented on code in PR #5708: URL: https://github.com/apache/hudi/pull/5708#discussion_r927259929 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -564,42 +538,57 @@ abstract class HoodieBaseRelation(val sq

[jira] [Assigned] (HUDI-4362) Spark: Support dynamic partition filtering in 3.2

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu reassigned HUDI-4362: Assignee: chenliang > Spark: Support dynamic partition filtering in 3.2 > -

[jira] [Updated] (HUDI-4362) Spark: Support dynamic partition filtering in 3.2

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-4362: - Epic Link: HUDI-1297 > Spark: Support dynamic partition filtering in 3.2 > ---

[jira] [Updated] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-4449: - Epic Link: HUDI-1297 > Spark: Support DataSourceV2 Read > - > >

[jira] [Updated] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-4449: - Labels: (was: HUDI-1297) > Spark: Support DataSourceV2 Read > - > >

[jira] [Updated] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-4449: - Labels: HUDI-1297 (was: ) > Spark: Support DataSourceV2 Read > - > >

[jira] [Assigned] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu reassigned HUDI-4449: Assignee: Forward Xu > Spark: Support DataSourceV2 Read > - > >

[jira] [Assigned] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu reassigned HUDI-4449: Assignee: chenliang (was: Forward Xu) > Spark: Support DataSourceV2 Read > --

[jira] [Updated] (HUDI-4450) Revert the checkpoint abort notification

2022-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4450: - Labels: pull-request-available (was: ) > Revert the checkpoint abort notification > -

[GitHub] [hudi] danny0405 opened a new pull request, #6181: [HUDI-4450] Revert the checkpoint abort notification

2022-07-21 Thread GitBox
danny0405 opened a new pull request, #6181: URL: https://github.com/apache/hudi/pull/6181 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpo

[jira] [Created] (HUDI-4450) Revert the checkpoint abort notification

2022-07-21 Thread Danny Chen (Jira)
Danny Chen created HUDI-4450: Summary: Revert the checkpoint abort notification Key: HUDI-4450 URL: https://issues.apache.org/jira/browse/HUDI-4450 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-4445) Fix few things related to S3 Incremental Source

2022-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4445: - Labels: pull-request-available (was: ) > Fix few things related to S3 Incremental Source > --

[GitHub] [hudi] hudi-bot commented on pull request #6176: [HUDI-4445] S3 Incremental source improvements

2022-07-21 Thread GitBox
hudi-bot commented on PR #6176: URL: https://github.com/apache/hudi/pull/6176#issuecomment-1192129920 ## CI report: * e434c6b5bb7b4dc8ee24164a01d58cf0e03207cd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=101

[GitHub] [hudi] hudi-bot commented on pull request #5430: [HUDI-3979][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-07-21 Thread GitBox
hudi-bot commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1192127121 ## CI report: * 5b241061bde4ca74684f07677c7f5afa828e269c UNKNOWN * 0a1470f333aaa644ba74a5616d4140380fde66a4 UNKNOWN * 0ed8c2e27c4a25625a0b48a54fff2f841e85115f Azure: [CANCEL

[GitHub] [hudi] hudi-bot commented on pull request #6180: [HUDI-4447] fix the sync problem when performing delete table operation

2022-07-21 Thread GitBox
hudi-bot commented on PR #6180: URL: https://github.com/apache/hudi/pull/6180#issuecomment-1192125800 ## CI report: * d371724cc9596ccabbc751bf73d1ccf625f8d471 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
hudi-bot commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192125788 ## CI report: * d7f999096d9a730c470cfc5a548fcbf634acd086 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1018

[GitHub] [hudi] hudi-bot commented on pull request #6176: S3 Incremental source improvements

2022-07-21 Thread GitBox
hudi-bot commented on PR #6176: URL: https://github.com/apache/hudi/pull/6176#issuecomment-1192125776 ## CI report: * e434c6b5bb7b4dc8ee24164a01d58cf0e03207cd Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=101

[GitHub] [hudi] hudi-bot commented on pull request #6180: [HUDI-4447] fix the sync problem when performing delete table operation

2022-07-21 Thread GitBox
hudi-bot commented on PR #6180: URL: https://github.com/apache/hudi/pull/6180#issuecomment-1192123816 ## CI report: * d371724cc9596ccabbc751bf73d1ccf625f8d471 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
hudi-bot commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192123787 ## CI report: * d7f999096d9a730c470cfc5a548fcbf634acd086 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6176: S3 Incremental source improvements

2022-07-21 Thread GitBox
hudi-bot commented on PR #6176: URL: https://github.com/apache/hudi/pull/6176#issuecomment-1192123762 ## CI report: * e434c6b5bb7b4dc8ee24164a01d58cf0e03207cd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6170: try adding in propery enabling bridge

2022-07-21 Thread GitBox
hudi-bot commented on PR #6170: URL: https://github.com/apache/hudi/pull/6170#issuecomment-1192121731 ## CI report: * 86690888d4002b8fe3aa8ef07b8f4347cc615304 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1017

[GitHub] [hudi] hudi-bot commented on pull request #6123: [HUDI-4437] Fix test conflicts by clearing file system cache

2022-07-21 Thread GitBox
hudi-bot commented on PR #6123: URL: https://github.com/apache/hudi/pull/6123#issuecomment-1192121619 ## CI report: * 9ad631abf88daf6fa3fe73320f0f1931ea4248ee UNKNOWN * f1508223a2661b830ff72d7b0db3927cba1771a8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] danny0405 commented on pull request #6179: [HUDI-4448] Remove the latest commit refresh for timeline server

2022-07-21 Thread GitBox
danny0405 commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1192120583 > ## What is the purpose of the pull request > In our production env, the fresh based on the latest commit cause data loss for all kinds of corner cases, when there are async table servi

[jira] [Updated] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread chenliang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenliang updated HUDI-4449: Status: In Progress (was: Open) > Spark: Support DataSourceV2 Read > - > >

[jira] [Created] (HUDI-4449) Spark: Support DataSourceV2 Read

2022-07-21 Thread chenliang (Jira)
chenliang created HUDI-4449: --- Summary: Spark: Support DataSourceV2 Read Key: HUDI-4449 URL: https://issues.apache.org/jira/browse/HUDI-4449 Project: Apache Hudi Issue Type: Improvement C

[jira] [Updated] (HUDI-4447) Hive Sync fails fails when performing delete table data operation

2022-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4447: - Labels: pull-request-available (was: ) > Hive Sync fails fails when performing delete table data

  1   2   3   4   5   >