[GitHub] [hudi] waywtdcc commented on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
waywtdcc commented on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1005459959 > Do you setup the state ttl already ? I didn't set TTL related parameters -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [hudi] hudi-bot commented on pull request #4512: [HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4512: URL: https://github.com/apache/hudi/pull/4512#issuecomment-1005458302 ## CI report: * 88fed889b20d81fa71c156a7b8e87c2c3651de2f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4512: [HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4512: URL: https://github.com/apache/hudi/pull/4512#issuecomment-1005456587 ## CI report: * 88fed889b20d81fa71c156a7b8e87c2c3651de2f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4512: [HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4512: URL: https://github.com/apache/hudi/pull/4512#issuecomment-1005456587 ## CI report: * 88fed889b20d81fa71c156a7b8e87c2c3651de2f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] xushiyan commented on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-04 Thread GitBox
xushiyan commented on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1005455990 @boneanxs @a0x Thanks for sharing the info and ideas. I've filed https://issues.apache.org/jira/browse/HUDI-3157 I'll defer to @zhedoubushishi to give some guidance from aws :) -

[jira] [Updated] (HUDI-3170) Clustering preserve commit metadata retains filegroup id despite writes going to new filegroup

2022-01-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3170: - Labels: pull-request-available (was: ) > Clustering preserve commit metadata retains filegroup id

[GitHub] [hudi] codope opened a new pull request #4512: [HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled

2022-01-04 Thread GitBox
codope opened a new pull request #4512: URL: https://github.com/apache/hudi/pull/4512 ## What is the purpose of the pull request [#3419](https://github.com/apache/hudi/pull/3419) allowed to preserve commit metadata while clustering so as to support incremental queries with replaceco

[GitHub] [hudi] danny0405 commented on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
danny0405 commented on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1005455296 Do you setup the state ttl already ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [hudi] hudi-bot commented on pull request #4486: [HUDI-3132] Minor fixes for HoodieCatalog

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4486: URL: https://github.com/apache/hudi/pull/4486#issuecomment-1005451852 ## CI report: * d96f3e5662350471fd8ff14c47f3daf12e5f151f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4486: [HUDI-3132] Minor fixes for HoodieCatalog

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4486: URL: https://github.com/apache/hudi/pull/4486#issuecomment-1005421687 ## CI report: * d96f3e5662350471fd8ff14c47f3daf12e5f151f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005424000 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a

[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005448595 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a02b UNKN

[GitHub] [hudi] dongkelun commented on issue #143: Tracking ticket for folks to be added to slack group

2022-01-04 Thread GitBox
dongkelun commented on issue #143: URL: https://github.com/apache/hudi/issues/143#issuecomment-1005439345 Hi, could you add me to the slack group? My email is 1412359...@qq.com -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] hudi-bot removed a comment on pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4511: URL: https://github.com/apache/hudi/pull/4511#issuecomment-1005429251 ## CI report: * 33f1af47efe9185c591280ea30932cbe970116a6 UNKNOWN * c01354222d525bae737c2db0455a86af500dd2c6 UNKNOWN Bot commands @hudi-bot sup

[GitHub] [hudi] hudi-bot commented on pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4511: URL: https://github.com/apache/hudi/pull/4511#issuecomment-1005430617 ## CI report: * 33f1af47efe9185c591280ea30932cbe970116a6 UNKNOWN * c01354222d525bae737c2db0455a86af500dd2c6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4511: URL: https://github.com/apache/hudi/pull/4511#issuecomment-1005427827 ## CI report: * 33f1af47efe9185c591280ea30932cbe970116a6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4511: URL: https://github.com/apache/hudi/pull/4511#issuecomment-1005429251 ## CI report: * 33f1af47efe9185c591280ea30932cbe970116a6 UNKNOWN * c01354222d525bae737c2db0455a86af500dd2c6 UNKNOWN Bot commands @hudi-bot supports th

[GitHub] [hudi] hudi-bot commented on pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4511: URL: https://github.com/apache/hudi/pull/4511#issuecomment-1005427827 ## CI report: * 33f1af47efe9185c591280ea30932cbe970116a6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-3171) Sync empty table to hive metastore

2022-01-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3171: - Labels: pull-request-available (was: ) > Sync empty table to hive metastore > ---

[GitHub] [hudi] danny0405 opened a new pull request #4511: [HUDI-3171] Sync empty table to hive metastore

2022-01-04 Thread GitBox
danny0405 opened a new pull request #4511: URL: https://github.com/apache/hudi/pull/4511 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpo

[jira] [Created] (HUDI-3171) Sync empty table to hive metastore

2022-01-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-3171: Summary: Sync empty table to hive metastore Key: HUDI-3171 URL: https://issues.apache.org/jira/browse/HUDI-3171 Project: Apache Hudi Issue Type: Improvement

[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005414443 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a

[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005424000 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a02b UNKN

[GitHub] [hudi] hudi-bot commented on pull request #4486: [HUDI-3132] Minor fixes for HoodieCatalog

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4486: URL: https://github.com/apache/hudi/pull/4486#issuecomment-1005421687 ## CI report: * d96f3e5662350471fd8ff14c47f3daf12e5f151f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4486: [HUDI-3132] Minor fixes for HoodieCatalog

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4486: URL: https://github.com/apache/hudi/pull/4486#issuecomment-1004919082 ## CI report: * d96f3e5662350471fd8ff14c47f3daf12e5f151f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1005420426 ## CI report: * 95d370a1a73fe177912bb6fee2362296f01ee779 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1005421630 ## CI report: * 29b1742747a4195db690d09f09de972ab7f409db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
alexeykudinkin commented on a change in pull request #4333: URL: https://github.com/apache/hudi/pull/4333#discussion_r778577147 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java ## @@ -208,71 +155,85 @@ private HoodieLogBlock rea

[GitHub] [hudi] danny0405 commented on pull request #4486: [HUDI-3132] Minor fixes for HoodieCatalog

2022-01-04 Thread GitBox
danny0405 commented on pull request #4486: URL: https://github.com/apache/hudi/pull/4486#issuecomment-1005420695 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot removed a comment on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1003094024 ## CI report: * 95d370a1a73fe177912bb6fee2362296f01ee779 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4471: URL: https://github.com/apache/hudi/pull/4471#issuecomment-1005420426 ## CI report: * 95d370a1a73fe177912bb6fee2362296f01ee779 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Updated] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3157: - Sprint: Hudi 0.10.1 - 2021/01/03 > Shade aws-dependencies to avoid class conflicts >

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3158: - Sprint: Hudi 0.10.1 - 2021/01/03 > Reduce warn logs in Spark SQL INSERT OVERWRITE > -

[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005414443 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a02b UNKN

[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005356347 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
alexeykudinkin commented on a change in pull request #4333: URL: https://github.com/apache/hudi/pull/4333#discussion_r778523226 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java ## @@ -88,76 +92,24 @@ public HoodieLogFileReader(F

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3125: - Status: Patch Available (was: In Progress) > Spark SQL writing timestamp type don't need to disable > `s

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3125: - Status: In Progress (was: Open) > Spark SQL writing timestamp type don't need to disable > `spark.sql.da

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3125: - Status: In Progress (was: Open) > Spark SQL writing timestamp type don't need to disable > `spark.sql.da

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3125: - Status: Open (was: Patch Available) > Spark SQL writing timestamp type don't need to disable > `spark.sq

[GitHub] [hudi] xushiyan commented on a change in pull request #4471: [HUDI-3125] spark-sql write timestamp directly

2022-01-04 Thread GitBox
xushiyan commented on a change in pull request #4471: URL: https://github.com/apache/hudi/pull/4471#discussion_r778566200 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/RowKeyGeneratorHelper.java ## @@ -116,6 +119,7 @@ public static String get

[GitHub] [hudi] dongkelun removed a comment on pull request #4083: [HUDI-2837] The original hoodie.table.name should be maintained in Spark SQL

2022-01-04 Thread GitBox
dongkelun removed a comment on pull request #4083: URL: https://github.com/apache/hudi/pull/4083#issuecomment-1005402855 > @nsivabalan no this won't go to 0.10.1 as it introduces new config. @dongkelun as this won't be included in 0.10.1, can we hold this off until next week to land? just

[GitHub] [hudi] dongkelun commented on pull request #4083: [HUDI-2837] The original hoodie.table.name should be maintained in Spark SQL

2022-01-04 Thread GitBox
dongkelun commented on pull request #4083: URL: https://github.com/apache/hudi/pull/4083#issuecomment-1005402855 > @nsivabalan no this won't go to 0.10.1 as it introduces new config. @dongkelun as this won't be included in 0.10.1, can we hold this off until next week to land? just try to a

[GitHub] [hudi] dongkelun commented on pull request #4083: [HUDI-2837] The original hoodie.table.name should be maintained in Spark SQL

2022-01-04 Thread GitBox
dongkelun commented on pull request #4083: URL: https://github.com/apache/hudi/pull/4083#issuecomment-1005402820 > @nsivabalan no this won't go to 0.10.1 as it introduces new config. @dongkelun as this won't be included in 0.10.1, can we hold this off until next week to land? just try to a

[GitHub] [hudi] waywtdcc edited a comment on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
waywtdcc edited a comment on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1005396888 > try to set **index.global.enabled=true** I set index global. enabled is true; However, it is still repeated. The query has not been repeated for about 4 days at the beg

[GitHub] [hudi] waywtdcc commented on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
waywtdcc commented on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1005396888 > try to set **index.global.enabled=true** I set index global. Enabled is true; However, it is still repeated. The query has not been repeated for about 4 days at the beginning.

[GitHub] [hudi] xushiyan commented on pull request #4083: [HUDI-2837] The original hoodie.table.name should be maintained in Spark SQL

2022-01-04 Thread GitBox
xushiyan commented on pull request #4083: URL: https://github.com/apache/hudi/pull/4083#issuecomment-1005394634 @nsivabalan no this won't go to 0.10.1 as it introduces new config. @dongkelun as this won't be included in 0.10.1, can we hold this off until next week to land? just try to avoi

[GitHub] [hudi] xushiyan commented on a change in pull request #4489: [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql

2022-01-04 Thread GitBox
xushiyan commented on a change in pull request #4489: URL: https://github.com/apache/hudi/pull/4489#discussion_r778551692 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java ## @@ -109,15 +111,31 @@ public static void deleteMetadataTab

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4234: [HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization

2022-01-04 Thread GitBox
alexeykudinkin commented on a change in pull request #4234: URL: https://github.com/apache/hudi/pull/4234#discussion_r778543602 ## File path: hudi-common/src/main/java/org/apache/hudi/common/util/ObjectSizeCalculator.java ## @@ -90,7 +90,7 @@ public static long getObjectSize(O

[GitHub] [hudi] a0x edited a comment on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-04 Thread GitBox
a0x edited a comment on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1005364631 Hi, I solved my issue by removing aws deps in the pom file. #4442 And whatsmore, I found in Hudi 0.8 there's not aws deps in the same pom file. Are there any plans on addin

[GitHub] [hudi] a0x edited a comment on issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-04 Thread GitBox
a0x edited a comment on issue #4442: URL: https://github.com/apache/hudi/issues/4442#issuecomment-1005363447 Finally I fixed this problem by removing aws deps in `packing/hudi-spark-bundle/pom.xml` and recompiling it myself. ```xml ``` -- This is an automated me

[GitHub] [hudi] a0x commented on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-04 Thread GitBox
a0x commented on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1005364631 Hi, I solved my issue by removing aws deps in the pom file. #4442 And whatsmore, I found in Hudi 0.8 there's not aws deps in the same pom file. Are there any plans for adding aws

[GitHub] [hudi] a0x commented on issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-04 Thread GitBox
a0x commented on issue #4442: URL: https://github.com/apache/hudi/issues/4442#issuecomment-1005363447 Finally I fixed this problem by removing aws deps in `packing/hudi-spark-bundle/pom.xml` and recompiling it myself. ```xml com.amazonaws:dynamodb-lock-client com.amazonaws:

[GitHub] [hudi] a0x closed issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-04 Thread GitBox
a0x closed issue #4442: URL: https://github.com/apache/hudi/issues/4442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.a

[GitHub] [hudi] xushiyan closed pull request #4438: [HUDI-2989] Update location during hive sync

2022-01-04 Thread GitBox
xushiyan closed pull request #4438: URL: https://github.com/apache/hudi/pull/4438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr

[GitHub] [hudi] xushiyan commented on pull request #4438: [HUDI-2989] Update location during hive sync

2022-01-04 Thread GitBox
xushiyan commented on pull request #4438: URL: https://github.com/apache/hudi/pull/4438#issuecomment-1005361941 @nsivabalan Closing this as won't do. We decided not to update it. We should optimize hive sync steps instead https://issues.apache.org/jira/browse/HUDI-3150 -- This is an auto

[jira] [Assigned] (HUDI-3163) Validate/certify hudi against diff spark 3 versions

2022-01-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3163: Assignee: Raymond Xu (was: Raymond Xu) > Validate/certify hudi against diff spark 3 versions > --

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-04 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3158: -- Fix Version/s: 0.11.0 0.10.1 > Reduce warn logs in Spark SQL INSERT O

[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005336147 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a

[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005356347 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a02b UNKN

[GitHub] [hudi] hudi-bot commented on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005353560 ## CI report: * 2968b1793b9b3e339f3a5267984269e02bdf6c83 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005329110 ## CI report: * 123962df6f27dbc18a111ebd5c3784e1339ea4f1 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[jira] [Updated] (HUDI-3170) Clustering preserve commit metadata retains filegroup id despite writes going to new filegroup

2022-01-04 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3170: -- Fix Version/s: 0.10.1 > Clustering preserve commit metadata retains filegroup id despite writes going >

[jira] [Assigned] (HUDI-3170) Clustering preserve commit metadata retains filegroup id despite writes going to new filegroup

2022-01-04 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-3170: - Epic Link: HUDI-1042 Story Points: 2 Assignee: Sagar Sumit Priority: Blocke

[jira] [Created] (HUDI-3170) Clustering preserve commit metadata retains filegroup id despite writes going to new filegroup

2022-01-04 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3170: - Summary: Clustering preserve commit metadata retains filegroup id despite writes going to new filegroup Key: HUDI-3170 URL: https://issues.apache.org/jira/browse/HUDI-3170

[GitHub] [hudi] hudi-bot commented on pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4509: URL: https://github.com/apache/hudi/pull/4509#issuecomment-1005346122 ## CI report: * 81384070a72540a9571cbf5c96eb1d185ce0fc90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4509: URL: https://github.com/apache/hudi/pull/4509#issuecomment-1005322404 ## CI report: * 81384070a72540a9571cbf5c96eb1d185ce0fc90 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] YannByron commented on pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-04 Thread GitBox
YannByron commented on pull request #4203: URL: https://github.com/apache/hudi/pull/4203#issuecomment-1005339630 @nsivabalan @codope I have a discussion related to this implement. In this pr, most of work is just to pass `isConsistentLogicalTimestampEnabled` to the method `HoodieAvroUt

[hudi] branch master updated (37b15ff -> a66212d)

2022-01-04 Thread mengtao
This is an automated email from the ASF dual-hosted git repository. mengtao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 37b15ff [HUDI-3147] Add endpoint_url to dynamodb lock provider (#4500) add a66212d [HUDI-2966] Closing LogRecor

[GitHub] [hudi] xiarixiaoyao merged pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
xiarixiaoyao merged pull request #4478: URL: https://github.com/apache/hudi/pull/4478 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsu

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1005311683 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1005336306 ## CI report: * 7cd3ebebb43de27fa46dde06dfcfc60589d18b96 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-1005336147 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a02b UNKN

[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4333: URL: https://github.com/apache/hudi/pull/4333#issuecomment-17098 ## CI report: * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN * de0d4385394dc5d820964cefc872f099cee7a0

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2022-01-04 Thread GitBox
alexeykudinkin commented on a change in pull request #4333: URL: https://github.com/apache/hudi/pull/4333#discussion_r778468264 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieDataBlock.java ## @@ -110,59 +136,94 @@ public static HoodieLogB

[jira] [Created] (HUDI-3169) Add comments for hoodie table schema

2022-01-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-3169: Summary: Add comments for hoodie table schema Key: HUDI-3169 URL: https://issues.apache.org/jira/browse/HUDI-3169 Project: Apache Hudi Issue Type: New Feature

[GitHub] [hudi] Guanpx commented on issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
Guanpx commented on issue #4508: URL: https://github.com/apache/hudi/issues/4508#issuecomment-1005334861 try to set **index.global.enabled=true** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] Guanpx edited a comment on issue #4510: [SUPPORT] Impala query error

2022-01-04 Thread GitBox
Guanpx edited a comment on issue #4510: URL: https://github.com/apache/hudi/issues/4510#issuecomment-1005333590 **HDFS files and Compaction status** ![image](https://user-images.githubusercontent.com/29246713/148152279-9eaad5fb-b45a-4c73-ab9b-4982d1b2beb4.png) ![image](https://us

[GitHub] [hudi] Guanpx commented on issue #4510: [SUPPORT] Impala query error

2022-01-04 Thread GitBox
Guanpx commented on issue #4510: URL: https://github.com/apache/hudi/issues/4510#issuecomment-1005333590 ![image](https://user-images.githubusercontent.com/29246713/148152279-9eaad5fb-b45a-4c73-ab9b-4982d1b2beb4.png)![image](https://user-images.githubusercontent.com/29246713/148152295-db4ac

[GitHub] [hudi] YannByron commented on a change in pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-04 Thread GitBox
YannByron commented on a change in pull request #4203: URL: https://github.com/apache/hudi/pull/4203#discussion_r778511325 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/SqlKeyGenerator.scala ## @@ -96,7 +96,7 @@ class SqlKeyGen

[GitHub] [hudi] Guanpx opened a new issue #4510: [SUPPORT]

2022-01-04 Thread GitBox
Guanpx opened a new issue #4510: URL: https://github.com/apache/hudi/issues/4510 **Describe the problem you faced** A clear and concise description of the problem. **To Reproduce** Steps to reproduce the behavior: 1. hudi sync hive 2. CREATE EXTERNAL IMPALA

[GitHub] [hudi] YannByron commented on a change in pull request #4203: [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator

2022-01-04 Thread GitBox
YannByron commented on a change in pull request #4203: URL: https://github.com/apache/hudi/pull/4203#discussion_r778508342 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -889,6 +890,13 @@ public String getKeyGener

[GitHub] [hudi] hudi-bot commented on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005329110 ## CI report: * 123962df6f27dbc18a111ebd5c3784e1339ea4f1 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?

[GitHub] [hudi] hudi-bot removed a comment on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005309464 ## CI report: * 123962df6f27dbc18a111ebd5c3784e1339ea4f1 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] hudi-bot commented on pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4509: URL: https://github.com/apache/hudi/pull/4509#issuecomment-1005322404 ## CI report: * 81384070a72540a9571cbf5c96eb1d185ce0fc90 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4509: URL: https://github.com/apache/hudi/pull/4509#issuecomment-1005321184 ## CI report: * 81384070a72540a9571cbf5c96eb1d185ce0fc90 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4509: URL: https://github.com/apache/hudi/pull/4509#issuecomment-1005321184 ## CI report: * 81384070a72540a9571cbf5c96eb1d185ce0fc90 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-3168) NPE with null schema with empty commit in Incremental Relation

2022-01-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3168: - Labels: pull-request-available user-support-issues (was: user-support-issues) > NPE with null sc

[GitHub] [hudi] nsivabalan opened a new pull request #4509: [HUDI-3168] Fixing null schema with empty commit in incremental relation

2022-01-04 Thread GitBox
nsivabalan opened a new pull request #4509: URL: https://github.com/apache/hudi/pull/4509 ## What is the purpose of the pull request When a table created via deltastreamer has only one commit which is empty, there are chances that there is not schema (depending on how schema provider

[jira] [Created] (HUDI-3168) NPE with null schema with empty commit in Incremental Relation

2022-01-04 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3168: - Summary: NPE with null schema with empty commit in Incremental Relation Key: HUDI-3168 URL: https://issues.apache.org/jira/browse/HUDI-3168 Project: Apache

[jira] [Assigned] (HUDI-3168) NPE with null schema with empty commit in Incremental Relation

2022-01-04 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-3168: - Assignee: sivabalan narayanan > NPE with null schema with empty commit in Increme

[jira] [Updated] (HUDI-3168) NPE with null schema with empty commit in Incremental Relation

2022-01-04 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3168: -- Labels: user-support-issues (was: ) > NPE with null schema with empty commit in Increme

[jira] [Updated] (HUDI-3168) NPE with null schema with empty commit in Incremental Relation

2022-01-04 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3168: -- Fix Version/s: 0.11.0 0.10.1 > NPE with null schema with empty commit

[GitHub] [hudi] waywtdcc opened a new issue #4508: [SUPPORT]Duplicate Flink Hudi data

2022-01-04 Thread GitBox
waywtdcc opened a new issue #4508: URL: https://github.com/apache/hudi/issues/4508 **Describe the problem you faced** Duplicate Flink Hudi data **To Reproduce** Steps to reproduce the behavior: CREATE TABLE hudi.datagen_test3 ( id BIGINT, nam

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1005311683 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1005310503 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004489100 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1005310503 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot removed a comment on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005308343 ## CI report: * 123962df6f27dbc18a111ebd5c3784e1339ea4f1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4507: [HUDI-52] Enabling savepoint and restore for MOR table

2022-01-04 Thread GitBox
hudi-bot commented on pull request #4507: URL: https://github.com/apache/hudi/pull/4507#issuecomment-1005309464 ## CI report: * 123962df6f27dbc18a111ebd5c3784e1339ea4f1 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?

  1   2   3   >