[jira] [Updated] (HUDI-33) Introduce config to allow users to control case-sensitivity in column projections #431

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-33: --- Reviewers: Raymond Xu > Introduce config to allow users to control case-sensitivity in column > projections #43

[GitHub] [hudi] danny0405 commented on a change in pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
danny0405 commented on a change in pull request #4879: URL: https://github.com/apache/hudi/pull/4879#discussion_r812633423 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/profile/DeltaWriteProfile.java ## @@ -59,7 +59,7 @@ public DeltaWriteProfile(Hoodi

[GitHub] [hudi] hudi-bot commented on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot commented on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048531651 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048493298 ## CI report: * a1b1cdbcb5277e7c3061c3ec3bf665027508a468 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048512845 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * a2f0214488f066837403b0c6c4f6630831d055da Azure: [FAILURE](https://dev.azure.com/apache-hud

[GitHub] [hudi] hudi-bot commented on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048534528 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * a2f0214488f066837403b0c6c4f6630831d055da Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot commented on pull request #4878: [HUDI-3465] Add validation of column stats and bloom filters in HoodieMetadataTableValidator

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4878: URL: https://github.com/apache/hudi/pull/4878#issuecomment-1048537032 ## CI report: * d53a320408f02a2c87fed46e1776b40bc218ea84 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4878: [HUDI-3465] Add validation of column stats and bloom filters in HoodieMetadataTableValidator

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4878: URL: https://github.com/apache/hudi/pull/4878#issuecomment-1048498515 ## CI report: * ca5cc35a6d19365efe276483dc6c3fbdff0d8809 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] xushiyan commented on pull request #4720: [HUDI-3221] Support querying a table as of a savepoint

2022-02-23 Thread GitBox
xushiyan commented on pull request #4720: URL: https://github.com/apache/hudi/pull/4720#issuecomment-1048537726 > @xushiyan , there is anything that I can help from my side to help you on this PR? I can help testing using real datasets hey @fedsp thank you for offering help. Feel fre

[GitHub] [hudi] yanenze commented on a change in pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
yanenze commented on a change in pull request #4879: URL: https://github.com/apache/hudi/pull/4879#discussion_r812649194 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/profile/DeltaWriteProfile.java ## @@ -59,7 +59,7 @@ public DeltaWriteProfile(HoodieW

[GitHub] [hudi] hudi-bot removed a comment on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048483132 ## CI report: * 19ba560542a8769475948561e2b607f85f70b548 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] hudi-bot commented on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048543998 ## CI report: * 19ba560542a8769475948561e2b607f85f70b548 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?

[GitHub] [hudi] hudi-bot removed a comment on pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048518510 ## CI report: * 25c05634e335b1f138f2907c5ea71ec5abbbfaa6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048546648 ## CI report: * 25c05634e335b1f138f2907c5ea71ec5abbbfaa6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] zhilinli123 opened a new issue #4881: Full incremental Enable index loading to discover duplicate data

2022-02-23 Thread GitBox
zhilinli123 opened a new issue #4881: URL: https://github.com/apache/hudi/issues/4881 We use flink CDC to monitor mysql's latest binlog send kafka consumption Kafka load index full import data index Importing HDFS in batches offline Enable incremental intervention in bulk insert mod

[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4441: URL: https://github.com/apache/hudi/pull/4441#issuecomment-1048512623 ## CI report: * ee74d4492baca5f4fe9caab22afa349805bd4d5f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4441: URL: https://github.com/apache/hudi/pull/4441#issuecomment-1048548793 ## CI report: * ee74d4492baca5f4fe9caab22afa349805bd4d5f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Updated] (HUDI-2752) The MOR DELETE block breaks the event time sequence of CDC

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2752: - Sprint: Hudi-Sprint-Jan-3, Hudi-Sprint-Jan-10, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Jan-3, Hudi-Sprint-Ja

[GitHub] [hudi] danny0405 commented on a change in pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
danny0405 commented on a change in pull request #3808: URL: https://github.com/apache/hudi/pull/3808#discussion_r812656919 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/WriteOperationType.java ## @@ -48,6 +48,8 @@ INSERT_OVERWRITE_TABLE("insert_overwr

[GitHub] [hudi] hudi-bot commented on pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048549461 ## CI report: * 25c05634e335b1f138f2907c5ea71ec5abbbfaa6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4879: # modify the flink SmallFile list get error when async-compaction has been configured

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048546648 ## CI report: * 25c05634e335b1f138f2907c5ea71ec5abbbfaa6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] cdzryou opened a new issue #4882: [SUPPORT]How to clean action files theses are not used with flink task?

2022-02-23 Thread GitBox
cdzryou opened a new issue #4882: URL: https://github.com/apache/hudi/issues/4882 - how to autu clean timeline action files in flink ? these files like `*.commit`,`.*deltacommit` , `.*clean` and in hdfs path '../.hoodie/' - it may create a lot of small files in flink streaming task,

[GitHub] [hudi] danny0405 commented on pull request #4879: The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread GitBox
danny0405 commented on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048551748 I have changed the commit title: `The flink small file list should exclude file slices with pending compaction` Can we also fire a JIRA issue here: https://issues.apache.

[GitHub] [hudi] danny0405 commented on pull request #4821: [HUDI-3435] Do not throw exception when instant to rollback does not …

2022-02-23 Thread GitBox
danny0405 commented on pull request #4821: URL: https://github.com/apache/hudi/pull/4821#issuecomment-1048552544 Hello @nsivabalan , i think after this path, the compaction instant constraint can be removed, can you double check this ? -- This is an automated message from the Apache Git

[GitHub] [hudi] prashantwason commented on a change in pull request #4821: [HUDI-3435] Do not throw exception when instant to rollback does not …

2022-02-23 Thread GitBox
prashantwason commented on a change in pull request #4821: URL: https://github.com/apache/hudi/pull/4821#discussion_r812666713 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java ## @@ -429,25 +428,21 @@ public void mer

[GitHub] [hudi] hudi-bot removed a comment on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1042937208 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hu

[GitHub] [hudi] hudi-bot commented on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048561524 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-or

[jira] [Created] (HUDI-3488) The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread yanenze (Jira)
yanenze created HUDI-3488: - Summary: The flink small file list should exclude file slices with pending compaction Key: HUDI-3488 URL: https://issues.apache.org/jira/browse/HUDI-3488 Project: Apache Hudi

[GitHub] [hudi] hudi-bot removed a comment on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048561524 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hu

[GitHub] [hudi] hudi-bot commented on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048563964 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-or

[jira] [Resolved] (HUDI-3488) The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread yanenze (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yanenze resolved HUDI-3488. --- > The flink small file list should exclude file slices with pending compaction > -

[GitHub] [hudi] yanenze commented on pull request #4879: The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread GitBox
yanenze commented on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048566375 > I have changed the commit title: `The flink small file list should exclude file slices with pending compaction` > > Can we also fire a JIRA issue here: https://issues.apa

[GitHub] [hudi] yanenze edited a comment on pull request #4879: [HUDI-3488] The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread GitBox
yanenze edited a comment on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048566375 > I have changed the commit title: `The flink small file list should exclude file slices with pending compaction` > > Can we also fire a JIRA issue here: https://iss

[jira] [Updated] (HUDI-3488) The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3488: - Labels: flink hudi pull-request-available (was: flink hudi) > The flink small file list should ex

[jira] [Updated] (HUDI-3161) Add Call Produce Command for spark sql

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3161: - Status: In Progress (was: Open) > Add Call Produce Command for spark sql > --

[jira] [Updated] (HUDI-3161) Add Call Produce Command for spark sql

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3161: - Status: Patch Available (was: In Progress) > Add Call Produce Command for spark sql > ---

[GitHub] [hudi] cuibo01 commented on pull request #4699: [HUDI-3336][HUDI-FLINK] Support custom hadoop config options for flink

2022-02-23 Thread GitBox
cuibo01 commented on pull request #4699: URL: https://github.com/apache/hudi/pull/4699#issuecomment-1048571140 > I think a little and we should hold for this patch, people usually do not pass hadoop config options through SQL options, can you describe again your use cases again, what kind

[jira] [Updated] (HUDI-3469) Refactor HoodieTestDataGenerator to enable reproducible builds

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3469: - Status: In Progress (was: Open) > Refactor HoodieTestDataGenerator to enable reproducible builds > --

[jira] [Updated] (HUDI-3469) Refactor HoodieTestDataGenerator to enable reproducible builds

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3469: - Status: Patch Available (was: In Progress) > Refactor HoodieTestDataGenerator to enable reproducible buil

[jira] [Updated] (HUDI-3457) Refactor Spark Relations to avoid code duplication

2022-02-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3457: - Status: Patch Available (was: In Progress) > Refactor Spark Relations to avoid code duplication > ---

[GitHub] [hudi] hudi-bot removed a comment on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048534528 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * a2f0214488f066837403b0c6c4f6630831d055da Azure: [FAILURE](https://dev.azure.com/apache-hud

[GitHub] [hudi] hudi-bot commented on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048574852 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * d6aebc9d12c8b666d81498dc3d4690088082eda7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] prashantwason commented on a change in pull request #4640: [HUDI-3225] [RFC-45] for async metadata indexing

2022-02-23 Thread GitBox
prashantwason commented on a change in pull request #4640: URL: https://github.com/apache/hudi/pull/4640#discussion_r812687856 ## File path: rfc/rfc-45/rfc-45.md ## @@ -0,0 +1,229 @@ + + +# RFC-45: Asynchronous Metadata Indexing + +## Proposers + +- @codope +- @manojpec + +## A

[GitHub] [hudi] prashantwason commented on a change in pull request #4640: [HUDI-3225] [RFC-45] for async metadata indexing

2022-02-23 Thread GitBox
prashantwason commented on a change in pull request #4640: URL: https://github.com/apache/hudi/pull/4640#discussion_r812688821 ## File path: rfc/rfc-45/rfc-45.md ## @@ -0,0 +1,229 @@ + + +# RFC-45: Asynchronous Metadata Indexing + +## Proposers + +- @codope +- @manojpec + +## A

[GitHub] [hudi] hudi-bot commented on pull request #4880: [HUDI-2752] The MOR DELETE block breaks the event time sequence of CDC

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4880: URL: https://github.com/apache/hudi/pull/4880#issuecomment-1048583870 ## CI report: * 3d7f2d4f3e4ce5c195be0ea9b9fec4edb191525d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4880: [HUDI-2752] The MOR DELETE block breaks the event time sequence of CDC

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4880: URL: https://github.com/apache/hudi/pull/4880#issuecomment-1048518541 ## CI report: * 3d7f2d4f3e4ce5c195be0ea9b9fec4edb191525d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[jira] [Created] (HUDI-3489) Unify config to avoid duplicate code

2022-02-23 Thread leesf (Jira)
leesf created HUDI-3489: --- Summary: Unify config to avoid duplicate code Key: HUDI-3489 URL: https://issues.apache.org/jira/browse/HUDI-3489 Project: Apache Hudi Issue Type: Improvement Repo

[GitHub] [hudi] leesf commented on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
leesf commented on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048592909 CC @xushiyan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] leesf opened a new pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
leesf opened a new pull request #4883: URL: https://github.com/apache/hudi/pull/4883 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose o

[jira] [Updated] (HUDI-3489) Unify config to avoid duplicate code

2022-02-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3489: - Labels: pull-request-available (was: ) > Unify config to avoid duplicate code > -

[GitHub] [hudi] hudi-bot commented on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048595455 ## CI report: * d94fa4f3e0b1d8f223655062abb2ca1661d2ea14 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] hudi-bot removed a comment on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048595455 ## CI report: * d94fa4f3e0b1d8f223655062abb2ca1661d2ea14 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048598371 ## CI report: * d94fa4f3e0b1d8f223655062abb2ca1661d2ea14 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048543998 ## CI report: * 19ba560542a8769475948561e2b607f85f70b548 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] hudi-bot commented on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048604126 ## CI report: * 19ba560542a8769475948561e2b607f85f70b548 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?

[GitHub] [hudi] hudi-bot commented on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048606686 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-or

[GitHub] [hudi] hudi-bot removed a comment on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048563964 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hu

[GitHub] [hudi] hudi-bot removed a comment on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048604126 ## CI report: * 19ba560542a8769475948561e2b607f85f70b548 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] hudi-bot commented on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048606918 ## CI report: * 125d2cd385219cb9187e0ce6ac90b00cfea863fc Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?

[GitHub] [hudi] hudi-bot removed a comment on pull request #4879: [HUDI-3488] The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048549461 ## CI report: * 25c05634e335b1f138f2907c5ea71ec5abbbfaa6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4879: [HUDI-3488] The flink small file list should exclude file slices with pending compaction

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4879: URL: https://github.com/apache/hudi/pull/4879#issuecomment-1048630005 ## CI report: * d38b6702a8d74c0328ab6241bced66223cc86851 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048531651 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot commented on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048634966 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot commented on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048637982 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
xiarixiaoyao commented on a change in pull request #3808: URL: https://github.com/apache/hudi/pull/3808#discussion_r812748275 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/WriteOperationType.java ## @@ -48,6 +48,8 @@ INSERT_OVERWRITE_TABLE("insert_ove

[GitHub] [hudi] hudi-bot removed a comment on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048634966 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot removed a comment on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4441: URL: https://github.com/apache/hudi/pull/4441#issuecomment-1048548793 ## CI report: * ee74d4492baca5f4fe9caab22afa349805bd4d5f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4441: URL: https://github.com/apache/hudi/pull/4441#issuecomment-1048638375 ## CI report: * fb474d121dc178d2284c2d8a80b7f9380aa902a8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Created] (HUDI-3490) Timestamp conversion (parquet)

2022-02-23 Thread Istvan Darvas (Jira)
Istvan Darvas created HUDI-3490: --- Summary: Timestamp conversion (parquet) Key: HUDI-3490 URL: https://issues.apache.org/jira/browse/HUDI-3490 Project: Apache Hudi Issue Type: Bug Re

[jira] [Updated] (HUDI-3459) Add schema on read for parquet and json DFS sources

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3459: -- Description: Add schema on read for parquet and json DFS sources, just like we do for CS

[GitHub] [hudi] hudi-bot commented on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048677895 ## CI report: * d94fa4f3e0b1d8f223655062abb2ca1661d2ea14 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4883: URL: https://github.com/apache/hudi/pull/4883#issuecomment-1048598371 ## CI report: * d94fa4f3e0b1d8f223655062abb2ca1661d2ea14 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] nsivabalan commented on a change in pull request #4811: [HUDI-3213] Making commit preserve metadata to true for compaction

2022-02-23 Thread GitBox
nsivabalan commented on a change in pull request #4811: URL: https://github.com/apache/hudi/pull/4811#discussion_r812798456 ## File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java ## @@ -373,10 +373,36 @@ public static GenericRecord rewriteRecord(Gener

[GitHub] [hudi] nsivabalan commented on pull request #4811: [HUDI-3213] Making commit preserve metadata to true for compaction

2022-02-23 Thread GitBox
nsivabalan commented on pull request #4811: URL: https://github.com/apache/hudi/pull/4811#issuecomment-1048689691 @xushiyan : this is good to review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [hudi] hudi-bot commented on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048700801 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * d32f1ae27baf09d619b065bfaa174c154239c1cc UNKNOWN * 79a1360fb8ff7978bd1f9e9535b5299e8e930f97 Azur

[GitHub] [hudi] hudi-bot removed a comment on pull request #4773: [HUDI-3400] Avoid throw exception when create hoodie table

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4773: URL: https://github.com/apache/hudi/pull/4773#issuecomment-1048606686 ## CI report: * cba4eaeedfa9c5fa6971b2fe24e98ff4469a0a15 UNKNOWN * ebdfdb21fc482f2b7792e5b5392cc5ef3f342882 Azure: [CANCELED](https://dev.azure.com/apache-hu

[GitHub] [hudi] hudi-bot removed a comment on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048637982 ## CI report: * b8bd3b506616cea82e29e4e5a0e95624a60b811a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #3808: [HUDI-2560] introduce id_based schema to support full schema evolution.

2022-02-23 Thread GitBox
hudi-bot commented on pull request #3808: URL: https://github.com/apache/hudi/pull/3808#issuecomment-1048712991 ## CI report: * e28d8f0fbfdbd67bba30c60d3dfa79e444f2d9c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048606918 ## CI report: * 125d2cd385219cb9187e0ce6ac90b00cfea863fc Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/

[GitHub] [hudi] hudi-bot commented on pull request #4848: [HUDI-3356][HUDI-3203] HoodieData for metadata index records, bloom and colstats init

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4848: URL: https://github.com/apache/hudi/pull/4848#issuecomment-1048722584 ## CI report: * 48399d1f4e5fc3acf04ded4e9ed6e1fbfb34aebd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[jira] [Commented] (HUDI-3490) Timestamp conversion (parquet)

2022-02-23 Thread Istvan Darvas (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496713#comment-17496713 ] Istvan Darvas commented on HUDI-3490: - DeltaStreamer from Kafka/Json => S3/Hudi table

[jira] [Comment Edited] (HUDI-3490) Timestamp conversion (parquet)

2022-02-23 Thread Istvan Darvas (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496713#comment-17496713 ] Istvan Darvas edited comment on HUDI-3490 at 2/23/22, 12:24 PM:

[GitHub] [hudi] hudi-bot commented on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048730777 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * d6aebc9d12c8b666d81498dc3d4690088082eda7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org

[GitHub] [hudi] hudi-bot removed a comment on pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
hudi-bot removed a comment on pull request #4752: URL: https://github.com/apache/hudi/pull/4752#issuecomment-1048574852 ## CI report: * d5f1fbad92cd451d5ac7cf81f5f8612ff18d85ed UNKNOWN * d6aebc9d12c8b666d81498dc3d4690088082eda7 Azure: [FAILURE](https://dev.azure.com/apache-hud

[jira] [Created] (HUDI-3491) Remove PathFilter from DirectoryLister

2022-02-23 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3491: - Summary: Remove PathFilter from DirectoryLister Key: HUDI-3491 URL: https://issues.apache.org/jira/browse/HUDI-3491 Project: Apache Hudi Issue Type: Task

[jira] [Updated] (HUDI-945) Cleanup spillable map files eagerly as part of close

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-945: - Sprint: Cont' improve - 2022/02/14 (was: Cont' improve - 2022/02/14, Cont' improve - 20

[jira] [Comment Edited] (HUDI-3490) Timestamp conversion (parquet)

2022-02-23 Thread Istvan Darvas (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496713#comment-17496713 ] Istvan Darvas edited comment on HUDI-3490 at 2/23/22, 12:30 PM:

[GitHub] [hudi] XuQianJin-Stars commented on a change in pull request #4752: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-02-23 Thread GitBox
XuQianJin-Stars commented on a change in pull request #4752: URL: https://github.com/apache/hudi/pull/4752#discussion_r802433824 ## File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java ## @@ -410,6 +411,16 @@ public static String t

[GitHub] [hudi] fedsp commented on pull request #4720: [HUDI-3221] Support querying a table as of a savepoint

2022-02-23 Thread GitBox
fedsp commented on pull request #4720: URL: https://github.com/apache/hudi/pull/4720#issuecomment-1048755045 @xushiyan great! I will do this by today. I'm planning to use it on aws glue which unfortunately only offers spark 3.1 today. I know that Hudi documentation says explicitly that the

[jira] [Updated] (HUDI-3482) Add insert overwrite tests for deltastreamer

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3482: -- Sprint: Hudi-Sprint-Feb-22 > Add insert overwrite tests for deltastreamer > ---

[jira] [Updated] (HUDI-3483) Add insert overwrite tests for spark DS

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3483: -- Sprint: Hudi-Sprint-Feb-22 > Add insert overwrite tests for spark DS > -

[jira] [Updated] (HUDI-3480) Add integ test suite job for medium scale w/o deleting input

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3480: -- Sprint: Hudi-Sprint-Feb-22 > Add integ test suite job for medium scale w/o deleting inpu

[jira] [Updated] (HUDI-3479) Add long running multi-clustering deltastreamer tests

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3479: -- Sprint: Hudi-Sprint-Feb-22 > Add long running multi-clustering deltastreamer tests > ---

[jira] [Updated] (HUDI-3481) add long running spark DS tests w/ inline clustering

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3481: -- Sprint: Hudi-Sprint-Feb-22 > add long running spark DS tests w/ inline clustering > ---

[jira] [Updated] (HUDI-3484) add medium scale tests for spark DS w/o input data deletion

2022-02-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3484: -- Sprint: Hudi-Sprint-Feb-22 > add medium scale tests for spark DS w/o input data deletion

[GitHub] [hudi] nsivabalan opened a new pull request #4884: [HUDI-3480][HUDI-3481] Enchancements to integ test suite

2022-02-23 Thread GitBox
nsivabalan opened a new pull request #4884: URL: https://github.com/apache/hudi/pull/4884 ## What is the purpose of the pull request - Most of our existing tests deletes input data at the end of every iteration so that we can do long running tests. So validation at the end of every

[jira] [Updated] (HUDI-3480) Add integ test suite job for medium scale w/o deleting input

2022-02-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3480: - Labels: pull-request-available (was: ) > Add integ test suite job for medium scale w/o deleting i

[GitHub] [hudi] nsivabalan merged pull request #4883: [HUDI-3489] Unify config to avoid duplicate code

2022-02-23 Thread GitBox
nsivabalan merged pull request #4883: URL: https://github.com/apache/hudi/pull/4883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubs

[GitHub] [hudi] hudi-bot commented on pull request #4884: [HUDI-3480][HUDI-3481] Enchancements to integ test suite

2022-02-23 Thread GitBox
hudi-bot commented on pull request #4884: URL: https://github.com/apache/hudi/pull/4884#issuecomment-1048769858 ## CI report: * a370abe330b1f0da059edf0afad5adf21277d59c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[hudi] branch master updated (4e8accc -> 2a93b8e)

2022-02-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 4e8accc [HUDI-3486] Fix wrong field order for constructing HoodieMetadataColumnStats (#4875) add 2a93b8e [HU

  1   2   3   4   5   >