[jira] [Updated] (HUDI-4453) Support partition pruning for tables Bootstrapped from Source Hive Style partitioned tables

2022-09-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4453: -- Status: Patch Available (was: In Progress) > Support partition pruning for tables Boots

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-915: --- Status: In Progress (was: Open) > Partition Columns missing in files upserted after Metadata Bootstrap > -

[jira] [Updated] (HUDI-4785) Cannot find partition column when querying bootstrapped table in Spark

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4785: Status: In Progress (was: Open) > Cannot find partition column when querying bootstrapped table in Spark >

[jira] [Updated] (HUDI-4784) Full-record bootstrap does not generate correct partition path

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4784: Status: In Progress (was: Open) > Full-record bootstrap does not generate correct partition path >

[jira] [Updated] (HUDI-4783) Hive-style partition path ("partition=value") does not work with bootstrap

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4783: Status: In Progress (was: Open) > Hive-style partition path ("partition=value") does not work with bootstra

[jira] [Updated] (HUDI-4453) Support partition pruning for tables Bootstrapped from Source Hive Style partitioned tables

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4453: Sprint: 2022/09/05 (was: 2022/09/19) > Support partition pruning for tables Bootstrapped from Source Hive S

[jira] [Updated] (HUDI-4453) Support partition pruning for tables Bootstrapped from Source Hive Style partitioned tables

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4453: Status: In Progress (was: Open) > Support partition pruning for tables Bootstrapped from Source Hive Style

[GitHub] [hudi] hudi-bot commented on pull request #6575: [HUDI-4754] Add compliance check in github actions

2022-09-15 Thread GitBox
hudi-bot commented on PR #6575: URL: https://github.com/apache/hudi/pull/6575#issuecomment-1248275427 ## CI report: * 1600e31836157c8d05e3bc8b9e08e1717471f1a6 UNKNOWN * 4d02f2c64a5fc4b89889677ee639a20b53cec26a UNKNOWN * 48147d19c835e7868102fd2d083659e6ee2ac343 UNKNOWN * c6

[jira] [Commented] (HUDI-4759) Fix website Quick start guide to add validations

2022-09-15 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605419#comment-17605419 ] Jonathan Vexler commented on HUDI-4759: --- insert data python code error: no "\" at e

[jira] [Commented] (HUDI-4759) Fix website Quick start guide to add validations

2022-09-15 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605418#comment-17605418 ] Jonathan Vexler commented on HUDI-4759: --- time travel query python code error: uses

[GitHub] [hudi] hudi-bot commented on pull request #6575: [HUDI-4754] Add compliance check in github actions

2022-09-15 Thread GitBox
hudi-bot commented on PR #6575: URL: https://github.com/apache/hudi/pull/6575#issuecomment-1248268779 ## CI report: * 1600e31836157c8d05e3bc8b9e08e1717471f1a6 UNKNOWN * 4d02f2c64a5fc4b89889677ee639a20b53cec26a UNKNOWN * 48147d19c835e7868102fd2d083659e6ee2ac343 UNKNOWN * c6

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1248261319 ## CI report: * c6fe58f992656d26e60f24e2b5791613f55e5bd3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[jira] [Updated] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2022-09-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-992: --- Status: In Progress (was: Open) > For hive-style partitioned source data, partition columns synced with Hive

[jira] [Closed] (HUDI-3403) Ensure immutable hudi configurations are set properly and not changed later

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3403. - Resolution: Fixed > Ensure immutable hudi configurations are set properly and not changed later >

[GitHub] [hudi] jonvex opened a new pull request, #6682: ran checkstyle on the site pages

2022-09-15 Thread GitBox
jonvex opened a new pull request, #6682: URL: https://github.com/apache/hudi/pull/6682 ### Change Logs Ran the checkstyle on quick-start-guide and the 0.12.0 version of that. ### Impact _Describe any public API or user-facing feature change or any performance impact._

[hudi] branch master updated: [HUDI-3403] Ensure keygen props are set for bootstrap (#6645)

2022-09-15 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new a3921a845f [HUDI-3403] Ensure keygen props are set

[GitHub] [hudi] yihua merged pull request #6645: [HUDI-3403] Ensure keygen props are set for bootstrap

2022-09-15 Thread GitBox
yihua merged PR #6645: URL: https://github.com/apache/hudi/pull/6645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] dmenin commented on issue #3975: [SUPPORT] Question on hudi's delete statment taking too long

2022-09-15 Thread GitBox
dmenin commented on issue #3975: URL: https://github.com/apache/hudi/issues/3975#issuecomment-1248234846 I totally understand that it would be tricky to automate that logic, however, I would be fine in being responsible for submitting this list of partitions as part of the job parameters, a

[jira] [Updated] (HUDI-619) Investigate and implement mechanism to have hive/presto/sparksql queries avoid stitching and return null values for hoodie columns

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-619: - Sprint: 2022/09/19 (was: 2022/09/05) > Investigate and implement mechanism to have hive/presto/sparksql qu

[jira] [Updated] (HUDI-4785) Cannot find partition column when querying bootstrapped table in Spark

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4785: -- Sprint: 2022/09/05 > Cannot find partition column when querying bootstrapped table in Spark > --

[jira] [Assigned] (HUDI-4651) Test COW: Spark datasource writing with non-Hudi partitions

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-4651: - Assignee: Sagar Sumit (was: Ethan Guo) > Test COW: Spark datasource writing with non-Hudi partit

[GitHub] [hudi] bhasudha merged pull request #6654: [DOCS] fix site tags various style issues

2022-09-15 Thread GitBox
bhasudha merged PR #6654: URL: https://github.com/apache/hudi/pull/6654 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

[hudi] branch asf-site updated: [DOCS] fix site tags various style issues (#6654)

2022-09-15 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new af54b82e97 [DOCS] fix site tags various

[GitHub] [hudi] hudi-bot commented on pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
hudi-bot commented on PR #6476: URL: https://github.com/apache/hudi/pull/6476#issuecomment-1248191245 ## CI report: * eef1ba6269240d968d2e0c6e9acd4c51b84c6eb5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[hudi] branch master updated (a1dedf3d59 -> c22568ee28)

2022-09-15 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from a1dedf3d59 [HUDI-4844] Skip partition value resolving when the field does not exists for MergeOnReadInputFormat#getRead

[GitHub] [hudi] codope merged pull request #6666: [MINOR] Fix the Spark job status description for metadata-only bootstrap operation

2022-09-15 Thread GitBox
codope merged PR #: URL: https://github.com/apache/hudi/pull/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.or

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
pratyakshsharma commented on code in PR #6662: URL: https://github.com/apache/hudi/pull/6662#discussion_r972062851 ## hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncClient.java: ## @@ -83,18 +87,24 @@ public boolean isBootstrap() { return met

[GitHub] [hudi] hudi-bot commented on pull request #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
hudi-bot commented on PR #6681: URL: https://github.com/apache/hudi/pull/6681#issuecomment-1248184126 ## CI report: * 3ae2406a29ff29a8cf6ae0ea36c4cb048b26da8a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[GitHub] [hudi] hudi-bot commented on pull request #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
hudi-bot commented on PR #6681: URL: https://github.com/apache/hudi/pull/6681#issuecomment-1248176339 ## CI report: * 3ae2406a29ff29a8cf6ae0ea36c4cb048b26da8a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[GitHub] [hudi] hudi-bot commented on pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
hudi-bot commented on PR #6662: URL: https://github.com/apache/hudi/pull/6662#issuecomment-1248176123 ## CI report: * 8d100f5793803b053673f0730ea34b7c75e1d41c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132

[GitHub] [hudi] hudi-bot commented on pull request #6665: Incremental Ingestion from GCS

2022-09-15 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1248176198 ## CI report: * c23c9e5ddfe4867a5d1a03d0670fe26f9f27118c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1133

[GitHub] [hudi] hudi-bot commented on pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
hudi-bot commented on PR #6476: URL: https://github.com/apache/hudi/pull/6476#issuecomment-1248175455 ## CI report: * 34088aeee92daffe28ef3a17c04bb8e000f233e7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1136

[GitHub] [hudi] hudi-bot commented on pull request #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
hudi-bot commented on PR #6681: URL: https://github.com/apache/hudi/pull/6681#issuecomment-1248167030 ## CI report: * 3ae2406a29ff29a8cf6ae0ea36c4cb048b26da8a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[GitHub] [hudi] hudi-bot commented on pull request #6665: Incremental Ingestion from GCS

2022-09-15 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1248166878 ## CI report: * c23c9e5ddfe4867a5d1a03d0670fe26f9f27118c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1133

[GitHub] [hudi] hudi-bot commented on pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
hudi-bot commented on PR #6662: URL: https://github.com/apache/hudi/pull/6662#issuecomment-1248166767 ## CI report: * 8d100f5793803b053673f0730ea34b7c75e1d41c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1132

[GitHub] [hudi] hudi-bot commented on pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
hudi-bot commented on PR #6476: URL: https://github.com/apache/hudi/pull/6476#issuecomment-1248166076 ## CI report: * 34088aeee92daffe28ef3a17c04bb8e000f233e7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1136

[GitHub] [hudi] pratyakshsharma commented on a diff in pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
pratyakshsharma commented on code in PR #6662: URL: https://github.com/apache/hudi/pull/6662#discussion_r972042716 ## hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncClient.java: ## @@ -83,18 +87,24 @@ public boolean isBootstrap() { return met

[GitHub] [hudi] codope commented on a diff in pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
codope commented on code in PR #6662: URL: https://github.com/apache/hudi/pull/6662#discussion_r971980886 ## hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncClient.java: ## @@ -83,18 +87,24 @@ public boolean isBootstrap() { return metaClient.g

[GitHub] [hudi] codope commented on a diff in pull request #6662: [HUDI-4832] Fix drop partition meta sync

2022-09-15 Thread GitBox
codope commented on code in PR #6662: URL: https://github.com/apache/hudi/pull/6662#discussion_r971978788 ## hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/TestHiveSyncTool.java: ## @@ -88,8 +88,8 @@ public class TestHiveSyncTool { private static final List SY

[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-15 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1248071289 ## CI report: * f32cf8515a4984c75bc26641abd3a5b042ebe372 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1137

[jira] [Commented] (HUDI-621) Presto Integration for supporting Bootstrapped table

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605320#comment-17605320 ] Sagar Sumit commented on HUDI-621: -- This is implemented in [https://github.com/prestodb/p

[jira] [Closed] (HUDI-621) Presto Integration for supporting Bootstrapped table

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-621. Resolution: Done > Presto Integration for supporting Bootstrapped table > ---

[jira] [Closed] (HUDI-955) Test MOR : Presto Read Optimized Query with metadata bootstrap

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-955. Resolution: Done > Test MOR : Presto Read Optimized Query with metadata bootstrap > -

[jira] [Commented] (HUDI-3983) ClassNotFoundException when using hudi-spark-bundle to write table with hbase index

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605317#comment-17605317 ] Sagar Sumit commented on HUDI-3983: --- [~xichaomin] It's best not to remove the relocation

[jira] [Updated] (HUDI-3983) ClassNotFoundException when using hudi-spark-bundle to write table with hbase index

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3983: -- Status: In Progress (was: Open) > ClassNotFoundException when using hudi-spark-bundle to write table wi

[GitHub] [hudi] hudi-bot commented on pull request #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
hudi-bot commented on PR #6681: URL: https://github.com/apache/hudi/pull/6681#issuecomment-1248003265 ## CI report: * 3ae2406a29ff29a8cf6ae0ea36c4cb048b26da8a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1138

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1248003080 ## CI report: * bdf3e337bd0b82ba1e59887f10cffdbe50fbde99 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1137

[GitHub] [hudi] hudi-bot commented on pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
hudi-bot commented on PR #6476: URL: https://github.com/apache/hudi/pull/6476#issuecomment-1248002706 ## CI report: * 34088aeee92daffe28ef3a17c04bb8e000f233e7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1136

[jira] [Closed] (HUDI-4256) Bulk insert of a large dataset with S3 fails w/ timeline server based markers

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4256. - Resolution: Cannot Reproduce Hudi has an exponential backoff [retry mechanism|https://github.com/apache/h

[jira] [Updated] (HUDI-3403) Ensure immutable hudi configurations are set properly and not changed later

2022-09-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3403: -- Summary: Ensure immutable hudi configurations are set properly and not changed later (was: Manage immut

[GitHub] [hudi] hudi-bot commented on pull request #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
hudi-bot commented on PR #6681: URL: https://github.com/apache/hudi/pull/6681#issuecomment-1247997929 ## CI report: * 3ae2406a29ff29a8cf6ae0ea36c4cb048b26da8a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1247997685 ## CI report: * bdf3e337bd0b82ba1e59887f10cffdbe50fbde99 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1137

[GitHub] [hudi] hudi-bot commented on pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
hudi-bot commented on PR #6476: URL: https://github.com/apache/hudi/pull/6476#issuecomment-1247997320 ## CI report: * 34088aeee92daffe28ef3a17c04bb8e000f233e7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1136

[jira] [Commented] (HUDI-4844) Skip partition value resolving when the field does not exists for MergeOnReadInputFormat#getReader

2022-09-15 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605286#comment-17605286 ] Danny Chen commented on HUDI-4844: -- Fixed via master branch: a1dedf3d595e3e1a7bd0aaab4c43

[jira] [Resolved] (HUDI-4844) Skip partition value resolving when the field does not exists for MergeOnReadInputFormat#getReader

2022-09-15 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-4844. -- > Skip partition value resolving when the field does not exists for > MergeOnReadInputFormat#getReader > --

[hudi] branch master updated (851c6e12db -> a1dedf3d59)

2022-09-15 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 851c6e12db [HUDI-4780] hoodie.logfile.max.size It does not take effect, causing the log file to be too large (#6602)

[GitHub] [hudi] danny0405 merged pull request #6678: [HUDI-4844] Skip partition value resolving when the field does not ex…

2022-09-15 Thread GitBox
danny0405 merged PR #6678: URL: https://github.com/apache/hudi/pull/6678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] codope opened a new pull request, #6681: [HUDI-4071] Remove default value for mandatory record key field

2022-09-15 Thread GitBox
codope opened a new pull request, #6681: URL: https://github.com/apache/hudi/pull/6681 ### Change Logs Record key field is mandatory, so this PR removed the default value for that field. This is not backwards compatible. Most users set the record key, but if some user relied on `uuid

[jira] [Created] (HUDI-4849) [DOCS] Removal of default value for record key field

2022-09-15 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-4849: - Summary: [DOCS] Removal of default value for record key field Key: HUDI-4849 URL: https://issues.apache.org/jira/browse/HUDI-4849 Project: Apache Hudi Issue Type:

[GitHub] [hudi] TJX2014 commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-15 Thread GitBox
TJX2014 commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1247975433 Hi, @danny0405,https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=11379&view=results this is the latest right, seems ci will retain at least one failed record:

[GitHub] [hudi] low-on-mana commented on issue #5180: [SUPPORT] Which hudi deltastreamer configuration to use with JSONKafkaSource when target schema is fixed and source contains sparse events

2022-09-15 Thread GitBox
low-on-mana commented on issue #5180: URL: https://github.com/apache/hudi/issues/5180#issuecomment-1247970752 > I came across this and added a unit test case in _TestOverwriteNonDefaultsWithLatestAvroPayload_ to check whether partial updates works or not by using _OverwriteN

[GitHub] [hudi] TJX2014 commented on a diff in pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
TJX2014 commented on code in PR #6630: URL: https://github.com/apache/hudi/pull/6630#discussion_r971860117 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/HoodieSimpleBucketIndex.java: ## @@ -52,10 +54,20 @@ private Map loadPartitionBucketIdFileIdMa

[GitHub] [hudi] danny0405 commented on a diff in pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
danny0405 commented on code in PR #6630: URL: https://github.com/apache/hudi/pull/6630#discussion_r971858003 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/HoodieSimpleBucketIndex.java: ## @@ -52,10 +54,20 @@ private Map loadPartitionBucketIdFileId

[GitHub] [hudi] koochiswathiTR commented on issue #6606: Observing data duplication with Single Writer

2022-09-15 Thread GitBox
koochiswathiTR commented on issue #6606: URL: https://github.com/apache/hudi/issues/6606#issuecomment-1247962307 @zhedoubushishi @nsivabalan @xushiyan It will be great if we hear from you soon, Complete Stacktrace : "java.util.ConcurrentModificationException: Cannot resolv

[GitHub] [hudi] hudi-bot commented on pull request #6678: [HUDI-4844] Skip partition value resolving when the field does not ex…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6678: URL: https://github.com/apache/hudi/pull/6678#issuecomment-1247920707 ## CI report: * 8fee8139ce7ecf9271c6c40cd414c60a03d26eba Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1137

[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-15 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1247920497 ## CI report: * 81546636356aa5b224b87087bad14a1da1a06a4d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1131

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1247920439 ## CI report: * 85a8f5166c17ec5ce9fa00e2c38846f440582acf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[GitHub] [hudi] zhangyue19921010 commented on pull request #5416: [HUDI-3963] Use Lock-Free Message Queue Disruptor Improving Hoodie Writing Efficiency

2022-09-15 Thread GitBox
zhangyue19921010 commented on PR #5416: URL: https://github.com/apache/hudi/pull/5416#issuecomment-1247917674 > Thanks for running the benchmarks @zhangyue19921010! > > Please sum them up in a table so that they are easier to digest and compare. I will prioritize reviewing this PR thi

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1247914535 ## CI report: * 85a8f5166c17ec5ce9fa00e2c38846f440582acf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1123

[GitHub] [hudi] hudi-bot commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-15 Thread GitBox
hudi-bot commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1247914631 ## CI report: * 81546636356aa5b224b87087bad14a1da1a06a4d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1131

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971793084 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971808477 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] zhangyue19921010 commented on pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on PR #6600: URL: https://github.com/apache/hudi/pull/6600#issuecomment-1247891091 > Could we also mention the need of obfuscation of enterprise informations such ip address, hoatname, bucket, tables or columns names and so on ? Sure we can collect these inf

[GitHub] [hudi] zhangyue19921010 commented on pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on PR #6600: URL: https://github.com/apache/hudi/pull/6600#issuecomment-1247885073 Hi @codope And @YuweiXiao Really appreciate for your efforts here! All comments are addressed. PTAL! -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971794995 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971793519 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971793084 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] zhangyue19921010 commented on a diff in pull request #6600: [RFC-62] Diagnostic Reporter

2022-09-15 Thread GitBox
zhangyue19921010 commented on code in PR #6600: URL: https://github.com/apache/hudi/pull/6600#discussion_r971792776 ## rfc/rfc-62/rfc-62.md: ## @@ -0,0 +1,443 @@ + +# RFC-62: Diagnostic Reporter + + + +## Proposers + +- zhangyue19921...@163.com + +## Approvers + - @codope + - @x

[GitHub] [hudi] ganczarek commented on issue #4656: [SUPPORT] Slow file listing after update to Hudi 0.10.0

2022-09-15 Thread GitBox
ganczarek commented on issue #4656: URL: https://github.com/apache/hudi/issues/4656#issuecomment-1247863914 @nsivabalan I'm sorry, but I no longer have a setup to test it the same way as I did back in January. I will update to Hudi v0.12 and enable metadata table to see if there are perform

[jira] [Updated] (HUDI-4848) Fix tooling for deprecated partition

2022-09-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4848: -- Fix Version/s: 0.12.1 > Fix tooling for deprecated partition >

[jira] [Assigned] (HUDI-4848) Fix tooling for deprecated partition

2022-09-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4848: - Assignee: sivabalan narayanan > Fix tooling for deprecated partition > -

[jira] [Created] (HUDI-4848) Fix tooling for deprecated partition

2022-09-15 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4848: - Summary: Fix tooling for deprecated partition Key: HUDI-4848 URL: https://issues.apache.org/jira/browse/HUDI-4848 Project: Apache Hudi Issue Type:

[jira] [Assigned] (HUDI-4847) hive sync fails w/ utilities bundle in 0.13-snapshot, but succeeds w/ 0.11

2022-09-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4847: - Assignee: Sagar Sumit > hive sync fails w/ utilities bundle in 0.13-snapshot, but

[jira] [Created] (HUDI-4847) hive sync fails w/ utilities bundle in 0.13-snapshot, but succeeds w/ 0.11

2022-09-15 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4847: - Summary: hive sync fails w/ utilities bundle in 0.13-snapshot, but succeeds w/ 0.11 Key: HUDI-4847 URL: https://issues.apache.org/jira/browse/HUDI-4847 Proj

[GitHub] [hudi] TJX2014 commented on a diff in pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-15 Thread GitBox
TJX2014 commented on code in PR #6630: URL: https://github.com/apache/hudi/pull/6630#discussion_r971770725 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -72,6 +73,26 @@ public static List getLatestBaseFilesForPartition(

[GitHub] [hudi] pratyakshsharma commented on issue #6422: [SUPPORT]: hudi build failing for hudi-flink-client when no maven build option is provided

2022-09-15 Thread GitBox
pratyakshsharma commented on issue #6422: URL: https://github.com/apache/hudi/issues/6422#issuecomment-1247841811 Yeah this can be closed. Thank you for the support. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] pratyakshsharma closed issue #6422: [SUPPORT]: hudi build failing for hudi-flink-client when no maven build option is provided

2022-09-15 Thread GitBox
pratyakshsharma closed issue #6422: [SUPPORT]: hudi build failing for hudi-flink-client when no maven build option is provided URL: https://github.com/apache/hudi/issues/6422 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

2022-09-15 Thread GitBox
pratyakshsharma commented on issue #5539: URL: https://github.com/apache/hudi/issues/5539#issuecomment-1247835150 @nleena123 Any updates here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] pratyakshsharma commented on issue #5551: HoodieCommitException: Failed to archive commits Caused by: IOException: Not an Avro data file at org.apache.hudi.table.HoodieTimelineArchiveL

2022-09-15 Thread GitBox
pratyakshsharma commented on issue #5551: URL: https://github.com/apache/hudi/issues/5551#issuecomment-1247833911 Closing this due to inactivity. Feel free to reopen if you are still facing issues @nleena123 -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [hudi] pratyakshsharma closed issue #5551: HoodieCommitException: Failed to archive commits Caused by: IOException: Not an Avro data file at org.apache.hudi.table.HoodieTimelineArchiveLog.arc

2022-09-15 Thread GitBox
pratyakshsharma closed issue #5551: HoodieCommitException: Failed to archive commits Caused by: IOException: Not an Avro data file at org.apache.hudi.table.HoodieTimelineArchiveLog.archive(HoodieTimelineArchiveLog.java:324) URL: https://github.com/apache/hudi/issues/5551 -- This is an automa

[GitHub] [hudi] pratyakshsharma commented on pull request #1650: [HUDI-541]: replaced dataFile/df with baseFile/bf throughout code base

2022-09-15 Thread GitBox
pratyakshsharma commented on PR #1650: URL: https://github.com/apache/hudi/pull/1650#issuecomment-1247831339 @yihua I am still interested in completing this PR but I am not getting reviews and hence did not spend time in resolving the conflicts here. If you can help me review this, I can dr

[GitHub] [hudi] TJX2014 commented on pull request #6634: [HUDI-4813] Fix infer keygen not work in sparksql side issue

2022-09-15 Thread GitBox
TJX2014 commented on PR #6634: URL: https://github.com/apache/hudi/pull/6634#issuecomment-1247832138 > Can you rebase with the latest master code and force push to re-trigger the tests. Ok. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [hudi] pratyakshsharma commented on issue #5020: [SUPPORT] The cleaning strategy breaks the reader view completeness

2022-09-15 Thread GitBox
pratyakshsharma commented on issue #5020: URL: https://github.com/apache/hudi/issues/5020#issuecomment-1247828014 I guess https://github.com/apache/hudi/pull/5406 does not address the issue discussed here. As per this comment by Danny - https://github.com/apache/hudi/issues/5020#issuecommen

[GitHub] [hudi] YannByron commented on a diff in pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
YannByron commented on code in PR #6476: URL: https://github.com/apache/hudi/pull/6476#discussion_r971743633 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieWriteStat.java: ## @@ -70,6 +75,11 @@ public class HoodieWriteStat implements Serializable { */ pri

[GitHub] [hudi] YannByron commented on a diff in pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-15 Thread GitBox
YannByron commented on code in PR #6476: URL: https://github.com/apache/hudi/pull/6476#discussion_r971741076 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieCommitMetadata.java: ## @@ -236,6 +241,44 @@ public static T fromJsonString(String jsonStr, Class clazz)

[GitHub] [hudi] koochiswathiTR commented on issue #6606: Observing data duplication with Single Writer

2022-09-15 Thread GitBox
koochiswathiTR commented on issue #6606: URL: https://github.com/apache/hudi/issues/6606#issuecomment-1247812774 @nsivabalan Pls respond on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [hudi] koochiswathiTR commented on issue #6606: Observing data duplication with Single Writer

2022-09-15 Thread GitBox
koochiswathiTR commented on issue #6606: URL: https://github.com/apache/hudi/issues/6606#issuecomment-1247810081 We have solved this issue by automatic creation of Dynamodb table and table is created with partition key (key) We could able to setup with out aws keys. With multi writer s

[jira] [Updated] (HUDI-4812) Lazy partition listing and file groups fetching in Spark Query

2022-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4812: - Labels: pull-request-available (was: ) > Lazy partition listing and file groups fetching in Spark

[GitHub] [hudi] YuweiXiao opened a new pull request, #6680: [WIP][HUDI-4812] lazy fetching partition path & file slice for HoodieFileIndex

2022-09-15 Thread GitBox
YuweiXiao opened a new pull request, #6680: URL: https://github.com/apache/hudi/pull/6680 ### Change Logs Lazy fetching partition path & file slice for HoodieFileIndex ### Impact No API change, and will improve performance for spark query with partition filter.

[GitHub] [hudi] pratyakshsharma commented on pull request #5071: [HUDI-1881]: draft implementation for trigger based on data availability

2022-09-15 Thread GitBox
pratyakshsharma commented on PR #5071: URL: https://github.com/apache/hudi/pull/5071#issuecomment-1247786946 @yihua yes, I plan to complete it this week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

<    1   2   3   4   >