[GitHub] [hudi] hudi-bot commented on pull request #5176: [HUDI-3700] Add new hudi-utilities-bundle profile without spark

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5176: URL: https://github.com/apache/hudi/pull/5176#issuecomment-1082650234 ## CI report: * 9035463d1e8e7db12ec42671fc4fe2da2baffe93 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #5176: [HUDI-3700] Add new hudi-utilities-bundle profile without spark

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #5176: URL: https://github.com/apache/hudi/pull/5176#issuecomment-1082648531 ## CI report: * 9035463d1e8e7db12ec42671fc4fe2da2baffe93 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot

[GitHub] [hudi] codope commented on a change in pull request #4693: [HUDI-2488][HUDI-3175] Implement async metadata indexing

2022-03-29 Thread GitBox
codope commented on a change in pull request #4693: URL: https://github.com/apache/hudi/pull/4693#discussion_r838143398 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/index/ScheduleIndexActionExecutor.java ## @@ -0,0 +1,138 @@ +/* + *

[GitHub] [hudi] hudi-bot commented on pull request #5176: [HUDI-3700] Add new hudi-utilities-bundle profile without spark

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5176: URL: https://github.com/apache/hudi/pull/5176#issuecomment-1082648531 ## CI report: * 9035463d1e8e7db12ec42671fc4fe2da2baffe93 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-3700) Revisit hudi-utilities-bundle build wrt Spark versions

2022-03-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3700: - Labels: pull-request-available (was: ) > Revisit hudi-utilities-bundle build wrt Spark versions

[GitHub] [hudi] yihua opened a new pull request #5176: [HUDI-3700] Add new hudi-utilities-bundle profile without spark

2022-03-29 Thread GitBox
yihua opened a new pull request #5176: URL: https://github.com/apache/hudi/pull/5176 ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation

[GitHub] [hudi] hudi-bot removed a comment on pull request #5168: [HUDI-3729][SPARK] fixed the per regression by enable vectorizeReader for parquet file

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #5168: URL: https://github.com/apache/hudi/pull/5168#issuecomment-1082561598 ## CI report: * f76bbaa375989de317451e753297596917fe2f77 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5168: [HUDI-3729][SPARK] fixed the per regression by enable vectorizeReader for parquet file

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5168: URL: https://github.com/apache/hudi/pull/5168#issuecomment-1082644996 ## CI report: * 72c2185a5add2ec5c073fcafd1e8d3ae404ae23e Azure:

[GitHub] [hudi] codope commented on a change in pull request #4693: [HUDI-2488][HUDI-3175] Implement async metadata indexing

2022-03-29 Thread GitBox
codope commented on a change in pull request #4693: URL: https://github.com/apache/hudi/pull/4693#discussion_r838136779 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -511,24 +523,42 @@ private

[jira] [Updated] (HUDI-3700) Revisit hudi-utilities-bundle build wrt Spark versions

2022-03-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3700: Status: In Progress (was: Open) > Revisit hudi-utilities-bundle build wrt Spark versions >

[hudi] branch master updated (4fed8dd -> 7fa3639)

2022-03-29 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 4fed8dd [HUDI-3485] Adding scheduler pool configs for async clustering (#5043) add 7fa3639 [HUDI-3745]

[GitHub] [hudi] nsivabalan merged pull request #5170: [HUDI-3745] Support for spark datasource options in S3EventsHoodieInc…

2022-03-29 Thread GitBox
nsivabalan merged pull request #5170: URL: https://github.com/apache/hudi/pull/5170 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Closed] (HUDI-3244) UnsupportedOperationException when bulk insert to hudi

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3244. Assignee: leesf Resolution: Fixed Fixed in https://github.com/apache/hudi/pull/4498 >

[GitHub] [hudi] hudi-bot commented on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-03-29 Thread GitBox
hudi-bot commented on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1082637975 ## CI report: * 333da7447af7d602ffa3067a759cecc62e4365d8 UNKNOWN * f8f4a172dbe60949e83d889e3d97c8d5dc2b8585 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4925: [HUDI-3103] Enable MultiTableDeltaStreamer to update a single sink table from multiple source tables

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #4925: URL: https://github.com/apache/hudi/pull/4925#issuecomment-1082544271 ## CI report: * 333da7447af7d602ffa3067a759cecc62e4365d8 UNKNOWN * a119a051dbe0a0921b6bd58fbb0f1bbd3d647fa8 Azure:

[jira] [Closed] (HUDI-3472) Compilation failure for release-0.10.1

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3472. Fix Version/s: (was: 0.11.0) Resolution: Not A Problem > Compilation failure for release-0.10.1

[jira] [Updated] (HUDI-3667) Unit tests in hudi-integ-tests are not executed in CI

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3667: - Status: In Progress (was: Open) > Unit tests in hudi-integ-tests are not executed in CI >

[jira] [Updated] (HUDI-3667) Unit tests in hudi-integ-tests are not executed in CI

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3667: - Status: Patch Available (was: In Progress) > Unit tests in hudi-integ-tests are not executed in CI >

[jira] [Closed] (HUDI-3703) Reset taskID in restoreWriteMetadata

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3703. Assignee: Zhaojing Yu Resolution: Fixed > Reset taskID in restoreWriteMetadata >

[GitHub] [hudi] hudi-bot commented on pull request #5174: [Minor]Make cli 'commit rollback' using rollbackUsingMarkers false as default

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5174: URL: https://github.com/apache/hudi/pull/5174#issuecomment-1082627856 ## CI report: * 6d09991ecdecbf15c37352e713a3add4213755e0 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #5174: [Minor]Make cli 'commit rollback' using rollbackUsingMarkers false as default

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #5174: URL: https://github.com/apache/hudi/pull/5174#issuecomment-1082531603 ## CI report: * 6d09991ecdecbf15c37352e713a3add4213755e0 Azure:

[GitHub] [hudi] xushiyan edited a comment on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2022-03-29 Thread GitBox
xushiyan edited a comment on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-1082623653 @xiarixiaoyao thanks for sharing the analysis. sorry haven't been able to catch up on time. You're right; `collect()` happens right after `isEmpty()` or `count()`. A full

[GitHub] [hudi] xushiyan edited a comment on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2022-03-29 Thread GitBox
xushiyan edited a comment on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-1082623653 @xiarixiaoyao thanks for sharing the analysis. sorry haven't been able to catch up on time. You're right; `collect()` happens right after `isEmpty()` or `count()`. A full

[GitHub] [hudi] alexeykudinkin commented on pull request #5168: [HUDI-3729][SPARK] fixed the per regression by enable vectorizeReader for parquet file

2022-03-29 Thread GitBox
alexeykudinkin commented on pull request #5168: URL: https://github.com/apache/hudi/pull/5168#issuecomment-1082624552 @xiarixiaoyao it depends on the data distribution in your table: filter push down would have an impact for ex, when it could rule out whole file reading column stats from

[GitHub] [hudi] xushiyan commented on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2022-03-29 Thread GitBox
xushiyan commented on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-1082623653 @xiarixiaoyao thanks for sharing the analysis. sorry haven't been able to catch up on time. You're right; `collect()` happens right after `isEmpty()` or `count()`. A full

[GitHub] [hudi] codope commented on a change in pull request #4693: [HUDI-2488][HUDI-3175] Implement async metadata indexing

2022-03-29 Thread GitBox
codope commented on a change in pull request #4693: URL: https://github.com/apache/hudi/pull/4693#discussion_r838119416 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieIndexer.java ## @@ -0,0 +1,292 @@ +/* + * Licensed to the Apache Software

[jira] [Assigned] (HUDI-3744) NoSuchMethodError of getReadStatistics with Spark 3.2/Hadoop 3.2 using HBase

2022-03-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-3744: --- Assignee: Raymond Xu (was: Ethan Guo) > NoSuchMethodError of getReadStatistics with Spark

[jira] [Closed] (HUDI-2893) Metadata table not found warn logs repeated

2022-03-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-2893. --- Resolution: Invalid > Metadata table not found warn logs repeated >

[jira] [Commented] (HUDI-2893) Metadata table not found warn logs repeated

2022-03-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514439#comment-17514439 ] Ethan Guo commented on HUDI-2893: - I confirmed by running the spark quick start guide that this is no

[GitHub] [hudi] codope commented on a change in pull request #4693: [HUDI-2488][HUDI-3175] Implement async metadata indexing

2022-03-29 Thread GitBox
codope commented on a change in pull request #4693: URL: https://github.com/apache/hudi/pull/4693#discussion_r838114020 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieIndexer.java ## @@ -0,0 +1,292 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] XuQianJin-Stars removed a comment on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
XuQianJin-Stars removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082615319 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082616503 ## CI report: * 4100259b0ea8381e2ab6c01eebcdefa29925ee84 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082602477 ## CI report: * 4100259b0ea8381e2ab6c01eebcdefa29925ee84 Azure:

[jira] [Updated] (HUDI-3467) Check shutdown logic with async compaction in Spark Structured Streaming

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3467: - Description: Related issue https://github.com/apache/hudi/issues/5046 > Check shutdown logic with

[jira] [Updated] (HUDI-3467) Check shutdown logic with async compaction in Spark Structured Streaming

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3467: - Fix Version/s: 0.11.0 (was: 0.12.0) > Check shutdown logic with async compaction

[GitHub] [hudi] hudi-bot commented on pull request #5175: [minor] Follow 3178, fix the flink metadata table compaction

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5175: URL: https://github.com/apache/hudi/pull/5175#issuecomment-1082615391 ## CI report: * e19192cdd29ebfd94c73a2e06a9322ec2de35f71 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #5175: [minor] Follow 3178, fix the flink metadata table compaction

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #5175: URL: https://github.com/apache/hudi/pull/5175#issuecomment-1082613855 ## CI report: * e19192cdd29ebfd94c73a2e06a9322ec2de35f71 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot

[GitHub] [hudi] XuQianJin-Stars commented on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
XuQianJin-Stars commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082615319 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot commented on pull request #4957: [HUDI-3406] Rollback incorrectly relying on FS listing instead of Com…

2022-03-29 Thread GitBox
hudi-bot commented on pull request #4957: URL: https://github.com/apache/hudi/pull/4957#issuecomment-1082615150 ## CI report: * 9ba5c351c32f9f364c30b4bc9a814075150d9728 UNKNOWN * 53b4cec7d05dd0bb5a622ffb33597471da7711de UNKNOWN * e3140bb9abef7d5e9dc0405b9f8b12f11c2438da

[GitHub] [hudi] hudi-bot removed a comment on pull request #4957: [HUDI-3406] Rollback incorrectly relying on FS listing instead of Com…

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #4957: URL: https://github.com/apache/hudi/pull/4957#issuecomment-1082595287 ## CI report: * 9ba5c351c32f9f364c30b4bc9a814075150d9728 UNKNOWN * 53b4cec7d05dd0bb5a622ffb33597471da7711de UNKNOWN *

[GitHub] [hudi] XuQianJin-Stars commented on pull request #5169: [HUDI-3743] Support DELETE_PARTITION for metadata table

2022-03-29 Thread GitBox
XuQianJin-Stars commented on pull request #5169: URL: https://github.com/apache/hudi/pull/5169#issuecomment-1082614642 LGTM, In addition, I want to wait for https://github.com/apache/hudi/pull/4489 to merge in and then merge this? -- This is an automated message from the Apache Git

[jira] [Updated] (HUDI-2606) Ensure query engines not access MDT if disabled

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2606: - Description: This is to visit all the read code paths and ensure when metadata is disabled, query engines

[GitHub] [hudi] hudi-bot commented on pull request #5175: [minor] Follow 3178, fix the flink metadata table compaction

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5175: URL: https://github.com/apache/hudi/pull/5175#issuecomment-1082613855 ## CI report: * e19192cdd29ebfd94c73a2e06a9322ec2de35f71 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] vingov commented on pull request #5153: [HUDI-3020] Utility to create manifest file

2022-03-29 Thread GitBox
vingov commented on pull request #5153: URL: https://github.com/apache/hudi/pull/5153#issuecomment-1082612051 @nsivabalan or @xushiyan - Can you please approve the workflow? @codejoyan - Thanks for working on this feature, I tested the code it works well, I need you to remove the

[GitHub] [hudi] danny0405 opened a new pull request #5175: [minor] Follow 3178, fix the flink metadata table compaction

2022-03-29 Thread GitBox
danny0405 opened a new pull request #5175: URL: https://github.com/apache/hudi/pull/5175 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[jira] [Updated] (HUDI-3710) Fix testHoodieAsyncClusteringJob in TestHoodieDeltaStreamer

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3710: - Priority: Blocker (was: Critical) > Fix testHoodieAsyncClusteringJob in TestHoodieDeltaStreamer >

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3636: - Sprint: Hudi-Sprint-Mar-21, Hudi-Sprint-Mar-22 (was: Hudi-Sprint-Mar-21) > Clustering fails due to

[jira] [Updated] (HUDI-3066) Very slow file listing after enabling metadata for existing tables in 0.10.0 release

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3066: - Sprint: Hudi-Sprint-Mar-22 > Very slow file listing after enabling metadata for existing tables in 0.10.0

[jira] [Updated] (HUDI-2893) Metadata table not found warn logs repeated

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2893: - Sprint: Hudi-Sprint-Mar-22 > Metadata table not found warn logs repeated >

[jira] [Closed] (HUDI-3688) Double check MT init behavior for MT rollout

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3688. Resolution: Done > Double check MT init behavior for MT rollout >

[jira] [Updated] (HUDI-3688) Double check MT init behavior for MT rollout

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3688: - Priority: Blocker (was: Major) > Double check MT init behavior for MT rollout >

[jira] [Updated] (HUDI-3673) Add a common hudi-hbase-shaded for shaded hbase dependencies

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3673: - Fix Version/s: 0.12.0 (was: 0.11.0) > Add a common hudi-hbase-shaded for shaded

[jira] [Updated] (HUDI-3673) Add a common hudi-hbase-shaded for shaded hbase dependencies

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3673: - Component/s: dependencies > Add a common hudi-hbase-shaded for shaded hbase dependencies >

[jira] [Updated] (HUDI-3668) Fix failing unit tests in hudi-integ-test

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3668: - Fix Version/s: 0.12.0 (was: 0.11.0) > Fix failing unit tests in hudi-integ-test >

[jira] [Updated] (HUDI-3668) Fix failing unit tests in hudi-integ-test

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3668: - Component/s: tests-ci > Fix failing unit tests in hudi-integ-test >

[jira] [Assigned] (HUDI-3668) Fix failing unit tests in hudi-integ-test

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3668: Assignee: sivabalan narayanan > Fix failing unit tests in hudi-integ-test >

[GitHub] [hudi] codope commented on a change in pull request #4693: [HUDI-2488][HUDI-3175] Implement async metadata indexing

2022-03-29 Thread GitBox
codope commented on a change in pull request #4693: URL: https://github.com/apache/hudi/pull/4693#discussion_r838101454 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java ## @@ -925,6 +928,53 @@ public boolean

[jira] [Updated] (HUDI-3649) Add HoodieTableConfig defaults to HoodieWriteConfig

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3649: - Fix Version/s: 0.12.0 (was: 0.11.0) > Add HoodieTableConfig defaults to

[jira] [Updated] (HUDI-3649) Add HoodieTableConfig defaults to HoodieWriteConfig

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3649: - Component/s: configs Epic Link: HUDI-1239 > Add HoodieTableConfig defaults to HoodieWriteConfig >

[jira] [Updated] (HUDI-3648) Failed to execute rollback due to HoodieIOException: Could not delete instant

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3648: - Fix Version/s: 0.12.0 (was: 0.11.0) > Failed to execute rollback due to

[jira] [Updated] (HUDI-3648) Failed to execute rollback due to HoodieIOException: Could not delete instant

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3648: - Component/s: deltastreamer > Failed to execute rollback due to HoodieIOException: Could not delete

[jira] [Updated] (HUDI-3647) Ignore errors if metadata table has not been initialized fully

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3647: - Priority: Blocker (was: Major) > Ignore errors if metadata table has not been initialized fully >

[jira] [Updated] (HUDI-3647) Ignore errors if metadata table has not been initialized fully

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3647: - Priority: Critical (was: Blocker) > Ignore errors if metadata table has not been initialized fully >

[GitHub] [hudi] hudi-bot commented on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
hudi-bot commented on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082602477 ## CI report: * 4100259b0ea8381e2ab6c01eebcdefa29925ee84 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4489: [HUDI-3135] Make delete partitions lazy to be executed by the cleaner

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #4489: URL: https://github.com/apache/hudi/pull/4489#issuecomment-1082531223 ## CI report: * e1e36f49a55f272ad4871c7fe7d66075394c4eee Azure:

[jira] [Updated] (HUDI-3635) Fix HoodieMetadataTableValidator around comparison of partition path listing

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3635: - Component/s: code-quality metadata > Fix HoodieMetadataTableValidator around comparison

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3636: - Component/s: multi-writer > Clustering fails due to marker creation failure >

[jira] [Updated] (HUDI-3635) Fix HoodieMetadataTableValidator around comparison of partition path listing

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3635: - Fix Version/s: 0.12.0 (was: 0.11.0) > Fix HoodieMetadataTableValidator around

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3636: - Priority: Blocker (was: Critical) > Clustering fails due to marker creation failure >

[jira] [Closed] (HUDI-3610) Validate Hudi Kafka Connect Sink writing to S3

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3610. Fix Version/s: (was: 0.11.0) Resolution: Not A Problem > Validate Hudi Kafka Connect Sink

[jira] [Assigned] (HUDI-3610) Validate Hudi Kafka Connect Sink writing to S3

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3610: Assignee: Rajesh Mahindra (was: Raymond Xu) > Validate Hudi Kafka Connect Sink writing to S3 >

[jira] [Commented] (HUDI-3610) Validate Hudi Kafka Connect Sink writing to S3

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514429#comment-17514429 ] Raymond Xu commented on HUDI-3610: -- Solved w/ user

[jira] [Updated] (HUDI-3579) Add timeline commands in hudi-cli

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3579: - Fix Version/s: 0.12.0 (was: 0.11.0) > Add timeline commands in hudi-cli >

[jira] [Updated] (HUDI-3579) Add timeline commands in hudi-cli

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3579: - Issue Type: New Feature (was: Task) > Add timeline commands in hudi-cli >

[jira] [Updated] (HUDI-3532) Refactor FileSystemBackedTableMetadata and related classes to support getColumnStats directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3532: - Issue Type: Improvement (was: Task) > Refactor FileSystemBackedTableMetadata and related classes to

[jira] [Updated] (HUDI-3533) Refactor FileSystemBackedTableMetadata and related classes to support getBloomFilters directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3533: - Fix Version/s: 0.12.0 (was: 0.11.0) > Refactor FileSystemBackedTableMetadata and

[jira] [Updated] (HUDI-3533) Refactor FileSystemBackedTableMetadata and related classes to support getBloomFilters directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3533: - Component/s: code-quality > Refactor FileSystemBackedTableMetadata and related classes to support >

[jira] [Updated] (HUDI-3532) Refactor FileSystemBackedTableMetadata and related classes to support getColumnStats directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3532: - Fix Version/s: 0.12.0 (was: 0.11.0) > Refactor FileSystemBackedTableMetadata and

[jira] [Updated] (HUDI-3533) Refactor FileSystemBackedTableMetadata and related classes to support getBloomFilters directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3533: - Issue Type: Improvement (was: Task) > Refactor FileSystemBackedTableMetadata and related classes to

[jira] [Updated] (HUDI-3532) Refactor FileSystemBackedTableMetadata and related classes to support getColumnStats directly

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3532: - Component/s: code-quality > Refactor FileSystemBackedTableMetadata and related classes to support >

[jira] [Updated] (HUDI-3495) Reading keys in parallel from HoodieMetadataMergedLogRecordReader may lead to empty results even if key exists

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3495: - Fix Version/s: (was: 0.11.0) > Reading keys in parallel from HoodieMetadataMergedLogRecordReader may

[jira] [Updated] (HUDI-3495) Reading keys in parallel from HoodieMetadataMergedLogRecordReader may lead to empty results even if key exists

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3495: - Priority: Blocker (was: Major) > Reading keys in parallel from HoodieMetadataMergedLogRecordReader may

[jira] [Updated] (HUDI-3427) Investigate timeouts in Hudi Kafka Connect Sink

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3427: - Priority: Blocker (was: Critical) > Investigate timeouts in Hudi Kafka Connect Sink >

[jira] [Updated] (HUDI-3427) Investigate timeouts in Hudi Kafka Connect Sink

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3427: - Issue Type: Bug (was: Improvement) > Investigate timeouts in Hudi Kafka Connect Sink >

[jira] [Updated] (HUDI-3427) Investigate timeouts in Hudi Kafka Connect Sink

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3427: - Component/s: kafka-connect > Investigate timeouts in Hudi Kafka Connect Sink >

[jira] [Updated] (HUDI-3427) Investigate timeouts in Hudi Kafka Connect Sink

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3427: - Fix Version/s: 0.12.0 (was: 0.11.0) > Investigate timeouts in Hudi Kafka Connect

[jira] [Updated] (HUDI-3321) HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key field name

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3321: - Labels: (was: code-cleanup) > HFileWriter, HFileReader and HFileDataBlock should avoid hardcoded key

[GitHub] [hudi] hudi-bot commented on pull request #5042: Three bulk_insert files are concurrently submitted and executed with a difference of 2s, the insert fails occasionally.

2022-03-29 Thread GitBox
hudi-bot commented on pull request #5042: URL: https://github.com/apache/hudi/pull/5042#issuecomment-1082595329 ## CI report: * 7ee24be4d11864af37bf300250d571e15d5f9ae9 UNKNOWN * 9a9e544ba48a52c7b54134fc9533c3e5a51ccfff UNKNOWN * 7217b7d0fac7b96b042b0a91d587f3518f3ce3e4

[GitHub] [hudi] hudi-bot removed a comment on pull request #5042: Three bulk_insert files are concurrently submitted and executed with a difference of 2s, the insert fails occasionally.

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #5042: URL: https://github.com/apache/hudi/pull/5042#issuecomment-1082530026 ## CI report: * 7ee24be4d11864af37bf300250d571e15d5f9ae9 UNKNOWN * 9a9e544ba48a52c7b54134fc9533c3e5a51ccfff UNKNOWN *

[GitHub] [hudi] hudi-bot removed a comment on pull request #4957: [HUDI-3406] Rollback incorrectly relying on FS listing instead of Com…

2022-03-29 Thread GitBox
hudi-bot removed a comment on pull request #4957: URL: https://github.com/apache/hudi/pull/4957#issuecomment-1082590199 ## CI report: * 9ba5c351c32f9f364c30b4bc9a814075150d9728 UNKNOWN * 53b4cec7d05dd0bb5a622ffb33597471da7711de UNKNOWN *

[GitHub] [hudi] hudi-bot commented on pull request #4957: [HUDI-3406] Rollback incorrectly relying on FS listing instead of Com…

2022-03-29 Thread GitBox
hudi-bot commented on pull request #4957: URL: https://github.com/apache/hudi/pull/4957#issuecomment-1082595287 ## CI report: * 9ba5c351c32f9f364c30b4bc9a814075150d9728 UNKNOWN * 53b4cec7d05dd0bb5a622ffb33597471da7711de UNKNOWN * e3140bb9abef7d5e9dc0405b9f8b12f11c2438da

[jira] [Closed] (HUDI-3307) Implement a repair tool for bring back table into a clean state

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3307. Resolution: Done > Implement a repair tool for bring back table into a clean state >

[jira] [Assigned] (HUDI-3307) Implement a repair tool for bring back table into a clean state

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3307: Assignee: sivabalan narayanan (was: Ethan Guo) > Implement a repair tool for bring back table

[jira] [Updated] (HUDI-3307) Implement a repair tool for bring back table into a clean state

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3307: - Issue Type: New Feature (was: Task) > Implement a repair tool for bring back table into a clean state >

[jira] [Updated] (HUDI-3307) Implement a repair tool for bring back table into a clean state

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3307: - Component/s: cli > Implement a repair tool for bring back table into a clean state >

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3301: - Labels: (was: HUDI-bug) > MergedLogRecordReader inline reading should be stateless and thread safe >

[jira] [Updated] (HUDI-3301) MergedLogRecordReader inline reading should be stateless and thread safe

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3301: - Fix Version/s: 0.12.0 (was: 0.11.0) > MergedLogRecordReader inline reading should

[GitHub] [hudi] xiarixiaoyao commented on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2022-03-29 Thread GitBox
xiarixiaoyao commented on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-1082593749 @vinothchandar @xushiyan in line 121 SparkRDDWriteClient.java, we call collect for rdd, which will trigger compute again. ``` List writeStats =

[jira] [Updated] (HUDI-3300) Timeline server FSViewManager should avoid point lookup for metadata file partition

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3300: - Issue Type: Bug (was: Task) > Timeline server FSViewManager should avoid point lookup for metadata file

[jira] [Updated] (HUDI-3300) Timeline server FSViewManager should avoid point lookup for metadata file partition

2022-03-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3300: - Fix Version/s: 0.12.0 (was: 0.11.0) > Timeline server FSViewManager should avoid

  1   2   3   4   5   6   7   >