[jira] [Closed] (HUDI-47) Revisit null checks in the Log Blocks, merge lazyreading with this null check #340

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-47. --- Resolution: Fixed > Revisit null checks in the Log Blocks, merge lazyreading with this null check > #34

[jira] [Closed] (HUDI-2788) Z-ordering Layout Optimization Strategy fails w/ Data Skipping enabled

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-2788. - Resolution: Fixed > Z-ordering Layout Optimization Strategy fails w/ Data Skipping enabled > -

[jira] [Closed] (HUDI-2814) Address issues w/ Z-order Layout Optimization

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-2814. - Resolution: Fixed > Address issues w/ Z-order Layout Optimization > --

[jira] [Reopened] (HUDI-2814) Address issues w/ Z-order Layout Optimization

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reopened HUDI-2814: --- > Address issues w/ Z-order Layout Optimization > - >

[jira] [Reopened] (HUDI-2788) Z-ordering Layout Optimization Strategy fails w/ Data Skipping enabled

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reopened HUDI-2788: --- > Z-ordering Layout Optimization Strategy fails w/ Data Skipping enabled > ---

[GitHub] [hudi] alexeykudinkin commented on pull request #6355: add auto col cast in MergeIntoHoodieTableCommand

2022-09-19 Thread GitBox
alexeykudinkin commented on PR #6355: URL: https://github.com/apache/hudi/pull/6355#issuecomment-1251512909 @xushiyan i don't think we will be able to land this as is, as it might heavily restrict usability of the `MERGE INTO` -- This is an automated message from the Apache Git Service. T

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6355: add auto col cast in MergeIntoHoodieTableCommand

2022-09-19 Thread GitBox
alexeykudinkin commented on code in PR #6355: URL: https://github.com/apache/hudi/pull/6355#discussion_r974659357 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala: ## @@ -150,15 +151,35 @@ case class MergeInto

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-19 Thread GitBox
alexeykudinkin commented on code in PR #6196: URL: https://github.com/apache/hudi/pull/6196#discussion_r974656615 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieCommonConfig.java: ## @@ -38,7 +38,7 @@ public class HoodieCommonConfig extends HoodieConfig {

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-09-19 Thread GitBox
alexeykudinkin commented on code in PR #6196: URL: https://github.com/apache/hudi/pull/6196#discussion_r974651748 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieCommonConfig.java: ## @@ -38,7 +38,7 @@ public class HoodieCommonConfig extends HoodieConfig {

[jira] [Updated] (HUDI-4872) Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INESRT *

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4872: -- Epic Link: HUDI-1297 > Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INES

[jira] [Updated] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4430: -- Affects Version/s: 0.12.0 > Incorrect type casting while reading HUDI table created with > Cust

[jira] [Updated] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4430: -- Issue Type: Bug (was: Improvement) > Incorrect type casting while reading HUDI table created wi

[jira] [Updated] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4430: -- Component/s: writer-core > Incorrect type casting while reading HUDI table created with > Custo

[jira] [Assigned] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-4430: - Assignee: Alexey Kudinkin > Incorrect type casting while reading HUDI table created with

[jira] [Updated] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4430: -- Fix Version/s: 0.13.0 > Incorrect type casting while reading HUDI table created with > CustomKe

[GitHub] [hudi] hudi-bot commented on pull request #6717: [HUDI-4877] Fix org.apache.hudi.index.bucket.TestHoodieSimpleBucketIndex#testTagLocation not work correct issue

2022-09-19 Thread GitBox
hudi-bot commented on PR #6717: URL: https://github.com/apache/hudi/pull/6717#issuecomment-1251481174 ## CI report: * 668b55c8b04a755f1a7b3d135bc8acea8c04e844 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1150

[GitHub] [hudi] hudi-bot commented on pull request #6665: [HUDI-4850] Incremental Ingestion from GCS

2022-09-19 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1251481026 ## CI report: * 4780c62abec84ed0e92dec0316aa132bb6bf8450 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1150

[jira] [Comment Edited] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572119#comment-17572119 ] Alexey Kudinkin edited comment on HUDI-4472 at 9/19/22 7:55 PM:

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Story Points: 8 (was: 10) > Revisit schema handling in HoodieSparkSqlWriter > -

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Component/s: writer-core > Revisit schema handling in HoodieSparkSqlWriter > ---

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Affects Version/s: 0.12.0 > Revisit schema handling in HoodieSparkSqlWriter > --

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Story Points: 10 (was: 4) > Revisit schema handling in HoodieSparkSqlWriter > -

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Revisit schema handling in HoodieSp

[jira] [Reopened] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reopened HUDI-4472: --- > Revisit schema handling in HoodieSparkSqlWriter > --

[jira] [Updated] (HUDI-4692) Clean up HoodieSparkSqlWriter

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4692: -- Issue Type: Improvement (was: Bug) > Clean up HoodieSparkSqlWriter > --

[jira] [Updated] (HUDI-4872) Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INESRT *

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4872: -- Component/s: writer-core > Support automatic schema evolution for SQL MERGE INTO with UPDATE */

[jira] [Assigned] (HUDI-4872) Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INESRT *

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-4872: - Assignee: Alexey Kudinkin > Support automatic schema evolution for SQL MERGE INTO with UP

[jira] [Updated] (HUDI-4872) Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INESRT *

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4872: -- Fix Version/s: 0.13.0 > Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INE

[jira] [Updated] (HUDI-4872) Support automatic schema evolution for SQL MERGE INTO with UPDATE */ INESRT *

2022-09-19 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4872: -- Priority: Major (was: Minor) > Support automatic schema evolution for SQL MERGE INTO with UPDAT

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606735#comment-17606735 ] Rajesh Mahindra commented on HUDI-4430: --- [~alexey.kudinkin] Will help with this. >

[GitHub] [hudi] xccui commented on issue #5870: [SUPPORT] Issues when querying data partitioned by year with Flink

2022-09-19 Thread GitBox
xccui commented on issue #5870: URL: https://github.com/apache/hudi/issues/5870#issuecomment-1251447373 Hi @danny0405, I wonder if you could take a look at this issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] hudi-bot commented on pull request #5920: [HUDI-4326] add updateTableSerDeInfo for HiveSyncTool

2022-09-19 Thread GitBox
hudi-bot commented on PR #5920: URL: https://github.com/apache/hudi/pull/5920#issuecomment-1251415233 ## CI report: * 2be98deb1d10d00a627fffabeabc318cb045981f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1150

[GitHub] [hudi] hudi-bot commented on pull request #6516: [HUDI-4729] Fix fq can not be queried in pending compaction when query ro table with spark

2022-09-19 Thread GitBox
hudi-bot commented on PR #6516: URL: https://github.com/apache/hudi/pull/6516#issuecomment-1251405340 ## CI report: * f5cc48018e7d4e542c3d0b2dc677f4b708f03f11 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1150

[jira] [Updated] (HUDI-4734) Add table config change validation in deltastreamer

2022-09-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4734: -- Sprint: 2022/09/19 > Add table config change validation in deltastreamer > -

[jira] [Assigned] (HUDI-4734) Add table config change validation in deltastreamer

2022-09-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4734: - Assignee: Jonathan Vexler (was: Vamshi Gudavarthi) > Add table config change val

[GitHub] [hudi] hudi-bot commented on pull request #6665: [HUDI-4850] Incremental Ingestion from GCS

2022-09-19 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1251334612 ## CI report: * dc2f497daf743a6605183ada6d412f9491890331 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1139

[GitHub] [hudi] hudi-bot commented on pull request #6349: [HUDI-4433] Hudi-CLI repair deduplicate not working with non-partitio…

2022-09-19 Thread GitBox
hudi-bot commented on PR #6349: URL: https://github.com/apache/hudi/pull/6349#issuecomment-1251333988 ## CI report: * 6b06c7819c5c9ebf11c85a0ba3a4fb03963cbce5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1144

[jira] [Deleted] (HUDI-4869) Fix test for HUDI-4780

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu deleted HUDI-4869: - > Fix test for HUDI-4780 > -- > > Key: HUDI-4869 > URL: h

[GitHub] [hudi] hudi-bot commented on pull request #6665: [HUDI-4850] Incremental Ingestion from GCS

2022-09-19 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1251328667 ## CI report: * dc2f497daf743a6605183ada6d412f9491890331 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1139

[GitHub] [hudi] hudi-bot commented on pull request #6349: [HUDI-4433] Hudi-CLI repair deduplicate not working with non-partitio…

2022-09-19 Thread GitBox
hudi-bot commented on PR #6349: URL: https://github.com/apache/hudi/pull/6349#issuecomment-1251328048 ## CI report: * 6b06c7819c5c9ebf11c85a0ba3a4fb03963cbce5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1144

[jira] [Closed] (HUDI-4780) hoodie.logfile.max.size It does not take effect, causing the log file to be too large

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4780. Resolution: Fixed > hoodie.logfile.max.size It does not take effect, causing the log file to be > too larg

[GitHub] [hudi] hudi-bot commented on pull request #6697: [HUDI-3478] Spark CDC Write

2022-09-19 Thread GitBox
hudi-bot commented on PR #6697: URL: https://github.com/apache/hudi/pull/6697#issuecomment-1251252254 ## CI report: * 197d7d1cc1e02fc046e217c238650bc95705421b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1149

[GitHub] [hudi] hudi-bot commented on pull request #6697: [HUDI-3478] Spark CDC Write

2022-09-19 Thread GitBox
hudi-bot commented on PR #6697: URL: https://github.com/apache/hudi/pull/6697#issuecomment-1251245708 ## CI report: * 197d7d1cc1e02fc046e217c238650bc95705421b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1149

[GitHub] [hudi] hudi-bot commented on pull request #6349: [HUDI-4433] Hudi-CLI repair deduplicate not working with non-partitio…

2022-09-19 Thread GitBox
hudi-bot commented on PR #6349: URL: https://github.com/apache/hudi/pull/6349#issuecomment-1251237935 ## CI report: * 6b06c7819c5c9ebf11c85a0ba3a4fb03963cbce5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1144

[GitHub] [hudi] hudi-bot commented on pull request #6388: [HUDI-4617] Fix delete_record's preCombine logic when changelog disabled

2022-09-19 Thread GitBox
hudi-bot commented on PR #6388: URL: https://github.com/apache/hudi/pull/6388#issuecomment-1251238082 ## CI report: * 61055ec5e2a5c1282b7acf329276d345c4f83cc8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1149

[jira] [Updated] (HUDI-3967) Automatic savepoint in Hudi

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3967: - Story Points: 0 > Automatic savepoint in Hudi > --- > > Key: HUDI-

[jira] [Updated] (HUDI-3601) Support multi-arch builds in docker setup

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3601: - Story Points: 0 > Support multi-arch builds in docker setup > - >

[jira] [Updated] (HUDI-1574) Trim existing unit tests to finish in much shorter amount of time

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1574: - Story Points: 0 Description: spark-client-tests 278.165 s - in org.apache.hudi.table.TestHoodieMerg

[jira] [Updated] (HUDI-1265) Efficient bootstrap and migration of existing non-Hudi dataset

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1265: - Story Points: 0 > Efficient bootstrap and migration of existing non-Hudi dataset > ---

[jira] [Updated] (HUDI-4659) Develop a validation tool for bootstrap table

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4659: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Develop a validation tool for bootstrap table > --

[jira] [Updated] (HUDI-4652) Test COW: Deltastreamer writing with non-Hudi partitions

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4652: - Sprint: 2022/08/22, 2022/09/05 (was: 2022/08/22, 2022/09/05, 2022/09/19) > Test COW: Deltastreamer writin

[jira] [Updated] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-992: Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > For hive-style partitioned source data, partition columns

[jira] [Updated] (HUDI-2071) Support Reading Bootstrap MOR RT Table In Spark DataSource Table

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2071: - Sprint: (was: 2022/09/19) > Support Reading Bootstrap MOR RT Table In Spark DataSource Table >

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-915: Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Partition Columns missing in files upserted after Metadat

[jira] [Updated] (HUDI-4785) Cannot find partition column when querying bootstrapped table in Spark

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4785: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Cannot find partition column when querying bootstrappe

[jira] [Updated] (HUDI-619) Investigate and implement mechanism to have hive/presto/sparksql queries avoid stitching and return null values for hoodie columns

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-619: Sprint: (was: 2022/09/19) > Investigate and implement mechanism to have hive/presto/sparksql queries > avo

[jira] [Updated] (HUDI-1001) Add implementation to translate source partition paths when doing metadata bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1001: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Add implementation to translate source partition paths

[jira] [Updated] (HUDI-4783) Hive-style partition path ("partition=value") does not work with bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4783: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Hive-style partition path ("partition=value") does not

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1779: - Sprint: 2022/08/22, 2022/09/05 (was: 2022/08/22, 2022/09/05, 2022/09/19) > Fail to bootstrap/upsert a tab

[jira] [Updated] (HUDI-4453) Support partition pruning for tables Bootstrapped from Source Hive Style partitioned tables

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4453: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Support partition pruning for tables Bootstrapped from

[jira] [Updated] (HUDI-4854) Deltastreamer does not respect partition selector regex for metadata-only bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4854: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Deltastreamer does not respect partition selector rege

[jira] [Updated] (HUDI-1157) Optimization whether to query Bootstrapped table using HoodieBootstrapRelation vs Sparks Parquet datasource

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1157: - Sprint: (was: 2022/09/19) > Optimization whether to query Bootstrapped table using > HoodieBootstrapRel

[jira] [Updated] (HUDI-4651) Test COW: Spark datasource writing with non-Hudi partitions

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4651: - Sprint: 2022/08/22, 2022/09/05 (was: 2022/08/22, 2022/09/05, 2022/09/19) > Test COW: Spark datasource wri

[jira] [Updated] (HUDI-4784) Full-record bootstrap does not generate correct partition path

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4784: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Full-record bootstrap does not generate correct partit

[jira] [Updated] (HUDI-1369) Bootstrap datasource jobs from hanging via spark-submit

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1369: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Bootstrap datasource jobs from hanging via spark-submi

[jira] [Updated] (HUDI-4125) Add IT (Azure CI) around bootstrapped Hudi table

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4125: - Sprint: (was: 2022/09/19) > Add IT (Azure CI) around bootstrapped Hudi table > -

[jira] [Updated] (HUDI-4662) Test MOR: Spark datasource and SQL with bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4662: - Sprint: (was: 2022/09/19) > Test MOR: Spark datasource and SQL with bootstrap >

[jira] [Updated] (HUDI-4855) Bootstrap table from Deltastreamer cannot be read in Spark

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4855: - Sprint: 2022/09/05 (was: 2022/09/05, 2022/09/19) > Bootstrap table from Deltastreamer cannot be read in S

[jira] [Updated] (HUDI-1158) Optimizations in parallelized listing behaviour for markers and bootstrap source files

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1158: - Sprint: (was: 2022/09/19) > Optimizations in parallelized listing behaviour for markers and bootstrap >

[jira] [Updated] (HUDI-4663) Test MOR: Hive QL with bootstrap

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4663: - Sprint: (was: 2022/09/19) > Test MOR: Hive QL with bootstrap > > >

[jira] [Updated] (HUDI-4664) Evaluate performance of bootstrap using large dataset

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4664: - Sprint: (was: 2022/09/19) > Evaluate performance of bootstrap using large dataset >

[jira] [Assigned] (HUDI-4722) Add support for metrics for locking infra

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4722: Assignee: Jagmeet Bali > Add support for metrics for locking infra > --

[jira] [Updated] (HUDI-4789) Convert FileSystem usage in hudi connector to use TrinoFileSystem interface

2022-09-19 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4789: -- Story Points: 10 (was: 2) > Convert FileSystem usage in hudi connector to use TrinoFileSystem interface

[jira] [Assigned] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3961: Assignee: Raymond Xu > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities sli

[GitHub] [hudi] hudi-bot commented on pull request #6349: [HUDI-4433] Hudi-CLI repair deduplicate not working with non-partitio…

2022-09-19 Thread GitBox
hudi-bot commented on PR #6349: URL: https://github.com/apache/hudi/pull/6349#issuecomment-1251174058 ## CI report: * 6b06c7819c5c9ebf11c85a0ba3a4fb03963cbce5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1144

[jira] [Updated] (HUDI-4199) Clean up row writer path for url encoding, consistent logical timestamp

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4199: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Clean up row writer path for url encoding, consistent

[jira] [Updated] (HUDI-3967) Automatic savepoint in Hudi

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3967: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Automatic savepoint in Hudi >

[jira] [Updated] (HUDI-4749) PartitionsForFullCleaning in CleanPlanner is using FileSystemBasedListing

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4749: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > PartitionsForFullCleaning in C

[jira] [Updated] (HUDI-4836) Remove "hbase-default.xml" colliding w/ "hbase-site.xml" in Hudi bundles

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4836: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Remove "hbase-default.xml" colliding w/ "hbase-site.xm

[jira] [Updated] (HUDI-4758) Enhance validations for hudi-examples quick start for spark and pyspark

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4758: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Enhance validations for hudi-e

[jira] [Updated] (HUDI-4765) Compared inserting data via spark-sql with spark-shell,_hoodie_record_key generation logic is different, which might affects data upsert

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4765: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Compared inserting data via spark-sql with spark-shell

[jira] [Updated] (HUDI-3397) Make sure Spark RDDs triggering actual FS activity are only dereferenced once

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3397: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Make sure Spark RDDs triggering actual FS activity are

[jira] [Updated] (HUDI-4754) Add compliance check in GH actions

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4754: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Add compliance check in GH act

[jira] [Updated] (HUDI-4759) Fix website Quick start guide to add validations

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4759: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Fix website Quick start guide to add validations > ---

[jira] [Updated] (HUDI-4792) Speed up cleaning with metadata table enabled

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4792: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Speed up cleaning with metadata table enabled > -

[jira] [Updated] (HUDI-2786) Failed to connect to namenode in Docker Demo on Apple M1 chip

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2786: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Failed to connect to namenode in Docker Demo on Apple

[jira] [Updated] (HUDI-4784) Full-record bootstrap does not generate correct partition path

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4784: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Full-record bootstrap does not generate correct partit

[jira] [Updated] (HUDI-4396) Add a boolean parameter to decide whether the partition is cascade or not when hive table columns changes

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4396: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Add a boolean parameter to decide whether the partitio

[jira] [Updated] (HUDI-4261) OOM in bulk-insert when using "NONE" sort-mode for table w/ large # of partitions

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4261: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > OOM in bulk-insert when using

[jira] [Updated] (HUDI-4363) Support Clustering row writer to improve performance

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4363: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Support Clustering row writer to improve performance >

[jira] [Updated] (HUDI-4722) Add support for metrics for locking infra

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4722: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Add support for metrics for locking infra > --

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3636: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Clustering fails due to marker

[jira] [Updated] (HUDI-4761) Test using spark listeners that guards any changes to DAG

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4761: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Test using spark listeners that guards any changes to

[jira] [Updated] (HUDI-4652) Test COW: Deltastreamer writing with non-Hudi partitions

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4652: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Test COW: Deltastreamer writin

[jira] [Updated] (HUDI-4878) Fix incremental cleaning for clean based on LATEST_FILE_VERSIONS

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4878: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Fix incremental cleaning for clean based on LATEST_FIL

[jira] [Updated] (HUDI-3207) Hudi Trino connector PR review

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3207: - Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7,

[jira] [Updated] (HUDI-4757) Enhance hudi-examples to add pyspark examples

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4757: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > Enhance hudi-examples to add p

[jira] [Updated] (HUDI-4805) Update docs for workaround to make HBase working with HDFS on Hadoop 3

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4805: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Update docs for workaround to make HBase working with

[jira] [Updated] (HUDI-4586) Address S3 timeouts in Bloom Index with metadata table

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4586: - Sprint: 2022/08/08, 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/08, 2022/08/22, 2022/09/05) > Addre

[jira] [Updated] (HUDI-4674) change the default value of inputFormat for the MOR table

2022-09-19 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4674: - Sprint: 2022/08/22, 2022/09/05, 2022/09/19 (was: 2022/08/22, 2022/09/05) > change the default value of in

<    1   2   3   4   5   >