[GitHub] [hudi] danny0405 commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
danny0405 commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953405253 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -74,16 +74,57 @@ public class HoodieActiveTimeline extends HoodieDefa

[GitHub] [hudi] danny0405 commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
danny0405 commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953404878 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -74,16 +74,57 @@ public class HoodieActiveTimeline extends HoodieDefa

[GitHub] [hudi] danny0405 commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
danny0405 commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953404485 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -74,16 +74,57 @@ public class HoodieActiveTimeline extends HoodieDefa

[GitHub] [hudi] 15663671003 opened a new issue, #6483: [SUPPORT]

2022-08-23 Thread GitBox
15663671003 opened a new issue, #6483: URL: https://github.com/apache/hudi/issues/6483 **Describe the problem you faced** can't connect to hive when sync **Expected behavior** Install spark3.2.2 in CDH6.2.1 environment, and run an operation to write to hudi0.12.0, the p

[GitHub] [hudi] danny0405 commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
danny0405 commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953403629 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -74,16 +74,57 @@ public class HoodieActiveTimeline extends HoodieDefa

[GitHub] [hudi] china-shang commented on issue #6479: [SUPPORT] How to query the previous SNAPSHOT in Hive

2022-08-23 Thread GitBox
china-shang commented on issue #6479: URL: https://github.com/apache/hudi/issues/6479#issuecomment-1225264914 > I guess it's still under development > > https://issues.apache.org/jira/browse/HUDI-1460 Oh... No... -- This is an automated message from the Apache Git Service. T

[jira] [Updated] (HUDI-4704) bulk insert overwrite table will delete the table and then recreate a table

2022-08-23 Thread zouxxyy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zouxxyy updated HUDI-4704: -- Description: When hoodie.sql.bulk.insert.enable is enabled, executing insert overwrite will delete the table an

[jira] [Updated] (HUDI-4704) bulk insert overwrite table will delete the table and then recreate a table

2022-08-23 Thread zouxxyy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zouxxyy updated HUDI-4704: -- Description: When hoodie.sql.bulk.insert.enable is enabled, executing insert overwrite will delete the table an

[jira] [Updated] (HUDI-4704) bulk insert overwrite table will delete the table and then recreate a table

2022-08-23 Thread zouxxyy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zouxxyy updated HUDI-4704: -- Description: When `hoodie.sql.bulk.insert.enable` is enabled, executing insert overwrite will delete the table

[jira] [Created] (HUDI-4704) bulk insert overwrite table will delete the table and then recreate a table

2022-08-23 Thread zouxxyy (Jira)
zouxxyy created HUDI-4704: - Summary: bulk insert overwrite table will delete the table and then recreate a table Key: HUDI-4704 URL: https://issues.apache.org/jira/browse/HUDI-4704 Project: Apache Hudi

[jira] [Assigned] (HUDI-4704) bulk insert overwrite table will delete the table and then recreate a table

2022-08-23 Thread zouxxyy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zouxxyy reassigned HUDI-4704: - Assignee: zouxxyy > bulk insert overwrite table will delete the table and then recreate a table > ---

[GitHub] [hudi] nsivabalan commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-23 Thread GitBox
nsivabalan commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1225242795 give me two days. I am gonna take a stab at this and will update here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [hudi] hudi-bot commented on pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
hudi-bot commented on PR #6000: URL: https://github.com/apache/hudi/pull/6000#issuecomment-1225193356 ## CI report: * 06f352b0235cbbac215174c2755fca24009799c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1091

[GitHub] [hudi] nsivabalan commented on issue #4622: [SUPPORT] Can't query Redshift rows even after downgrade from 0.10

2022-08-23 Thread GitBox
nsivabalan commented on issue #4622: URL: https://github.com/apache/hudi/issues/4622#issuecomment-1225182724 thanks @nochimow for the update. appreciate it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [hudi] fengjian428 commented on issue #6441: Status on PR: 2666: Support update partial fields for CoW table

2022-08-23 Thread GitBox
fengjian428 commented on issue #6441: URL: https://github.com/apache/hudi/issues/6441#issuecomment-1225175998 > What I understand -> OverwriteNonDefaultsWithLatestAvroPayload can update the non-null fields in the new data(cdc) to the old data(Hudi table) But what if I have multiple changes

[GitHub] [hudi] brskiran1 commented on issue #6304: Hudi MultiTable Deltastreamer not updating glue catalog when new column added on Source

2022-08-23 Thread GitBox
brskiran1 commented on issue #6304: URL: https://github.com/apache/hudi/issues/6304#issuecomment-1225173640 @rmahindra123 responding on behalf of @SubashRanganathan . we have tried this without the flag hoodie.schema.on.read.enable set to true. Still dont see glue catalog updated with new

[jira] [Commented] (HUDI-4698) Rename the package 'org.apache.flink.table.data' to avoid conflicts with flink table core

2022-08-23 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583983#comment-17583983 ] Danny Chen commented on HUDI-4698: -- Fixed via master branch: 822c1397e04936b89fda771bb1c2

[jira] [Resolved] (HUDI-4698) Rename the package 'org.apache.flink.table.data' to avoid conflicts with flink table core

2022-08-23 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-4698. -- > Rename the package 'org.apache.flink.table.data' to avoid conflicts with > flink table core > ---

[hudi] branch master updated (16a80e6d41 -> 822c1397e0)

2022-08-23 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 16a80e6d41 [HUDI-4637] Release thread in RateLimiter doesn't been terminated (#6433) add 822c1397e0 [HUDI-4698]

[GitHub] [hudi] namuny commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-23 Thread GitBox
namuny commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1225170438 Gentle bump to see if anyone has any further recommendations on what information we could provide to help with reproducing the issue. -- This is an automated message from the Apache Gi

[GitHub] [hudi] danny0405 merged pull request #6481: [HUDI-4698] Rename the package 'org.apache.flink.table.data' to avoid…

2022-08-23 Thread GitBox
danny0405 merged PR #6481: URL: https://github.com/apache/hudi/pull/6481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] danny0405 commented on pull request #6481: [HUDI-4698] Rename the package 'org.apache.flink.table.data' to avoid…

2022-08-23 Thread GitBox
danny0405 commented on PR #6481: URL: https://github.com/apache/hudi/pull/6481#issuecomment-1225170342 The failed test case is flaky, should not be caused by this patch, would merge this PR and fix it in another PR. -- This is an automated message from the Apache Git Service. To respond t

[GitHub] [hudi] hudi-bot commented on pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
hudi-bot commented on PR #6000: URL: https://github.com/apache/hudi/pull/6000#issuecomment-1225160370 ## CI report: * b54e1a1397b1294cc4dc6e28bdfea7fb4ccaceab Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1089

[GitHub] [hudi] hudi-bot commented on pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
hudi-bot commented on PR #6000: URL: https://github.com/apache/hudi/pull/6000#issuecomment-1225157843 ## CI report: * b54e1a1397b1294cc4dc6e28bdfea7fb4ccaceab Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1089

[GitHub] [hudi] TengHuo commented on pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
TengHuo commented on PR #6000: URL: https://github.com/apache/hudi/pull/6000#issuecomment-1225157773 Done, updated the method `parseDateFromInstantTimeSafely`, it will log a warning message and return `Option.empty` when get an invalid timestamp, so won't output metrics when the timestamp i

[GitHub] [hudi] Zouxxyy commented on issue #6479: [SUPPORT] How to query the previous SNAPSHOT in Hive

2022-08-23 Thread GitBox
Zouxxyy commented on issue #6479: URL: https://github.com/apache/hudi/issues/6479#issuecomment-1225129309 I guess it's still under development -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] TengHuo commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
TengHuo commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953286608 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -75,16 +75,56 @@ public class HoodieActiveTimeline extends HoodieDefaul

[GitHub] [hudi] danny0405 commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
danny0405 commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953281948 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -75,16 +75,56 @@ public class HoodieActiveTimeline extends HoodieDefa

[GitHub] [hudi] TengHuo commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
TengHuo commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953269114 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -75,16 +75,56 @@ public class HoodieActiveTimeline extends HoodieDefaul

[jira] [Closed] (HUDI-4637) Release thread in RateLimiter is not terminated

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-4637. - Resolution: Fixed > Release thread in RateLimiter is not terminated >

[hudi] branch master updated (ca8a57a21d -> 16a80e6d41)

2022-08-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from ca8a57a21d [HUDI-4515] Fix savepoints will be cleaned in keeping latest versions policy (#6267) add 16a80e6d41

[GitHub] [hudi] nsivabalan merged pull request #6433: [HUDI-4637] Release thread in RateLimiter is not terminated

2022-08-23 Thread GitBox
nsivabalan merged PR #6433: URL: https://github.com/apache/hudi/pull/6433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[GitHub] [hudi] linfey90 commented on a diff in pull request #6456: [HUDI-4674]Change the default value of inputFormat for the MOR table

2022-08-23 Thread GitBox
linfey90 commented on code in PR #6456: URL: https://github.com/apache/hudi/pull/6456#discussion_r953253259 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala: ## @@ -120,10 +119,8 @@ object CreateHoodieTabl

[GitHub] [hudi] nsivabalan commented on pull request #5920: [HUDI-4326] add updateTableSerDeInfo for HiveSyncTool

2022-08-23 Thread GitBox
nsivabalan commented on PR #5920: URL: https://github.com/apache/hudi/pull/5920#issuecomment-1225077158 hey @kk17 : is there any updates on this patch. once its ready, let me know. I can take another look. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6000: [HUDI-4340] fix not parsable text DateTimeParseException in HoodieInstantTimeGenerator.parseDateFromInstantTime

2022-08-23 Thread GitBox
nsivabalan commented on code in PR #6000: URL: https://github.com/apache/hudi/pull/6000#discussion_r953246798 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java: ## @@ -75,16 +75,56 @@ public class HoodieActiveTimeline extends HoodieDef

[jira] [Closed] (HUDI-4515) savepoints will be clean in keeping latest versions policy

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-4515. - Resolution: Fixed > savepoints will be clean in keeping latest versions policy > -

[GitHub] [hudi] nsivabalan merged pull request #6267: [HUDI-4515] Fix savepoints will be cleaned in keeping latest versions policy

2022-08-23 Thread GitBox
nsivabalan merged PR #6267: URL: https://github.com/apache/hudi/pull/6267 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[hudi] branch master updated (1879efa45d -> ca8a57a21d)

2022-08-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 1879efa45d [HUDI-4686] Flip option 'write.ignore.failed' to default false (#6467) add ca8a57a21d [HUDI-4515] Fi

[GitHub] [hudi] nsivabalan commented on pull request #6157: [HUDI-4431] Fix log file will not roll over to a new file

2022-08-23 Thread GitBox
nsivabalan commented on PR #6157: URL: https://github.com/apache/hudi/pull/6157#issuecomment-1225068635 @XuQianJin-Stars : hey. can you follow up on this. do we need a fix or if its already taken care. let us know. we can close it out. -- This is an automated message from the Apache Git S

[GitHub] [hudi] bhasudha commented on pull request #6482: [DOCS] Add youtube channel and Office hours page

2022-08-23 Thread GitBox
bhasudha commented on PR #6482: URL: https://github.com/apache/hudi/pull/6482#issuecomment-1225060413 **Image of the header** https://user-images.githubusercontent.com/2179254/186296081-1401a649-663e-4db0-9c67-5aef18ff6042.png";> The logo is updated but is not usually visible in local w

[GitHub] [hudi] bhasudha opened a new pull request, #6482: [DOCS] Add youttube channel and Office hours page

2022-08-23 Thread GitBox
bhasudha opened a new pull request, #6482: URL: https://github.com/apache/hudi/pull/6482 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1225016353 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * f70abbc3b45005d40e74252814edc0078a50030e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224985799 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 326f8f69ea423a58df8c98f382528efb9424d053 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224982778 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224982712 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 269aef1e346d379cdb5b76eb2aab9fc2945dcfc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224979442 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224979379 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 269aef1e346d379cdb5b76eb2aab9fc2945dcfc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6432: [HUDI-4586] Improve metadata fetching in bloom index

2022-08-23 Thread GitBox
hudi-bot commented on PR #6432: URL: https://github.com/apache/hudi/pull/6432#issuecomment-1224976328 ## CI report: * ed15f57dc58b2e9142dd33a0ecd078bf4c236afc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1088

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6352: [HUDI-4584] Fixing `SQLConf` not being propagated to executor

2022-08-23 Thread GitBox
alexeykudinkin commented on code in PR #6352: URL: https://github.com/apache/hudi/pull/6352#discussion_r953174810 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/SQLConfInjectingRDD.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] yihua commented on a diff in pull request #6352: [HUDI-4584] Fixing `SQLConf` not being propagated to executor

2022-08-23 Thread GitBox
yihua commented on code in PR #6352: URL: https://github.com/apache/hudi/pull/6352#discussion_r953160525 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/SQLConfInjectingRDD.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundati

[GitHub] [hudi] hudi-bot commented on pull request #6432: [HUDI-4586] Improve metadata fetching in bloom index

2022-08-23 Thread GitBox
hudi-bot commented on PR #6432: URL: https://github.com/apache/hudi/pull/6432#issuecomment-1224932396 ## CI report: * ed15f57dc58b2e9142dd33a0ecd078bf4c236afc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1088

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224931869 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 269aef1e346d379cdb5b76eb2aab9fc2945dcfc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6432: [HUDI-4586] Improve metadata fetching in bloom index

2022-08-23 Thread GitBox
hudi-bot commented on PR #6432: URL: https://github.com/apache/hudi/pull/6432#issuecomment-1224927464 ## CI report: * ed15f57dc58b2e9142dd33a0ecd078bf4c236afc UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] dyang108 commented on issue #6428: [SUPPORT] S3 Deltastreamer: Block has already been inflated

2022-08-23 Thread GitBox
dyang108 commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1224925928 Update: I got it working on an older version of Hudi 0.10.1, so seems like a regression -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] nikspatel03 commented on issue #6441: Status on PR: 2666: Support update partial fields for CoW table

2022-08-23 Thread GitBox
nikspatel03 commented on issue #6441: URL: https://github.com/apache/hudi/issues/6441#issuecomment-1224887332 What I understand -> OverwriteNonDefaultsWithLatestAvroPayload can update the non-null fields in the new data(cdc) to the old data(Hudi table) But what if I have multiple changes f

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224873599 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 35c07f36c6409d471e1810833cec0b27cbf78cf9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] nochimow closed issue #4622: [SUPPORT] Can't query Redshift rows even after downgrade from 0.10

2022-08-23 Thread GitBox
nochimow closed issue #4622: [SUPPORT] Can't query Redshift rows even after downgrade from 0.10 URL: https://github.com/apache/hudi/issues/4622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [hudi] nochimow commented on issue #4622: [SUPPORT] Can't query Redshift rows even after downgrade from 0.10

2022-08-23 Thread GitBox
nochimow commented on issue #4622: URL: https://github.com/apache/hudi/issues/4622#issuecomment-1224870906 Even with AWS saying that only 0.10.0 is "supported", I did some compatibility tests with Hudi 0.10, 0.11 and 0.12. All versions worked fine, like it wasn't before. (Prior to that,

[GitHub] [hudi] nsivabalan commented on issue #6474: [SUPPORT] Hudi Deltastreamer fails to acquire lock with DynamoDB Lock Provider.

2022-08-23 Thread GitBox
nsivabalan commented on issue #6474: URL: https://github.com/apache/hudi/issues/6474#issuecomment-1224816176 yeah. From what I see, the cleaner waits for the lock (which was acquired to apply `20220822020402958` to metadata table", but after retrying, before giving up, looks like the cleane

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark 3.2 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224756201 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 35c07f36c6409d471e1810833cec0b27cbf78cf9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark3 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224734575 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 58aadea50328122e1a9a1b01d38e3af12e33fbe1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #6105: Make Spark3 the default profile

2022-08-23 Thread GitBox
hudi-bot commented on PR #6105: URL: https://github.com/apache/hudi/pull/6105#issuecomment-1224723686 ## CI report: * ec2ecf42597af2586cd3864b297f15b881cf204d UNKNOWN * 58aadea50328122e1a9a1b01d38e3af12e33fbe1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] minihippo commented on pull request #5920: [HUDI-4326] add updateTableSerDeInfo for HiveSyncTool

2022-08-23 Thread GitBox
minihippo commented on PR #5920: URL: https://github.com/apache/hudi/pull/5920#issuecomment-1224555479 > > can we please write a test for the changes made. > > any instruction on how to write a test? Hi @kk17, you can refer to the ut in `TestHiveSyncTool`, mock a table performe

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224544985 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Updated] (HUDI-2369) Blog on bulk insert sort modes

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2369: -- Sprint: 2022/09/05 > Blog on bulk insert sort modes > -- > >

[jira] [Updated] (HUDI-2369) Blog on bulk insert sort modes

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2369: -- Fix Version/s: 0.12.1 > Blog on bulk insert sort modes > --

[GitHub] [hudi] yihua commented on pull request #6442: [HUDI-4449] Support DataSourceV2 Read for Spark3.2

2022-08-23 Thread GitBox
yihua commented on PR #6442: URL: https://github.com/apache/hudi/pull/6442#issuecomment-1224494779 @alexeykudinkin FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[jira] [Updated] (HUDI-4496) ORC fails w/ Spark 3.1

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4496: -- Fix Version/s: 13.0 > ORC fails w/ Spark 3.1 > -- > >

[jira] [Updated] (HUDI-4389) Make HoodieStreamingSink idempotent

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4389: -- Sprint: 2022/08/22 (was: 2022/09/19) > Make HoodieStreamingSink idempotent > --

[jira] [Updated] (HUDI-2673) Add integration/e2e test for kafka-connect functionality

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2673: -- Sprint: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25, 2022/05/02, 2022/05/16, 2022/08/22 (was

[GitHub] [hudi] yihua commented on pull request #6196: [HUDI-4071] Enable schema reconciliation by default

2022-08-23 Thread GitBox
yihua commented on PR #6196: URL: https://github.com/apache/hudi/pull/6196#issuecomment-1224476707 @alexeykudinkin could you also review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[jira] [Updated] (HUDI-4212) kafka-connect module: Unresolved dependency: 'jdk.tools:jdk.tools:jar:1.7'

2022-08-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4212: -- Sprint: 2022/08/08, 2022/09/05 (was: 2022/08/08, 2022/08/22) > kafka-connect module: Un

[GitHub] [hudi] rmahindra123 commented on issue #6348: [SUPPORT] Hudi error while running HoodieMultiTableDeltaStreamer: Commit 20220809112130103 failed and rolled-back !

2022-08-23 Thread GitBox
rmahindra123 commented on issue #6348: URL: https://github.com/apache/hudi/issues/6348#issuecomment-1224406489 For Multitable Deltastreamer, it runs the ingestion sequentially, so it will first ingest table1 and then table2. Let me know if you still are facing issues. -- This is an auto

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224389948 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6456: [HUDI-4674]Change the default value of inputFormat for the MOR table

2022-08-23 Thread GitBox
alexeykudinkin commented on code in PR #6456: URL: https://github.com/apache/hudi/pull/6456#discussion_r952916053 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala: ## @@ -120,10 +119,8 @@ object CreateHood

[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

2022-08-23 Thread GitBox
hudi-bot commented on PR #6135: URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224382862 ## CI report: * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] alexeykudinkin closed pull request #6193: [WIP] Fixing logging dependencies and configs

2022-08-23 Thread GitBox
alexeykudinkin closed pull request #6193: [WIP] Fixing logging dependencies and configs URL: https://github.com/apache/hudi/pull/6193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [hudi] alexeykudinkin commented on pull request #6193: [WIP] Fixing logging dependencies and configs

2022-08-23 Thread GitBox
alexeykudinkin commented on PR #6193: URL: https://github.com/apache/hudi/pull/6193#issuecomment-1224359433 Yeah, this could be closed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] yihua commented on pull request #6193: [WIP] Fixing logging dependencies and configs

2022-08-23 Thread GitBox
yihua commented on PR #6193: URL: https://github.com/apache/hudi/pull/6193#issuecomment-1224348069 @alexeykudinkin Is this still needed or replaced by #6170? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[jira] [Updated] (HUDI-4586) Address S3 timeouts in Bloom Index with metadata table

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4586: - Story Points: 1 (was: 5) > Address S3 timeouts in Bloom Index with metadata table > -

[jira] [Updated] (HUDI-4635) Update roadmap page based on H2 2022 plan

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4635: - Story Points: 0.5 (was: 1) > Update roadmap page based on H2 2022 plan >

[jira] [Updated] (HUDI-3636) Clustering fails due to marker creation failure

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3636: - Story Points: 2 (was: 4) > Clustering fails due to marker creation failure >

[jira] [Updated] (HUDI-4585) Optimize query performance on Presto Hudi connector

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4585: - Story Points: 0 (was: 2) > Optimize query performance on Presto Hudi connector > ---

[jira] [Updated] (HUDI-2955) Upgrade Hadoop to 3.3.x

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2955: - Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Mar-14, Hudi-Sprint-Mar-21, Hudi-Sprint-Mar-22, Hudi-Sprint-Apr-05

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6352: [HUDI-4584] Fixing `SQLConf` not being propagated to executor

2022-08-23 Thread GitBox
alexeykudinkin commented on code in PR #6352: URL: https://github.com/apache/hudi/pull/6352#discussion_r952852924 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/SQLConfInjectingRDD.scala: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] pomaster commented on issue #6344: [SUPPORT] spark-sql schema_evolution

2022-08-23 Thread GitBox
pomaster commented on issue #6344: URL: https://github.com/apache/hudi/issues/6344#issuecomment-1224294973 @nsivabalan Looked like @KnightChess has updated the doc already. Thanks @KnightChess. -- This is an automated message from the Apache Git Service. To respond to the message, please

[jira] [Updated] (HUDI-4659) Develop a validation tool for bootstrap table

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4659: - Sprint: 2022/09/05 (was: 2022/08/22) > Develop a validation tool for bootstrap table > --

[jira] [Updated] (HUDI-1369) Bootstrap datasource jobs from hanging via spark-submit

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1369: - Sprint: 2022/09/05 (was: 2022/08/22) > Bootstrap datasource jobs from hanging via spark-submit >

[jira] [Updated] (HUDI-4125) Add IT (Azure CI) around bootstrapped Hudi table

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4125: - Sprint: 2022/09/05 (was: 2022/08/22) > Add IT (Azure CI) around bootstrapped Hudi table > ---

[GitHub] [hudi] hudi-bot commented on pull request #6481: [HUDI-4698] Rename the package 'org.apache.flink.table.data' to avoid…

2022-08-23 Thread GitBox
hudi-bot commented on PR #6481: URL: https://github.com/apache/hudi/pull/6481#issuecomment-1224281869 ## CI report: * 3eb012affd4283f9970445bf3dbf4cb48afc25bf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1090

[GitHub] [hudi] hudi-bot commented on pull request #6480: [HUDI-4687] add show_invalid_parquet procedure

2022-08-23 Thread GitBox
hudi-bot commented on PR #6480: URL: https://github.com/apache/hudi/pull/6480#issuecomment-1224281819 ## CI report: * 9d161840463bb97d4872ce8a2c376cb9e0d00440 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1090

[GitHub] [hudi] rmahindra123 commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

2022-08-23 Thread GitBox
rmahindra123 commented on issue #6278: URL: https://github.com/apache/hudi/issues/6278#issuecomment-1224268630 Confirmed that #6352 resolves the issue after adding the following config: --conf spark.sql.avro.datetimeRebaseModeInWrite=LEGACY -- This is an automated message from the Apach

[jira] [Updated] (HUDI-4585) Optimize query performance on Presto Hudi connector

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4585: - Story Points: 2 (was: 10) > Optimize query performance on Presto Hudi connector > --

[GitHub] [hudi] rmahindra123 commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

2022-08-23 Thread GitBox
rmahindra123 commented on issue #6278: URL: https://github.com/apache/hudi/issues/6278#issuecomment-1224263388 Was able to reproduce by adding the following line in my source: newDataSet = newDataSet.withColumn("invalidDates", functions.lit("1000-01-11").cast(DataTypes.DateType));

[jira] [Updated] (HUDI-4468) Simplify TimeTravel logic for Spark 3.3

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4468: - Sprint: 2022/09/19 (was: 2022/08/22) > Simplify TimeTravel logic for Spark 3.3 >

[jira] [Updated] (HUDI-4467) Port borrowed code from Spark 3.3

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4467: - Sprint: 2022/09/19 (was: 2022/08/22) > Port borrowed code from Spark 3.3 > --

[jira] [Updated] (HUDI-4465) Optimizing file-listing path in MT

2022-08-23 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4465: -- Story Points: 2 (was: 4) > Optimizing file-listing path in MT > ---

[jira] [Updated] (HUDI-4588) Ingestion failing if source column is dropped

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4588: - Story Points: 4 (was: 12) > Ingestion failing if source column is dropped > -

[jira] [Updated] (HUDI-4691) Deduplicate Spark 3.2 and Spark 3.3 integrations

2022-08-23 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4691: - Story Points: 6 (was: 12) > Deduplicate Spark 3.2 and Spark 3.3 integrations > --

[jira] [Updated] (HUDI-4690) Remove code duplicated over from Spark

2022-08-23 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4690: -- Story Points: 12 (was: 5) > Remove code duplicated over from Spark > --

  1   2   3   >