[jira] [Closed] (HUDI-6277) Clustering improvements

2023-05-30 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6277. Resolution: Fixed Fixed via master branch: 9c7d856656f3f3a01c073a2aed444d90c740c913 > Clustering improvemen

[hudi] branch master updated (87740003822 -> 9c7d856656f)

2023-05-30 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 87740003822 [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation. (#860

[GitHub] [hudi] danny0405 merged pull request #8829: [HUDI-6277][UBER] Clustering enhancements

2023-05-30 Thread via GitHub
danny0405 merged PR #8829: URL: https://github.com/apache/hudi/pull/8829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] ad1happy2go commented on issue #8576: [SUPPORT] Doubt about handling old data arrival in hudi

2023-05-30 Thread via GitHub
ad1happy2go commented on issue #8576: URL: https://github.com/apache/hudi/issues/8576#issuecomment-1569577071 @pravin1406 Not sure if we have benchmarking on this. It would be great help if you can benchmark and share the results to community. -- This is an automated message from the Apac

[GitHub] [hudi] ad1happy2go commented on issue #8670: [SUPPORT] Hudi cannot multi-write referring to case #7653

2023-05-30 Thread via GitHub
ad1happy2go commented on issue #8670: URL: https://github.com/apache/hudi/issues/8670#issuecomment-1569569791 @tomyanth Closing this issue as per Siva's comments. Please reopen in case of any other queries. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [hudi] danny0405 commented on a diff in pull request #8839: [HUDI-6287] Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread via GitHub
danny0405 commented on code in PR #8839: URL: https://github.com/apache/hudi/pull/8839#discussion_r1211144330 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimeRecordReader.java: ## @@ -69,8 +69,18 @@ private static RecordReader constructRecordReader

[GitHub] [hudi] ertanden commented on issue #8265: [SUPPORT] Flink Table planner not loading problem

2023-05-30 Thread via GitHub
ertanden commented on issue #8265: URL: https://github.com/apache/hudi/issues/8265#issuecomment-1569564073 > Do you mean I should put `flink-table-planner-loader` in the jobs shadowJar? I didn't try that. But, my understanding from Flink documentation is that `flink-table-pla

[GitHub] [hudi] harris233 commented on pull request #8839: [HUDI-6287] Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread via GitHub
harris233 commented on PR #8839: URL: https://github.com/apache/hudi/pull/8839#issuecomment-1569559847 @hudi-bot run azure re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] hudi-bot commented on pull request #8850: [HUDI-6290] Fix Flink MDT compaction strategy

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8850: URL: https://github.com/apache/hudi/pull/8850#issuecomment-1569540348 ## CI report: * b0b700bfca2396e298e540dbac2292da7b97da84 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1750

[GitHub] [hudi] hudi-bot commented on pull request #8849: [UBER] Rollback enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8849: URL: https://github.com/apache/hudi/pull/8849#issuecomment-1569540313 ## CI report: * 3a119ed3dfa7b2514f0603d8018a933410f08875 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=175

[jira] [Created] (HUDI-6291) Environment variable not getting picked up using variable HUDI_CONF_DIR

2023-05-30 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-6291: --- Summary: Environment variable not getting picked up using variable HUDI_CONF_DIR Key: HUDI-6291 URL: https://issues.apache.org/jira/browse/HUDI-6291 Project: Apache Hud

[GitHub] [hudi] ad1happy2go commented on issue #8730: The HudiDeltaStream is not able to Load Conf by setting Enviorment Variable

2023-05-30 Thread via GitHub
ad1happy2go commented on issue #8730: URL: https://github.com/apache/hudi/issues/8730#issuecomment-1569537108 @Amar1404 We need some improvement for the configs. JIRA created - https://issues.apache.org/jira/browse/HUDI-6291 -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] hudi-bot commented on pull request #8850: [HUDI-6290] Fix Flink MDT compaction strategy

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8850: URL: https://github.com/apache/hudi/pull/8850#issuecomment-1569534211 ## CI report: * b0b700bfca2396e298e540dbac2292da7b97da84 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8849: [UBER] Rollback enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8849: URL: https://github.com/apache/hudi/pull/8849#issuecomment-1569534183 ## CI report: * 3a119ed3dfa7b2514f0603d8018a933410f08875 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1750

[GitHub] [hudi] bvaradar commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-30 Thread via GitHub
bvaradar commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1569531504 @boneanxs : Please update the PR description with the conditional dependency on url encoding. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [hudi] ad1happy2go commented on issue #8761: [SUPPORT] "Illegal Lambda Deserialization" When Leveraging PostgresDebeziumSource

2023-05-30 Thread via GitHub
ad1happy2go commented on issue #8761: URL: https://github.com/apache/hudi/issues/8761#issuecomment-1569529718 @samserpoosh Can you refer these configs and shell script and see if its not a config miss - https://gist.github.com/ad1happy2go/49b81f015c1a2964fee489214658cf44 -- This is an au

[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-05-30 Thread via GitHub
bvaradar commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r123893 ## hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java: ## @@ -96,11 +109,32 @@ public List getPartitionPathWithPathPrefixes(List relat

[GitHub] [hudi] hudi-bot commented on pull request #8829: [HUDI-6277][UBER] Clustering enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8829: URL: https://github.com/apache/hudi/pull/8829#issuecomment-1569527755 ## CI report: * 20a8dbafc6ced9f1dd26e080d116ae6b44e58010 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1749

[GitHub] [hudi] ad1happy2go commented on issue #8791: [SUPPORT] ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f

2023-05-30 Thread via GitHub
ad1happy2go commented on issue #8791: URL: https://github.com/apache/hudi/issues/8791#issuecomment-1569527034 @lucienoz Gentle ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] harris233 commented on a diff in pull request #8839: [HUDI-6287] Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread via GitHub
harris233 commented on code in PR #8839: URL: https://github.com/apache/hudi/pull/8839#discussion_r1211108874 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimeRecordReader.java: ## @@ -69,8 +69,18 @@ private static RecordReader constructRecordReader

[GitHub] [hudi] Riddle4045 commented on issue #8848: [SUPPORT] Hive Sync tool fails to sync Hoodi table written using Flink 1.16 to HMS

2023-05-30 Thread via GitHub
Riddle4045 commented on issue #8848: URL: https://github.com/apache/hudi/issues/8848#issuecomment-1569510718 @xicm makes sense, I wanted to confirm I wasn't missing anything. I am going to add add a `dev` flag, it'll - Trigger installation of compatible hadoop & hive versions tha

[jira] [Updated] (HUDI-6290) Fix Flink MDT compaction strategy

2023-05-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6290: - Labels: pull-request-available (was: ) > Fix Flink MDT compaction strategy >

[GitHub] [hudi] danny0405 opened a new pull request, #8850: [HUDI-6290] Fix Flink MDT compaction strategy

2023-05-30 Thread via GitHub
danny0405 opened a new pull request, #8850: URL: https://github.com/apache/hudi/pull/8850 ### Change Logs Because #8797 has been fixed, we can now unblock the compaction of MDT, to be more conservertive, only Flink MDT compaction strategy restriction is loosen becasue the streaming i

[jira] [Created] (HUDI-6290) Fix Flink MDT compaction strategy

2023-05-30 Thread Danny Chen (Jira)
Danny Chen created HUDI-6290: Summary: Fix Flink MDT compaction strategy Key: HUDI-6290 URL: https://issues.apache.org/jira/browse/HUDI-6290 Project: Apache Hudi Issue Type: Improvement

[GitHub] [hudi] hudi-bot commented on pull request #8849: [UBER] Rollback enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8849: URL: https://github.com/apache/hudi/pull/8849#issuecomment-1569494339 ## CI report: * 3a119ed3dfa7b2514f0603d8018a933410f08875 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1750

[GitHub] [hudi] hudi-bot commented on pull request #8839: [HUDI-6287] Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8839: URL: https://github.com/apache/hudi/pull/8839#issuecomment-1569494310 ## CI report: * 5b836de96e422930069eaff5614726dbcfb863a1 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1747

[jira] [Created] (HUDI-6289) Fix Flink MDT compaction strategy

2023-05-30 Thread Danny Chen (Jira)
Danny Chen created HUDI-6289: Summary: Fix Flink MDT compaction strategy Key: HUDI-6289 URL: https://issues.apache.org/jira/browse/HUDI-6289 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-6287) Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6287: - Labels: pull-request-available (was: ) > Fix Memory Leak in RealtimeCompactedRecordReader > -

[GitHub] [hudi] hudi-bot commented on pull request #8849: [UBER] Rollback enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8849: URL: https://github.com/apache/hudi/pull/8849#issuecomment-1569490283 ## CI report: * 3a119ed3dfa7b2514f0603d8018a933410f08875 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8839: [HUDI-6287] Fix Memory Leak in RealtimeCompactedRecordReader

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8839: URL: https://github.com/apache/hudi/pull/8839#issuecomment-1569490233 ## CI report: * 5b836de96e422930069eaff5614726dbcfb863a1 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1747

[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8190: URL: https://github.com/apache/hudi/pull/8190#issuecomment-1569485206 ## CI report: * 1ca7cd7dcd3534fa8d21fcf80eefc760c16e6f75 UNKNOWN * 30d043fb18dca954f9df59c450671106a7fa070e UNKNOWN * ada7e29d46179057b839f62ca4241a3ef4ac9c04 UNKNOWN * 63

[GitHub] [hudi] MarlboroBoy commented on issue #8838: [SUPPORT] Flink sql create COPY_ON_WRITE partitioned table use hive query raise `UnsupportedOperationException '

2023-05-30 Thread via GitHub
MarlboroBoy commented on issue #8838: URL: https://github.com/apache/hudi/issues/8838#issuecomment-1569479056 > Have you ever restarted hiveserver2? I ran this case, it works. Yes, I have already restarted. Using hive3, I modified the following code during compilation.The co

[jira] [Updated] (HUDI-5728) HoodieTimelineArchiver archives the latest instant before inflight replacecommit

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5728: Fix Version/s: (was: 0.13.1) > HoodieTimelineArchiver archives the latest instant before inflight > rep

[jira] [Updated] (HUDI-5919) Fix the validation of partition listing in metadata table validator

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5919: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix the validation of partition listing in meta

[jira] [Updated] (HUDI-5799) Fix Spark partition validation in TestBulkInsertInternalPartitionerForRows

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5799: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix Spark partition validation in TestBulkInser

[jira] [Updated] (HUDI-5816) Avoid loading archived timeline during meta sync

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5816: Fix Version/s: 0.14.0 (was: 0.13.1) > Avoid loading archived timeline during meta syn

[jira] [Updated] (HUDI-5406) Troubleshoot NoClassDefFoundError in Integration Tests logs

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5406: Fix Version/s: 0.14.0 (was: 0.13.1) > Troubleshoot NoClassDefFoundError in Integratio

[jira] [Updated] (HUDI-5957) Run cluster should fail when using bucket index

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5957: Fix Version/s: 0.14.0 (was: 0.13.1) > Run cluster should fail when using bucket index

[jira] [Updated] (HUDI-4968) Fix ambiguous stream read config

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4968: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix ambiguous stream read config >

[jira] [Updated] (HUDI-5772) Align Flink clustering configuration with HoodieClusteringConfig

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5772: Fix Version/s: 0.14.0 (was: 0.13.1) > Align Flink clustering configuration with Hoodi

[jira] [Updated] (HUDI-5720) Improve validation script of staged source release

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5720: Fix Version/s: 0.14.0 (was: 0.13.1) > Improve validation script of staged source rele

[jira] [Updated] (HUDI-5835) spark cannot read mor table after execute update statement

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5835: Fix Version/s: 0.14.0 (was: 0.13.1) > spark cannot read mor table after execute updat

[jira] [Updated] (HUDI-5721) Add Github actions on more validations

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5721: Fix Version/s: 0.14.0 (was: 0.13.1) > Add Github actions on more validations > --

[jira] [Updated] (HUDI-5509) check if dfs support atomic creation when using filesystem base lock provider

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5509: Fix Version/s: 0.14.0 (was: 0.13.1) > check if dfs support atomic creation when using

[jira] [Updated] (HUDI-6199) CDC payload with op field for deletes do not work

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6199: Fix Version/s: 0.14.0 (was: 0.13.1) > CDC payload with op field for deletes do not wo

[jira] [Updated] (HUDI-6015) Refresh the table after executing rollback to instantTime

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6015: Fix Version/s: 0.14.0 (was: 0.13.1) > Refresh the table after executing rollback to i

[jira] [Updated] (HUDI-6068) Improve logic of getOldestInstantToRetainForClustering when archive timeline

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6068: Fix Version/s: 0.14.0 (was: 0.13.1) > Improve logic of getOldestInstantToRetainForClu

[jira] [Updated] (HUDI-6039) Fix FS based listing in clean planner

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6039: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix FS based listing in clean planner > ---

[jira] [Updated] (HUDI-5800) Fix test failure in TestHoodieMergeOnReadTable

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5800: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix test failure in TestHoodieMergeOnReadTable

[jira] [Updated] (HUDI-5455) Add commons configuration2 to cli bundle

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5455: Fix Version/s: 0.14.0 (was: 0.13.1) > Add commons configuration2 to cli bundle >

[jira] [Updated] (HUDI-6009) Let the jetty server in TimelineService create daemon threads

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6009: Fix Version/s: 0.14.0 (was: 0.13.1) > Let the jetty server in TimelineService create

[jira] [Updated] (HUDI-4920) fix PartialUpdatePayload cannot return deleted record in preCombine function issue

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4920: Fix Version/s: 0.14.0 (was: 0.13.1) > fix PartialUpdatePayload cannot return deleted

[jira] [Updated] (HUDI-5329) spark reads hudi table error when flink creates the table without preCombine fields

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5329: Fix Version/s: (was: 0.13.1) > spark reads hudi table error when flink creates the table without preComb

[jira] [Updated] (HUDI-5931) Improve the description of operation in HoodieDeltaStreamer

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5931: Fix Version/s: 0.14.0 (was: 0.13.1) > Improve the description of operation in HoodieD

[jira] [Updated] (HUDI-5979) Replace individual hudi modules by hudi-trino-bundle in Trino Hudi connector

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5979: Fix Version/s: 0.14.0 (was: 0.13.1) > Replace individual hudi modules by hudi-trino-b

[jira] [Updated] (HUDI-6184) Improve the test on incremental queries

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6184: Fix Version/s: 0.14.0 (was: 0.13.1) > Improve the test on incremental queries > -

[jira] [Updated] (HUDI-5987) Clustering on bootstrap table fails when row writer is disabled

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5987: Fix Version/s: 0.14.0 (was: 0.13.1) > Clustering on bootstrap table fails when row wr

[jira] [Updated] (HUDI-6027) Unnecessary scala-maven-plugin causes build issue with JDK17

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6027: Fix Version/s: 0.14.0 (was: 0.13.1) > Unnecessary scala-maven-plugin causes build iss

[jira] [Updated] (HUDI-6174) Fix flaky test testCleanerDeleteReplacedDataWithArchive

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6174: Fix Version/s: 0.14.0 (was: 0.13.1) > Fix flaky test testCleanerDeleteReplacedDataWit

[jira] [Updated] (HUDI-6090) Optimise payload size for list of FileGroupDTO

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6090: Fix Version/s: 0.14.0 (was: 0.13.1) > Optimise payload size for list of FileGroupDTO

[jira] [Updated] (HUDI-5960) Allow bootstrap procedure to throw an exception when execution fails

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5960: Fix Version/s: 0.14.0 (was: 0.13.1) > Allow bootstrap procedure to throw an exception

[jira] [Updated] (HUDI-6091) Add Java 11 and 17 to bundle validation image

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6091: Fix Version/s: 0.14.0 (was: 0.13.1) > Add Java 11 and 17 to bundle validation image >

[jira] [Updated] (HUDI-6204) Add Spark 3.3.2 in bundle validation

2023-05-30 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6204: Fix Version/s: 0.14.0 (was: 0.13.1) > Add Spark 3.3.2 in bundle validation >

[GitHub] [hudi] suryaprasanna opened a new pull request, #8849: [UBER] Rollback enhancements

2023-05-30 Thread via GitHub
suryaprasanna opened a new pull request, #8849: URL: https://github.com/apache/hudi/pull/8849 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any perfor

[GitHub] [hudi] nfarah86 commented on issue #8824: [SUPPORT] Performance and Data Integrity Issues with Hudi for Long-Term Data Retention

2023-05-30 Thread via GitHub
nfarah86 commented on issue #8824: URL: https://github.com/apache/hudi/issues/8824#issuecomment-1569461129 following up from slack: 6 years of data in the active timeline is a lot of data. 1) what kind of queries are you running? Do you need incremental queries across 6 years of da

[GitHub] [hudi] xuzifu666 commented on a diff in pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
xuzifu666 commented on code in PR #8795: URL: https://github.com/apache/hudi/pull/8795#discussion_r1210992276 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java: ## @@ -190,6 +190,8 @@ protected void doSync() { syncHoodieTable(roTableNa

[jira] [Updated] (HUDI-4575) Initial Kafka Global Offsets in Hudi Kafka Sink Connector

2023-05-30 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4575: Fix Version/s: 0.14.0 (was: 0.13.1) > Initial Kafka Global Offsets in Hudi Kafka Sin

[jira] [Updated] (HUDI-4735) Spark2 bundles made from master after 2022-07-23 failed to stop

2023-05-30 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4735: Fix Version/s: 0.14.0 (was: 0.13.1) > Spark2 bundles made from master after 2022-07-2

[jira] [Updated] (HUDI-2506) Hudi dependency governance

2023-05-30 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-2506: Fix Version/s: (was: 0.13.1) > Hudi dependency governance > -- > >

[jira] [Updated] (HUDI-4574) Failed to create timeline-server marker due to HoodieRemoteException

2023-05-30 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4574: Fix Version/s: 0.14.0 (was: 0.13.1) > Failed to create timeline-server marker due to

[jira] [Updated] (HUDI-4557) Support validation of column stats of avro log files in tests

2023-05-30 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yue Zhang updated HUDI-4557: Fix Version/s: 0.14.0 (was: 0.13.1) > Support validation of column stats of avro log

[GitHub] [hudi] danny0405 commented on a diff in pull request #8840: [HUDI-5352] Fix `LocalDate` serialization in colstats

2023-05-30 Thread via GitHub
danny0405 commented on code in PR #8840: URL: https://github.com/apache/hudi/pull/8840#discussion_r1211045326 ## hudi-common/src/main/java/org/apache/hudi/common/util/JsonUtils.java: ## @@ -20,41 +20,74 @@ package org.apache.hudi.common.util; import org.apache.hudi.exception

[GitHub] [hudi] xicm commented on issue #8848: [SUPPORT] Hive Sync tool fails to sync Hoodi table written using Flink 1.16 to HMS

2023-05-30 Thread via GitHub
xicm commented on issue #8848: URL: https://github.com/apache/hudi/issues/8848#issuecomment-1569428397 As you say, adding calcite-core to the class path can solve the problem. You can raise a pr to improve the script. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] hudi-bot commented on pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8795: URL: https://github.com/apache/hudi/pull/8795#issuecomment-1569423204 ## CI report: * 002c9331b0f360178bc5f705467a5a51c9707d0a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1733

[hudi] branch master updated (efa141f121e -> 87740003822)

2023-05-30 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from efa141f121e [HUDI-6275] Fix POM for building bundle jars of Spark 2 (#8825) add 87740003822 [HUDI-6152] Fixed the c

[GitHub] [hudi] yihua merged pull request #8605: [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation.

2023-05-30 Thread via GitHub
yihua merged PR #8605: URL: https://github.com/apache/hudi/pull/8605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[hudi] branch master updated (6f5aaf6586d -> efa141f121e)

2023-05-30 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 6f5aaf6586d [HUDI-5988] Add a param, Implement a full partition sync operation wh… (#8301) add efa141f121e [HUDI-62

[GitHub] [hudi] yihua merged pull request #8825: [HUDI-6275] Fix POM for building bundle jars of Spark 2

2023-05-30 Thread via GitHub
yihua merged PR #8825: URL: https://github.com/apache/hudi/pull/8825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] hudi-bot commented on pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8795: URL: https://github.com/apache/hudi/pull/8795#issuecomment-1569418794 ## CI report: * 002c9331b0f360178bc5f705467a5a51c9707d0a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1733

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
Zouxxyy commented on code in PR #8795: URL: https://github.com/apache/hudi/pull/8795#discussion_r1211028391 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfigHolder.java: ## @@ -85,6 +85,10 @@ public class HiveSyncConfigHolder { .defaultValue("f

[GitHub] [hudi] Riddle4045 commented on issue #8848: [SUPPORT] Hive Sync tool fails to sync Hoodi table written using Flink 1.16 to HMS

2023-05-30 Thread via GitHub
Riddle4045 commented on issue #8848: URL: https://github.com/apache/hudi/issues/8848#issuecomment-1569415940 > Flink support syncing hive with params, did you try that ? > > ```sql > CREATE TABLE t1( > uuid VARCHAR(20), > name VARCHAR(10), > age INT, > ts TIMESTA

[jira] [Updated] (HUDI-6288) Create new conflict resolution strategy that gives priority to Ingestion writers over other rewriters

2023-05-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6288: - Labels: pull-request-available (was: ) > Create new conflict resolution strategy that gives prior

[GitHub] [hudi] hudi-bot commented on pull request #8832: [HUDI-6288][UBER] Create IngestionPrimaryWriterBasedConflictResolutionStrategy to prioritize ingestion writers over other writers

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8832: URL: https://github.com/apache/hudi/pull/8832#issuecomment-1569414172 ## CI report: * 874cfa4508c17606cfdf4fb313334fea601deb53 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1744

[GitHub] [hudi] danny0405 commented on issue #8810: [SUPPORT] when read parquet files which the file name starts with dot(.) by spark, there will create an error like "Caused by: java.lang.RuntimeExce

2023-05-30 Thread via GitHub
danny0405 commented on issue #8810: URL: https://github.com/apache/hudi/issues/8810#issuecomment-1569409786 Not sure, maybe you can delete the hidden files manually, there is no atomatic fix when upgrading to 0.13.1. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] danny0405 commented on issue #8848: [SUPPORT] Hive Sync tool fails to sync Hoodi table written using Flink 1.16 to HMS

2023-05-30 Thread via GitHub
danny0405 commented on issue #8848: URL: https://github.com/apache/hudi/issues/8848#issuecomment-1569407728 Flink support syncing hive with params, did you try that ? ```sql CREATE TABLE t1( uuid VARCHAR(20), name VARCHAR(10), age INT, ts TIMESTAMP(3), `part

[GitHub] [hudi] king5holiday commented on issue #8810: [SUPPORT] when read parquet files which the file name starts with dot(.) by spark, there will create an error like "Caused by: java.lang.RuntimeE

2023-05-30 Thread via GitHub
king5holiday commented on issue #8810: URL: https://github.com/apache/hudi/issues/8810#issuecomment-1569406447 > Did you read the table by specifying hudi as the format? Or just read it as a raw parquet table. hi, there are some codes, `spark.read .format("org.

[GitHub] [hudi] xicm commented on issue #8838: [SUPPORT] Flink sql create COPY_ON_WRITE partitioned table use hive query raise `UnsupportedOperationException '

2023-05-30 Thread via GitHub
xicm commented on issue #8838: URL: https://github.com/apache/hudi/issues/8838#issuecomment-1569405268 Have you ever restarted hiveserver2? I ran this case, it works. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] weimingdiit commented on pull request #8301: [HUDI-5988] Add a param, Implement a full partition sync operation wh…

2023-05-30 Thread via GitHub
weimingdiit commented on PR #8301: URL: https://github.com/apache/hudi/pull/8301#issuecomment-1569404992 @yihua @danny0405 Thanks for your review and edit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [hudi] MarlboroBoy commented on issue #8838: [SUPPORT] Flink sql create COPY_ON_WRITE partitioned table use hive query raise `UnsupportedOperationException '

2023-05-30 Thread via GitHub
MarlboroBoy commented on issue #8838: URL: https://github.com/apache/hudi/issues/8838#issuecomment-1569402289 @danny0405 @xicm I have already attempted to build based on the master branch, but the issue still persists. -- This is an automated message from the Apache Git Service. To respo

[GitHub] [hudi] xuzifu666 commented on a diff in pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
xuzifu666 commented on code in PR #8795: URL: https://github.com/apache/hudi/pull/8795#discussion_r1210992276 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java: ## @@ -190,6 +190,8 @@ protected void doSync() { syncHoodieTable(roTableNa

[GitHub] [hudi] danny0405 closed issue #8843: Memory leak caused by hudi if got exception when constructing record reader

2023-05-30 Thread via GitHub
danny0405 closed issue #8843: Memory leak caused by hudi if got exception when constructing record reader URL: https://github.com/apache/hudi/issues/8843 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] xuzifu666 commented on pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
xuzifu666 commented on PR #8795: URL: https://github.com/apache/hudi/pull/8795#issuecomment-1569384489 @XuQianJin-Stars cc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] xuzifu666 commented on a diff in pull request #8795: [HUDI-6258] support olap engine query mor table in table name without ro/rt suffix

2023-05-30 Thread via GitHub
xuzifu666 commented on code in PR #8795: URL: https://github.com/apache/hudi/pull/8795#discussion_r1211004649 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java: ## @@ -190,6 +190,8 @@ protected void doSync() { syncHoodieTable(roTableNa

[GitHub] [hudi] danny0405 commented on issue #8265: [SUPPORT] Flink Table planner not loading problem

2023-05-30 Thread via GitHub
danny0405 commented on issue #8265: URL: https://github.com/apache/hudi/issues/8265#issuecomment-1569381945 Did you have the `flink-table-planner-loader` in the classpath? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[jira] [Created] (HUDI-6288) Create new conflict resolution strategy that gives priority to Ingestion writers over other rewriters

2023-05-30 Thread Surya Prasanna Yalla (Jira)
Surya Prasanna Yalla created HUDI-6288: -- Summary: Create new conflict resolution strategy that gives priority to Ingestion writers over other rewriters Key: HUDI-6288 URL: https://issues.apache.org/jira/brows

[GitHub] [hudi] hudi-bot commented on pull request #8832: [UBER] Create IngestionPrimaryWriterBasedConflictResolutionStrategy to prioritize ingestion writers over other writers

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8832: URL: https://github.com/apache/hudi/pull/8832#issuecomment-1569376055 ## CI report: * 874cfa4508c17606cfdf4fb313334fea601deb53 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1744

[GitHub] [hudi] hudi-bot commented on pull request #8829: [HUDI-6277][UBER] Clustering enhancements

2023-05-30 Thread via GitHub
hudi-bot commented on PR #8829: URL: https://github.com/apache/hudi/pull/8829#issuecomment-1569376000 ## CI report: * 8b1116b265af364a08c59d74136a56eee77e2350 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1743

[jira] [Closed] (HUDI-5988) Implement a full partition sync operation when partitions are lost

2023-05-30 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-5988. Fix Version/s: 0.14.0 Resolution: Fixed Fixed via master branch: 6f5aaf6586d6296c0b07efa0095e93d0ca30

[hudi] branch master updated (7cf0c9571a5 -> 6f5aaf6586d)

2023-05-30 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 7cf0c9571a5 [MINOR] Fix typo about `recordMergerStrategyId` (#8846) add 6f5aaf6586d [HUDI-5988] Add a param, Imp

[GitHub] [hudi] danny0405 merged pull request #8301: [HUDI-5988] Add a param, Implement a full partition sync operation wh…

2023-05-30 Thread via GitHub
danny0405 merged PR #8301: URL: https://github.com/apache/hudi/pull/8301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

  1   2   3   >