[GitHub] [hudi] hudi-bot commented on pull request #8758: [HUDI-53] Implementation of record_index - a HUDI index based on the metadata table.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8758: URL: https://github.com/apache/hudi/pull/8758#issuecomment-1583201880 ## CI report: * b30678e5d37b724f02d731c8e14a5127db221086 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8908: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8908: URL: https://github.com/apache/hudi/pull/8908#issuecomment-1583146754 ## CI report: * 9d6633418e12c8a06c7bdb3e271f535096299bd2 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8908: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8908: URL: https://github.com/apache/hudi/pull/8908#issuecomment-1583137285 ## CI report: * 9d6633418e12c8a06c7bdb3e271f535096299bd2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] nsivabalan opened a new pull request, #8908: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
nsivabalan opened a new pull request, #8908: URL: https://github.com/apache/hudi/pull/8908 ### Change Logs [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters ### Impact none ### Risk level (write none, low medium or high

[GitHub] [hudi] nsivabalan commented on pull request #8907: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
nsivabalan commented on PR #8907: URL: https://github.com/apache/hudi/pull/8907#issuecomment-1583098117 1 cents. when we are investigating some test failure, lets try to see if we can disbale other modules from azure pipelines. and even for the module of interest, we should try to filter

[GitHub] [hudi] hudi-bot commented on pull request #8907: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8907: URL: https://github.com/apache/hudi/pull/8907#issuecomment-1583073766 ## CI report: * ed947b39f1c42f690cbb79257399c1ec967859e9 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8907: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8907: URL: https://github.com/apache/hudi/pull/8907#issuecomment-1583064894 ## CI report: * ed947b39f1c42f690cbb79257399c1ec967859e9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean ignore partitions affected by append write commits/delta commits

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1583055590 ## CI report: * 40921a2c95fd2121cf6b1af01cf1cbebd204b1b2 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1583054671 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * 86902aa16075d75d61347a93819295d78cda0d2d Azure:

[GitHub] [hudi] codope opened a new pull request, #8907: [DNM][MINOR] Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters

2023-06-08 Thread via GitHub
codope opened a new pull request, #8907: URL: https://github.com/apache/hudi/pull/8907 ### Change Logs Add some logs to investigate flaky testUpsertsContinuousModeWithMultipleWriters ### Impact none ### Risk level (write none, low medium or high below)

[GitHub] [hudi] hudi-bot commented on pull request #8885: [HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1583004757 ## CI report: * 5e4e74413ff5ce0237dfd70620b9ab2c78680e06 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean ignore partitions affected by append write commits/delta commits

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1583004915 ## CI report: * 40921a2c95fd2121cf6b1af01cf1cbebd204b1b2 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8847: [HUDI-2071] Support Reading Bootstrap MOR RT Table In Spark DataSource Table

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8847: URL: https://github.com/apache/hudi/pull/8847#issuecomment-1583004506 ## CI report: * fe991dc492e5bec19b4bfd91dc0b210e6b152b7a UNKNOWN * ffe28db124ce5eba9e8bc406fc24972dec19a782 Azure:

[jira] [Updated] (HUDI-6341) Add docs for Spark 3.4.0 support

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6341: Fix Version/s: 0.14.0 > Add docs for Spark 3.4.0 support > > >

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean ignore partitions affected by append write commits/delta commits

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582994285 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1582994129 ## CI report: * 5e4e74413ff5ce0237dfd70620b9ab2c78680e06 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8847: [HUDI-2071] Support Reading Bootstrap MOR RT Table In Spark DataSource Table

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8847: URL: https://github.com/apache/hudi/pull/8847#issuecomment-1582993854 ## CI report: * fe991dc492e5bec19b4bfd91dc0b210e6b152b7a UNKNOWN * ffe28db124ce5eba9e8bc406fc24972dec19a782 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8758: [HUDI-53] Implementation of record_index - a HUDI index based on the metadata table.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8758: URL: https://github.com/apache/hudi/pull/8758#issuecomment-1582993413 ## CI report: * c8679dfb6e1ddea34c5aa19cfe7e8f55bf78abb1 Azure:

[jira] [Assigned] (HUDI-6341) Add docs for Spark 3.4.0 support

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-6341: --- Assignee: Ethan Guo > Add docs for Spark 3.4.0 support > > >

[jira] [Updated] (HUDI-6341) Add docs for Spark 3.4.0 support

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6341: Description: Docs update for HUDI-6198 > Add docs for Spark 3.4.0 support >

[jira] [Commented] (HUDI-6198) Support Spark 3.4.0

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730635#comment-17730635 ] Ethan Guo commented on HUDI-6198: - Docs update ticket: HUDI-6341 > Support Spark 3.4.0 >

[jira] [Created] (HUDI-6341) Add docs for Spark 3.4.0 support

2023-06-08 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6341: --- Summary: Add docs for Spark 3.4.0 support Key: HUDI-6341 URL: https://issues.apache.org/jira/browse/HUDI-6341 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-6341) Add docs for Spark 3.4.0 support

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6341: Story Points: 2 > Add docs for Spark 3.4.0 support > > >

[jira] [Updated] (HUDI-6198) Support Spark 3.4.0

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6198: Fix Version/s: 0.14.0 > Support Spark 3.4.0 > --- > > Key: HUDI-6198 >

[jira] [Assigned] (HUDI-6198) Support Spark 3.4.0

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-6198: --- Assignee: Ethan Guo > Support Spark 3.4.0 > --- > > Key: HUDI-6198 >

[jira] [Updated] (HUDI-6198) Support Spark 3.4.0

2023-06-08 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6198: Story Points: 60 Priority: Blocker (was: Major) > Support Spark 3.4.0 > --- > >

[GitHub] [hudi] hudi-bot commented on pull request #8758: [HUDI-53] Implementation of record_index - a HUDI index based on the metadata table.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8758: URL: https://github.com/apache/hudi/pull/8758#issuecomment-1582974896 ## CI report: * c8679dfb6e1ddea34c5aa19cfe7e8f55bf78abb1 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582971556 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 4c1b5f01eec0440f14d35c5cc7daf29e96086b4d Azure:

[GitHub] [hudi] Vsevolod3 commented on issue #8071: [SUPPORT]How to improve the speed of Flink writing to hudi ?

2023-06-08 Thread via GitHub
Vsevolod3 commented on issue #8071: URL: https://github.com/apache/hudi/issues/8071#issuecomment-1582970788 We're having a similar issue with write performance. The Hudi stream_write task takes between 8 and 10 minutes for a MoR table and between 9 and 11 minutes for a CoW table to write

[GitHub] [hudi] jonvex commented on pull request #8847: [HUDI-2071] Support Reading Bootstrap MOR RT Table In Spark DataSource Table

2023-06-08 Thread via GitHub
jonvex commented on PR #8847: URL: https://github.com/apache/hudi/pull/8847#issuecomment-1582888526 Created an alternate constructors for the iterators -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] hudi-bot commented on pull request #8900: [HUDI-6334] Integrate logcompaction table service to metadata table and provides various bugfixes to metadata table

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8900: URL: https://github.com/apache/hudi/pull/8900#issuecomment-1582840441 ## CI report: * 2728af07076e721cc98f1ab1cc1e814fbc7b147c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8900: [HUDI-6334] Integrate logcompaction table service to metadata table and provides various bugfixes to metadata table

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8900: URL: https://github.com/apache/hudi/pull/8900#issuecomment-1582820525 ## CI report: * 2728af07076e721cc98f1ab1cc1e814fbc7b147c Azure:

[GitHub] [hudi] nsivabalan commented on a diff in pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
nsivabalan commented on code in PR #8684: URL: https://github.com/apache/hudi/pull/8684#discussion_r1223138356 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java: ## @@ -1453,7 +1453,11 @@ public static String

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582702675 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582685180 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582661903 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * da92c6c310fccdb4f1e797b481eaf6bb027e551f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582590820 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * df3a5b57a375fd344814b1f5a2451cfb79fea2c3 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582590050 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * bb8be867576e7a658994061abd0a1f4907c7fb7f Azure:

[GitHub] [hudi] ChestnutQiang commented on issue #8126: [SUPPORT] Exit code 137 (interrupted by signal 9: SIGKILL) when StreamWriteFunction detect object size

2023-06-08 Thread via GitHub
ChestnutQiang commented on issue #8126: URL: https://github.com/apache/hudi/issues/8126#issuecomment-1582587226 I tried using JDK azu1-1.8_372 but encountered the same problem. I resolved it by changing the JDK version to 1.8_144. -- This is an automated message from the Apache Git

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582578399 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * bb8be867576e7a658994061abd0a1f4907c7fb7f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582568099 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 Azure:

[jira] [Created] (HUDI-6340) Investigate and fix testReadability

2023-06-08 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-6340: - Summary: Investigate and fix testReadability Key: HUDI-6340 URL: https://issues.apache.org/jira/browse/HUDI-6340 Project: Apache Hudi Issue Type: Test

[GitHub] [hudi] boneanxs commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
boneanxs commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1222992789 ## hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java: ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] leesf commented on pull request #8437: [HUDI-6066] HoodieTableSource supports parquet predicate push down

2023-06-08 Thread via GitHub
leesf commented on PR #8437: URL: https://github.com/apache/hudi/pull/8437#issuecomment-1582526236 @danny0405 @XuQianJin-Stars will merge this PR if you do not have any active comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] boneanxs commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
boneanxs commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1222992789 ## hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java: ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Created] (HUDI-6339) Ability to Disable Partition Deletes during Clean

2023-06-08 Thread Dave Hagman (Jira)
Dave Hagman created HUDI-6339: - Summary: Ability to Disable Partition Deletes during Clean Key: HUDI-6339 URL: https://issues.apache.org/jira/browse/HUDI-6339 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-6339) Ability to Disable Partition Deletes during Clean

2023-06-08 Thread Dave Hagman (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Hagman updated HUDI-6339: -- Description: We recently experienced a large data loss in one of our largest Hudi tables. We observed

[GitHub] [hudi] boneanxs commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
boneanxs commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582505519 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582479221 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * bb8be867576e7a658994061abd0a1f4907c7fb7f Azure:

[jira] [Created] (HUDI-6338) Refactor HoodieBackedTableMetadataWriter

2023-06-08 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-6338: - Summary: Refactor HoodieBackedTableMetadataWriter Key: HUDI-6338 URL: https://issues.apache.org/jira/browse/HUDI-6338 Project: Apache Hudi Issue Type: Improvement

[GitHub] [hudi] SteNicholas commented on pull request #8062: [HUDI-5823][RFC-65] RFC for Partition TTL Management

2023-06-08 Thread via GitHub
SteNicholas commented on PR #8062: URL: https://github.com/apache/hudi/pull/8062#issuecomment-1582405189 @stream2000, do you have any updates? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1582397757 ## CI report: * 5e4e74413ff5ce0237dfd70620b9ab2c78680e06 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582396153 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * df3a5b57a375fd344814b1f5a2451cfb79fea2c3 Azure:

[GitHub] [hudi] zyclove opened a new issue, #8906: [SUPPORT] hudi upsert error: java.lang.NumberFormatException: For input string: "d880d4ea"

2023-06-08 Thread via GitHub
zyclove opened a new issue, #8906: URL: https://github.com/apache/hudi/issues/8906 **Describe the problem you faced** **To Reproduce** hudi 0.9 upgrade hudi 0.12.3 data cannot be upsert into old table. code: `jsonDataSet.write()

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582340844 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * 1d49e658315a2ae22a2c29e3d388e82f869b146d Azure:

[GitHub] [hudi] voonhous commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

2023-06-08 Thread via GitHub
voonhous commented on issue #8892: URL: https://github.com/apache/hudi/issues/8892#issuecomment-1582326573 I noticed 3 bucketIds being repeated and they all have instants between **20230511183601566** and **20230510170043301**. Can you please share your timeline between these 2

[GitHub] [hudi] danny0405 commented on issue #8902: [SUPPORT] Flink Async Compaction MOR Table,OutOfMemoryError: Requested array size exceeds VM limit

2023-06-08 Thread via GitHub
danny0405 commented on issue #8902: URL: https://github.com/apache/hudi/issues/8902#issuecomment-1582320471 It seems the filesystem view takes too much memory. It starts on the client machine that you submit the job. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] danny0405 commented on issue #8902: [SUPPORT] Flink Async Compaction MOR Table,OutOfMemoryError: Requested array size exceeds VM limit

2023-06-08 Thread via GitHub
danny0405 commented on issue #8902: URL: https://github.com/apache/hudi/issues/8902#issuecomment-1582317124 How much memory you configured for the JM? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] danny0405 commented on issue #8903: [SUPPORT] aws spark3.2.1 & hudi 0.13.1 with java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.PartitionedFile

2023-06-08 Thread via GitHub
danny0405 commented on issue #8903: URL: https://github.com/apache/hudi/issues/8903#issuecomment-1582311028 Hi, @umehrot2 , can you take a look at this, it seems a class conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] hudi-bot commented on pull request #8874: [HUDI-6310] CreateHoodieTableCommand::createHiveDataSourceTable arguments refactor

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8874: URL: https://github.com/apache/hudi/pull/8874#issuecomment-1582308968 ## CI report: * 6c0328f11f729b18ed59ef0362e69e2499086836 Azure:

[GitHub] [hudi] danny0405 commented on issue #8882: [SUPPORT] Using hive to read rt table exception

2023-06-08 Thread via GitHub
danny0405 commented on issue #8882: URL: https://github.com/apache/hudi/issues/8882#issuecomment-1582306023 Nice findings ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] danny0405 commented on a diff in pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
danny0405 commented on code in PR #8684: URL: https://github.com/apache/hudi/pull/8684#discussion_r1222772904 ## hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/metadata/FlinkHoodieBackedTableMetadataWriter.java: ## @@ -104,40 +106,19 @@ protected void

[GitHub] [hudi] thomasg19930417 commented on issue #8882: [SUPPORT] Using hive to read rt table exception

2023-06-08 Thread via GitHub
thomasg19930417 commented on issue #8882: URL: https://github.com/apache/hudi/issues/8882#issuecomment-1582168044 Look at this test, the logic of this split should not be careless ![image](https://github.com/apache/hudi/assets/20243868/4b0c1b9f-4520-485b-8728-360270f85d4e) --

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582140102 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8905: URL: https://github.com/apache/hudi/pull/8905#issuecomment-1582128707 ## CI report: * 89ee7d567c40cb8017a82e74486d19325a828e35 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582126900 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * 75b8cbded462b93578b7703069e15d52f321a5b7 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582114650 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * 1d49e658315a2ae22a2c29e3d388e82f869b146d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582112961 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * c0673394e9bbced23d85463e2e6827e45b743fdd Azure:

[GitHub] [hudi] xuzifu666 commented on pull request #8874: [HUDI-6310] CreateHoodieTableCommand::createHiveDataSourceTable arguments refactor

2023-06-08 Thread via GitHub
xuzifu666 commented on PR #8874: URL: https://github.com/apache/hudi/pull/8874#issuecomment-1582110951 @danny0405 cc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] stream2000 opened a new pull request, #8905: [HUDI-6337] Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread via GitHub
stream2000 opened a new pull request, #8905: URL: https://github.com/apache/hudi/pull/8905 ### Change Logs Incremental Clean skip fetch commit metadata for append mode. we don't need to clean the data file at all because there is always only one version for one file group.

[jira] [Updated] (HUDI-6337) Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6337: - Labels: pull-request-available (was: ) > Incremental Clean skip fetch commit metadata for append

[jira] [Created] (HUDI-6337) Incremental Clean skip fetch commit metadata for append mode

2023-06-08 Thread Qijun Fu (Jira)
Qijun Fu created HUDI-6337: -- Summary: Incremental Clean skip fetch commit metadata for append mode Key: HUDI-6337 URL: https://issues.apache.org/jira/browse/HUDI-6337 Project: Apache Hudi Issue

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1582054577 ## CI report: * d927bbbca29949d551eeed521f6a1e31e2fe498d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1582053742 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * 1d49e658315a2ae22a2c29e3d388e82f869b146d Azure:

[jira] [Closed] (HUDI-6263) Update hoodie.properties will cause reader failed: hoodie.properties: No such file or directory!

2023-06-08 Thread Qijun Fu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qijun Fu closed HUDI-6263. -- Resolution: Duplicate > Update hoodie.properties will cause reader failed: hoodie.properties: No such > file

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1582038557 ## CI report: * 96f974edcae6d91b5293abd0b8a9ba68007c7a1f Azure:

[jira] [Created] (HUDI-6336) Support TimelineBased Checkpoint Metadata for flink

2023-06-08 Thread Qijun Fu (Jira)
Qijun Fu created HUDI-6336: -- Summary: Support TimelineBased Checkpoint Metadata for flink Key: HUDI-6336 URL: https://issues.apache.org/jira/browse/HUDI-6336 Project: Apache Hudi Issue Type:

[GitHub] [hudi] codope commented on a diff in pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
codope commented on code in PR #8684: URL: https://github.com/apache/hudi/pull/8684#discussion_r1222567953 ## hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/metadata/FlinkHoodieBackedTableMetadataWriter.java: ## @@ -104,40 +106,19 @@ protected void initRegistry() {

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582035580 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * c0673394e9bbced23d85463e2e6827e45b743fdd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8874: [HUDI-6310] CreateHoodieTableCommand::createHiveDataSourceTable arguments refactor

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8874: URL: https://github.com/apache/hudi/pull/8874#issuecomment-1582018910 ## CI report: * 6c0328f11f729b18ed59ef0362e69e2499086836 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1582015784 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * e6bf09a37c06ad1fd765a83e685dbf5e775f6216 Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
danny0405 commented on code in PR #8684: URL: https://github.com/apache/hudi/pull/8684#discussion_r1222548105 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java: ## @@ -1453,7 +1453,11 @@ public static String

[GitHub] [hudi] zyclove opened a new issue, #8904: [SUPPORT] spark-sql hudi table Caused by: org.apache.avro.AvroTypeException: Found string, expecting union

2023-06-08 Thread via GitHub
zyclove opened a new issue, #8904: URL: https://github.com/apache/hudi/issues/8904 **Describe the problem you faced** run spark-sql works select * from bi_ods_real.ods_api_test_task_log_rt limit 10;

[GitHub] [hudi] danny0405 commented on a diff in pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
danny0405 commented on code in PR #8684: URL: https://github.com/apache/hudi/pull/8684#discussion_r1222533305 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java: ## @@ -1453,7 +1453,11 @@ public static String

[GitHub] [hudi] zyclove opened a new issue, #8903: [SUPPORT] aws spark3.2.1 & hudi 0.13.1 with java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.PartitionedFile

2023-06-08 Thread via GitHub
zyclove opened a new issue, #8903: URL: https://github.com/apache/hudi/issues/8903 **Describe the problem you faced** run : spark-sql --packages org.apache.hudi:hudi-spark3.2-bundle_2.12:0.13.1 \ > --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \ >

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1581967151 ## CI report: * 96f974edcae6d91b5293abd0b8a9ba68007c7a1f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8874: [HUDI-6310] CreateHoodieTableCommand::createHiveDataSourceTable arguments refactor

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8874: URL: https://github.com/apache/hudi/pull/8874#issuecomment-1581967040 ## CI report: * 6c0328f11f729b18ed59ef0362e69e2499086836 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1581965985 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * e6bf09a37c06ad1fd765a83e685dbf5e775f6216 Azure:

[GitHub] [hudi] BohanZhang0222 commented on issue #8902: [SUPPORT] Flink Async Compaction MOR Table,OutOfMemoryError: Requested array size exceeds VM limit

2023-06-08 Thread via GitHub
BohanZhang0222 commented on issue #8902: URL: https://github.com/apache/hudi/issues/8902#issuecomment-1581964681 > There is an option for the sort memory, did you try that, did you try to turn the memory for flink JM? 1. Can you provide specific option? Thanks. 2. My start

[GitHub] [hudi] hudi-bot commented on pull request #8885: [DNM][HUDI-6198] Support Hudi on Spark 3.4.0

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8885: URL: https://github.com/apache/hudi/pull/8885#issuecomment-1581960282 ## CI report: * 96f974edcae6d91b5293abd0b8a9ba68007c7a1f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8452: URL: https://github.com/apache/hudi/pull/8452#issuecomment-1581959224 ## CI report: * 8082df232089396b2a9f9be2b915e51b3645f172 UNKNOWN * e6bf09a37c06ad1fd765a83e685dbf5e775f6216 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8684: [HUDI-6200] Enhancements to the MDT for improving performance of larger indexes.

2023-06-08 Thread via GitHub
hudi-bot commented on PR #8684: URL: https://github.com/apache/hudi/pull/8684#issuecomment-1581951617 ## CI report: * da0e00e3655e06b064d11e20500317e2a1d2229c UNKNOWN * fd2ec462e554190843ddcc8d946317ae76cb81e1 UNKNOWN * 1d49e658315a2ae22a2c29e3d388e82f869b146d Azure:

[GitHub] [hudi] bvaradar commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
bvaradar commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1222501769 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/Conversions.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [hudi] boneanxs commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
boneanxs commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1222499227 ## hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java: ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] boneanxs commented on a diff in pull request #8452: [HUDI-6077] Add more partition push down filters

2023-06-08 Thread via GitHub
boneanxs commented on code in PR #8452: URL: https://github.com/apache/hudi/pull/8452#discussion_r1222426029 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/Conversions.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [hudi] danny0405 commented on issue #8902: [SUPPORT] Flink Async Compaction MOR Table,OutOfMemoryError: Requested array size exceeds VM limit

2023-06-08 Thread via GitHub
danny0405 commented on issue #8902: URL: https://github.com/apache/hudi/issues/8902#issuecomment-1581937874 There is an option for the sort memory, did you try that, did you try to turn the memory for flink JM? -- This is an automated message from the Apache Git Service. To respond to

<    1   2