Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058284585 ## CI report: * 67ca721df255223c873303aeccf7900c29f7811a Azure:

Re: [I] [SUPPORT] The Hive run_sync_tool's Logged Command & The Actual Command Do Not Match [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on issue #11029: URL: https://github.com/apache/hudi/issues/11029#issuecomment-2058252483 Sure, thanks for the nice findings and welcome to any contributions:) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-15 Thread via GitHub
xushiyan commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-2058238697 > we have clustering to group rows together, but it's still thousands of files affected. 75th percentile of individual file overwrite(task in the Doing partition and writing data

(hudi) branch master updated (40d4f489389 -> c54be848f96)

2024-04-15 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 40d4f489389 [HUDI-7577] Avoid MDT compaction instant time conflicts (#10992) add c54be848f96 [MINOR] Remove

Re: [PR] [MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTests [hudi]

2024-04-15 Thread via GitHub
yihua merged PR #11027: URL: https://github.com/apache/hudi/pull/11027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058193126 ## CI report: * 67832fce75903cce3b3f66beb125f6a02fb82e11 Azure:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058187793 ## CI report: * 67832fce75903cce3b3f66beb125f6a02fb82e11 Azure:

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2058182194 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * f7ab315084f8534388db563a20d34b174cc63fa3 Azure:

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-04-15 Thread via GitHub
boneanxs commented on code in PR #10886: URL: https://github.com/apache/hudi/pull/10886#discussion_r1566657358 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePartitionMetadata.java: ## @@ -92,11 +92,12 @@ public int getPartitionDepth() { /** * Write

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058148444 ## CI report: * 67832fce75903cce3b3f66beb125f6a02fb82e11 Azure:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058142670 ## CI report: * 8fc55507a82ee1295f14c1125876b8395cfc27df Azure:

Re: [PR] [MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTests [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11027: URL: https://github.com/apache/hudi/pull/11027#issuecomment-2058142651 ## CI report: * b96388ad837c124fb63a8655f295fadebc37319f Azure:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058137317 ## CI report: * 8fc55507a82ee1295f14c1125876b8395cfc27df Azure:

Re: [PR] [HUDI-7526] Fix constructors for bulkinsert sort partitioners to ensure we could use it as user defined partitioners [hudi]

2024-04-15 Thread via GitHub
wombatu-kun commented on PR #10942: URL: https://github.com/apache/hudi/pull/10942#issuecomment-2058119247 @nsivabalan Hi! Sorry to bother you, but you are reporter of this task. Could you please review my PR? Or close the PR if i totally misunderstood the task and did it wrong. --

[jira] [Closed] (HUDI-6762) Remove usages of MetadataRecordsGenerationParams

2024-04-15 Thread Vova Kolmakov (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vova Kolmakov closed HUDI-6762. --- Reviewers: Sagar Sumit Resolution: Fixed Fixed via master branch:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058104791 ## CI report: * 8fc55507a82ee1295f14c1125876b8395cfc27df Azure:

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2058104279 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 7f43200dfc27f8ff499c9d0c9c375b635120f67e Azure:

[I] [SUPPORT] The Hive run_sync_tool's Logged Output & The Actual Command Do Not Match [hudi]

2024-04-15 Thread via GitHub
samserpoosh opened a new issue, #11029: URL: https://github.com/apache/hudi/issues/11029 This is a pretty small/minor issue I noticed while working with the `HiveSyncTool`. Essentially what's being logged does **not** match what's actually being executed:

Re: [PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11028: URL: https://github.com/apache/hudi/pull/11028#issuecomment-2058098918 ## CI report: * 8fc55507a82ee1295f14c1125876b8395cfc27df UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2058098358 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * d5f312761099a9c57394f89c9b481e58773cb17f Azure:

[jira] [Commented] (HUDI-7596) Enable Jacoco code coverage report across multiple modules

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837477#comment-17837477 ] Ethan Guo commented on HUDI-7596: - Say module A is depended on by module B, and there are functional tests

[jira] [Commented] (HUDI-7596) Enable Jacoco code coverage report across multiple modules

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837476#comment-17837476 ] Ethan Guo commented on HUDI-7596: -

[jira] [Assigned] (HUDI-7596) Enable Jacoco code coverage report across multiple modules

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7596: --- Assignee: Danny Chen > Enable Jacoco code coverage report across multiple modules >

[jira] [Updated] (HUDI-7596) Enable Jacoco code coverage report across multiple modules

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7596: Reviewers: Danny Chen > Enable Jacoco code coverage report across multiple modules >

[jira] [Updated] (HUDI-7596) Enable Jacoco code coverage report across multiple modules

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7596: Reviewers: (was: Danny Chen) > Enable Jacoco code coverage report across multiple modules >

[jira] [Updated] (HUDI-6699) An indexed global timeline (phase2)

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6699: - Status: Open (was: In Progress) > An indexed global timeline (phase2) >

[jira] [Updated] (HUDI-6787) Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompactedRecordReader for Hive

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6787: - Story Points: 25 (was: 5) > Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader

[jira] [Updated] (HUDI-6787) Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompactedRecordReader for Hive

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6787: - Story Points: 5 (was: 25) > Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader

Re: [PR] [HUDI-7146] [RFC-77] RFC for secondary index [hudi]

2024-04-15 Thread via GitHub
codope commented on code in PR #10814: URL: https://github.com/apache/hudi/pull/10814#discussion_r1565589697 ## rfc/rfc-77/rfc-77.md: ## @@ -0,0 +1,247 @@ + + +# RFC-77: Secondary Indexes + +## Proposers + +- @bhat-vinay +- @codope + +## Approvers + - @vinothchandar + -

[jira] [Updated] (HUDI-7582) Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame()

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7582: - Story Points: 4 > Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame() >

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
bhat-vinay commented on code in PR #11021: URL: https://github.com/apache/hudi/pull/11021#discussion_r1566605569 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ## @@ -349,10 +350,15 @@ case class HoodieFileIndex(spark:

[jira] [Commented] (HUDI-7144) Support query for tables written as partitionBy but synced as non-partitioned

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837474#comment-17837474 ] Vinoth Chandar commented on HUDI-7144: -- Things to ensure : 1. the out-of-box experience for

[jira] [Updated] (HUDI-7580) Inserting rows into partitioned table leads to data sanity issues

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7580: -- Reviewers: Jonathan Vexler > Inserting rows into partitioned table leads to data sanity issues >

[jira] [Updated] (HUDI-7144) Support query for tables written as partitionBy but synced as non-partitioned

2024-04-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7144: - Reviewers: Ethan Guo, Vinoth Chandar (was: Vinoth Chandar) > Support query for tables written as

[jira] [Updated] (HUDI-7570) Update RFC with details on API changes

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7570: -- Status: Patch Available (was: In Progress) > Update RFC with details on API changes >

[jira] [Updated] (HUDI-7570) Update RFC with details on API changes

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7570: -- Reviewers: Vinoth Chandar > Update RFC with details on API changes >

[jira] [Updated] (HUDI-7582) Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame()

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7582: -- Sprint: Sprint 2024-03-25 > Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame() >

[PR] [HUDI-7578] Avoid unnecessary rewriting when copy old data from old base to new base file to improve compaction performance [hudi]

2024-04-15 Thread via GitHub
danny0405 opened a new pull request, #11028: URL: https://github.com/apache/hudi/pull/11028 ### Change Logs There is no need to copy for most of the use cases. ### Impact no impact. ### Risk level (write none, low medium or high below) none ###

[jira] [Updated] (HUDI-7582) Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame()

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7582: -- Status: In Progress (was: Open) > Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame() >

[jira] [Updated] (HUDI-7582) Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame()

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7582: -- Status: Patch Available (was: In Progress) > Fix NPE in

[jira] [Assigned] (HUDI-7582) Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame()

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-7582: - Assignee: Sagar Sumit > Fix NPE in FunctionalIndexSupport::loadFunctionalIndexDataFrame() >

[jira] [Updated] (HUDI-7580) Inserting rows into partitioned table leads to data sanity issues

2024-04-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7580: -- Status: Patch Available (was: In Progress) > Inserting rows into partitioned table leads to data

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
bhat-vinay commented on code in PR #11021: URL: https://github.com/apache/hudi/pull/11021#discussion_r1566596914 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala: ## @@ -156,7 +156,7 @@ object DataSourceReadOptions { val

[jira] [Closed] (HUDI-7577) Avoid MDT compaction instant time conflicts

2024-04-15 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7577. Resolution: Fixed Fixed via master branch: 40d4f489389083e3c6d69954361d3de4aec8186a > Avoid MDT compaction

(hudi) branch master updated: [HUDI-7577] Avoid MDT compaction instant time conflicts (#10992)

2024-04-15 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 40d4f489389 [HUDI-7577] Avoid MDT compaction

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
danny0405 merged PR #10992: URL: https://github.com/apache/hudi/pull/10992 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on code in PR #10992: URL: https://github.com/apache/hudi/pull/10992#discussion_r1566592380 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -106,17 +106,18 @@ public static String

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on code in PR #10992: URL: https://github.com/apache/hudi/pull/10992#discussion_r1566591565 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1360,7 +1361,10 @@ protected void

Re: [PR] [MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTests [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11027: URL: https://github.com/apache/hudi/pull/11027#issuecomment-2058057742 ## CI report: * b96388ad837c124fb63a8655f295fadebc37319f Azure:

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10992: URL: https://github.com/apache/hudi/pull/10992#issuecomment-2058057634 ## CI report: * d8dda49ff97feca5172346047aacb007746568ae Azure:

Re: [I] [SUPPORT] Task serialization failed: java.lang.NoSuchMethodError: void org.apache.hudi.common.util.HoodieCommonKryoRegistrar.registerClasses(com.esotericsoftware.kryo.Kryo) [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on issue #11026: URL: https://github.com/apache/hudi/issues/11026#issuecomment-2058058270 Looks like a jar conflict, do you have multiple Hudi bundle jars on the classpath? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2058057248 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * d5f312761099a9c57394f89c9b481e58773cb17f Azure:

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10992: URL: https://github.com/apache/hudi/pull/10992#issuecomment-2058052392 ## CI report: * d8dda49ff97feca5172346047aacb007746568ae UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7577] Avoid MDT compaction instant time conflicts [hudi]

2024-04-15 Thread via GitHub
yihua commented on code in PR #10992: URL: https://github.com/apache/hudi/pull/10992#discussion_r1566585767 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1360,7 +1361,10 @@ protected void

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-2058051993 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * d5f312761099a9c57394f89c9b481e58773cb17f Azure:

Re: [PR] [MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTests [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11027: URL: https://github.com/apache/hudi/pull/11027#issuecomment-2058052484 ## CI report: * b96388ad837c124fb63a8655f295fadebc37319f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[PR] [MINOR] Remove redundant lines in StreamSync and TestStreamSyncUnitTests [hudi]

2024-04-15 Thread via GitHub
yihua opened a new pull request, #11027: URL: https://github.com/apache/hudi/pull/11027 ### Change Logs As above. ### Impact Cleaner code. ### Risk level none ### Documentation Update N/A ### Contributor's checklist - [ ] Read

Re: [PR] [HUDI-7608] Fix Flink table creation configuration not taking effect when writing… [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on code in PR #11005: URL: https://github.com/apache/hudi/pull/11005#discussion_r1566579977 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieOptionConfig.scala: ## @@ -43,6 +43,11 @@ object HoodieOptionConfig { */

Re: [PR] [HUDI-7609] Support array field type whose element type can be nullable [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on code in PR #11006: URL: https://github.com/apache/hudi/pull/11006#discussion_r1566577832 ## hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/util/Parquet2SparkSchemaUtils.java: ## @@ -140,7 +141,7 @@ private static String

Re: [PR] [HUDI-7609] Support array field type whose element type can be nullable [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on code in PR #11006: URL: https://github.com/apache/hudi/pull/11006#discussion_r1566577832 ## hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/util/Parquet2SparkSchemaUtils.java: ## @@ -140,7 +141,7 @@ private static String

Re: [I] [SUPPORT] Metadata table not cleaned / compacted, log files growing rapidly [hudi]

2024-04-15 Thread via GitHub
danny0405 commented on issue #8567: URL: https://github.com/apache/hudi/issues/8567#issuecomment-2058029478 > 2a0969c9972ef746d377dbddd278ef13bf3d299d For mor table, it should be fine if it is the upsert semantics. -- This is an automated message from the Apache Git Service. To

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2057958997 ## CI report: * 94171e2cb1dd8066176589376d1af6c49f676b9c Azure:

Re: [PR] [HUDI-7503] Compaction and LogCompaction executions should start a heartbeat on every attempt and block concurrent executions of same plan [hudi]

2024-04-15 Thread via GitHub
nsivabalan commented on code in PR #10965: URL: https://github.com/apache/hudi/pull/10965#discussion_r1566489716 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java: ## @@ -1135,8 +1138,36 @@ protected void

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2057876670 ## CI report: * aed811322f7c2a2fb539d293fc93b5054d550835 Azure:

Re: [PR] [HUDI-7269] Fallback to key based merge if positions are missing from log block [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10991: URL: https://github.com/apache/hudi/pull/10991#issuecomment-2057814505 ## CI report: * 9b5a2a5f69fa40f9dbd6e10d0c1c3fe9457b71da Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2057814333 ## CI report: * 966e8c85f2afb0ffaf00e12d02eb41b41c68e0bc Azure:

(hudi) branch master updated: [HUDI-7566] Add schema evolution to spark file readers (#10956)

2024-04-15 Thread jonvex
This is an automated email from the ASF dual-hosted git repository. jonvex pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new c71ac326b79 [HUDI-7566] Add schema evolution to

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex merged PR #10956: URL: https://github.com/apache/hudi/pull/10956 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057767464 https://github.com/apache/hudi/assets/26940621/fba674a1-82fc-4b21-ab90-40623835d9f0;> azure ci passing -- This is an automated message from the Apache Git Service. To respond to the

[I] [SUPPORT] Task serialization failed: java.lang.NoSuchMethodError: void org.apache.hudi.common.util.HoodieCommonKryoRegistrar.registerClasses(com.esotericsoftware.kryo.Kryo) [hudi]

2024-04-15 Thread via GitHub
vbogretsov opened a new issue, #11026: URL: https://github.com/apache/hudi/issues/11026 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2057697998 ## CI report: * 755ddfdc5d0a02ac1cf1c35fbf5ccd21e1025a31 Azure:

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11021: URL: https://github.com/apache/hudi/pull/11021#issuecomment-2057684920 ## CI report: * 8cdf539f2193660299a3894b59d16a7c2b1a59fb UNKNOWN * 24d5e7f082788b257a42598df5f1d2378e32b041 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
yihua commented on code in PR #10956: URL: https://github.com/apache/hudi/pull/10956#discussion_r1563556374 ## hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParquetSchemaEvolutionUtils.scala: ## @@ -0,0 +1,194 @@ +/* +

Re: [PR] [HUDI-7269] Fallback to key based merge if positions are missing from log block [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10991: URL: https://github.com/apache/hudi/pull/10991#issuecomment-2057613466 ## CI report: * 2af03c004aef66248dae6283e9c2f1e63e062e75 Azure:

Re: [PR] [DO NOT MERGE][HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2057613346 ## CI report: * 966e8c85f2afb0ffaf00e12d02eb41b41c68e0bc Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057613284 ## CI report: * 8943bb4eaf741096203bed688905977d4bf59160 Azure:

Re: [PR] [DO NOT MERGE][HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2057600734 ## CI report: * 966e8c85f2afb0ffaf00e12d02eb41b41c68e0bc Azure:

Re: [PR] [HUDI-7269] Fallback to key based merge if positions are missing from log block [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10991: URL: https://github.com/apache/hudi/pull/10991#issuecomment-2057600936 ## CI report: * 2af03c004aef66248dae6283e9c2f1e63e062e75 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057600621 ## CI report: * 8943bb4eaf741096203bed688905977d4bf59160 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057588207 ## CI report: * 8943bb4eaf741096203bed688905977d4bf59160 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex commented on code in PR #10956: URL: https://github.com/apache/hudi/pull/10956#discussion_r1566258283 ## hudi-spark-datasource/hudi-spark2/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark24ParquetReader.scala: ## @@ -156,30 +177,51 @@ class

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex commented on code in PR #10956: URL: https://github.com/apache/hudi/pull/10956#discussion_r1566257427 ## hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParquetSchemaEvolutionUtils.scala: ## @@ -0,0 +1,194 @@ +/* +

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex commented on code in PR #10956: URL: https://github.com/apache/hudi/pull/10956#discussion_r1566256863 ## hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParquetSchemaEvolutionUtils.scala: ## @@ -0,0 +1,194 @@ +/* +

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
jonvex commented on code in PR #10956: URL: https://github.com/apache/hudi/pull/10956#discussion_r1566256714 ## hudi-spark-datasource/hudi-spark3.0.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark30ParquetReader.scala: ## @@ -174,7 +190,7 @@ class

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11021: URL: https://github.com/apache/hudi/pull/11021#issuecomment-2057514833 ## CI report: * 8cdf539f2193660299a3894b59d16a7c2b1a59fb UNKNOWN * 24d5e7f082788b257a42598df5f1d2378e32b041 Azure:

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2057514710 ## CI report: * c0923360a546fcfd71c0111b9ea29894fa1fe7f3 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057514363 ## CI report: * be7795021e2cffe600a109448ed02e5860385b9f Azure:

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11021: URL: https://github.com/apache/hudi/pull/11021#issuecomment-2057502357 ## CI report: * 8cdf539f2193660299a3894b59d16a7c2b1a59fb UNKNOWN * 24d5e7f082788b257a42598df5f1d2378e32b041 UNKNOWN Bot commands @hudi-bot supports the

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11018: URL: https://github.com/apache/hudi/pull/11018#issuecomment-2057502296 ## CI report: * c0923360a546fcfd71c0111b9ea29894fa1fe7f3 Azure:

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057501959 ## CI report: * be7795021e2cffe600a109448ed02e5860385b9f Azure:

Re: [PR] [HUDI-7582] Fix functional index lookup [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #11021: URL: https://github.com/apache/hudi/pull/11021#issuecomment-2057490440 ## CI report: * 8cdf539f2193660299a3894b59d16a7c2b1a59fb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7618] Add ability to ignore checkpoints in delta streamer [hudi]

2024-04-15 Thread via GitHub
sampan-s-nayak commented on code in PR #11018: URL: https://github.com/apache/hudi/pull/11018#discussion_r1566189979 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -424,6 +439,11 @@ public static class Config implements

(hudi) branch master updated: [HUDI-6762] Removed usages of MetadataRecordsGenerationParams (#10962)

2024-04-15 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 7c12decc86c [HUDI-6762] Removed usages of

Re: [PR] [HUDI-6762] Removed usages of MetadataRecordsGenerationParams [hudi]

2024-04-15 Thread via GitHub
codope merged PR #10962: URL: https://github.com/apache/hudi/pull/10962 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-6762] Removed usages of MetadataRecordsGenerationParams [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10962: URL: https://github.com/apache/hudi/pull/10962#issuecomment-2057374962 ## CI report: * 04a008a504f9d9f1b8cfb8ae5199a85ec0fca6fe UNKNOWN * faf292c41024255b88f01c0f3193b8fd72a2849d Azure:

[jira] [Created] (HUDI-7620) Support querying multiple functional index in single query

2024-04-15 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-7620: - Summary: Support querying multiple functional index in single query Key: HUDI-7620 URL: https://issues.apache.org/jira/browse/HUDI-7620 Project: Apache Hudi Issue

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057374664 ## CI report: * be7795021e2cffe600a109448ed02e5860385b9f Azure:

Re: [I] [SUPPORT] Metadata table not cleaned / compacted, log files growing rapidly [hudi]

2024-04-15 Thread via GitHub
Qiuzhuang commented on issue #8567: URL: https://github.com/apache/hudi/issues/8567#issuecomment-2057270718 > if you have any pending/inflight in data table timeline, metadata table compaction will stalled until that gets to completion. may be there is some lingering pending operation

Re: [PR] [HUDI-7566] Add schema evolution to spark file readers [hudi]

2024-04-15 Thread via GitHub
hudi-bot commented on PR #10956: URL: https://github.com/apache/hudi/pull/10956#issuecomment-2057219823 ## CI report: * c8f507bcac03c7183893400487a1885400c46853 Azure:

[jira] [Closed] (HUDI-7378) Fix Spark SQL DML with custom key generator

2024-04-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-7378. --- Resolution: Fixed > Fix Spark SQL DML with custom key generator > ---

  1   2   >