[jira] [Created] (HUDI-7834) Setup table versions to differentiate HUDI 0.16.x and 1.0-beta versions

2024-06-05 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-7834: Summary: Setup table versions to differentiate HUDI 0.16.x and 1.0-beta versions Key: HUDI-7834 URL: https://issues.apache.org/jira/browse/HUDI-7834 Project:

[jira] [Assigned] (HUDI-7834) Setup table versions to differentiate HUDI 0.16.x and 1.0-beta versions

2024-06-05 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-7834: Assignee: Balaji Varadarajan > Setup table versions to differentiate HUDI 0.16.x

Re: [I] [SUPPORT] It failed to compile raw hudi src with error "oodieTableMetadataUtil.java:[189,7] no suitable method found for collect(java.util.stream.Collector

2024-06-05 Thread via GitHub
danny0405 commented on issue #5552: URL: https://github.com/apache/hudi/issues/5552#issuecomment-2151362100 We did have a fix for windows OS path with special back slash, do you encounter any issues for complire on windows OS ? -- This is an automated message from the Apache Git Service.

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2151300459 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * 1e677e9b8b5d79cb23e85f2577407f9be840c762 Azure:

Re: [I] [SUPPORT] Serde properties missing after migrate from hivesync to gluesync [hudi]

2024-06-05 Thread via GitHub
danny0405 commented on issue #11397: URL: https://github.com/apache/hudi/issues/11397#issuecomment-2151283036 > I have fixed this for our internal use & would like to contribute the same That's great, can you share the patch with us. -- This is an automated message from the Apache

Re: [I] [SUPPORT] [hudi]

2024-06-05 Thread via GitHub
danny0405 commented on issue #11403: URL: https://github.com/apache/hudi/issues/11403#issuecomment-2151279640 I would suggest you use the 0.12.3 or 0.14.1, 0.12.1 still got some stability issues. -- This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Updated] (HUDI-6787) Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompactedRecordReader for Hive

2024-06-05 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6787: Reviewers: Balaji Varadarajan > Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and >

[jira] [Closed] (HUDI-7384) Implement writer path support for secondary index

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-7384. - Fix Version/s: 1.0.0 Resolution: Done > Implement writer path support for secondary index >

[jira] [Updated] (HUDI-7405) Implement reader path support for secondary index

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7405: -- Status: In Progress (was: Open) > Implement reader path support for secondary index >

[jira] [Closed] (HUDI-7795) Fix loading of input splits from look up table reader

2024-06-05 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-7795. --- Resolution: Fixed > Fix loading of input splits from look up table reader >

[jira] [Updated] (HUDI-7405) Implement reader path support for secondary index

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7405: -- Status: Patch Available (was: In Progress) > Implement reader path support for secondary index >

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-06-05 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7779: Fix Version/s: 0.16.0 1.0.0 > Guarding archival to not archive unintended commits >

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-06-05 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7779: Status: In Progress (was: Open) > Guarding archival to not archive unintended commits >

[jira] [Updated] (HUDI-7779) Guarding archival to not archive unintended commits

2024-06-05 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7779: Sprint: 2024/06/03-16 > Guarding archival to not archive unintended commits >

[I] [SUPPORT] [hudi]

2024-06-05 Thread via GitHub
zaminhassnain06 opened a new issue, #11403: URL: https://github.com/apache/hudi/issues/11403 Hi Our organization is migrating from Hudi 0.6.0 to Hudi 0.12.1 and also updating the required spark and EMR versions. Our existing data sets (100s of TBs of data on S3) are written using Hudi

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2151246128 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * e710020df011ae0e9aac4284126dbc226533e6d5 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628627195 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -73,16 +84,27 @@ class

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2151233755 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * e710020df011ae0e9aac4284126dbc226533e6d5 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628612267 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala: ## @@ -161,15 +167,14 @@ abstract class

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628612267 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieHadoopFsRelationFactory.scala: ## @@ -161,15 +167,14 @@ abstract class

(hudi) branch asf-site updated: [HUDI-4967][HUDI-4834] Improve docs for hive sync and glue sync (#11402)

2024-06-05 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 41d1021f8a7 [HUDI-4967][HUDI-4834] Improve

Re: [PR] [HUDI-4967][HUDI-4834] Improve docs for hive sync and glue sync [hudi]

2024-06-05 Thread via GitHub
xushiyan merged PR #11402: URL: https://github.com/apache/hudi/pull/11402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Closed] (HUDI-6633) Add hms based sync to hudi website

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-6633. --- Resolution: Fixed > Add hms based sync to hudi website > -- > >

[jira] [Updated] (HUDI-4967) Improve docs for meta sync with TimestampBasedKeyGenerator

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu updated HUDI-4967: Status: Open (was: Patch Available) > Improve docs for meta sync with TimestampBasedKeyGenerator >

[jira] [Closed] (HUDI-4834) Update AWSGlueCatalog syncing oage to add spark datasource example

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-4834. --- Fix Version/s: 1.0.0 Resolution: Fixed > Update AWSGlueCatalog syncing oage to add spark datasource

[jira] [Closed] (HUDI-4967) Improve docs for meta sync with TimestampBasedKeyGenerator

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-4967. --- Fix Version/s: 1.0.0 Resolution: Fixed > Improve docs for meta sync with TimestampBasedKeyGenerator >

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628560599 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordReader.java: ## @@ -343,19 +310,19 @@ public Builder

Re: [PR] [HUDI-4967][HUDI-4834] Improve docs for hive sync and glue sync [hudi]

2024-06-05 Thread via GitHub
xushiyan commented on PR #11402: URL: https://github.com/apache/hudi/pull/11402#issuecomment-2151191188 ![screencapture-localhost-3000-docs-next-syncing-aws-glue-data-catalog-2024-06-05-19_51_01](https://github.com/apache/hudi/assets/2701446/ca644d33-870c-4a0e-9515-e4c647fb3646) --

Re: [PR] [HUDI-4967][HUDI-4834] Improve docs for hive sync and glue sync [hudi]

2024-06-05 Thread via GitHub
xushiyan commented on PR #11402: URL: https://github.com/apache/hudi/pull/11402#issuecomment-2151190860 ![screencapture-localhost-3000-docs-next-syncing-metastore-2024-06-05-19_52_22](https://github.com/apache/hudi/assets/2701446/81929e63-3831-45d1-8303-07f0139840b9) -- This is an

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
jonvex commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628598510 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -46,21 +49,27 @@ import scala.collection.mutable *

[jira] [Created] (HUDI-7833) Validate that fg reader works with nested column as record key

2024-06-05 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-7833: - Summary: Validate that fg reader works with nested column as record key Key: HUDI-7833 URL: https://issues.apache.org/jira/browse/HUDI-7833 Project: Apache Hudi

[jira] [Updated] (HUDI-6230) Make hive sync aws support partition indexes

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu updated HUDI-6230: Fix Version/s: 0.15.0 > Make hive sync aws support partition indexes >

Re: [PR] [HUDI-1234] DO NOT MERGE use fg reader in cdc test [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11401: URL: https://github.com/apache/hudi/pull/11401#issuecomment-2151112254 ## CI report: * a4f3d9a64cc59f67bda1b9f9e045774b29213d2c Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628543356 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -101,46 +121,150 @@ class

[jira] [Closed] (HUDI-1964) Update guide around hive metastore and hive sync for hudi tables

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-1964. --- Resolution: Duplicate > Update guide around hive metastore and hive sync for hudi tables >

[jira] [Updated] (HUDI-1964) Update guide around hive metastore and hive sync for hudi tables

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu updated HUDI-1964: Fix Version/s: 1.0.0 > Update guide around hive metastore and hive sync for hudi tables >

[jira] [Updated] (HUDI-6633) Add hms based sync to hudi website

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu updated HUDI-6633: Fix Version/s: 1.0.0 > Add hms based sync to hudi website > -- > >

[jira] [Closed] (HUDI-851) Add Documentation on partitioning data with examples and details on how to sync to Hive

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-851. -- Fix Version/s: 1.0.0 (was: 0.15.0) Resolution: Duplicate > Add Documentation on

Re: [PR] [HUDI-1234] DO NOT MERGE use fg reader in cdc test [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11401: URL: https://github.com/apache/hudi/pull/11401#issuecomment-2151067173 ## CI report: * a4f3d9a64cc59f67bda1b9f9e045774b29213d2c Azure:

Re: [PR] [HUDI-1234] DO NOT MERGE use fg reader in cdc test [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11401: URL: https://github.com/apache/hudi/pull/11401#issuecomment-2151058358 ## CI report: * a4f3d9a64cc59f67bda1b9f9e045774b29213d2c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

(hudi) branch master updated (a1ba9728310 -> 44922f160bd)

2024-06-05 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from a1ba9728310 [HUDI-7414] Remove redundant base path config in BQ sync (#11395) add 44922f160bd [MINOR] Allow

Re: [PR] [MINOR] Allow recreation of metrics instance for base path [hudi]

2024-06-05 Thread via GitHub
yihua merged PR #11400: URL: https://github.com/apache/hudi/pull/11400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR] Allow recreation of metrics instance for base path [hudi]

2024-06-05 Thread via GitHub
yihua commented on PR #11400: URL: https://github.com/apache/hudi/pull/11400#issuecomment-2151045404 Azure CI is green. https://github.com/apache/hudi/assets/2497195/8e77a102-fefa-44d3-9c8d-366546204d28;> -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
yihua commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628436290 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -46,21 +49,27 @@ import scala.collection.mutable *

[PR] [HUDI-1234] DO NOT MERGE use fg reader in cdc test [hudi]

2024-06-05 Thread via GitHub
jonvex opened a new pull request, #11401: URL: https://github.com/apache/hudi/pull/11401 ### Change Logs use fg reader in cdc run ci ### Impact step in making cdc reading engine agnostic ### Risk level (write none, low medium or high below) low ###

[jira] [Assigned] (HUDI-7832) Refactor Deltastreamer S3/GCP Events Source to allow adding auxiliary columns from upstream.

2024-06-05 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-7832: Assignee: Balaji Varadarajan > Refactor Deltastreamer S3/GCP Events Source to

[jira] [Created] (HUDI-7832) Refactor Deltastreamer S3/GCP Events Source to allow adding auxiliary columns from upstream.

2024-06-05 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-7832: Summary: Refactor Deltastreamer S3/GCP Events Source to allow adding auxiliary columns from upstream. Key: HUDI-7832 URL: https://issues.apache.org/jira/browse/HUDI-7832

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2150997634 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * e710020df011ae0e9aac4284126dbc226533e6d5 Azure:

Re: [PR] [MINOR] Allow recreation of metrics instance for base path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11400: URL: https://github.com/apache/hudi/pull/11400#issuecomment-2150977762 ## CI report: * 8f7123807feaa88d95dcc289364e2b8f15b43553 Azure:

[jira] [Closed] (HUDI-7414) Remove hoodie.gcp.bigquery.sync.base_path reference in the gcp docs

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu closed HUDI-7414. --- Resolution: Fixed > Remove hoodie.gcp.bigquery.sync.base_path reference in the gcp docs >

[jira] [Updated] (HUDI-7414) Remove hoodie.gcp.bigquery.sync.base_path reference in the gcp docs

2024-06-05 Thread Shiyan Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiyan Xu updated HUDI-7414: Fix Version/s: 1.0.0 (was: 0.15.0) > Remove hoodie.gcp.bigquery.sync.base_path

Re: [PR] [MINOR] Allow recreation of metrics instance for base path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11400: URL: https://github.com/apache/hudi/pull/11400#issuecomment-2150916443 ## CI report: * 8f7123807feaa88d95dcc289364e2b8f15b43553 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150916361 ## CI report: * 7bc15adec04d8b680ed83b532803ceef350d51a6 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2150915206 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * 11862a3bd3b84cb12b0abcf8a399d2bfb56870b3 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #10957: URL: https://github.com/apache/hudi/pull/10957#issuecomment-2150904022 ## CI report: * c98242b22fb2518c0cc93c037df558037030500f UNKNOWN * 11862a3bd3b84cb12b0abcf8a399d2bfb56870b3 Azure:

[PR] [MINOR] Allow recreation of metrics instance for base path [hudi]

2024-06-05 Thread via GitHub
the-other-tim-brown opened a new pull request, #11400: URL: https://github.com/apache/hudi/pull/11400 ### Change Logs - Removes metrics entry from map when it is shutdown ### Impact Allows proper recreation of metrics instance if it was previously shutdown. This can be

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150893311 ## CI report: * 2e1f5f9da800d048b39b5f119038191b9f277396 Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150893255 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN * ea23061e800c02c8814d50efddf303edad448be2 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
jonvex commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628362032 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -101,46 +121,150 @@ class

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150809888 ## CI report: * 2e1f5f9da800d048b39b5f119038191b9f277396 Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150809795 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN * 723b5a29eb4a7f872bb4436f8d6c612edf97a4d4 Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150796622 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN * 723b5a29eb4a7f872bb4436f8d6c612edf97a4d4 Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150796674 ## CI report: * 2e1f5f9da800d048b39b5f119038191b9f277396 Azure:

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11162: URL: https://github.com/apache/hudi/pull/11162#issuecomment-2150795998 ## CI report: * b342d8f8e10f77419bf1bd0bc9f626a596ad65f9 UNKNOWN * 8a9986ae4b8712c0e2e700aeb40a1e4c041fde0e Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150783095 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN * 723b5a29eb4a7f872bb4436f8d6c612edf97a4d4 Azure:

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11162: URL: https://github.com/apache/hudi/pull/11162#issuecomment-2150782235 ## CI report: * b342d8f8e10f77419bf1bd0bc9f626a596ad65f9 UNKNOWN * 7a0a21f67d6cfc5a17cd1e04abec99dfb6fd53f1 Azure:

Re: [PR] [HUDI-7567] Add schema evolution to the filegroup reader [hudi]

2024-06-05 Thread via GitHub
linliu-code commented on code in PR #10957: URL: https://github.com/apache/hudi/pull/10957#discussion_r1628272787 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala: ## @@ -101,46 +121,150 @@ class

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150703254 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN * 723b5a29eb4a7f872bb4436f8d6c612edf97a4d4 UNKNOWN Bot commands @hudi-bot supports the

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150703293 ## CI report: * 2e1f5f9da800d048b39b5f119038191b9f277396 Azure:

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11162: URL: https://github.com/apache/hudi/pull/11162#issuecomment-2150702446 ## CI report: * b342d8f8e10f77419bf1bd0bc9f626a596ad65f9 UNKNOWN * 7a0a21f67d6cfc5a17cd1e04abec99dfb6fd53f1 Azure:

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11399: URL: https://github.com/apache/hudi/pull/11399#issuecomment-2150688904 ## CI report: * 2e1f5f9da800d048b39b5f119038191b9f277396 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11398: URL: https://github.com/apache/hudi/pull/11398#issuecomment-2150688823 ## CI report: * da8e1320dc7b7e18a35319a32342f96eff646518 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11162: URL: https://github.com/apache/hudi/pull/11162#issuecomment-2150688074 ## CI report: * b342d8f8e10f77419bf1bd0bc9f626a596ad65f9 UNKNOWN * c6d07ea56ebf1c7eaeb9306df8fe0dd366d72abe Azure:

[jira] [Commented] (HUDI-7829) storage partition stats index can not effert in data skipping

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17852528#comment-17852528 ] Sagar Sumit commented on HUDI-7829: --- Thanks for creating the issue. Will take a look. > storage

[jira] [Assigned] (HUDI-7829) storage partition stats index can not effert in data skipping

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-7829: - Assignee: Sagar Sumit > storage partition stats index can not effert in data skipping >

[jira] [Updated] (HUDI-7829) storage partition stats index can not effert in data skipping

2024-06-05 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7829: -- Fix Version/s: 1.0.0 > storage partition stats index can not effert in data skipping >

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
codope commented on code in PR #11162: URL: https://github.com/apache/hudi/pull/11162#discussion_r1628207181 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataLogRecordReader.java: ## @@ -253,7 +253,7 @@ public HoodieMetadataLogRecordReader build() { }

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
codope commented on code in PR #11162: URL: https://github.com/apache/hudi/pull/11162#discussion_r1628207181 ## hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataLogRecordReader.java: ## @@ -253,7 +253,7 @@ public HoodieMetadataLogRecordReader build() { }

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
codope commented on code in PR #11162: URL: https://github.com/apache/hudi/pull/11162#discussion_r1628193853 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java: ## @@ -229,6 +231,13 @@ public void cancelAllJobs() {

Re: [PR] [HUDI-7146] Integrate secondary index on reader path [hudi]

2024-06-05 Thread via GitHub
codope commented on code in PR #11162: URL: https://github.com/apache/hudi/pull/11162#discussion_r1628190224 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSecondaryIndexWithSql.scala: ## @@ -95,4 +97,39 @@ class TestSecondaryIndexWithSql

[PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions bootstrap [hudi]

2024-06-05 Thread via GitHub
jonvex opened a new pull request, #11399: URL: https://github.com/apache/hudi/pull/11399 ### Change Logs Testing hive 3 bootstrap read using the bundle validation setup ### Impact see if hive 3 works as expected ### Risk level (write none, low medium or high

Re: [I] [SUPPORT] using spark's observe feature on dataframes saved by hudi is stuck [hudi]

2024-06-05 Thread via GitHub
szingerpeter commented on issue #11367: URL: https://github.com/apache/hudi/issues/11367#issuecomment-2150610226 @ad1happy2go , thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [HUDI-6787] DO NOT MERGE. Test hive3 in ghactions [hudi]

2024-06-05 Thread via GitHub
jonvex opened a new pull request, #11398: URL: https://github.com/apache/hudi/pull/11398 ### Change Logs Testing hive 3 using the bundle validation setup ### Impact see if hive 3 works as expected ### Risk level (write none, low medium or high below) none

[jira] [Created] (HUDI-7831) Support secondary index reads using native HFile reader

2024-06-05 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-7831: - Summary: Support secondary index reads using native HFile reader Key: HUDI-7831 URL: https://issues.apache.org/jira/browse/HUDI-7831 Project: Apache Hudi Issue

Re: [PR] [HUDI-7747] In MetaClient remove getBasePathV2() and return StoragePath from getBasePath() [hudi]

2024-06-05 Thread via GitHub
wombatu-kun commented on PR #11385: URL: https://github.com/apache/hudi/pull/11385#issuecomment-2150016743 > Let me know if you prefer to address the `toString()` calls in this PR. Also, could you raise another PR against `branch-0.x` with the same changes? hi, @yihua ! I've made it

Re: [PR] [HUDI-7747] In MetaClient remove getBasePathV2() and return StoragePath from getBasePath() [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11385: URL: https://github.com/apache/hudi/pull/11385#issuecomment-2149867032 ## CI report: * 0b9134e14a349ac70defc972dd67e464c0506ae1 Azure:

Re: [I] [SUPPORT] Unable to Use DynamoDB Based Lock with Hudi PySpark Job Locally [hudi]

2024-06-05 Thread via GitHub
soumilshah1995 commented on issue #11391: URL: https://github.com/apache/hudi/issues/11391#issuecomment-2149868601 Added following packages ``` HUDI_VERSION = '0.14.0' SPARK_VERSION = '3.4' os.environ["JAVA_HOME"] = "/opt/homebrew/opt/openjdk@11" SUBMIT_ARGS =

Re: [PR] [HUDI-7830] Add predicate filter pruning for snapshot queries in hudi related sources [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11396: URL: https://github.com/apache/hudi/pull/11396#issuecomment-2149867215 ## CI report: * d39943c1608d0a18e25e8b13f9bf6900c684253f Azure:

Re: [I] [SUPPORT] Unable to Use DynamoDB Based Lock with Hudi PySpark Job Locally [hudi]

2024-06-05 Thread via GitHub
soumilshah1995 commented on issue #11391: URL: https://github.com/apache/hudi/issues/11391#issuecomment-2149831661 # Code ``` HUDI_VERSION = '0.14.0' SPARK_VERSION = '3.4' os.environ["JAVA_HOME"] = "/opt/homebrew/opt/openjdk@11" AWS_JAR_FILES =

Re: [I] [SUPPORT] Serde properties missing after migrate from hivesync to gluesync [hudi]

2024-06-05 Thread via GitHub
prathit06 commented on issue #11397: URL: https://github.com/apache/hudi/issues/11397#issuecomment-2149827641 I have fixed this for our internal use & would like to contribute the same. Kindly access & let me know if any other information is required on the same. -- This is an automated

[I] [SUPPORT] Serde properties missing after migrate from hivesync to gluesync [hudi]

2024-06-05 Thread via GitHub
prathit06 opened a new issue, #11397: URL: https://github.com/apache/hudi/issues/11397 **Describe the problem you faced** - We used hive sync to sync tables to glue for hudi version 0.8, 0.10.0, 0.11.1. After sometime we started using glue sync in hudi version 0.11.1 & have recently

Re: [I] [SUPPORT] Unable to Use DynamoDB Based Lock with Hudi PySpark Job Locally [hudi]

2024-06-05 Thread via GitHub
soumilshah1995 closed issue #11391: [SUPPORT] Unable to Use DynamoDB Based Lock with Hudi PySpark Job Locally URL: https://github.com/apache/hudi/issues/11391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] [SUPPORT] Unable to Use DynamoDB Based Lock with Hudi PySpark Job Locally [hudi]

2024-06-05 Thread via GitHub
soumilshah1995 commented on issue #11391: URL: https://github.com/apache/hudi/issues/11391#issuecomment-2149620227 oh let me try this and update the thread shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [HUDI-7830] Add predicate filter pruning for snapshot queries in hudi related sources [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11396: URL: https://github.com/apache/hudi/pull/11396#issuecomment-2149598890 ## CI report: * 5dc3a94d9c3acb593b0c993e7ffa3b415e917774 Azure:

Re: [PR] [HUDI-7747] In MetaClient remove getBasePathV2() and return StoragePath from getBasePath() [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11385: URL: https://github.com/apache/hudi/pull/11385#issuecomment-2149598779 ## CI report: * 064b5310f709e5886dd7e278d1ebf9cdcfbe70c7 Azure:

Re: [PR] [HUDI-7747] In MetaClient remove getBasePathV2() and return StoragePath from getBasePath() [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11385: URL: https://github.com/apache/hudi/pull/11385#issuecomment-2149582706 ## CI report: * 064b5310f709e5886dd7e278d1ebf9cdcfbe70c7 Azure:

Re: [PR] [HUDI-7830] Add predicate filter pruning for snapshot queries in hudi related sources [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11396: URL: https://github.com/apache/hudi/pull/11396#issuecomment-2149582843 ## CI report: * 5dc3a94d9c3acb593b0c993e7ffa3b415e917774 Azure:

Re: [PR] [MINOR] Fix operation total io should not exceed the target io limit [hudi]

2024-06-05 Thread via GitHub
KnightChess commented on code in PR #11174: URL: https://github.com/apache/hudi/pull/11174#discussion_r1627542400 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/strategy/BoundedIOCompactionStrategy.java: ## @@ -44,10 +44,10 @@ public List

Re: [PR] [MINOR] Fix operation total io should not exceed the target io limit [hudi]

2024-06-05 Thread via GitHub
KnightChess commented on code in PR #11174: URL: https://github.com/apache/hudi/pull/11174#discussion_r1627542176 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/strategy/BoundedIOCompactionStrategy.java: ## @@ -44,10 +44,10 @@ public List

Re: [PR] [HUDI-7830] Add predicate filter pruning for snapshot queries in hudi related sources [hudi]

2024-06-05 Thread via GitHub
hudi-bot commented on PR #11396: URL: https://github.com/apache/hudi/pull/11396#issuecomment-2149449688 ## CI report: * 5dc3a94d9c3acb593b0c993e7ffa3b415e917774 Azure:

  1   2   >