Re: [I] [SUPPORT]hive_sync.skip_ro_suffix looks not working as expected [hudi]

2024-09-25 Thread via GitHub
ad1happy2go commented on issue #12011: URL: https://github.com/apache/hudi/issues/12011#issuecomment-2376082279 Thanks for raising this @bithw1 . I confirmed this is the regression for release 0.15.0. With 0.14.X versions it works fine. i will created jIRA for this. -- This is an automa

Re: [PR] Fix for CVE-2023-39410 and CVE-2020-13956 [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12010: URL: https://github.com/apache/hudi/pull/12010#issuecomment-2376062800 ## CI report: * 33da3523bba3d0ccb40ef343bd664501f351f126 Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=796)

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2376014374 ## CI report: * faed95611cba8f5b510a37a0aff0f9f1d2f287f2 Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=795)

[I] [SUPPORT]hive_sync.skip_ro_suffix looks not working as expected [hudi]

2024-09-25 Thread via GitHub
bithw1 opened a new issue, #12011: URL: https://github.com/apache/hudi/issues/12011 Hi, I am using Hudi 0.15.0, I find that the option`hoodie.datasource.hive_sync.skip_ro_suffix` looks not working as expected. I first created an mor table and set hoodie.datasource.hive_sync.s

Re: [PR] Fix for CVE-2023-39410 and CVE-2020-13956 [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12010: URL: https://github.com/apache/hudi/pull/12010#issuecomment-2375948486 ## CI report: * 33da3523bba3d0ccb40ef343bd664501f351f126 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=796)

Re: [PR] Fix for CVE-2023-39410 and CVE-2020-13956 [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12010: URL: https://github.com/apache/hudi/pull/12010#issuecomment-2375935242 ## CI report: * 33da3523bba3d0ccb40ef343bd664501f351f126 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

[PR] Fix for CVE-2023-39410 and CVE-2020-13956 [hudi]

2024-09-25 Thread via GitHub
mehradpk opened a new pull request, #12010: URL: https://github.com/apache/hudi/pull/12010 Upgrade httpclient version to 4.5.13 Upgrade avro version to 1.11.3 **Reference PR** - https://github.com/apache/hudi/pull/11964 ### Change Logs This issue will address the belo

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2375874096 ## CI report: * c4403271f24e75fcb2b5d3c06d6eb84267c7c351 Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=787)

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2375875355 ## CI report: * c4403271f24e75fcb2b5d3c06d6eb84267c7c351 Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=787)

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
dataproblems commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2375829985 Let me try beta2 and report back. @yihua - I tried using `s3a` and it immediately gave me a null pointer exception ``` Caused by: org.apache.hudi.exception.H

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2375787302 yeah, the fix should be included in beta2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
dataproblems commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2375782170 @danny0405 I'm using 1.0.0-beta1, are you saying that I would need 1.0.0-beta2? -- This is an automated message from the Apache Git Service. To respond to the message, please

[I] [SUPPORT]When saveAsTable: java.lang.IllegalArgumentException: Partition-path field has to be non-empty! [hudi]

2024-09-25 Thread via GitHub
bithw1 opened a new issue, #12009: URL: https://github.com/apache/hudi/issues/12009 Hi, I am using 0.15.0, I am using following code snippet on the spark-shell,trying to save the spark dataframe as hudi table. When I run the code, an exception, the full exception stack trace is pa

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375635896 ## CI report: * c08c32127aad68da6114fbf407616cd7acba2791 Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=794)

[I] [SUPPORT]Why ComplexKeyGenerator doesn't work consistently when generating record key [hudi]

2024-09-25 Thread via GitHub
bithw1 opened a new issue, #12008: URL: https://github.com/apache/hudi/issues/12008 Hi, I am using Hudi 0.15 and trying ComplexKeyGenerator 1. record key has one field, and use ComplexKeyGenerator as the key generator. In this case, it will use the value of the record key f

Re: [I] [SUPPORT] two insert into operation works like upsert [hudi]

2024-09-25 Thread via GitHub
bithw1 closed issue #11996: [SUPPORT] two insert into operation works like upsert URL: https://github.com/apache/hudi/issues/11996 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] [SUPPORT] two insert into operation works like upsert [hudi]

2024-09-25 Thread via GitHub
bithw1 commented on issue #11996: URL: https://github.com/apache/hudi/issues/11996#issuecomment-2375605864 Thanks @KnightChess -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[jira] [Updated] (HUDI-7507) ongoing concurrent writers with smaller timestamp can cause issues with table services

2024-09-25 Thread Kate Huber (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kate Huber updated HUDI-7507: - Reviewers: Danny Chen, Ethan Guo (this is the old account; please use "yihua") (was: Ethan Guo (this is t

[jira] [Updated] (HUDI-8260) Fix col stats metadata validation so that log files are also validated

2024-09-25 Thread Kate Huber (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kate Huber updated HUDI-8260: - Story Points: 4 > Fix col stats metadata validation so that log files are also validated > ---

Re: [PR] [HUDI-8160] Verify the consistency of the user-defined schema and the existing hoodie scheme when creating the hoodie table [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on PR #11869: URL: https://github.com/apache/hudi/pull/11869#issuecomment-2375558709 @huangxiaopingRD Hi, we still have genkins CI failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375556279 ## CI report: * 5470f7aa1fae582baf89740c9c889d87e620dd39 UNKNOWN * 4a619de91574abee820734d942ea238fec645e13 UNKNOWN * 0e0b31531f37a69f591db588f039ac40ac8e0cfb UNKNOWN *

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375547379 ## CI report: * b416e962b65eed32d8c2d219fa8caf7310a5e951 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=793)

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
linliu-code commented on code in PR #12007: URL: https://github.com/apache/hudi/pull/12007#discussion_r1776165910 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/DefaultSparkRecordMerger.java: ## @@ -50,7 +50,7 @@ public Option> merge(HoodieRecord older, Schema o

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on code in PR #12007: URL: https://github.com/apache/hudi/pull/12007#discussion_r1776159980 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/DefaultSparkRecordMerger.java: ## @@ -50,7 +50,7 @@ public Option> merge(HoodieRecord older, Schema old

Re: [I] [SUPPORT] two insert into operation works like upsert [hudi]

2024-09-25 Thread via GitHub
KnightChess commented on issue #11996: URL: https://github.com/apache/hudi/issues/11996#issuecomment-2375529288 @bithw1 my mistake, the `hoodie.combine.before.insert` is controll the imcoming records, your insert is two sql. you can set `set hoodie.spark.sql.insert.into.operation = insert`

Re: [PR] [HUDI-8179] Upgrade hudi flink connector to 1.20.0 [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on PR #11966: URL: https://github.com/apache/hudi/pull/11966#issuecomment-2375528233 @guptashailesh92 Thanks for the contribution, have you already referenced https://github.com/apache/hudi/pull/11779 for the changes? Especially we need to upload the docker images for bu

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2375523192 You might need this fix: https://github.com/apache/hudi/pull/10883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [HUDI-7507] Adding timestamp ordering validation before creating requested instant [hudi]

2024-09-25 Thread via GitHub
danny0405 commented on PR #11580: URL: https://github.com/apache/hudi/pull/11580#issuecomment-2375508787 > hey @danny0405 : I was chasing some test failures in this patch and realized that flink might have an issue. In [this](https://github.com/apache/hudi/blob/ed65de1460468ad33a374a66606c0

Re: [PR] [HUUse isDelete instead of isDeleted function for merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375487248 ## CI report: * b416e962b65eed32d8c2d219fa8caf7310a5e951 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

[jira] [Commented] (HUDI-7276) Fix IOException on the File group reader path

2024-09-25 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884815#comment-17884815 ] Lin Liu commented on HUDI-7276: --- Revisited the terraform setup. Searching for the dataset to

[jira] [Updated] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-09-25 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-6909: -- Status: Patch Available (was: In Progress) > Handle `_hoodie_operation` field in the new HoodieFileGroupReader

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
jonvex commented on code in PR #12006: URL: https://github.com/apache/hudi/pull/12006#discussion_r1776139780 ## hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java: ## @@ -91,6 +94,13 @@ public HoodieBaseFileGroupRecordBuffer(HoodieR

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375496872 ## CI report: * b416e962b65eed32d8c2d219fa8caf7310a5e951 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=793)

[jira] [Updated] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-09-25 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-6909: -- Reviewers: Danny Chen, Y Ethan Guo > Handle `_hoodie_operation` field in the new HoodieFileGroupReader > ---

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375495163 ## CI report: * b416e962b65eed32d8c2d219fa8caf7310a5e951 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=793)

Re: [PR] [HUDI-6909] Use isDelete instead of isDeleted function for Spark merger [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12007: URL: https://github.com/apache/hudi/pull/12007#issuecomment-2375488751 ## CI report: * b416e962b65eed32d8c2d219fa8caf7310a5e951 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=793)

[PR] [HUUse isDelete instead of isDeleted function for merger [hudi]

2024-09-25 Thread via GitHub
linliu-code opened a new pull request, #12007: URL: https://github.com/apache/hudi/pull/12007 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any perfor

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375483363 ## CI report: * 5470f7aa1fae582baf89740c9c889d87e620dd39 UNKNOWN * 4a619de91574abee820734d942ea238fec645e13 UNKNOWN * 0e0b31531f37a69f591db588f039ac40ac8e0cfb UNKNOWN *

[jira] [Commented] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-09-25 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884812#comment-17884812 ] Lin Liu commented on HUDI-6909: --- Compared the logic between Flink and Spark on the usage of

[jira] [Updated] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-09-25 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-6909: -- Status: In Progress (was: Reopened) > Handle `_hoodie_operation` field in the new HoodieFileGroupReader > -

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375459816 ## CI report: * 5470f7aa1fae582baf89740c9c889d87e620dd39 UNKNOWN * 4a619de91574abee820734d942ea238fec645e13 UNKNOWN * 0e0b31531f37a69f591db588f039ac40ac8e0cfb UNKNOWN *

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375435724 ## CI report: * 5470f7aa1fae582baf89740c9c889d87e620dd39 UNKNOWN * 4a619de91574abee820734d942ea238fec645e13 UNKNOWN * 0e0b31531f37a69f591db588f039ac40ac8e0cfb UNKNOWN *

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
linliu-code commented on code in PR #12006: URL: https://github.com/apache/hudi/pull/12006#discussion_r1776083862 ## hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java: ## @@ -91,6 +94,13 @@ public HoodieBaseFileGroupRecordBuffer(Ho

[jira] [Commented] (HUDI-6910) Handle schema evolution across base and log files in HoodieFileGroupReader

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884801#comment-17884801 ] Jonathan Vexler commented on HUDI-6910: --- This already works. We test extensively inĀ 

[jira] [Updated] (HUDI-7848) Fix the Comparable type of the ordering field value stored in delete record

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7848: -- Remaining Estimate: 1h (was: 4h) Original Estimate: 1h (was: 4h) > Fix the Comparable typ

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375355028 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Prevent ordering value cast exceptions in filegroup reader [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375348621 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375277180 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375272603 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375270273 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375267234 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375265141 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=790)

Re: [PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12006: URL: https://github.com/apache/hudi/pull/12006#issuecomment-2375262962 ## CI report: * d7ee3ab0b4ca29f92117d9a5318d0def43aeb849 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

[jira] [Created] (HUDI-8261) Implement casting for hive fg reader

2024-09-25 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-8261: - Summary: Implement casting for hive fg reader Key: HUDI-8261 URL: https://issues.apache.org/jira/browse/HUDI-8261 Project: Apache Hudi Issue Type: Improvem

[PR] [HUDI-7848] Fix ordering value schema evolution differences [hudi]

2024-09-25 Thread via GitHub
jonvex opened a new pull request, #12006: URL: https://github.com/apache/hudi/pull/12006 ### Change Logs Due to schema evolution, some edge cases for MIT, and spark using UTF8String, the ordering values of records can be different types. Now, we cast the ordering type into the reader

[jira] [Assigned] (HUDI-7848) Fix the Comparable type of the ordering field value stored in delete record

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler reassigned HUDI-7848: - Assignee: Jonathan Vexler (was: Lin Liu) > Fix the Comparable type of the ordering field

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2374979882 ## CI report: * c4403271f24e75fcb2b5d3c06d6eb84267c7c351 Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=787)

Re: [PR] [HUDI-5829] Optimize conversion from json to row format when sanitizing field names [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11941: URL: https://github.com/apache/hudi/pull/11941#issuecomment-2374980244 ## CI report: * 857f45b9871c3457a3d454c72b71ba389e29083b Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=581)

Re: [PR] [HUDI-5829] Optimize conversion from json to row format when sanitizing field names [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11941: URL: https://github.com/apache/hudi/pull/11941#issuecomment-2374971926 ## CI report: * 3107f9e1e252b483c0b0511a9c73a75ff9bf755e Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=491) Azur

Re: [PR] [HUDI-5829] Optimize conversion from json to row format when sanitizing field names [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11941: URL: https://github.com/apache/hudi/pull/11941#issuecomment-2374962528 ## CI report: * 3107f9e1e252b483c0b0511a9c73a75ff9bf755e Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=491) Azur

Re: [PR] [HUDI-5829] Optimize conversion from json to row format when sanitizing field names [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11941: URL: https://github.com/apache/hudi/pull/11941#issuecomment-2374954816 ## CI report: * 3107f9e1e252b483c0b0511a9c73a75ff9bf755e Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=491) Azur

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
dataproblems commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2374933378 Hi @yihua - I've not tried using `s3a`, what would be the difference between that and S3? What would be the intuition of trying it with `s3a`? -- This is an automated message

Re: [PR] [Hudi-8221][RFC-82] Concurrent schema evolution detection [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12005: URL: https://github.com/apache/hudi/pull/12005#issuecomment-2374906621 ## CI report: * fbb7d0154b59120c1fa1d91094b0cffca73c2fb9 Azure: [FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=786)

Re: [PR] [HUDI-5240] Reset the content during error handling in the log block reading [hudi]

2024-09-25 Thread via GitHub
kishoreraj05 commented on PR #7434: URL: https://github.com/apache/hudi/pull/7434#issuecomment-2374898682 @Yihua, could you let me know in which version this bug fix will be released? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[jira] [Updated] (HUDI-8192) Support merge mode table config in Spark Structured Streaming

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-8192: -- Remaining Estimate: 1h (was: 4h) Original Estimate: 1h (was: 4h) > Support merge mode tab

[jira] [Commented] (HUDI-8192) Support merge mode table config in Spark Structured Streaming

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884696#comment-17884696 ] Jonathan Vexler commented on HUDI-8192: --- [https://github.com/jonvex/hudi/pull/6] her

[jira] [Updated] (HUDI-8192) Support merge mode table config in Spark Structured Streaming

2024-09-25 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-8192: -- Status: Patch Available (was: In Progress) > Support merge mode table config in Spark Structure

[PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
Davis-Zhang-Onehouse opened a new pull request, #12003: URL: https://github.com/apache/hudi/pull/12003 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or a

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2374792864 ## CI report: * 94a0979f0d5093dc443420999362367793859ee6 Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=742)

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11921: URL: https://github.com/apache/hudi/pull/11921#issuecomment-2374788198 ## CI report: * 94a0979f0d5093dc443420999362367793859ee6 Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=742)

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
lokeshj1703 commented on code in PR #11921: URL: https://github.com/apache/hudi/pull/11921#discussion_r1775706138 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java: ## @@ -914,6 +950,97 @@ private void validateAllColumnStats( valida

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
lokeshj1703 commented on code in PR #11921: URL: https://github.com/apache/hudi/pull/11921#discussion_r1775705839 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java: ## @@ -914,6 +950,97 @@ private void validateAllColumnStats( valida

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
lokeshj1703 commented on code in PR #11921: URL: https://github.com/apache/hudi/pull/11921#discussion_r1775705064 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java: ## @@ -914,6 +950,97 @@ private void validateAllColumnStats( valida

[jira] [Created] (HUDI-8260) Fix col stats metadata validation so that log files are also validated

2024-09-25 Thread Lokesh Jain (Jira)
Lokesh Jain created HUDI-8260: - Summary: Fix col stats metadata validation so that log files are also validated Key: HUDI-8260 URL: https://issues.apache.org/jira/browse/HUDI-8260 Project: Apache Hudi

Re: [PR] [HUDI-8188] Add validation for partition stats index in HoodieMetadataTableValidator [hudi]

2024-09-25 Thread via GitHub
lokeshj1703 commented on code in PR #11921: URL: https://github.com/apache/hudi/pull/11921#discussion_r1775700225 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieMetadataTableValidator.java: ## @@ -139,6 +139,50 @@ public void testMetadataTableValidation(Stri

[jira] [Updated] (HUDI-8208) Fix partition stats with compaction or clustering

2024-09-25 Thread Lokesh Jain (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HUDI-8208: -- Description: Consider a partition with 10 file slices. If compaction triggered for 1 file slice fs1_1,

Re: [I] [SUPPORT] Hoodie Metadata exception while upserting using Spark Structured Streaming [hudi]

2024-09-25 Thread via GitHub
yihua commented on issue #11997: URL: https://github.com/apache/hudi/issues/11997#issuecomment-2374769722 @dataproblems Thanks for reporting the problem. It looks like you're running Spark structured streaming job on EMR using EMR file system and `S3NativeFileSystem`. Have you tried using

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12004: URL: https://github.com/apache/hudi/pull/12004#issuecomment-2374732021 ## CI report: * 04934a7d69d377c77c5f245f428d44b747a51cf6 Azure: [CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=785)

Re: [PR] [Hudi-8221] RFC for review [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12005: URL: https://github.com/apache/hudi/pull/12005#issuecomment-2374742429 ## CI report: * fbb7d0154b59120c1fa1d91094b0cffca73c2fb9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [Hudi-8221] RFC for review [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12005: URL: https://github.com/apache/hudi/pull/12005#issuecomment-2374745091 ## CI report: * fbb7d0154b59120c1fa1d91094b0cffca73c2fb9 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=786)

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12004: URL: https://github.com/apache/hudi/pull/12004#issuecomment-2374720514 ## CI report: * 04934a7d69d377c77c5f245f428d44b747a51cf6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [HUDI-8221] Claim RFC-82 for Concurrent schema evolution detection [hudi]

2024-09-25 Thread via GitHub
yihua merged PR #12004: URL: https://github.com/apache/hudi/pull/12004 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.o

(hudi) branch master updated: [HUDI-8221] Claim RFC-82 for Concurrent schema evolution detection (#12004)

2024-09-25 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 30f077d0ded [HUDI-8221] Claim RFC-82 for Concurrent

[PR] [Hudi-8221] RFC for review [hudi]

2024-09-25 Thread via GitHub
Davis-Zhang-Onehouse opened a new pull request, #12005: URL: https://github.com/apache/hudi/pull/12005 ### Change Logs RFC doc ### Impact No ### Risk level (write none, low medium or high below) No ### Documentation Update Check RFC content ###

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
Davis-Zhang-Onehouse closed pull request #12003: [HUDI-8221] RFC placeholder URL: https://github.com/apache/hudi/pull/12003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
Davis-Zhang-Onehouse opened a new pull request, #12004: URL: https://github.com/apache/hudi/pull/12004 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or a

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
Davis-Zhang-Onehouse closed pull request #11977: [HUDI-8221] RFC placeholder URL: https://github.com/apache/hudi/pull/11977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12004: URL: https://github.com/apache/hudi/pull/12004#issuecomment-2374728986 ## CI report: * 04934a7d69d377c77c5f245f428d44b747a51cf6 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=785)

Re: [PR] [HUDI-8221] RFC placeholder [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12004: URL: https://github.com/apache/hudi/pull/12004#issuecomment-2374723245 ## CI report: * 04934a7d69d377c77c5f245f428d44b747a51cf6 Azure: [PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=785)

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374597240 ## CI report: * a9ecc9a8fc6e28cd2ebdcfc6e1773625673f940a Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=784)

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
codope commented on code in PR #12001: URL: https://github.com/apache/hudi/pull/12001#discussion_r1775531824 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/command/index/TestFunctionalIndex.scala: ## @@ -424,6 +424,78 @@ class TestFunctionalIndex ex

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
codope commented on code in PR #12001: URL: https://github.com/apache/hudi/pull/12001#discussion_r1775530964 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java: ## @@ -316,6 +316,13 @@ public final class HoodieMetadataConfig extends HoodieConfi

[jira] [Updated] (HUDI-7484) Fix partitioning style when partition is inferred from partitionBy

2024-09-25 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7484: -- Remaining Estimate: 4h Original Estimate: 4h > Fix partitioning style when partition is inferred fr

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374426357 ## CI report: * f328dade37789a968eed244f9f0aac52c114aa5c Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=779)

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
jonvex commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374425472 > > > uber config > > > > > > What do you mean? Not the company right? > > No i meant like a super config for func indexes. Sorry for the confusion. I have rephrased.

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374422922 ## CI report: * f328dade37789a968eed244f9f0aac52c114aa5c Azure: [SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=779)

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
jonvex commented on code in PR #12001: URL: https://github.com/apache/hudi/pull/12001#discussion_r1775418456 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java: ## @@ -316,6 +316,13 @@ public final class HoodieMetadataConfig extends HoodieConfi

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
codope commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374421053 > > uber config > > What do you mean? Not the company right? No i meant like a super config for func indexes. Sorry for the confusion. I have rephrased. -- This is an automa

Re: [PR] [HUDI-8185] Fix SPARK record for Colstats [hudi]

2024-09-25 Thread via GitHub
hudi-bot commented on PR #11969: URL: https://github.com/apache/hudi/pull/11969#issuecomment-2374336621 ## CI report: * 8646dbab4b0570b7e38ce7c9abcb877165b16b4a UNKNOWN * a890523a4a7da69a9063adf1648223223022 UNKNOWN * 4a9cc20cc60a71abcedae1c3ec11500f15a67dce Azure: [SUCC

Re: [PR] [HUDI-7662] Add a metadata config to enable or disable functional index [hudi]

2024-09-25 Thread via GitHub
jonvex commented on PR #12001: URL: https://github.com/apache/hudi/pull/12001#issuecomment-2374327687 > uber config What do you mean? Not the company right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

  1   2   >