Re: [I] [SUPPORT] PySpark reading hudi partition column for hudi table incorrectly [hudi]

2023-10-19 Thread via GitHub
ad1happy2go commented on issue #9890: URL: https://github.com/apache/hudi/issues/9890#issuecomment-1772123321 @bradleybonitatibus If we have passed `keygen.timebased.output.dateformat='/MM/dd'` then we would expect the partition column value should be converted to that format. While

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1772120236 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * 5da881a1f377e366bd6317cb4844fa791b8cadd5 Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1772120159 ## CI report: * c9d09be26e354d8fb4efdc51e1e1e9a6da7d2031 Azure:

Re: [I] [SUPPORT] Compaction error [hudi]

2023-10-19 Thread via GitHub
ad1happy2go commented on issue #9885: URL: https://github.com/apache/hudi/issues/9885#issuecomment-1772113589 @fearlsgroove Can you try to read the data (without compaction) and see if you are able to and if data looks correct? -- This is an automated message from the Apache Git Service.

Re: [I] [SUPPORT]Index Bootstrap deleted some snapshot data that has been batch-inserted into Hudi ? [hudi]

2023-10-19 Thread via GitHub
imrewang commented on issue #9513: URL: https://github.com/apache/hudi/issues/9513#issuecomment-1772114275 > @imrewang Were you able to resolve this out? Please let us know if you need any help on this. It seems that I don’t need help at the moment, thank u again! ! ! -- This

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1772113242 ## CI report: * 50733026e5dae6744f6793b204f29a8c09e673e8 Azure:

Re: [PR] [HUDI-6962] Correct the behavior of bulk insert for NB-CC [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9896: URL: https://github.com/apache/hudi/pull/9896#issuecomment-1772079757 ## CI report: * f4fa15f3e401363be00607940dd5f8b6a4af3215 Azure:

Re: [PR] [HUDI-6962] Correct the behavior of bulk insert for NB-CC [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9896: URL: https://github.com/apache/hudi/pull/9896#issuecomment-1772072695 ## CI report: * f4fa15f3e401363be00607940dd5f8b6a4af3215 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9894: URL: https://github.com/apache/hudi/pull/9894#issuecomment-1772066552 ## CI report: * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 Azure:

[jira] [Updated] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6962: - Labels: pull-request-available (was: ) > Correct the behavior of bulk insert for NB-CC >

[PR] [HUDI-6962] Correct the behavior of bulk insert for NB-CC [hudi]

2023-10-19 Thread via GitHub
beyond1920 opened a new pull request, #9896: URL: https://github.com/apache/hudi/pull/9896 ### Change Logs In the previous [PR#9850](https://github.com/apache/hudi/pull/9850), I forget to take `bulk insert` into consideration. How to handle the case if the multiple writer contains

Re: [PR] [HUDI-6963] Fix class conflict of CreateIndex from Spark3.3 [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9895: URL: https://github.com/apache/hudi/pull/9895#issuecomment-1772035508 ## CI report: * 6eb38c5b78e4f53a7d21c15f5bb63cd528bc5132 Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1772035423 ## CI report: * 1a8581386b9147cd9b3a76d4a1664a17bf2e59fa Azure:

Re: [PR] [HUDI-6963] Fix class conflict of CreateIndex from Spark3.3 [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9895: URL: https://github.com/apache/hudi/pull/9895#issuecomment-1772029929 ## CI report: * 6eb38c5b78e4f53a7d21c15f5bb63cd528bc5132 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1772029826 ## CI report: * 1a8581386b9147cd9b3a76d4a1664a17bf2e59fa Azure:

Re: [PR] [HUDI-6878] Fix Overwrite error when ingest multiple tables [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9749: URL: https://github.com/apache/hudi/pull/9749#issuecomment-1772023277 ## CI report: * 149dfda8469d598e3098c418ce1e7bf99a4a177f UNKNOWN * 66ea14a95621e003cbf81773c78f0ad2147bbbf6 UNKNOWN * 1597dfa2436c2789e5a5e8dbecfe4f900383c35d Azure:

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1772023429 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * 5da881a1f377e366bd6317cb4844fa791b8cadd5 Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1772023369 ## CI report: * 1a8581386b9147cd9b3a76d4a1664a17bf2e59fa Azure:

[jira] [Updated] (HUDI-6963) Fix class conflict of CreateIndex from Spark3.3

2023-10-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6963: - Labels: pull-request-available (was: ) > Fix class conflict of CreateIndex from Spark3.3 >

[PR] [HUDI-6963] Fix class conflict of CreateIndex from Spark3.3 [hudi]

2023-10-19 Thread via GitHub
boneanxs opened a new pull request, #9895: URL: https://github.com/apache/hudi/pull/9895 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ CreateIndex is added in [HUDI-4165](https://github.com/apache/hudi/pull/5761/files), and

[jira] [Updated] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6962: - Description: How to handle the case if the multiple writer contains a job with bulk insert operation? 1.

[jira] [Updated] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6962: - Description: How to handle the case if the multiple writer contains a job with bulk insert operation? 1.

[jira] [Created] (HUDI-6963) Fix class conflict of CreateIndex from Spark3.3

2023-10-19 Thread Hui An (Jira)
Hui An created HUDI-6963: Summary: Fix class conflict of CreateIndex from Spark3.3 Key: HUDI-6963 URL: https://issues.apache.org/jira/browse/HUDI-6963 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6962: - Description: How to handle the case if the multiple writer contains a job with bulk insert operation? 1.

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771994205 ## CI report: * 1a8581386b9147cd9b3a76d4a1664a17bf2e59fa Azure:

Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9894: URL: https://github.com/apache/hudi/pull/9894#issuecomment-1771987971 ## CI report: * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 Azure:

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.blo… [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9886: URL: https://github.com/apache/hudi/pull/9886#issuecomment-1771987931 ## CI report: * c3e2763136ca7108adc32162d093abb58b143363 Azure:

Re: [PR] [HUDI-6878] Fix Overwrite error when ingest multiple tables [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9749: URL: https://github.com/apache/hudi/pull/9749#issuecomment-1771987757 ## CI report: * 149dfda8469d598e3098c418ce1e7bf99a4a177f UNKNOWN * 66ea14a95621e003cbf81773c78f0ad2147bbbf6 UNKNOWN * 1597dfa2436c2789e5a5e8dbecfe4f900383c35d Azure:

[jira] [Assigned] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang reassigned HUDI-6962: Assignee: Jing Zhang > Correct the behavior of bulk insert for NB-CC >

[jira] [Created] (HUDI-6962) Correct the behavior of bulk insert for NB-CC

2023-10-19 Thread Jing Zhang (Jira)
Jing Zhang created HUDI-6962: Summary: Correct the behavior of bulk insert for NB-CC Key: HUDI-6962 URL: https://issues.apache.org/jira/browse/HUDI-6962 Project: Apache Hudi Issue Type: New

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.blo… [hudi]

2023-10-19 Thread via GitHub
xicm commented on PR #9886: URL: https://github.com/apache/hudi/pull/9886#issuecomment-1771984196 > Is the `_hoodie_record_key` field always there, I see many test failures. I run the failed test locally, it is successful. I think another thing, if the col-stats field is user

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
lokesh-lingarajan-0310 commented on code in PR #9852: URL: https://github.com/apache/hudi/pull/9852#discussion_r1366375237 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -553,17 +546,13 @@ private Pair>> fetchFromSourc

Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9894: URL: https://github.com/apache/hudi/pull/9894#issuecomment-1771981041 ## CI report: * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.blo… [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9886: URL: https://github.com/apache/hudi/pull/9886#issuecomment-1771980996 ## CI report: * c3e2763136ca7108adc32162d093abb58b143363 Azure:

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.blo… [hudi]

2023-10-19 Thread via GitHub
xicm commented on PR #9886: URL: https://github.com/apache/hudi/pull/9886#issuecomment-1771980280 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[jira] [Updated] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6798: Status: In Progress (was: Open) > Implement event-time-based merging mode in FileGroupReader >

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on code in PR #9879: URL: https://github.com/apache/hudi/pull/9879#discussion_r1365496418 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/CompletionTimeQueryView.java: ## @@ -88,7 +88,8 @@ public

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
stream2000 commented on PR #9887: URL: https://github.com/apache/hudi/pull/9887#issuecomment-1771950583 > Makes sense, because in #9776, we changed the basic rollback logic for log files, all the log files are rolled back the same style with base files. The dirty data left is

[jira] [Updated] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2023-10-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6798: - Labels: pull-request-available (was: ) > Implement event-time-based merging mode in

[PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub
yihua opened a new pull request, #9894: URL: https://github.com/apache/hudi/pull/9894 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1771946413 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * 881a74b6a3f85f2014e7eb211fe2a9041be418ed Azure:

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
stream2000 commented on code in PR #9887: URL: https://github.com/apache/hudi/pull/9887#discussion_r1366346017 ## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java: ## @@ -97,7 +97,6 @@ public void commit(List

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on code in PR #9887: URL: https://github.com/apache/hudi/pull/9887#discussion_r1366335823 ## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java: ## @@ -97,7 +97,6 @@ public void commit(List

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on code in PR #9887: URL: https://github.com/apache/hudi/pull/9887#discussion_r1366335823 ## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java: ## @@ -97,7 +97,6 @@ public void commit(List

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on PR #9887: URL: https://github.com/apache/hudi/pull/9887#issuecomment-1771930875 > This bug exists before https://github.com/apache/hudi/pull/9776, but only after https://github.com/apache/hudi/pull/9776 the test will fail Makes sense, because in

Re: [PR] [HUDI-6959] Bulk insert V2 do not rollback failed instant on abort [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on code in PR #9887: URL: https://github.com/apache/hudi/pull/9887#discussion_r1366329742 ## hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java: ## @@ -97,7 +97,6 @@ public void commit(List

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.blo… [hudi]

2023-10-19 Thread via GitHub
danny0405 commented on PR #9886: URL: https://github.com/apache/hudi/pull/9886#issuecomment-1771927902 Is the `_hoodie_record_key` field always there, I see many test failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1771902294 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * 881a74b6a3f85f2014e7eb211fe2a9041be418ed Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771902240 ## CI report: * 1a8581386b9147cd9b3a76d4a1664a17bf2e59fa Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771890253 ## CI report: * 962ca40a77f2d32f2addec9e57f05f86256afedb Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771846755 ## CI report: * 00714535a8643da2ff4fadf9d4044f9db6671d2d Azure:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771840700 ## CI report: * 00714535a8643da2ff4fadf9d4044f9db6671d2d Azure:

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Story Points: 3 (was: 2) > Deletes with custom delete field not working with DefaultHoodieRecordPayload >

Re: [PR] [HUDI-6961] Fix deletes with custom delete field in DefaultHoodieRecordPayload [hudi]

2023-10-19 Thread via GitHub
yihua commented on code in PR #9892: URL: https://github.com/apache/hudi/pull/9892#discussion_r1366228172 ## hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java: ## @@ -86,30 +86,27 @@ public Option getInsertValue(Schema schema, Properties

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
nsivabalan commented on code in PR #9852: URL: https://github.com/apache/hudi/pull/9852#discussion_r1366227106 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -584,16 +573,36 @@ private Pair>> fetchFromSourc

Re: [PR] [DOCS] Remove /next reference from already published doc versions [hudi]

2023-10-19 Thread via GitHub
bhasudha commented on PR #9893: URL: https://github.com/apache/hudi/pull/9893#issuecomment-1771827351 > is this tested and the links are good? Yes. tested locally. looks okay. Waiting for CI to see if there are any other broken links. -- This is an automated message from the

Re: [PR] [HUDI-6961] Fix deletes with custom delete field in DefaultHoodieRecordPayload [hudi]

2023-10-19 Thread via GitHub
nsivabalan commented on code in PR #9892: URL: https://github.com/apache/hudi/pull/9892#discussion_r1366218260 ## hudi-common/src/main/java/org/apache/hudi/common/model/BaseAvroPayload.java: ## @@ -83,7 +89,7 @@ public boolean canProduceSentinel() { * @param genericRecord

Re: [PR] [DOCS] Remove /next reference from already published doc versions [hudi]

2023-10-19 Thread via GitHub
nsivabalan commented on PR #9893: URL: https://github.com/apache/hudi/pull/9893#issuecomment-1771814016 is this tested and the links are good? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771807983 ## CI report: * 00714535a8643da2ff4fadf9d4044f9db6671d2d Azure:

[PR] [DOCS] Remove /next reference from already published doc versions [hudi]

2023-10-19 Thread via GitHub
bhasudha opened a new pull request, #9893: URL: https://github.com/apache/hudi/pull/9893 ### Change Logs Docs clean up. No new changes. Remove all references to /docs/next/... and replace with /docs/... for links sanity. Otherwise some older versions were pointing to /docs/next/...

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771801002 ## CI report: * 00714535a8643da2ff4fadf9d4044f9db6671d2d Azure:

Re: [PR] [HUDI-6872] Simplify Out Of Box Schema Evolution Functionality [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9743: URL: https://github.com/apache/hudi/pull/9743#issuecomment-1771800794 ## CI report: * 097ef6176650413eef2a4c3581ca6e48ea43788f UNKNOWN * e32b58f7ce1880568566be0c8a6940ae2f3a1016 UNKNOWN * 0fe4d74eb04601d878a44c6d8892168e1e321d1a Azure:

Re: [PR] [HUDI-6961] Fix deletes with custom delete field in DefaultHoodieRecordPayload [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9892: URL: https://github.com/apache/hudi/pull/9892#issuecomment-1771793685 ## CI report: * d2fd9c42f3994b5b5ba49e9c160f91add1f4aa96 Azure:

Re: [PR] [HUDI-6872] Simplify Out Of Box Schema Evolution Functionality [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9743: URL: https://github.com/apache/hudi/pull/9743#issuecomment-1771793484 ## CI report: * 097ef6176650413eef2a4c3581ca6e48ea43788f UNKNOWN * e32b58f7ce1880568566be0c8a6940ae2f3a1016 UNKNOWN * d3697955819bc90d5a136077472729ed1e3b57c9 Azure:

Re: [I] [SUPPORT] Facing java.util.NoSuchElementException on EMR 6.12 (Hudi 0.13) with inline compaction and cleaning on MoR tables [hudi]

2023-10-19 Thread via GitHub
arunvasudevan commented on issue #9861: URL: https://github.com/apache/hudi/issues/9861#issuecomment-1771769666 @ad1happy2go Let me know if you need any more info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [HUDI-6961] Fix deletes with custom delete field in DefaultHoodieRecordPayload [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9892: URL: https://github.com/apache/hudi/pull/9892#issuecomment-1771751288 ## CI report: * d2fd9c42f3994b5b5ba49e9c160f91add1f4aa96 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6961: - Labels: pull-request-available (was: ) > Deletes with custom delete field not working with

[PR] [HUDI-6961] Fix deletes with custom delete field in DefaultHoodieRecordPayload [hudi]

2023-10-19 Thread via GitHub
yihua opened a new pull request, #9892: URL: https://github.com/apache/hudi/pull/9892 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
lokesh-lingarajan-0310 commented on code in PR #9852: URL: https://github.com/apache/hudi/pull/9852#discussion_r1366111650 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -316,13 +314,15 @@ public StreamSync(HoodieStreamer.Config cfg,

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
lokesh-lingarajan-0310 commented on code in PR #9852: URL: https://github.com/apache/hudi/pull/9852#discussion_r1366100184 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -597,16 +578,56 @@ private Pair>> fetchFromSourc

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
nsivabalan commented on code in PR #9852: URL: https://github.com/apache/hudi/pull/9852#discussion_r1365976614 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -316,13 +314,15 @@ public StreamSync(HoodieStreamer.Config cfg, SparkSession

[I] [SUPPORT] KryoException while using spark-hudi [hudi]

2023-10-19 Thread via GitHub
akshayakp97 opened a new issue, #9891: URL: https://github.com/apache/hudi/issues/9891 **Describe the problem you faced** Running into below error while using emr 6-10.1. Want to understand if this was fixed by newer Hudi versions. Found a similar issue for Flink -

Re: [PR] [HUDI-6960] Support read partition values from path when schema evolution enabled [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9889: URL: https://github.com/apache/hudi/pull/9889#issuecomment-1771591577 ## CI report: * b1eb4c3b0ce0a167318cf9e863ec4223233cd556 Azure:

Re: [PR] [DOCS]Improve videos page with tagging capability [hudi]

2023-10-19 Thread via GitHub
bhasudha merged PR #9882: URL: https://github.com/apache/hudi/pull/9882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Row writer optimization for bulk insert [hudi]

2023-10-19 Thread via GitHub
nsivabalan commented on PR #9852: URL: https://github.com/apache/hudi/pull/9852#issuecomment-1771533852 Can you add PR description w/ sufficient details -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9871: URL: https://github.com/apache/hudi/pull/9871#issuecomment-1771511294 ## CI report: * f0a1258092388ff7d2ac67b8de7180be25a2137e Azure:

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Description: When configuring custom delete key and delete marker with DefaultHoodieRecordPayload, writing

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Description: When configuring custom delete key and delete marker with  {code:java} Error for key:HoodieKey

[jira] [Created] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6961: --- Summary: Deletes with custom delete field not working with DefaultHoodieRecordPayload Key: HUDI-6961 URL: https://issues.apache.org/jira/browse/HUDI-6961 Project: Apache Hudi

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Fix Version/s: 0.14.1 > Deletes with custom delete field not working with DefaultHoodieRecordPayload >

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Affects Version/s: 0.14.0 > Deletes with custom delete field not working with DefaultHoodieRecordPayload >

[jira] [Assigned] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-6961: --- Assignee: Ethan Guo > Deletes with custom delete field not working with DefaultHoodieRecordPayload >

[jira] [Updated] (HUDI-6961) Deletes with custom delete field not working with DefaultHoodieRecordPayload

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6961: Story Points: 2 > Deletes with custom delete field not working with DefaultHoodieRecordPayload >

Re: [I] [SUPPORT] Compaction error [hudi]

2023-10-19 Thread via GitHub
fearlsgroove commented on issue #9885: URL: https://github.com/apache/hudi/issues/9885#issuecomment-1771473502 Yea I was thinking that as well but I don't see how it could be the case. We did evolve the schema in the source topic recently, but the only decimal field in the data did not

[jira] [Updated] (HUDI-6722) Performance and API improvement on record merging

2023-10-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6722: Priority: Blocker (was: Major) > Performance and API improvement on record merging >

[I] [SUPPORT] PySpark reading hudi partition column for hudi table incorrectly [hudi]

2023-10-19 Thread via GitHub
bradleybonitatibus opened a new issue, #9890: URL: https://github.com/apache/hudi/issues/9890 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

Re: [PR] [HUDI-6872] Simplify Out Of Box Schema Evolution Functionality [hudi]

2023-10-19 Thread via GitHub
lokesh-lingarajan-0310 commented on code in PR #9743: URL: https://github.com/apache/hudi/pull/9743#discussion_r1365855918 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java: ## @@ -111,17 +111,21 @@ public static InternalSchema

Re: [PR] [HUDI-6960] Support read partition values from path when schema evolution enabled [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9889: URL: https://github.com/apache/hudi/pull/9889#issuecomment-1771376027 ## CI report: * b1eb4c3b0ce0a167318cf9e863ec4223233cd556 Azure:

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9871: URL: https://github.com/apache/hudi/pull/9871#issuecomment-1771375858 ## CI report: * 6c99a1f414f82b85331fd1b256325641b0417f2e Azure:

Re: [PR] [HUDI-6960] Support read partition values from path when schema evolution enabled [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9889: URL: https://github.com/apache/hudi/pull/9889#issuecomment-1771361230 ## CI report: * b1eb4c3b0ce0a167318cf9e863ec4223233cd556 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1771361109 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * 881a74b6a3f85f2014e7eb211fe2a9041be418ed Azure:

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9871: URL: https://github.com/apache/hudi/pull/9871#issuecomment-1771360969 ## CI report: * 6c99a1f414f82b85331fd1b256325641b0417f2e Azure:

Re: [PR] [HUDI-6872] Simplify Out Of Box Schema Evolution Functionality [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9743: URL: https://github.com/apache/hudi/pull/9743#issuecomment-1771360597 ## CI report: * 097ef6176650413eef2a4c3581ca6e48ea43788f UNKNOWN * e32b58f7ce1880568566be0c8a6940ae2f3a1016 UNKNOWN * d3697955819bc90d5a136077472729ed1e3b57c9 Azure:

Re: [PR] [HUDI-6790] Support incremental queries using HoodieFileGroupReader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9888: URL: https://github.com/apache/hudi/pull/9888#issuecomment-1771346556 ## CI report: * 512aabead021aed3817215d1c6aecf567cd0a575 Azure:

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1771346450 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * bca8fc00528c9e2b789fd235847d00bb29d09618 Azure:

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-19 Thread via GitHub
codope commented on code in PR #9871: URL: https://github.com/apache/hudi/pull/9871#discussion_r1365814203 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1038,7 +1038,7 @@ protected void validateRollback(

[jira] [Updated] (HUDI-6960) Support read partition values from path when schema evolution enabled

2023-10-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6960: - Labels: pull-request-available (was: ) > Support read partition values from path when schema

[PR] [HUDI-6960] Support read partition values from path when schema evolution enabled [hudi]

2023-10-19 Thread via GitHub
wecharyu opened a new pull request, #9889: URL: https://github.com/apache/hudi/pull/9889 ### Change Logs 1. val `shouldExtractPartitionValuesFromPartitionPath` in class `BaseFileOnlyRelation` get parent value if schema evolution enabled. 2. `toHadoopFsRelation()` should always read

[jira] [Created] (HUDI-6960) Support read partition values from path when schema evolution enabled

2023-10-19 Thread Wechar (Jira)
Wechar created HUDI-6960: Summary: Support read partition values from path when schema evolution enabled Key: HUDI-6960 URL: https://issues.apache.org/jira/browse/HUDI-6960 Project: Apache Hudi

Re: [PR] [HUDI-6952] Skip reading the uncommitted log files for log reader [hudi]

2023-10-19 Thread via GitHub
hudi-bot commented on PR #9879: URL: https://github.com/apache/hudi/pull/9879#issuecomment-1771232182 ## CI report: * 94301a15c9f75355c0ebf5bab3baf6226820ac42 UNKNOWN * bca8fc00528c9e2b789fd235847d00bb29d09618 Azure:

  1   2   >