Re: [PR] [HUDI-6757] Fix compaction execution terminated in async threads in flink bounded… [hudi]

2024-04-01 Thread via GitHub


flashJd commented on PR #9544:
URL: https://github.com/apache/hudi/pull/9544#issuecomment-2031185874

   > yeah, @flashJd is there any possibility you can rebase with the latest 
master?
   
   @danny0405 you can collaborate, I doesn't pay attention to hudi several month


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2031142489

   
   ## CI report:
   
   * c344e38bfcfea10fb1556a4d335af1b5b92da6ee Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23077)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6806] Support Spark 3.5.0 [hudi]

2024-04-01 Thread via GitHub


melin commented on PR #9717:
URL: https://github.com/apache/hudi/pull/9717#issuecomment-2031122042

   > The 0.15.0 release branch is planned to be cut this month once we verify 
engine integrations.
   
   When will it be released?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2031095571

   
   ## CI report:
   
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23070)
 
   * c344e38bfcfea10fb1556a4d335af1b5b92da6ee Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23077)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2031089981

   
   ## CI report:
   
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23070)
 
   * c344e38bfcfea10fb1556a4d335af1b5b92da6ee UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2031039502

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 208249c7f8164e434a8760d64678ab86295a26fc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23076)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2031034219

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 029a6466f51d1ad0103521c45639aaf2e47240c9 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23075)
 
   * 208249c7f8164e434a8760d64678ab86295a26fc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2031028647

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 029a6466f51d1ad0103521c45639aaf2e47240c9 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23075)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030997720

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 3dc06097b480a32194508bb1d1edd6f4806feeec Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23074)
 
   * 029a6466f51d1ad0103521c45639aaf2e47240c9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (HUDI-4699) Primary key-less data model

2024-04-01 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan resolved HUDI-4699.
---

> Primary key-less data model
> ---
>
> Key: HUDI-4699
> URL: https://issues.apache.org/jira/browse/HUDI-4699
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: writer-core
>Reporter: Sagar Sumit
>Priority: Major
>  Labels: pull-request-available
>
> Hudi requires users to specify a primary key field. Can we do away with this 
> requirement? This epic tracks the work to support use cases which does not 
> require primary key based data modelling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-4699) Primary key-less data model

2024-04-01 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan closed HUDI-4699.
-
Fix Version/s: 0.14.0
   Resolution: Fixed

> Primary key-less data model
> ---
>
> Key: HUDI-4699
> URL: https://issues.apache.org/jira/browse/HUDI-4699
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: writer-core
>Reporter: Sagar Sumit
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> Hudi requires users to specify a primary key field. Can we do away with this 
> requirement? This epic tracks the work to support use cases which does not 
> require primary key based data modelling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (HUDI-4699) Primary key-less data model

2024-04-01 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan reopened HUDI-4699:
---
Assignee: sivabalan narayanan

> Primary key-less data model
> ---
>
> Key: HUDI-4699
> URL: https://issues.apache.org/jira/browse/HUDI-4699
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: writer-core
>Reporter: Sagar Sumit
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: pull-request-available
>
> Hudi requires users to specify a primary key field. Can we do away with this 
> requirement? This epic tracks the work to support use cases which does not 
> require primary key based data modelling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUGGEST] Can the community version be updated regularly and faster? The roadmap should also be updated regularly and synchronized. [hudi]

2024-04-01 Thread via GitHub


danny0405 commented on issue #10944:
URL: https://github.com/apache/hudi/issues/10944#issuecomment-2030996714

   Thanks for the notation, we are working hard to prepare a GA release for 
1.0, we want it to be in good shape, that is why the waiting period is kind of 
long comparing to other releases.
   
   Will update the roadmap soon, thanks for the reminder again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Handle cases of malformed records when converting to json [hudi]

2024-04-01 Thread via GitHub


danny0405 commented on code in PR #10943:
URL: https://github.com/apache/hudi/pull/10943#discussion_r1547081375


##
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##
@@ -209,10 +210,18 @@ public static byte[] avroToJson(GenericRecord record, 
boolean pretty) throws IOE
   private static ByteArrayOutputStream avroToJsonHelper(GenericRecord record, 
boolean pretty) throws IOException {
 DatumWriter writer = new GenericDatumWriter<>(record.getSchema());
 ByteArrayOutputStream out = new ByteArrayOutputStream();
-JsonEncoder jsonEncoder = 
EncoderFactory.get().jsonEncoder(record.getSchema(), out, pretty);
-writer.write(record, jsonEncoder);
-jsonEncoder.flush();
-return out;
+try {
+  JsonEncoder jsonEncoder = 
EncoderFactory.get().jsonEncoder(record.getSchema(), out, pretty);
+  writer.write(record, jsonEncoder);
+  jsonEncoder.flush();
+  return out;
+} catch (ClassCastException | NullPointerException ex) {
+  // NullPointerException will be thrown in cases where the field values 
are missing
+  // ClassCastException will be thrown in cases where the field values do 
not match the schema type
+  // Fallback to using `toString` which also returns json but without a 
pretty-print option
+  out.write(record.toString().getBytes(StandardCharsets.UTF_8));

Review Comment:
   >  think I've convinced myself there should just be a new method like 
"safeToJson" that does not throw an exception that we use in the error 
table/writer cases since those are not as critical to Hudi.
   
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030992946

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 627ddbeabf3e1886f64f1432499003f39ddba49c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23073)
 
   * 3dc06097b480a32194508bb1d1edd6f4806feeec UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030985254

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * 627ddbeabf3e1886f64f1432499003f39ddba49c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23073)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat updated HUDI-7559:
--
Description: `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` 
throws NPE which is then subsequently ignored by 
`lookupCandidateFilesInMetadataTable()` rendering every other index (like 
FunctionalIndex, ColStat Index) to not be used for data skipping (i.e pruning 
files)  (was: `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` throws 
NPE which is then subsequently `lookupCandidateFilesInMetadataTable()` 
rendering every other index (like FunctionalIndex, ColStat Index) to not be 
used for data skipping (i.e pruning files))

> Fix functional index (on column stats): Handle NPE in 
> filterQueriesWithRecordKey(...)
> -
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` throws NPE which is 
> then subsequently ignored by `lookupCandidateFilesInMetadataTable()` 
> rendering every other index (like FunctionalIndex, ColStat Index) to not be 
> used for data skipping (i.e pruning files)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


xuzifu666 closed pull request #10898: [HUDI-7522] Support find out the conflict 
instants in bucket partition when bucket id multiple
URL: https://github.com/apache/hudi/pull/10898


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030954340

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * de9c573008c76367234cd859ca80ee165556e954 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23072)
 
   * 627ddbeabf3e1886f64f1432499003f39ddba49c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7552] Remove the suffix for MDT table service instants [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10945:
URL: https://github.com/apache/hudi/pull/10945#issuecomment-2030948488

   
   ## CI report:
   
   * 6c3830bb4de1887f41aebc139b3fc837e446ead5 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23071)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030948345

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * bc4fe83062daefe310b394a0d9b698a8c950c068 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23049)
 
   * de9c573008c76367234cd859ca80ee165556e954 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23072)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030941589

   
   ## CI report:
   
   * e9fc630d3a8999c7ef0db7bd94da910b1f77df7d UNKNOWN
   * b7011691a07deb288ce0341dcd55bb6feeb4101d UNKNOWN
   * bc4fe83062daefe310b394a0d9b698a8c950c068 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23049)
 
   * de9c573008c76367234cd859ca80ee165556e954 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832988#comment-17832988
 ] 

Vinoth Chandar commented on HUDI-7559:
--

[~codope] Hows this different from what we tested for beta1?

 

> Fix functional index (on column stats): Handle NPE in 
> filterQueriesWithRecordKey(...)
> -
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` throws NPE which is 
> then subsequently `lookupCandidateFilesInMetadataTable()` rendering every 
> other index (like FunctionalIndex, ColStat Index) to not be used for data 
> skipping (i.e pruning files)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-7559:
-
Sprint: Sprint 2024-03-25

> Fix functional index (on column stats): Handle NPE in 
> filterQueriesWithRecordKey(...)
> -
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` throws NPE which is 
> then subsequently `lookupCandidateFilesInMetadataTable()` rendering every 
> other index (like FunctionalIndex, ColStat Index) to not be used for data 
> skipping (i.e pruning files)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


xuzifu666 commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2030921118

   > @xuzifu666 @danny0405 @beyond1920 i think we should solve the root cause 
of bucket duplication. There are currently three situations where bucket file 
duplication occurs
   > 
   > 1. Spark Speculation execution .  Turn 
off speculative execution , we can solve this problem
   > 2. hoodier archiver Parallel deletet  complete timeline  .  
1.0 has solved this problem.
   > 3. Concurrent into ovewrite of multiple spark writer  . this is a bug 
need to fixed.
   > 
   > now focus on the scence3: Concurrent into ovewrite of multiple spark 
writer when hudi build fileslice, hudi will call isFileSliceCommitted to 
Determine if the current file is committed.
   > 
   > ```
   >   /**
   >* A FileSlice is considered committed, if one of the following is true 
- There is a committed data file - There are
   >* some log files, that are based off a commit or delta commit.
   >*/
   >   private boolean isFileSliceCommitted(FileSlice slice) {
   > if (!compareTimestamps(slice.getBaseInstantTime(), 
LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
   >   return false;
   > }
   > 
   > return 
timeline.containsOrBeforeTimelineStarts(slice.getBaseInstantTime());
   >   }
   > ```
   > 
   > this is ok for single concurrent write scenario, but for mutil write the 
logical of isFileSliceCommitted has some bugs. If a file has a smaller commit 
time then smallest complete commit, Hudi will directly determine that the file 
is committed, even if it is a Garbage file or (File generated by write failure)
   > 
   > eg: two spark app insert overwrite hudi BUCKET table with same partition. 
app1: start write commit at 0001 write files: 0--uuid1.parquet 
app2: start write commit at 0002 write files: 0--uuid2.parquet app1 
maybe failed to write due to OCC /cancel/OOM, but 0--uuid1.parquet is 
already written. when hudi build fileslice, 0--uuid1.parquet is 
considered as committed. since it‘s committime 0001 < smallest complete 
commit 0002. this is wrong, committime 0001 is not committed maybe we 
can modify isFileSliceCommitted like this
   > 
   > ```
   >private boolean isFileSliceCommitted(FileSlice slice) {
   > if (!compareTimestamps(slice.getBaseInstantTime(), 
LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
   >   return false;
   > }
   > 
   > return 
timeline.containsOrBeforeTimelineStarts(slice.getBaseInstantTime()) && 
UncompleteTimelineNotContains(slice.getBaseInstantTime());
   >   }
   > ```
   > 
   > finally, I think Hudi's fileslices should be managed uniformly, just like 
iceberg/delta lakes, rather than being obtained through list operation.
   
   Thanks for your advice,had test it in multiple write sences,it is ok as 
expected @xiarixiaoyao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7552] Remove the suffix for MDT table service instants [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10945:
URL: https://github.com/apache/hudi/pull/10945#issuecomment-2030897587

   
   ## CI report:
   
   * bf8eba5011f8ff4762e4da92aa57057873bafeab Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23063)
 
   * 6c3830bb4de1887f41aebc139b3fc837e446ead5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23071)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7552] Remove the suffix for MDT table service instants [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10945:
URL: https://github.com/apache/hudi/pull/10945#issuecomment-2030892213

   
   ## CI report:
   
   * bf8eba5011f8ff4762e4da92aa57057873bafeab Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23063)
 
   * 6c3830bb4de1887f41aebc139b3fc837e446ead5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (HUDI-1455) Hudi integration with project nessie

2024-04-01 Thread Wenrui Meng (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832970#comment-17832970
 ] 

Wenrui Meng commented on HUDI-1455:
---

Is there any plan for this issue? 

> Hudi integration with project nessie
> 
>
> Key: HUDI-1455
> URL: https://issues.apache.org/jira/browse/HUDI-1455
> Project: Apache Hudi
>  Issue Type: New Feature
>Reporter: Vinoth Chandar
>Priority: Major
>
> [https://github.com/apache/hudi/issues/2330#issuecomment-743423398] 
> Follow up from this. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7466] Add tests to AWSGlueCatalogSyncClient [hudi]

2024-04-01 Thread via GitHub


parisni commented on PR #10897:
URL: https://github.com/apache/hudi/pull/10897#issuecomment-2030589638

   > Can we make repairing tests a separate effort?
   
   makes sense. thanks for your insight


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [DOCS] Update roadmap [hudi]

2024-04-01 Thread via GitHub


xushiyan opened a new pull request, #10950:
URL: https://github.com/apache/hudi/pull/10950

   Update roadmap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7235] Fix checkpoint bug for S3/GCS Incremental Source [hudi]

2024-04-01 Thread via GitHub


bvaradar commented on PR #10336:
URL: https://github.com/apache/hudi/pull/10336#issuecomment-2030335892

   @vinishjail97 : Can you address these comments and land it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] use Temurin jdk [hudi]

2024-04-01 Thread via GitHub


bvaradar commented on PR #10948:
URL: https://github.com/apache/hudi/pull/10948#issuecomment-2030325960

   Will land once the CI tests succeed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-3431) Certify Hudi against Spark3 Hive3 Hadoop3

2024-04-01 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-3431:
-
Fix Version/s: 0.15.0

> Certify Hudi against Spark3 Hive3 Hadoop3
> -
>
> Key: HUDI-3431
> URL: https://issues.apache.org/jira/browse/HUDI-3431
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: dependencies
>Reporter: Raymond Xu
>Assignee: Rahil Chertara
>Priority: Blocker
> Fix For: 0.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [MINOR] Handle cases of malformed records when converting to json [hudi]

2024-04-01 Thread via GitHub


the-other-tim-brown commented on code in PR #10943:
URL: https://github.com/apache/hudi/pull/10943#discussion_r1546657481


##
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##
@@ -209,10 +210,18 @@ public static byte[] avroToJson(GenericRecord record, 
boolean pretty) throws IOE
   private static ByteArrayOutputStream avroToJsonHelper(GenericRecord record, 
boolean pretty) throws IOException {
 DatumWriter writer = new GenericDatumWriter<>(record.getSchema());
 ByteArrayOutputStream out = new ByteArrayOutputStream();
-JsonEncoder jsonEncoder = 
EncoderFactory.get().jsonEncoder(record.getSchema(), out, pretty);
-writer.write(record, jsonEncoder);
-jsonEncoder.flush();
-return out;
+try {
+  JsonEncoder jsonEncoder = 
EncoderFactory.get().jsonEncoder(record.getSchema(), out, pretty);
+  writer.write(record, jsonEncoder);
+  jsonEncoder.flush();
+  return out;
+} catch (ClassCastException | NullPointerException ex) {
+  // NullPointerException will be thrown in cases where the field values 
are missing
+  // ClassCastException will be thrown in cases where the field values do 
not match the schema type
+  // Fallback to using `toString` which also returns json but without a 
pretty-print option
+  out.write(record.toString().getBytes(StandardCharsets.UTF_8));

Review Comment:
   One concern I have is that this could hide some exception and then we don't 
catch something in our initial testing for some more critical timeline related 
flow. I think I've convinced myself there should just be a new method like 
"safeToJson" that does not throw an exception that we use in the error 
table/writer cases since those are not as critical to Hudi.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-7557] Fix incremental cleaner when commit for savepoint removed (#10946)

2024-04-01 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 9efced37f81 [HUDI-7557] Fix incremental cleaner when commit for 
savepoint removed (#10946)
9efced37f81 is described below

commit 9efced37f819ae59b51099ee43dc75e1a876a855
Author: Sagar Sumit 
AuthorDate: Mon Apr 1 23:00:19 2024 +0530

[HUDI-7557] Fix incremental cleaner when commit for savepoint removed 
(#10946)
---
 .../hudi/table/action/clean/CleanPlanner.java  |  1 +
 .../apache/hudi/table/action/TestCleanPlanner.java | 89 --
 2 files changed, 51 insertions(+), 39 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java
index 48ec8f9baa1..753f8c8253d 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java
@@ -245,6 +245,7 @@ public class CleanPlanner implements 
Serializable {
   Option instantOption = 
hoodieTable.getCompletedCommitsTimeline().filter(instant -> 
instant.getTimestamp().equals(savepointCommit)).firstInstant();
   if (!instantOption.isPresent()) {
 LOG.warn("Skipping to process a commit for which savepoint was removed 
as the instant moved to archived timeline already");
+return Stream.empty();
   }
   HoodieInstant instant = instantOption.get();
   return getPartitionsForInstants(instant);
diff --git 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/TestCleanPlanner.java
 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/TestCleanPlanner.java
index 8052572fcea..9989273b723 100644
--- 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/TestCleanPlanner.java
+++ 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/TestCleanPlanner.java
@@ -138,14 +138,14 @@ public class TestCleanPlanner {
   void testPartitionsForIncrCleaning(HoodieWriteConfig config, String 
earliestInstant,
  String lastCompletedTimeInLastClean, 
String lastCleanInstant, String earliestInstantsInLastClean, List 
partitionsInLastClean,
  Map> 
savepointsTrackedInLastClean, Map> 
activeInstantsPartitions,
- Map> savepoints, 
List expectedPartitions) throws IOException {
+ Map> savepoints, 
List expectedPartitions, boolean areCommitsForSavepointsRemoved) throws 
IOException {
 HoodieActiveTimeline activeTimeline = mock(HoodieActiveTimeline.class);
 when(mockHoodieTable.getActiveTimeline()).thenReturn(activeTimeline);
 // setup savepoint mocks
 Set savepointTimestamps = 
savepoints.keySet().stream().collect(Collectors.toSet());
 
when(mockHoodieTable.getSavepointTimestamps()).thenReturn(savepointTimestamps);
 if (!savepoints.isEmpty()) {
-  for (Map.Entry> entry: savepoints.entrySet()) {
+  for (Map.Entry> entry : savepoints.entrySet()) {
 Pair> 
savepointMetadataOptionPair = getSavepointMetadata(entry.getValue());
 HoodieInstant instant = new HoodieInstant(false, 
HoodieTimeline.SAVEPOINT_ACTION, entry.getKey());
 
when(activeTimeline.getInstantDetails(instant)).thenReturn(savepointMetadataOptionPair.getRight());
@@ -156,7 +156,7 @@ public class TestCleanPlanner {
 Pair> cleanMetadataOptionPair =
 getCleanCommitMetadata(partitionsInLastClean, lastCleanInstant, 
earliestInstantsInLastClean, lastCompletedTimeInLastClean, 
savepointsTrackedInLastClean.keySet());
 mockLastCleanCommit(mockHoodieTable, lastCleanInstant, 
earliestInstantsInLastClean, activeTimeline, cleanMetadataOptionPair);
-mockFewActiveInstants(mockHoodieTable, activeInstantsPartitions, 
savepointsTrackedInLastClean);
+mockFewActiveInstants(mockHoodieTable, activeInstantsPartitions, 
savepointsTrackedInLastClean, areCommitsForSavepointsRemoved);
 
 // Trigger clean and validate partitions to clean.
 CleanPlanner cleanPlanner = new CleanPlanner<>(context, 
mockHoodieTable, config);
@@ -332,7 +332,7 @@ public class TestCleanPlanner {
 
   static Stream keepLatestByHoursOrCommitsArgsIncrCleanPartitions() 
{
 String earliestInstant = "20231204194919610";
-String earliestInstantPlusTwoDays =  "20231206194919610";
+String earliestInstantPlusTwoDays = "20231206194919610";
 String lastCleanInstant = earliestInstantPlusTwoDays;
 String earliestInstantMinusThreeDays = "20231201194919610";
 String earliestInstantMinusFourDays = "20231130194919610";
@@ -340,9 +340,9 @@ public class T

Re: [PR] [HUDI-7557] Fix incremental cleaner when commit for savepoint removed [hudi]

2024-04-01 Thread via GitHub


nsivabalan merged PR #10946:
URL: https://github.com/apache/hudi/pull/10946


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2030174858

   
   ## CI report:
   
   * 51380200fafd1b3917658c549ab3caa3e5a408f5 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23069)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2030152748

   
   ## CI report:
   
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23070)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Column comments not syncing to AWS Glue Catalog [hudi]

2024-04-01 Thread via GitHub


TrustOkoroego commented on issue #8857:
URL: https://github.com/apache/hudi/issues/8857#issuecomment-2030087508

   @cbts-alec-johnson I need to implement this. Could you please tell you your 
configuration t o sync the comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-04-01 Thread via GitHub


NishantBaheti commented on issue #10850:
URL: https://github.com/apache/hudi/issues/10850#issuecomment-2030086053

   @ad1happy2go  moved to the MOR table. COW configurations felt a little 
unstable. had to rush the project to production quickly. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] use Temurin jdk [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10948:
URL: https://github.com/apache/hudi/pull/10948#issuecomment-2030071024

   
   ## CI report:
   
   * 3109fe81b4d356316fb2b2837270c226a36ccf50 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23067)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2030045443

   
   ## CI report:
   
   * 685ba9e778377eb4c1a72016c1c8a745e965551e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23068)
 
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23070)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2030044819

   
   ## CI report:
   
   * 0e2e1d8ea5829905db3464a97593bb81231bbc08 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23066)
 
   * 51380200fafd1b3917658c549ab3caa3e5a408f5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23069)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7562) using -DTest=TestClass will still run scala tests

2024-04-01 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-7562:
-

 Summary: using -DTest=TestClass will still run scala tests
 Key: HUDI-7562
 URL: https://issues.apache.org/jira/browse/HUDI-7562
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Jonathan Vexler


As a workaround for now, you can set -DwildcardSuites="abdcd" so that all scala 
tests are filtered out.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2029917956

   
   ## CI report:
   
   * 685ba9e778377eb4c1a72016c1c8a745e965551e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23068)
 
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23070)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2029905910

   
   ## CI report:
   
   * 685ba9e778377eb4c1a72016c1c8a745e965551e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23068)
 
   * 1984e34cf984ca5088cd921e26cd3d74421afb03 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2029905218

   
   ## CI report:
   
   * 521ae79c05782ff553c945bc84c27afe33f8e52a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23053)
 
   * 0e2e1d8ea5829905db3464a97593bb81231bbc08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23066)
 
   * 51380200fafd1b3917658c549ab3caa3e5a408f5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23069)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2029891276

   
   ## CI report:
   
   * 521ae79c05782ff553c945bc84c27afe33f8e52a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23053)
 
   * 0e2e1d8ea5829905db3464a97593bb81231bbc08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23066)
 
   * 51380200fafd1b3917658c549ab3caa3e5a408f5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] use Temurin jdk [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10948:
URL: https://github.com/apache/hudi/pull/10948#issuecomment-2029822794

   
   ## CI report:
   
   * 3109fe81b4d356316fb2b2837270c226a36ccf50 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23067)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2029822835

   
   ## CI report:
   
   * 685ba9e778377eb4c1a72016c1c8a745e965551e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23068)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2029822297

   
   ## CI report:
   
   * 521ae79c05782ff553c945bc84c27afe33f8e52a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23053)
 
   * 0e2e1d8ea5829905db3464a97593bb81231bbc08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23066)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10778:
URL: https://github.com/apache/hudi/pull/10778#issuecomment-2029810492

   
   ## CI report:
   
   * 521ae79c05782ff553c945bc84c27afe33f8e52a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23053)
 
   * 0e2e1d8ea5829905db3464a97593bb81231bbc08 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10949:
URL: https://github.com/apache/hudi/pull/10949#issuecomment-2029811194

   
   ## CI report:
   
   * 685ba9e778377eb4c1a72016c1c8a745e965551e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] use Temurin jdk [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10948:
URL: https://github.com/apache/hudi/pull/10948#issuecomment-2029811129

   
   ## CI report:
   
   * 3109fe81b4d356316fb2b2837270c226a36ccf50 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] hudi0.14.0: Insert data into hudi with spark or create a new table exception [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10838:
URL: https://github.com/apache/hudi/issues/10838#issuecomment-2029795850

   @SmyxBug Were you able to get it working with suggestion @CTTY provided. 
Feel free to close if you are all good here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6854) Change default keygen type to HOODIE_AVRO_DEFAULT

2024-04-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6854:
-
Labels: pull-request-available  (was: )

> Change default keygen type to HOODIE_AVRO_DEFAULT
> -
>
> Key: HUDI-6854
> URL: https://issues.apache.org/jira/browse/HUDI-6854
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vova Kolmakov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Current default is OVERWRITE_LATEST which instantiates 
> OverwriteWithLatestAvroPayload but it's not intuitive when latest gets 
> written and user sets some precombine field and expects to merge records 
> based on that field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-6854] Change default payload type to HOODIE_AVRO_DEFAULT [hudi]

2024-04-01 Thread via GitHub


wombatu-kun opened a new pull request, #10949:
URL: https://github.com/apache/hudi/pull/10949

   ### Change Logs
   
   Changed default payload type to HOODIE_AVRO_DEFAULT.  
   Current default is OVERWRITE_LATEST which instantiates 
OverwriteWithLatestAvroPayload but it's not intuitive when latest gets written 
and user sets some precombine field and expects to merge records based on that 
field.
   
   ### Impact
   
   none
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   Needs to update default value in documentation.
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Incremental query not working on COW table [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10850:
URL: https://github.com/apache/hudi/issues/10850#issuecomment-2029789548

   @NishantBaheti Were you able to get it resolve? Can you let us know full 
stack trace. Looks like Unable to load class means some library conflicts.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-6854) Change default keygen type to HOODIE_AVRO_DEFAULT

2024-04-01 Thread Vova Kolmakov (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vova Kolmakov reassigned HUDI-6854:
---

Assignee: Vova Kolmakov

> Change default keygen type to HOODIE_AVRO_DEFAULT
> -
>
> Key: HUDI-6854
> URL: https://issues.apache.org/jira/browse/HUDI-6854
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vova Kolmakov
>Priority: Major
> Fix For: 1.0.0
>
>
> Current default is OVERWRITE_LATEST which instantiates 
> OverwriteWithLatestAvroPayload but it's not intuitive when latest gets 
> written and user sets some precombine field and expects to merge records 
> based on that field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6854) Change default keygen type to HOODIE_AVRO_DEFAULT

2024-04-01 Thread Vova Kolmakov (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vova Kolmakov updated HUDI-6854:

Status: In Progress  (was: Open)

> Change default keygen type to HOODIE_AVRO_DEFAULT
> -
>
> Key: HUDI-6854
> URL: https://issues.apache.org/jira/browse/HUDI-6854
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vova Kolmakov
>Priority: Major
> Fix For: 1.0.0
>
>
> Current default is OVERWRITE_LATEST which instantiates 
> OverwriteWithLatestAvroPayload but it's not intuitive when latest gets 
> written and user sets some precombine field and expects to merge records 
> based on that field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUPPORT] could hudi skip shuffle in SortMergeJoin, like what bucketby does in Spark? [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10704:
URL: https://github.com/apache/hudi/issues/10704#issuecomment-2029776148

   @boneanxs @ziudu Created a JIRA - 
https://issues.apache.org/jira/browse/HUDI-7561


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7561) Skip shuffling entire data in SortMergeJoin while upserting

2024-04-01 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-7561:
---

 Summary: Skip shuffling entire data in SortMergeJoin while 
upserting
 Key: HUDI-7561
 URL: https://issues.apache.org/jira/browse/HUDI-7561
 Project: Apache Hudi
  Issue Type: Improvement
  Components: writer-core
Reporter: Aditya Goenka
 Fix For: 1.1.0


[https://github.com/apache/hudi/issues/10704]

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [MINOR] use Temurin jdk [hudi]

2024-04-01 Thread via GitHub


sullis opened a new pull request, #10948:
URL: https://github.com/apache/hudi/pull/10948

   ### Change Logs
   
   [MINOR] replace AdoptOpenJDK with Temurin jdk
   
   ### Impact
   
   n/a
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   n/a
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Data duplicated in base file on updating record partition [hudi]

2024-04-01 Thread via GitHub


codope closed issue #10932: [SUPPORT] Data duplicated in base file on updating 
record partition
URL: https://github.com/apache/hudi/issues/10932


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordKey [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10947:
URL: https://github.com/apache/hudi/pull/10947#issuecomment-2029728713

   
   ## CI report:
   
   * 85cbde75f0f652274dc28f940cd0a159096b6aad Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23065)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Nested object support in Hudi Table using Flink [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10895:
URL: https://github.com/apache/hudi/issues/10895#issuecomment-2029718076

   @waytoharish Did you got a chance to try out GenericRowData, Are you still 
facing the issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Async Clustering failing for MoR in 0.13.0 [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #8153:
URL: https://github.com/apache/hudi/issues/8153#issuecomment-2029711805

   @haripriyarhp I tried with 0.14.X version and it works fine. Couldn't 
reproduce. I know I am late. Let me know in case you were able to resolve this 
issue or need any other help on this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Archival not working for hudi & corresponding hudi metadata table [hudi]

2024-04-01 Thread via GitHub


codope closed issue #9478: [SUPPORT] Archival not working for hudi & 
corresponding hudi metadata table
URL: https://github.com/apache/hudi/issues/9478


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Archival not working for hudi & corresponding hudi metadata table [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #9478:
URL: https://github.com/apache/hudi/issues/9478#issuecomment-2029704490

   @PankajKaushal Closing this out. Please reopen or create a new one in case 
of any more issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] spark stuctrued streaming failed to update MDT metadata [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10891:
URL: https://github.com/apache/hudi/issues/10891#issuecomment-2029680639

   @xicm I will try to reproduce it. Can you provide more details on the steps 
which I can follow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Hudi deltastreamer fails due to Clean [hudi]

2024-04-01 Thread via GitHub


codope closed issue #7209: [SUPPORT] Hudi deltastreamer fails due to Clean
URL: https://github.com/apache/hudi/issues/7209


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Hudi deltastreamer fails due to Clean [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #7209:
URL: https://github.com/apache/hudi/issues/7209#issuecomment-2029660841

   @koldic Sorry we missed it. You can use multi writer concurrency control to 
handle that. 
https://hudi.apache.org/docs/concurrency_control/#enabling-multi-writing
   
   Closing this issue as it was due to multi writers. Thanks. Feel free to open 
new one in case of any new issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Historical Clean and RollBack commits are not archived [hudi]

2024-04-01 Thread via GitHub


codope closed issue #9084: [SUPPORT] Historical Clean and RollBack commits are 
not archived
URL: https://github.com/apache/hudi/issues/9084


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Historical Clean and RollBack commits are not archived [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #9084:
URL: https://github.com/apache/hudi/issues/9084#issuecomment-2029654012

   @thomasg19930417 Closing this issue. Please reopen in case you still have 
any doubts on this. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] AWS Athena query fail when compaction is scheduled for MOR table [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #9907:
URL: https://github.com/apache/hudi/issues/9907#issuecomment-2029651546

   @brightwon Were you able to identify the root cause issue? Do let us know in 
case you still need help here .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordKey [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10947:
URL: https://github.com/apache/hudi/pull/10947#issuecomment-2029649541

   
   ## CI report:
   
   * 85cbde75f0f652274dc28f940cd0a159096b6aad Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23065)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Occur bucketid multiple cannot write data to the wrong partition [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10899:
URL: https://github.com/apache/hudi/issues/10899#issuecomment-2029642719

   Also are you using Spark Structured streaming or HudiStreamer?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Occur bucketid multiple cannot write data to the wrong partition [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10899:
URL: https://github.com/apache/hudi/issues/10899#issuecomment-2029642198

   @xuzifu666 Can you please post the table/writer configuration you are using?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordKey [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10947:
URL: https://github.com/apache/hudi/pull/10947#issuecomment-2029642273

   
   ## CI report:
   
   * 85cbde75f0f652274dc28f940cd0a159096b6aad UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat updated HUDI-7559:
--
Description: `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` 
throws NPE which is then subsequently `lookupCandidateFilesInMetadataTable()` 
rendering every other index (like FunctionalIndex, ColStat Index) to not be 
used for data skipping (i.e pruning files)
Summary: Fix functional index (on column stats): Handle NPE in 
filterQueriesWithRecordKey(...)  (was: Fix issues with functional index (on 
column stats) based pruning)

> Fix functional index (on column stats): Handle NPE in 
> filterQueriesWithRecordKey(...)
> -
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> `RecordLevelIndexSupport::filterQueryWithRecordKey(...)` throws NPE which is 
> then subsequently `lookupCandidateFilesInMetadataTable()` rendering every 
> other index (like FunctionalIndex, ColStat Index) to not be used for data 
> skipping (i.e pruning files)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUPPORT] IllegalArgumentException at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33) [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10906:
URL: https://github.com/apache/hudi/issues/10906#issuecomment-2029636299

   @michael1991 Thanks for identifying the root cause. Do you have a fix in 
your mind. Created tracking jira for the same - 
https://issues.apache.org/jira/browse/HUDI-7560
   
   Are you using spark structured streaming to write or HudiStreamer? 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7560) Rollback with async cleaning creating deadlocks and failing the subsequent write

2024-04-01 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-7560:
---

 Summary: Rollback with async cleaning creating deadlocks and 
failing the subsequent write
 Key: HUDI-7560
 URL: https://issues.apache.org/jira/browse/HUDI-7560
 Project: Apache Hudi
  Issue Type: Bug
  Components: writer-core
Reporter: Aditya Goenka
 Fix For: 1.1.0


Github Issue - [https://github.com/apache/hudi/issues/10906]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat updated HUDI-7559:
--
Status: In Progress  (was: Open)

> Fix issues with functional index (on column stats) based pruning
> 
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat updated HUDI-7559:
--
Epic Link: HUDI-512

> Fix issues with functional index (on column stats) based pruning
> 
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat updated HUDI-7559:
--
Fix Version/s: 1.0.0

> Fix issues with functional index (on column stats) based pruning
> 
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7557] Fix incremental cleaner when commit for savepoint removed [hudi]

2024-04-01 Thread via GitHub


danny0405 commented on code in PR #10946:
URL: https://github.com/apache/hudi/pull/10946#discussion_r1546222049


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##
@@ -245,6 +245,7 @@ private List 
getPartitionsFromDeletedSavepoint(HoodieCleanMetadata clean
   Option instantOption = 
hoodieTable.getCompletedCommitsTimeline().filter(instant -> 
instant.getTimestamp().equals(savepointCommit)).firstInstant();
   if (!instantOption.isPresent()) {
 LOG.warn("Skipping to process a commit for which savepoint was removed 
as the instant moved to archived timeline already");
+return Stream.empty();

Review Comment:
   Does this mean the archived savepoint partition never got cleaned?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordKey [hudi]

2024-04-01 Thread via GitHub


bhat-vinay commented on PR #10947:
URL: https://github.com/apache/hudi/pull/10947#issuecomment-2029589170

   cc: @codope Please review. This is the first PR in a series of fixes 
required to prune files (and enable data skipping) using functional index based 
on column stats.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7559:
-
Labels: pull-request-available  (was: )

> Fix issues with functional index (on column stats) based pruning
> 
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7559] [1/n] Fix RecordLevelIndexSupport::filterQueryWithRecordKey [hudi]

2024-04-01 Thread via GitHub


bhat-vinay opened a new pull request, #10947:
URL: https://github.com/apache/hudi/pull/10947

   RecordLevelIndexSupport::filterQueryWithRecordKey() throws a NPE if the 
EqualTo query predicate is not of the form `AttributeReference = Literal`. This 
is because RecordLevelIndexSupport:::getAttributeLiteralTuple() returns null in 
such cases which is then derefercend unconditionally.
   
   This bug was rendering the functional index to not be used even when the 
query predicate had spark functions on which functional index is built. Hence 
these column-stats based functional index was not pruning files.
   
   This PR makes the following minor changes.
   1. Move some methods in RecordLevelIndexSupport into an object to make it 
static (to aid in unit testing)
   2. Fix filterQueryWithRecordKey() by checking for null return values from 
the call to getAttributeLiteralTuple
   3. Add unit tests in TestRecordLevelIndexSupport.scala
   
   ### Change Logs
   
   This PR makes the following minor changes.
   1. Move some methods in RecordLevelIndexSupport into an object to make it 
static (to aid in unit testing)
   2. Fix filterQueryWithRecordKey() by checking for null return values from 
the call to getAttributeLiteralTuple
   3. Add unit tests in TestRecordLevelIndexSupport.scala
   
   ### Impact
   
   Bug fix.
   
   ### Risk level (write none, low medium or high below)
   
   None
   
   ### Documentation Update
   
   None
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7522] Support find out the conflict instants in bucket partition when bucket id multiple [hudi]

2024-04-01 Thread via GitHub


xiarixiaoyao commented on PR #10898:
URL: https://github.com/apache/hudi/pull/10898#issuecomment-2029573829

   @xuzifu666 @danny0405 @beyond1920 
   i think we should solve the root cause of bucket duplication.
   There are currently three situations where bucket file duplication occurs
   1. Spark Speculation execution .  Turn 
off speculative execution , we can solve this problem 
   2. hoodier archiver Parallel deletet  complete timeline  .  1.0 
has solved this problem. 
   3. Concurrent into ovewrite of multiple spark writer  . this is a bug 
need to fixed.
   
   now focus on the scence3: Concurrent into ovewrite of multiple spark writer
   when hudi build fileslice, hudi will call isFileSliceCommitted to Determine 
if the current file is committed.
   ```
 /**
  * A FileSlice is considered committed, if one of the following is true - 
There is a committed data file - There are
  * some log files, that are based off a commit or delta commit.
  */
 private boolean isFileSliceCommitted(FileSlice slice) {
   if (!compareTimestamps(slice.getBaseInstantTime(), 
LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
 return false;
   }
   
   return 
timeline.containsOrBeforeTimelineStarts(slice.getBaseInstantTime());
 }
   ```
   this is ok for single concurrent write scenario, but for mutil write the 
logical of isFileSliceCommitted has some bugs.
   If a file has a smaller commit time then smallest complete commit, Hudi will 
directly determine that the file is committed, even 
   if it is a Garbage file or (File generated by write failure) 
  
   eg: two spark app insert overwrite hudi BUCKET table with same partition. 
   app1: start write commit at 0001   write files: 0--uuid1.parquet
   app2: start write commit at 0002   write files: 0--uuid2.parquet
   app1 maybe failed to write due to OCC /cancel/OOM, but   
0--uuid1.parquet is already written.
   when hudi build fileslice, 0--uuid1.parquet is considered as 
committed. since it‘s committime 0001 < smallest complete commit 0002. 
this is wrong, committime 0001 is not committed
   maybe we can modify isFileSliceCommitted like this
   ```
  private boolean isFileSliceCommitted(FileSlice slice) {
   if (!compareTimestamps(slice.getBaseInstantTime(), 
LESSER_THAN_OR_EQUALS, lastInstant.get().getTimestamp())) {
 return false;
   }
   
   return 
timeline.containsOrBeforeTimelineStarts(slice.getBaseInstantTime()) && 
UncompleteTimelineNotContains(slice.getBaseInstantTime());
 }
   ```
   finally, I think Hudi's fileslices should be managed uniformly, just like 
iceberg/delta lakes, rather than being obtained through  list operation.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7526] Fix constructors for bulkinsert sort partitioners to ensure we could use it as user defined partitioners [hudi]

2024-04-01 Thread via GitHub


wombatu-kun commented on code in PR #10942:
URL: https://github.com/apache/hudi/pull/10942#discussion_r1546190914


##
hudi-client/hudi-java-client/src/main/java/org/apache/hudi/execution/bulkinsert/JavaGlobalSortPartitioner.java:
##
@@ -31,12 +32,21 @@
  *
  * @param  HoodieRecordPayload type
  */
-public class JavaGlobalSortPartitioner
-implements BulkInsertPartitioner>> {
+public class JavaGlobalSortPartitioner implements 
BulkInsertPartitioner>> {
+
+  public JavaGlobalSortPartitioner() {
+  }
+
+  /**
+   * Constructor to create as UserDefinedBulkInsertPartitioner class via 
reflection
+   * @param config HoodieWriteConfig

Review Comment:
   Before this fix:  
   if user wants to use JavaGlobalSortPartitioner and he set 
`hoodie.bulkinsert.user.defined.partitioner.class=org.apache.hudi.execution.bulkinsert.JavaGlobalSortPartitioner`,
 it will not work because this partitioner could not be instantiated via 
reflection (as it has no constructor with writeConfig parameter).
   We create this constructor to add ability to use JavaGlobalSortPartitioner 
as user defined partitioner just by setting it's class name in writeConfig.  
   
   Don't know how to explain more clear. Let's wait for the author's reply.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7526] Fix constructors for bulkinsert sort partitioners to ensure we could use it as user defined partitioners [hudi]

2024-04-01 Thread via GitHub


danny0405 commented on code in PR #10942:
URL: https://github.com/apache/hudi/pull/10942#discussion_r1546174541


##
hudi-client/hudi-java-client/src/main/java/org/apache/hudi/execution/bulkinsert/JavaGlobalSortPartitioner.java:
##
@@ -31,12 +32,21 @@
  *
  * @param  HoodieRecordPayload type
  */
-public class JavaGlobalSortPartitioner
-implements BulkInsertPartitioner>> {
+public class JavaGlobalSortPartitioner implements 
BulkInsertPartitioner>> {
+
+  public JavaGlobalSortPartitioner() {
+  }
+
+  /**
+   * Constructor to create as UserDefinedBulkInsertPartitioner class via 
reflection
+   * @param config HoodieWriteConfig

Review Comment:
   > Yes, in this case HoodieWriteConfig is ignored just because this 
Partitioner is not configurable at all, but it does not mean that it should not 
be used as UserDefinedBulkInsertPartitioner
   
   That does not make sense for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7557] Fix incremental cleaner when commit for savepoint removed [hudi]

2024-04-01 Thread via GitHub


hudi-bot commented on PR #10946:
URL: https://github.com/apache/hudi/pull/10946#issuecomment-2029433733

   
   ## CI report:
   
   * cbcbc5182f524886946fdefec86faf75110f35c5 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23064)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-04-01 Thread via GitHub


ROOBALJINDAL closed issue #10884: [SUPPORT] Hudi cdc upserts stopped working 
after migrating from hudi 13.1 to 14.0
URL: https://github.com/apache/hudi/issues/10884


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-04-01 Thread via GitHub


ROOBALJINDAL commented on issue #10884:
URL: https://github.com/apache/hudi/issues/10884#issuecomment-2029401141

   I have found the issue. We were using custom MssqlDebeziumSource class as 
debezium source and in constructor we were using `HoodieStreamerMetrics` 
instead of `HoodieIngestionMetrics` (which is introduced in hudi 14.0)
   
   Once corrected the class, it started working. We can close this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] The parquet files for the MOR table have been generated, but the RO table in Hive still cannot query the latest data in the parquet files. [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10907:
URL: https://github.com/apache/hudi/issues/10907#issuecomment-2029384654

   @Toroidals Did you got a chance to check it? Were you able to identify the 
root cause for the issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread Vinaykumar Bhat (Jira)
Vinaykumar Bhat created HUDI-7559:
-

 Summary: Fix issues with functional index (on column stats) based 
pruning
 Key: HUDI-7559
 URL: https://issues.apache.org/jira/browse/HUDI-7559
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Vinaykumar Bhat






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7559) Fix issues with functional index (on column stats) based pruning

2024-04-01 Thread Vinaykumar Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinaykumar Bhat reassigned HUDI-7559:
-

Assignee: Vinaykumar Bhat

> Fix issues with functional index (on column stats) based pruning
> 
>
> Key: HUDI-7559
> URL: https://issues.apache.org/jira/browse/HUDI-7559
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Vinaykumar Bhat
>Assignee: Vinaykumar Bhat
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUPPORT] Requesting Support for insert_overwrite in Delta Streamer [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10896:
URL: https://github.com/apache/hudi/issues/10896#issuecomment-2029348950

   As, Sudha suggested, can you also send a mail to dev list thread and point 
the conversation here. Good to hear thought on this from others.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Requesting Support for insert_overwrite in Delta Streamer [hudi]

2024-04-01 Thread via GitHub


ad1happy2go commented on issue #10896:
URL: https://github.com/apache/hudi/issues/10896#issuecomment-2029348293

   @soumilshah1995 This makes sense. Create a JIRA also to track  - 
https://issues.apache.org/jira/browse/HUDI-7558


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >