Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632554676


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -147,6 +153,37 @@ public void close() {
 records.clear();
   }
 
+  /**
+   * Compares two {@link Comparable}s.  If both are numbers, converts them to 
{@link Long} for comparison.
+   * If one of the {@link Comparable}s is a String, assumes that both are 
String values for comparison.
+   *
+   * @param o1 {@link Comparable} object.
+   * @param o2 other {@link Comparable} object to compare to.
+   * @return comparison result.
+   */
+  @VisibleForTesting
+  static int compareTo(Comparable o1, Comparable o2) {
+// TODO(HUDI-7848): fix the delete records to contain the correct ordering 
value type
+//  so this util with the number comparison is not necessary.
+try {
+  return o1.compareTo(o2);
+} catch (ClassCastException e) {
+  if (o1 instanceof Number && o2 instanceof Number) {
+Long o1LongValue = ((Number) o1).longValue();
+Long o2LongValue = ((Number) o2).longValue();
+return o1LongValue.compareTo(o2LongValue);

Review Comment:
   HUDI-7852 to track.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632553603


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -147,6 +153,37 @@ public void close() {
 records.clear();
   }
 
+  /**
+   * Compares two {@link Comparable}s.  If both are numbers, converts them to 
{@link Long} for comparison.
+   * If one of the {@link Comparable}s is a String, assumes that both are 
String values for comparison.
+   *
+   * @param o1 {@link Comparable} object.
+   * @param o2 other {@link Comparable} object to compare to.
+   * @return comparison result.
+   */
+  @VisibleForTesting
+  static int compareTo(Comparable o1, Comparable o2) {
+// TODO(HUDI-7848): fix the delete records to contain the correct ordering 
value type
+//  so this util with the number comparison is not necessary.
+try {
+  return o1.compareTo(o2);
+} catch (ClassCastException e) {
+  if (o1 instanceof Number && o2 instanceof Number) {
+Long o1LongValue = ((Number) o1).longValue();
+Long o2LongValue = ((Number) o2).longValue();
+return o1LongValue.compareTo(o2LongValue);

Review Comment:
   We can constrain the comparison to Long and Integer only to limit the 
possibility of wrong results.  I'll create a follow-up PR to fix this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632551875


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -147,6 +153,37 @@ public void close() {
 records.clear();
   }
 
+  /**
+   * Compares two {@link Comparable}s.  If both are numbers, converts them to 
{@link Long} for comparison.
+   * If one of the {@link Comparable}s is a String, assumes that both are 
String values for comparison.
+   *
+   * @param o1 {@link Comparable} object.
+   * @param o2 other {@link Comparable} object to compare to.
+   * @return comparison result.
+   */
+  @VisibleForTesting
+  static int compareTo(Comparable o1, Comparable o2) {
+// TODO(HUDI-7848): fix the delete records to contain the correct ordering 
value type

Review Comment:
   Yes, based on the test cases this only happens when the ordering field value 
is deserialized from the delete records.  We need to check if the existing 
Avro-based merging logic has done schema handling to make this work (which may 
also incur additional overhead).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


yihua commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157165840

   CI is green.
   https://github.com/apache/hudi/assets/2497195/6d8f4fa9-3e64-4914-9a46-05e8783cd458;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


codope merged PR #9894:
URL: https://github.com/apache/hudi/pull/9894


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157141857

   
   ## CI report:
   
   * 3a1ec4524a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24313)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157136586

   
   ## CI report:
   
   * ca01c48cd352583dbf024006de57c9f6827b237b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24324)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24323)
 
   * 3a1ec4524a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


codope commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632524037


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -147,6 +153,37 @@ public void close() {
 records.clear();
   }
 
+  /**
+   * Compares two {@link Comparable}s.  If both are numbers, converts them to 
{@link Long} for comparison.
+   * If one of the {@link Comparable}s is a String, assumes that both are 
String values for comparison.
+   *
+   * @param o1 {@link Comparable} object.
+   * @param o2 other {@link Comparable} object to compare to.
+   * @return comparison result.
+   */
+  @VisibleForTesting
+  static int compareTo(Comparable o1, Comparable o2) {
+// TODO(HUDI-7848): fix the delete records to contain the correct ordering 
value type

Review Comment:
   does this happen only for delete records?



##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -147,6 +153,37 @@ public void close() {
 records.clear();
   }
 
+  /**
+   * Compares two {@link Comparable}s.  If both are numbers, converts them to 
{@link Long} for comparison.
+   * If one of the {@link Comparable}s is a String, assumes that both are 
String values for comparison.
+   *
+   * @param o1 {@link Comparable} object.
+   * @param o2 other {@link Comparable} object to compare to.
+   * @return comparison result.
+   */
+  @VisibleForTesting
+  static int compareTo(Comparable o1, Comparable o2) {
+// TODO(HUDI-7848): fix the delete records to contain the correct ordering 
value type
+//  so this util with the number comparison is not necessary.
+try {
+  return o1.compareTo(o2);
+} catch (ClassCastException e) {
+  if (o1 instanceof Number && o2 instanceof Number) {
+Long o1LongValue = ((Number) o1).longValue();
+Long o2LongValue = ((Number) o2).longValue();
+return o1LongValue.compareTo(o2LongValue);

Review Comment:
   can possibly lead to wrong result with float/double?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157082849

   
   ## CI report:
   
   * ca01c48cd352583dbf024006de57c9f6827b237b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24324)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24323)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157034573

   
   ## CI report:
   
   * 7b6c9d86accaf976f4db0185fa1a203c82f04446 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24322)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24321)
 
   * ca01c48cd352583dbf024006de57c9f6827b237b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157025978

   
   ## CI report:
   
   * a6ffe1240055d6135a517dfcada59edc95383423 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24318)
 
   * 7b6c9d86accaf976f4db0185fa1a203c82f04446 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2157018634

   
   ## CI report:
   
   * a6ffe1240055d6135a517dfcada59edc95383423 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156933007

   
   ## CI report:
   
   * 3a1ec4524a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24313)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156929070

   
   ## CI report:
   
   * a6ffe1240055d6135a517dfcada59edc95383423 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24319)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24318)
 
   * 3a1ec4524a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156876724

   
   ## CI report:
   
   * a6ffe1240055d6135a517dfcada59edc95383423 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24319)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24318)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156823221

   
   ## CI report:
   
   * 2a190ee68d Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24314)
 
   * a6ffe1240055d6135a517dfcada59edc95383423 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156416934

   
   ## CI report:
   
   * 2a190ee68d Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24314)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156414384

   
   ## CI report:
   
   * a558a9ee98fe91fabd30987a69cdc3b28c5f18b0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24316)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24315)
 
   * 2a190ee68d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156391522

   
   ## CI report:
   
   * a558a9ee98fe91fabd30987a69cdc3b28c5f18b0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24316)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24315)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156369924

   
   ## CI report:
   
   * 57a91012d2a3c11d5fbe5dad55a6392d95953b0e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24312)
 
   * a558a9ee98fe91fabd30987a69cdc3b28c5f18b0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24316)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156367284

   
   ## CI report:
   
   * 57a91012d2a3c11d5fbe5dad55a6392d95953b0e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24312)
 
   * a558a9ee98fe91fabd30987a69cdc3b28c5f18b0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-09 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156353139

   
   ## CI report:
   
   * 04e8b0a67a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24308)
 
   * 57a91012d2a3c11d5fbe5dad55a6392d95953b0e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156280208

   
   ## CI report:
   
   * 04e8b0a67a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24308)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156278203

   
   ## CI report:
   
   * 04e8b0a67a675dc34bede7fb3c8f72c3b137cd60 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24308)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24307)
 
   * 04e8b0a67a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156259327

   
   ## CI report:
   
   * 04e8b0a67a675dc34bede7fb3c8f72c3b137cd60 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24308)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24307)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156246947

   
   ## CI report:
   
   * f5503b5c92 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24305)
 
   * 04e8b0a67a675dc34bede7fb3c8f72c3b137cd60 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24308)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156244799

   
   ## CI report:
   
   * f5503b5c92 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24305)
 
   * 04e8b0a67a675dc34bede7fb3c8f72c3b137cd60 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156242265

   
   ## CI report:
   
   * f5503b5c92aa9899ee55447cd415467a255caa82 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24305)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24303)
 
   * f5503b5c92 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156224903

   
   ## CI report:
   
   * f5503b5c92aa9899ee55447cd415467a255caa82 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24305)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24303)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156214340

   
   ## CI report:
   
   * 79b7c1f744fe13e094f245b38e131c63d801ea1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24302)
 
   * f5503b5c92aa9899ee55447cd415467a255caa82 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24305)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156212175

   
   ## CI report:
   
   * 79b7c1f744fe13e094f245b38e131c63d801ea1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24302)
 
   * f5503b5c92aa9899ee55447cd415467a255caa82 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156210177

   
   ## CI report:
   
   * 79b7c1f744fe13e094f245b38e131c63d801ea1a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24302)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1372348212


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1276,5 +1291,35 @@ public HoodieTableMetaClient initTable(Configuration 
configuration, String baseP
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {
+  boolean payloadClassNameSet = null != payloadClassName;
+  boolean payloadTypeSet = null != payloadType;
+  boolean recordMergerStrategySet = null != recordMergerStrategy;
+  boolean recordMergeModeSet = null != recordMergeMode;
+
+  checkArgument(recordMergeModeSet,
+  "Record merge mode " + HoodieTableConfig.RECORD_MERGE_MODE.key() + " 
should be set");

Review Comment:
   This is mandatory in the table config and during table upgrade, the merge 
mode should be inferred from either the payload class name / type or record 
merger strategy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632119102


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1276,5 +1291,35 @@ public HoodieTableMetaClient initTable(Configuration 
configuration, String baseP
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {
+  boolean payloadClassNameSet = null != payloadClassName;
+  boolean payloadTypeSet = null != payloadType;
+  boolean recordMergerStrategySet = null != recordMergerStrategy;
+  boolean recordMergeModeSet = null != recordMergeMode;
+
+  checkArgument(recordMergeModeSet,
+  "Record merge mode " + HoodieTableConfig.RECORD_MERGE_MODE.key() + " 
should be set");

Review Comment:
   The PR is updated and this is done in 
`HoodieTableMetaClient$PropertyBuilder#build`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632119003


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReader.java:
##
@@ -242,6 +249,11 @@ public HoodieFileGroupReaderIterator 
getClosableIterator() {
 return new HoodieFileGroupReaderIterator<>(this);
   }
 
+  public static RecordMergeMode getRecordMergeMode(Properties props) {
+String mergeMode = getStringWithAltKeys(props, 
HoodieCommonConfig.RECORD_MERGE_MODE, true).toUpperCase();

Review Comment:
   Right now, since there is only placeholder upgrade and downgrade methods 
from between table version 6 and 8, I added the inference of record merge mode 
inside `HoodieTableMetaClient$PropertyBuilder#build`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632118953


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1382,5 +1398,35 @@ public HoodieTableMetaClient 
initTable(StorageConfiguration configuration, St
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {

Review Comment:
   I invoke this method after inferring the record merge mode in 
`HoodieTableMetaClient$PropertyBuilder#build`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156197076

   
   ## CI report:
   
   * 54ba973a7ee35d5daf2ccc802aac5ec699cdf4af Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24299)
 
   * 79b7c1f744fe13e094f245b38e131c63d801ea1a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24302)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156193662

   
   ## CI report:
   
   * 54ba973a7ee35d5daf2ccc802aac5ec699cdf4af Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24299)
 
   * 79b7c1f744fe13e094f245b38e131c63d801ea1a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1632102563


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReader.java:
##
@@ -242,6 +249,11 @@ public HoodieFileGroupReaderIterator 
getClosableIterator() {
 return new HoodieFileGroupReaderIterator<>(this);
   }
 
+  public static RecordMergeMode getRecordMergeMode(Properties props) {
+String mergeMode = getStringWithAltKeys(props, 
HoodieCommonConfig.RECORD_MERGE_MODE, true).toUpperCase();

Review Comment:
   Sounds good.  The record merge mode is required to dictate the merging 
behavior in release 1.x, playing the same role as the payload class config in 
the release 0.x.  During table upgrade, we need to infer the record merge mode 
based on the payload class so it's correctly set.  HUDI-7847 to track the work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156135926

   
   ## CI report:
   
   * 54ba973a7ee35d5daf2ccc802aac5ec699cdf4af Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24299)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156131130

   
   ## CI report:
   
   * 84698763395160c09cac2c1529615a900cbb4625 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24296)
 
   * 54ba973a7ee35d5daf2ccc802aac5ec699cdf4af Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24299)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156117209

   
   ## CI report:
   
   * 84698763395160c09cac2c1529615a900cbb4625 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24296)
 
   * 54ba973a7ee35d5daf2ccc802aac5ec699cdf4af UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156058771

   
   ## CI report:
   
   * 84698763395160c09cac2c1529615a900cbb4625 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24296)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156041218

   
   ## CI report:
   
   * a70cfc6db41a781bb3b6c9c8a9138892f7a12687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24291)
 
   * 84698763395160c09cac2c1529615a900cbb4625 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24296)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-08 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2156038868

   
   ## CI report:
   
   * a70cfc6db41a781bb3b6c9c8a9138892f7a12687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24291)
 
   * 84698763395160c09cac2c1529615a900cbb4625 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2155818031

   
   ## CI report:
   
   * a70cfc6db41a781bb3b6c9c8a9138892f7a12687 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2155806588

   
   ## CI report:
   
   * c2dec94b442920784b3914cc13b87294e734a477 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24272)
 
   * a70cfc6db41a781bb3b6c9c8a9138892f7a12687 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2155804768

   
   ## CI report:
   
   * c2dec94b442920784b3914cc13b87294e734a477 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24272)
 
   * a70cfc6db41a781bb3b6c9c8a9138892f7a12687 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154949218

   
   ## CI report:
   
   * c2dec94b442920784b3914cc13b87294e734a477 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24272)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154856344

   
   ## CI report:
   
   * 3e8bdc41e97141b94a9f60a3450f41ad342fa45e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24271)
 
   * c2dec94b442920784b3914cc13b87294e734a477 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24272)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154841313

   
   ## CI report:
   
   * 3e8bdc41e97141b94a9f60a3450f41ad342fa45e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24271)
 
   * c2dec94b442920784b3914cc13b87294e734a477 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154824717

   
   ## CI report:
   
   * 3e8bdc41e97141b94a9f60a3450f41ad342fa45e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24271)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


codope commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1631121419


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1382,5 +1398,35 @@ public HoodieTableMetaClient 
initTable(StorageConfiguration configuration, St
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {

Review Comment:
   Where is this method used? 



##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReader.java:
##
@@ -242,6 +249,11 @@ public HoodieFileGroupReaderIterator 
getClosableIterator() {
 return new HoodieFileGroupReaderIterator<>(this);
   }
 
+  public static RecordMergeMode getRecordMergeMode(Properties props) {
+String mergeMode = getStringWithAltKeys(props, 
HoodieCommonConfig.RECORD_MERGE_MODE, true).toUpperCase();

Review Comment:
   note: Setting `useDefaultValue` to true as many tests don't set record merge 
mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154747202

   
   ## CI report:
   
   * 083ea7ec0e0cb2f14fc47faff5d781a64cca3874 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24269)
 
   * 3e8bdc41e97141b94a9f60a3450f41ad342fa45e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24271)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154733863

   
   ## CI report:
   
   * 083ea7ec0e0cb2f14fc47faff5d781a64cca3874 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24269)
 
   * 3e8bdc41e97141b94a9f60a3450f41ad342fa45e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154721676

   
   ## CI report:
   
   * 083ea7ec0e0cb2f14fc47faff5d781a64cca3874 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24269)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154654455

   
   ## CI report:
   
   * f1ad4786aad397d5bad19d3cf68cbbb90c92d9ac Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24267)
 
   * 083ea7ec0e0cb2f14fc47faff5d781a64cca3874 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24269)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154641685

   
   ## CI report:
   
   * f1ad4786aad397d5bad19d3cf68cbbb90c92d9ac Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24267)
 
   * 083ea7ec0e0cb2f14fc47faff5d781a64cca3874 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154551949

   
   ## CI report:
   
   * f1ad4786aad397d5bad19d3cf68cbbb90c92d9ac Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24267)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2024-06-07 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-2154479656

   
   ## CI report:
   
   * be208c2f40cdf7e82abc2d1627bf21f7ad509f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20489)
 
   * f1ad4786aad397d5bad19d3cf68cbbb90c92d9ac UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1372349415


##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -119,21 +124,36 @@ protected Option doProcessNextDataRecord(T record,
   Map metadata,
   Pair, Map> existingRecordMetadataPair) throws IOException {
 if (existingRecordMetadataPair != null) {
-  // Merge and store the combined record
-  // Note that the incoming `record` is from an older commit, so it should 
be put as
-  // the `older` in the merge API
-  HoodieRecord combinedRecord = (HoodieRecord) recordMerger.merge(
-  readerContext.constructHoodieRecord(Option.of(record), metadata, 
readerSchema),
-  readerSchema,
-  readerContext.constructHoodieRecord(
-  existingRecordMetadataPair.getLeft(), 
existingRecordMetadataPair.getRight(), readerSchema),
-  readerSchema,
-  payloadProps).get().getLeft();
-  // If pre-combine returns existing record, no need to update it
-  if (combinedRecord.getData() != 
existingRecordMetadataPair.getLeft().get()) {
-return Option.of(combinedRecord.getData());
+  switch (recordMergeMode) {
+case OVERWRITE_WITH_LATEST:
+  return Option.empty();
+case EVENT_TIME_ORDERING:
+  Comparable incomingOrderingValue = readerContext.getOrderingValue(
+  Option.of(record), metadata, readerSchema, payloadProps);
+  Comparable existingOrderingValue = readerContext.getOrderingValue(
+  existingRecordMetadataPair.getLeft(), 
existingRecordMetadataPair.getRight(), readerSchema, payloadProps);
+  if (incomingOrderingValue.compareTo(existingOrderingValue) > 0) {
+return Option.of(record);
+  }
+  return Option.empty();

Review Comment:
   Yes, `existingRecordMetadataPair` should be in the log record mapping.  The 
convention here is that, if `Option.empty()` is returned from this method, the 
log record of the same record key in the mapping should not be updated, to 
avoid the `readerContext.seal`:
   ```
   @Override
 public void processNextDataRecord(T record, Map metadata, 
Object recordKey) throws IOException {
   Pair, Map> existingRecordMetadataPair = 
records.get(recordKey);
   Option mergedRecord = doProcessNextDataRecord(record, metadata, 
existingRecordMetadataPair);
   if (mergedRecord.isPresent()) {
 records.put(recordKey, 
Pair.of(Option.ofNullable(readerContext.seal(mergedRecord.get())), metadata));
   }
 }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


yihua commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1372348212


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1276,5 +1291,35 @@ public HoodieTableMetaClient initTable(Configuration 
configuration, String baseP
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {
+  boolean payloadClassNameSet = null != payloadClassName;
+  boolean payloadTypeSet = null != payloadType;
+  boolean recordMergerStrategySet = null != recordMergerStrategy;
+  boolean recordMergeModeSet = null != recordMergeMode;
+
+  checkArgument(recordMergeModeSet,
+  "Record merge mode " + HoodieTableConfig.RECORD_MERGE_MODE.key() + " 
should be set");

Review Comment:
   This is mandatory in the table config and during table upgrade, the merge 
mode should inferred from either the payload class name / type or record merger 
strategy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


codope commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1372317061


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1276,5 +1291,35 @@ public HoodieTableMetaClient initTable(Configuration 
configuration, String baseP
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {
+  boolean payloadClassNameSet = null != payloadClassName;
+  boolean payloadTypeSet = null != payloadType;
+  boolean recordMergerStrategySet = null != recordMergerStrategy;
+  boolean recordMergeModeSet = null != recordMergeMode;
+
+  checkArgument(recordMergeModeSet,
+  "Record merge mode " + HoodieTableConfig.RECORD_MERGE_MODE.key() + " 
should be set");

Review Comment:
   Is it a mandatory config? How will it affect users upgrading to new version?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


codope commented on code in PR #9894:
URL: https://github.com/apache/hudi/pull/9894#discussion_r1372317061


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -1276,5 +1291,35 @@ public HoodieTableMetaClient initTable(Configuration 
configuration, String baseP
 throws IOException {
   return HoodieTableMetaClient.initTableAndGetMetaClient(configuration, 
basePath, build());
 }
+
+private void validateMergeConfigs() {
+  boolean payloadClassNameSet = null != payloadClassName;
+  boolean payloadTypeSet = null != payloadType;
+  boolean recordMergerStrategySet = null != recordMergerStrategy;
+  boolean recordMergeModeSet = null != recordMergeMode;
+
+  checkArgument(recordMergeModeSet,
+  "Record merge mode " + HoodieTableConfig.RECORD_MERGE_MODE.key() + " 
should be set");

Review Comment:
   Is it a mandatry config? How will it affect users upgrading to new version?



##
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##
@@ -119,21 +124,36 @@ protected Option doProcessNextDataRecord(T record,
   Map metadata,
   Pair, Map> existingRecordMetadataPair) throws IOException {
 if (existingRecordMetadataPair != null) {
-  // Merge and store the combined record
-  // Note that the incoming `record` is from an older commit, so it should 
be put as
-  // the `older` in the merge API
-  HoodieRecord combinedRecord = (HoodieRecord) recordMerger.merge(
-  readerContext.constructHoodieRecord(Option.of(record), metadata, 
readerSchema),
-  readerSchema,
-  readerContext.constructHoodieRecord(
-  existingRecordMetadataPair.getLeft(), 
existingRecordMetadataPair.getRight(), readerSchema),
-  readerSchema,
-  payloadProps).get().getLeft();
-  // If pre-combine returns existing record, no need to update it
-  if (combinedRecord.getData() != 
existingRecordMetadataPair.getLeft().get()) {
-return Option.of(combinedRecord.getData());
+  switch (recordMergeMode) {
+case OVERWRITE_WITH_LATEST:
+  return Option.empty();
+case EVENT_TIME_ORDERING:
+  Comparable incomingOrderingValue = readerContext.getOrderingValue(
+  Option.of(record), metadata, readerSchema, payloadProps);
+  Comparable existingOrderingValue = readerContext.getOrderingValue(
+  existingRecordMetadataPair.getLeft(), 
existingRecordMetadataPair.getRight(), readerSchema, payloadProps);
+  if (incomingOrderingValue.compareTo(existingOrderingValue) > 0) {
+return Option.of(record);
+  }
+  return Option.empty();

Review Comment:
   Why empty? Should it not be from `existingRecordMetadataPair`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1779951472

   
   ## CI report:
   
   * be208c2f40cdf7e82abc2d1627bf21f7ad509f71 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20489)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1779872587

   
   ## CI report:
   
   * 75e98fe81be61e02f30d41d798ea86b733a26e2a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20448)
 
   * be208c2f40cdf7e82abc2d1627bf21f7ad509f71 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20489)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-25 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1779858774

   
   ## CI report:
   
   * 75e98fe81be61e02f30d41d798ea86b733a26e2a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20448)
 
   * be208c2f40cdf7e82abc2d1627bf21f7ad509f71 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-23 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1776409405

   
   ## CI report:
   
   * 75e98fe81be61e02f30d41d798ea86b733a26e2a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20448)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-23 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1776362033

   
   ## CI report:
   
   * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20410)
 
   * 75e98fe81be61e02f30d41d798ea86b733a26e2a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1772066552

   
   ## CI report:
   
   * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20410)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1771987971

   
   ## CI report:
   
   * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20410)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6798] Add record merging mode and implement event-time ordering in the new file group reader [hudi]

2023-10-19 Thread via GitHub


hudi-bot commented on PR #9894:
URL: https://github.com/apache/hudi/pull/9894#issuecomment-1771981041

   
   ## CI report:
   
   * 74dab9f4a045822aef5565ff24cb8bbf15ef0f65 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org