Re: [PR] [HUDI-9713] Payload deprecation [hudi]

via GitHub Fri, 22 Aug 2025 16:17:24 -0700


yihua commented on code in PR #13726:
URL: https://github.com/apache/hudi/pull/13726#discussion_r2294908100



##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -668,7 +668,7 @@ private void 
validateOutputFromFileGroupReaderWithNativeRecords(StorageConfigura
         .map(r -> HoodieAvroUtils.removeFields(r, metaCols))
         .collect(Collectors.toSet());
     Set<GenericRecord> expectedRecordSet = expectedRecords.stream()
-        .map(r -> (GenericRecord) r.getRight())
+        .map(r -> resetByteBufferPosition((GenericRecord) r.getRight()))

Review Comment:
   Is this needed for production code?



##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestCopyOnWriteActionExecutor.java:
##########
@@ -144,7 +142,7 @@ private HoodieWriteConfig makeHoodieClientConfig() {
 
   private HoodieWriteConfig.Builder makeHoodieClientConfigBuilder() {
     // Prepare the AvroParquetIO
-    return 
HoodieWriteConfig.newBuilder().withPath(basePath).withSchema(SCHEMA.toString())

Review Comment:
   Should `SCHEMA` be deleted?



##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -217,7 +217,7 @@ public void testReadFileGroupWithMultipleOrderingFields() 
throws Exception {
       commitToTable(initialRecords, INSERT.value(), true, writeConfigs);
       validateOutputFromFileGroupReader(
           getStorageConf(), getBasePath(), true, 0, recordMergeMode,
-          initialRecords, initialRecords);
+          initialRecords, initialRecords, orderingValues.split(","));

Review Comment:
   So `orderingValues` are actually ordering field names.



##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -217,7 +217,7 @@ public void testReadFileGroupWithMultipleOrderingFields() 
throws Exception {
       commitToTable(initialRecords, INSERT.value(), true, writeConfigs);
       validateOutputFromFileGroupReader(
           getStorageConf(), getBasePath(), true, 0, recordMergeMode,
-          initialRecords, initialRecords);
+          initialRecords, initialRecords, orderingValues.split(","));

Review Comment:
   Maybe keep that as is in this PR if the renaming balloons the scope of 
changes.



##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -969,4 +964,30 @@ private static String removeHiveStylePartition(String 
partitionPath) {
     return partitionPath;
   }
 
+  private static IndexedRecord resetByteBufferPosition(IndexedRecord record) {
+    for (Schema.Field field : record.getSchema().getFields()) {
+      Object value = record.get(field.pos());
+      resetByteBufferField(value, field.schema());
+    }
+    return record;
+  }
+
+  private static void resetByteBufferField(Object value, Schema fieldSchema) {
+    if (value == null) {
+      return;
+    }
+    Schema.Type fieldType = 
HoodieAvroUtils.unwrapNullable(fieldSchema).getType();
+    if (fieldType == Schema.Type.BYTES || fieldType == Schema.Type.FIXED) {
+      // Reset position of ByteBuffer or Fixed type fields
+      if (value instanceof ByteBuffer) {
+        ((ByteBuffer) value).rewind();
+      }
+    } else if (fieldType == Schema.Type.RECORD) {
+      resetByteBufferPosition((IndexedRecord) value);
+    } else if (fieldType == Schema.Type.ARRAY) {
+      ((List<Object>) value).forEach(element -> resetByteBufferField(element, 
fieldSchema.getElementType()));
+    } else if (fieldType == Schema.Type.MAP) {
+      ((Map<Object, Object>) value).values().forEach(element -> 
resetByteBufferField(element, fieldSchema.getValueType()));
+    }
+  }

Review Comment:
   Is this because of certain testing artifacts that are reusing the record 
instances?



##########
hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java:
##########
@@ -969,4 +964,30 @@ private static String removeHiveStylePartition(String 
partitionPath) {
     return partitionPath;
   }
 
+  private static IndexedRecord resetByteBufferPosition(IndexedRecord record) {
+    for (Schema.Field field : record.getSchema().getFields()) {
+      Object value = record.get(field.pos());
+      resetByteBufferField(value, field.schema());
+    }
+    return record;
+  }
+
+  private static void resetByteBufferField(Object value, Schema fieldSchema) {
+    if (value == null) {
+      return;
+    }
+    Schema.Type fieldType = 
HoodieAvroUtils.unwrapNullable(fieldSchema).getType();
+    if (fieldType == Schema.Type.BYTES || fieldType == Schema.Type.FIXED) {
+      // Reset position of ByteBuffer or Fixed type fields
+      if (value instanceof ByteBuffer) {
+        ((ByteBuffer) value).rewind();
+      }
+    } else if (fieldType == Schema.Type.RECORD) {
+      resetByteBufferPosition((IndexedRecord) value);
+    } else if (fieldType == Schema.Type.ARRAY) {
+      ((List<Object>) value).forEach(element -> resetByteBufferField(element, 
fieldSchema.getElementType()));
+    } else if (fieldType == Schema.Type.MAP) {
+      ((Map<Object, Object>) value).values().forEach(element -> 
resetByteBufferField(element, fieldSchema.getValueType()));
+    }
+  }

Review Comment:
   I'm wondering if the testing logic deviates from actual read/write logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-9713] Payload deprecation [hudi]

Reply via email to