yihua commented on code in PR #11373: URL: https://github.com/apache/hudi/pull/11373#discussion_r1685274595
########## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java: ########## @@ -157,7 +156,7 @@ private static class AvroSupport { private static final String OVERFLOW_BYTES_FIELD_NAME = "proto_bytes"; private static final Schema RECURSION_OVERFLOW_SCHEMA = Schema.createRecord("recursion_overflow", null, "org.apache.hudi.proto", false, Arrays.asList(new Schema.Field(OVERFLOW_DESCRIPTOR_FIELD_NAME, STRING_SCHEMA, null, ""), - new Schema.Field(OVERFLOW_BYTES_FIELD_NAME, Schema.create(Schema.Type.BYTES), null, getUTF8Bytes("")))); + new Schema.Field(OVERFLOW_BYTES_FIELD_NAME, Schema.create(Schema.Type.BYTES), null, "".getBytes()))); Review Comment: This change seems unintended? ########## hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestProtoConversionUtil.java: ########## @@ -206,7 +233,7 @@ private Pair<Sample, GenericRecord> createInputOutputSampleWithRandomValues(Sche long primitiveFixedSignedLong = RANDOM.nextLong(); boolean primitiveBoolean = RANDOM.nextBoolean(); String primitiveString = randomString(10); - byte[] primitiveBytes = getUTF8Bytes(randomString(10)); + byte[] primitiveBytes = randomString(10).getBytes(); Review Comment: Similar here on unintended changes. In OSS, we have explicitly enforced UTF-8, although `.getBytes()` implicitly uses `UTF-8` on UNIX-like systems. ########## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java: ########## @@ -348,17 +348,17 @@ private Object getDefault(Descriptors.FieldDescriptor f) { case SFIXED64: return 0; case UINT64: - return "\u0000"; // requires bytes for decimal type + return DECIMAL_CONVERSION.toFixed(new BigDecimal(BigInteger.ZERO), fieldSchema, fieldSchema.getLogicalType()).bytes(); Review Comment: I assume this does not cause backwards compatibility issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org