hudi-agent commented on code in PR #18539:
URL: https://github.com/apache/hudi/pull/18539#discussion_r3183330011
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/RowDataToAvroConverters.java:
##########
@@ -273,6 +278,48 @@ public Object convert(HoodieSchema schema, Object object) {
};
}
+ /**
+ * Creates a converter for Flink 2.1+ VARIANT LogicalType. The converter
receives a Flink
+ * {@code Variant} object at runtime and extracts the raw metadata/value
byte arrays via
+ * reflection, then packs them into an Avro GenericRecord with the Variant
schema.
+ *
+ * <p>Reflection is required because the {@code Variant} interface and
{@code BinaryVariant}
+ * class only exist in Flink 2.1+, while this module compiles against Flink
1.20.
+ *
+ * <p>No shredded-variant check is needed here: {@code
HoodieSchemaConverter.convertVariant()}
+ * already rejects shredded variants before a Flink type or converter is
ever constructed,
+ * and Flink 2.1 itself only supports unshredded variants (FLIP-521).
+ */
+ private static RowDataToAvroConverter createVariantConverter() {
+ return new RowDataToAvroConverter() {
+ private static final long serialVersionUID = 1L;
+ private transient java.lang.reflect.Method metadataMethod;
+ private transient java.lang.reflect.Method valueMethod;
+
+ @Override
+ public Object convert(HoodieSchema schema, Object object) {
+ try {
+ if (metadataMethod == null) {
+ metadataMethod = object.getClass().getMethod("getMetadata");
+ valueMethod = object.getClass().getMethod("getValue");
+ }
Review Comment:
🤖 The lazy init of `metadataMethod` / `valueMethod` here isn't thread-safe —
if two threads race in `convert()`, one could see `metadataMethod != null`
after the first assignment but before `valueMethod` is set, and then NPE on
`valueMethod.invoke(...)`. In normal Flink (single-threaded operator task) this
can't fire, but since the sibling
`AvroToRowDataConverters.createVariantConverter()` already resolves its
`Constructor` upfront against a known class, would it make sense to do the same
here against `BinaryVariant.class`? That would also remove the per-call
`getMethod` lookup on the first invocation.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/HoodieSchemaConverter.java:
##########
@@ -222,6 +253,10 @@ public static HoodieSchema convertToSchema(LogicalType
logicalType, String rowNa
case RAW:
default:
+ if ("VARIANT".equals(logicalType.getTypeRoot().name())) {
Review Comment:
🤖 nit: the `"VARIANT".equals(logicalType.getTypeRoot().name())` check is
duplicated across `HoodieSchemaConverter`, `ParquetSchemaConverter`,
`AvroToRowDataConverters`, and `RowDataToAvroConverters`. Could you extract a
small helper (e.g. `HoodieSchemaConverter.isVariantLogicalType(LogicalType)`)
so the intent is obvious in one place and future tweaks (e.g. once a hard
dependency on Flink 2.1+ lands) only touch one site?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/RowDataToAvroConverters.java:
##########
@@ -273,6 +278,48 @@ public Object convert(HoodieSchema schema, Object object) {
};
}
+ /**
+ * Creates a converter for Flink 2.1+ VARIANT LogicalType. The converter
receives a Flink
+ * {@code Variant} object at runtime and extracts the raw metadata/value
byte arrays via
+ * reflection, then packs them into an Avro GenericRecord with the Variant
schema.
+ *
+ * <p>Reflection is required because the {@code Variant} interface and
{@code BinaryVariant}
+ * class only exist in Flink 2.1+, while this module compiles against Flink
1.20.
+ *
+ * <p>No shredded-variant check is needed here: {@code
HoodieSchemaConverter.convertVariant()}
+ * already rejects shredded variants before a Flink type or converter is
ever constructed,
+ * and Flink 2.1 itself only supports unshredded variants (FLIP-521).
+ */
+ private static RowDataToAvroConverter createVariantConverter() {
Review Comment:
🤖 nit: for consistency with
`AvroToRowDataConverters.createVariantConverter()` (which resolves the
`BinaryVariant` constructor once at converter-creation time), could we resolve
`getMetadata`/`getValue` Methods eagerly here too? The current per-instance
lazy resolution adds a null check on every call and diverges from the sibling
helper without an obvious reason.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]