hudi-agent commented on code in PR #18923:
URL: https://github.com/apache/hudi/pull/18923#discussion_r3459573449
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala:
##########
@@ -86,6 +87,14 @@ class
SparkFileFormatInternalRowReaderContext(baseFileReader: SparkColumnarFileR
// For each field of `target`, replace its dataType with the matching
field's projected
// variant struct from `source` (when present). Non-matching fields pass
through.
+ //
Review Comment:
🤖 nit: this design rationale for `sparkRequiredSchema` is really explaining
why that *field* must exist as a parallel construct, but it's tucked above
`overlayVariantProjections`. A developer first encountering
`sparkRequiredSchema` in the field declaration will look there for context, not
here. Could you consider moving (or mirroring) this explanation as a comment on
the field declaration itself?
<sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetReader.java:
##########
@@ -144,16 +144,28 @@ public ClosableIterator<UnsafeRow>
getUnsafeRowIterator(HoodieSchema requestedSc
*/
public ClosableIterator<UnsafeRow> getUnsafeRowIterator(HoodieSchema
requestedSchema, List<Filter> readFilters) throws IOException {
HoodieSchema nonNullSchema = requestedSchema.getNonNullType();
- StructType structSchema =
HoodieInternalRowUtils.getCachedSchema(nonNullSchema);
+ return getUnsafeRowIterator(nonNullSchema,
HoodieInternalRowUtils.getCachedSchema(nonNullSchema), readFilters);
+ }
+
+ /**
+ * Variant overload. {@code projectedStructSchema} is the requested Spark
schema, which may carry
Review Comment:
🤖 nit: "Variant overload." is ambiguous here — in a codebase full of Spark
Variant-type discussions, "variant" reads as the data type rather than
"alternate version." Could you open with something like "Overload that also
accepts an explicit Spark struct schema carrying variant projections..." to
avoid the double meaning?
<sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]