hudi-agent commented on code in PR #18923:
URL: https://github.com/apache/hudi/pull/18923#discussion_r3459573449


##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/SparkFileFormatInternalRowReaderContext.scala:
##########
@@ -86,6 +87,14 @@ class 
SparkFileFormatInternalRowReaderContext(baseFileReader: SparkColumnarFileR
 
   // For each field of `target`, replace its dataType with the matching 
field's projected
   // variant struct from `source` (when present). Non-matching fields pass 
through.
+  //

Review Comment:
   🤖 nit: this design rationale for `sparkRequiredSchema` is really explaining 
why that *field* must exist as a parallel construct, but it's tucked above 
`overlayVariantProjections`. A developer first encountering 
`sparkRequiredSchema` in the field declaration will look there for context, not 
here. Could you consider moving (or mirroring) this explanation as a comment on 
the field declaration itself?
   
   <sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetReader.java:
##########
@@ -144,16 +144,28 @@ public ClosableIterator<UnsafeRow> 
getUnsafeRowIterator(HoodieSchema requestedSc
    */
   public ClosableIterator<UnsafeRow> getUnsafeRowIterator(HoodieSchema 
requestedSchema, List<Filter> readFilters) throws IOException {
     HoodieSchema nonNullSchema = requestedSchema.getNonNullType();
-    StructType structSchema = 
HoodieInternalRowUtils.getCachedSchema(nonNullSchema);
+    return getUnsafeRowIterator(nonNullSchema, 
HoodieInternalRowUtils.getCachedSchema(nonNullSchema), readFilters);
+  }
+
+  /**
+   * Variant overload. {@code projectedStructSchema} is the requested Spark 
schema, which may carry

Review Comment:
   🤖 nit: "Variant overload." is ambiguous here — in a codebase full of Spark 
Variant-type discussions, "variant" reads as the data type rather than 
"alternate version." Could you open with something like "Overload that also 
accepts an explicit Spark struct schema carrying variant projections..." to 
avoid the double meaning?
   
   <sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to