ramitg254 commented on code in PR #6565:
URL: https://github.com/apache/hive/pull/6565#discussion_r3504073739


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergRecordReader.java:
##########
@@ -173,7 +174,22 @@ private CloseableIterable openGeneric(FileScanTask task, 
Schema readSchema) {
       default -> throw new UnsupportedOperationException(
           String.format("Cannot read %s file: %s", file.format().name(), 
file.location()));
     };
-    return applyResidualFiltering(iterable, residual, readSchema);
+    return applyResidualFiltering(withStructInitialDefaultBackfill(iterable, 
readSchema), residual, readSchema);
+  }
+
+  private CloseableIterable<T> 
withStructInitialDefaultBackfill(CloseableIterable<T> iterable, Schema 
readSchema) {
+    boolean needsBackfill = readSchema.columns().stream()
+        .filter(field -> field.type().isStructType())
+        .anyMatch(field -> 
!HiveSchemaUtil.getStructInitialDefaults(field.type().asStructType()).isEmpty());
+    if (!needsBackfill) {
+      return iterable;
+    }
+    return CloseableIterable.transform(iterable, row -> {
+      if (row instanceof Record curIceRecord) {
+        HiveSchemaUtil.backfillStructInitialDefaults(curIceRecord, 
readSchema.columns());
+      }
+      return row;
+    });

Review Comment:
   couple of redundancy  I introduced there, optimized it please check



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergRecordReader.java:
##########
@@ -173,7 +174,22 @@ private CloseableIterable openGeneric(FileScanTask task, 
Schema readSchema) {
       default -> throw new UnsupportedOperationException(
           String.format("Cannot read %s file: %s", file.format().name(), 
file.location()));
     };
-    return applyResidualFiltering(iterable, residual, readSchema);
+    return applyResidualFiltering(withStructInitialDefaultBackfill(iterable, 
readSchema), residual, readSchema);
+  }
+
+  private CloseableIterable<T> 
withStructInitialDefaultBackfill(CloseableIterable<T> iterable, Schema 
readSchema) {
+    boolean needsBackfill = readSchema.columns().stream()
+        .filter(field -> field.type().isStructType())
+        .anyMatch(field -> 
!HiveSchemaUtil.getStructInitialDefaults(field.type().asStructType()).isEmpty());
+    if (!needsBackfill) {
+      return iterable;
+    }
+    return CloseableIterable.transform(iterable, row -> {

Review Comment:
   my bad, optimized it to build default struct only once and then set the 
field of the row record for that particular field



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to