szehon-ho commented on code in PR #10547:
URL: https://github.com/apache/iceberg/pull/10547#discussion_r1664595395


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java:
##########
@@ -366,15 +371,56 @@ public void pruneColumns(StructType requestedSchema) {
 
   private Schema schemaWithMetadataColumns() {
     // metadata columns
-    List<Types.NestedField> fields =
+    List<Types.NestedField> metadataFields =
         metaColumns.stream()
             .distinct()
             .map(name -> MetadataColumns.metadataColumn(table, name))
             .collect(Collectors.toList());
-    Schema meta = new Schema(fields);
+    Schema metadataSchema = calculateMetadataSchema(metadataFields);
 
     // schema or rows returned by readers
-    return TypeUtil.join(schema, meta);
+    return TypeUtil.join(schema, metadataSchema);
+  }
+
+  private Schema calculateMetadataSchema(List<Types.NestedField> 
metaColumnFields) {
+    Optional<Types.NestedField> partitionField =
+        metaColumnFields.stream()
+            .filter(f -> MetadataColumns.PARTITION_COLUMN_ID == f.fieldId())
+            .findFirst();
+
+    // only calculate potential column id collision if partition metadata 
column was requested
+    if (!partitionField.isPresent()) {
+      return new Schema(metaColumnFields);
+    }
+
+    Set<Integer> idsToReassign =

Review Comment:
   Style comment:  I do love functional but Java does make it a bit ugly.  
Maybe we can just do the following to reduce the unnecessary emptySet line here 
.
   
   ```
       Set<Integer> idsToReassign =
           
TypeUtil.indexById(partitionField.get().type().asStructType()).keySet();
   ```
   
   Probably the way to do this is 
   ```
   partitionField.map{
      ... 
       return new Schema(...)
   }.orElse{
      return new Schema(metaColumnFields);
   }
   ```
   
   but it looks a bit complex



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to