dramaticlly commented on code in PR #10547: URL: https://github.com/apache/iceberg/pull/10547#discussion_r1664606896
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java: ########## @@ -366,15 +371,56 @@ public void pruneColumns(StructType requestedSchema) { private Schema schemaWithMetadataColumns() { // metadata columns - List<Types.NestedField> fields = + List<Types.NestedField> metadataFields = metaColumns.stream() .distinct() .map(name -> MetadataColumns.metadataColumn(table, name)) .collect(Collectors.toList()); - Schema meta = new Schema(fields); + Schema metadataSchema = calculateMetadataSchema(metadataFields); // schema or rows returned by readers - return TypeUtil.join(schema, meta); + return TypeUtil.join(schema, metadataSchema); + } + + private Schema calculateMetadataSchema(List<Types.NestedField> metaColumnFields) { + Optional<Types.NestedField> partitionField = + metaColumnFields.stream() + .filter(f -> MetadataColumns.PARTITION_COLUMN_ID == f.fieldId()) + .findFirst(); + + // only calculate potential column id collision if partition metadata column was requested + if (!partitionField.isPresent()) { + return new Schema(metaColumnFields); + } + + Set<Integer> idsToReassign = Review Comment: yeah I agree, it seem too big of block to put inside map, where within we are just assuming partition column is found. I updated to use `.get()` as suggested -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org