pvary commented on code in PR #15633:
URL: https://github.com/apache/iceberg/pull/15633#discussion_r2939106074


##########
data/src/test/java/org/apache/iceberg/data/BaseFormatModelTests.java:
##########
@@ -317,6 +314,306 @@ void 
testPositionDeleteWriterEngineWriteGenericRead(FileFormat fileFormat) throw
     DataTestHelpers.assertEquals(positionDeleteSchema.asStruct(), records, 
readRecords);
   }
 
+  @ParameterizedTest
+  @FieldSource("FORMAT_AND_GENERATOR")
+  /** Write with Generic Record, read with projected engine type T (narrow 
schema) */
+  void testReaderBuilderProjection(FileFormat fileFormat, DataGenerator 
dataGenerator)
+      throws IOException {
+    Schema fullSchema = dataGenerator.schema();
+
+    List<Types.NestedField> columns = fullSchema.columns();
+    Schema projectedSchema = new Schema(columns.get(columns.size() - 1));
+
+    List<Record> genericRecords = dataGenerator.generateRecords();
+    writeGenericRecords(fileFormat, fullSchema, genericRecords);
+
+    List<Record> projectedGenericRecords = projectRecords(genericRecords, 
projectedSchema);
+    List<T> expectedEngineRecords =
+        convertToEngineRecords(projectedGenericRecords, projectedSchema);
+
+    InputFile inputFile = encryptedFile.encryptingOutputFile().toInputFile();
+    List<T> readRecords;
+    try (CloseableIterable<T> reader =
+        FormatModelRegistry.readBuilder(fileFormat, engineType(), inputFile)
+            .project(projectedSchema)
+            .engineProjection(engineSchema(projectedSchema))
+            .build()) {
+      readRecords = ImmutableList.copyOf(reader);
+    }
+
+    assertEquals(projectedSchema, expectedEngineRecords, readRecords);
+  }
+
+  @ParameterizedTest
+  @FieldSource("FORMAT_AND_GENERATOR")
+  void testReaderBuilderFilter(FileFormat fileFormat, DataGenerator 
dataGenerator)
+      throws IOException {
+
+    // Avro does not support filter push down
+    // Skip this test for Avro to avoid false failures.
+    assumeThat(fileFormat != FileFormat.AVRO).isTrue();
+
+    Schema schema = dataGenerator.schema();
+
+    List<Record> genericRecords = dataGenerator.generateRecords();
+    writeGenericRecords(fileFormat, schema, genericRecords);
+
+    // Construct a filter condition that is smaller than the minimum value to 
achieve file-level
+    // filtering.
+    Types.NestedField firstField = schema.columns().get(0);

Review Comment:
   This heavily builds on the schema of the records generated by the generator.
   Maybe run the tests for all of the FileFormats, but only with a single 
generator.
   
   So the tests should be like:
   ```
     @ParameterizedTest
     @FieldSource("FORMATS")
     void testReaderBuilderFilter(FileFormat fileFormat) {
       DataGenerator dataGenerator = new StructOfPrimitive(); // or 
DataGenerator.structOfPrimitive()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to