Copilot commented on code in PR #3726:
URL: https://github.com/apache/avro/pull/3726#discussion_r3066273679


##########
lang/java/avro/src/test/java/org/apache/avro/TestDataFileReader.java:
##########
@@ -238,4 +244,36 @@ void invalidMagicBytes() throws IOException {
           () -> DataFileReader.openReader(fileInput, new 
GenericDatumReader<>()));
     }
   }
+
+  @Test
+  void missingSchemaMetadataDoesNotThrowNullPointerException() throws 
IOException {
+    byte[] malformedFile = buildContainerHeaderWithoutSchema();
+
+    IOException streamException = assertThrows(IOException.class,
+        () -> new DataFileStream<>(new ByteArrayInputStream(malformedFile), 
new GenericDatumReader<>()));
+    assertNotNull(streamException.getMessage());
+    
assertTrue(streamException.getMessage().contains(DataFileConstants.SCHEMA));
+
+    IOException readerException = assertThrows(IOException.class,
+        () -> new DataFileReader<>(new SeekableByteArrayInput(malformedFile), 
new GenericDatumReader<>()));
+    assertNotNull(readerException.getMessage());
+    
assertTrue(readerException.getMessage().contains(DataFileConstants.SCHEMA));
+  }
+
+  private static byte[] buildContainerHeaderWithoutSchema() throws IOException 
{
+    ByteArrayOutputStream output = new ByteArrayOutputStream();
+    output.write(DataFileConstants.MAGIC);
+
+    BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(output, null);
+    encoder.writeMapStart();
+    encoder.setItemCount(1);
+    encoder.startItem();
+    encoder.writeString(DataFileConstants.CODEC);
+    encoder.writeBytes("null".getBytes());
+    encoder.writeMapEnd();

Review Comment:
   `"null".getBytes()` uses the platform default charset, which can make this 
regression test platform-dependent. Use an explicit charset (e.g., UTF-8) to 
match how metadata is decoded in `DataFileStream#getMetaString` and keep the 
test deterministic.



##########
lang/java/avro/src/main/java/org/apache/avro/file/DataFileReader12.java:
##########
@@ -115,6 +115,10 @@ public synchronized long getMetaLong(String key) {
     return Long.parseLong(getMetaString(key));
   }
 
+  private Schema parseSchema() throws IOException {
+    return DataFileStream.parseSchemaFromMetadata(getMetaString(SCHEMA), 
SCHEMA, new Schema.Parser());

Review Comment:
   The PR changes `DataFileReader12` to throw a descriptive `IOException` when 
the 1.2-format schema metadata is missing, but the added regression test only 
covers the current container format (`DataFileStream`/`DataFileReader`). Please 
add a unit test that exercises `DataFileReader.openReader` on a minimal 
1.2-format container missing the `schema` metadata to prevent regressions in 
this compatibility path.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to