Copilot commented on code in PR #3725:
URL: https://github.com/apache/avro/pull/3725#discussion_r3066275698
##########
lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java:
##########
@@ -361,11 +381,40 @@ protected Object readMap(Object old, Schema expected,
ResolvingDecoder in) throw
addToMap(map, readMapKey(null, expected, in),
readWithoutConversion(null, eValue, in));
}
}
- } while ((l = in.mapNext()) > 0);
+ } while ((l = mapNext(in, eValue)) > 0);
}
return map;
}
+ /**
+ * Reads the next map block count and validates remaining bytes before the
+ * caller allocates storage.
+ */
+ private long mapNext(ResolvingDecoder in, Schema valueType) throws
IOException {
+ long l = in.mapNext();
+ if (l > 0) {
+ ensureAvailableMapBytes(in, l, valueType);
+ }
+ return l;
+ }
+
+ /**
+ * Validates remaining bytes for a map block. Each map entry has a string key
+ * (at least 1 byte for the length varint) plus a value, so the minimum bytes
+ * per entry is {@code 1 + minBytesPerElement(valueSchema)}.
+ */
+ private static void ensureAvailableMapBytes(Decoder decoder, long count,
Schema valueSchema) throws EOFException {
+ // Map keys are always strings: at least 1 byte for the length varint
+ int minBytesPerEntry = 1 + minBytesPerElement(valueSchema);
+ if (count > 0) {
+ int remaining = decoder.remainingBytes();
+ if (remaining >= 0 && count * (long) minBytesPerEntry > remaining) {
Review Comment:
`minBytesPerEntry` is computed as `1 + minBytesPerElement(valueSchema)`
using an `int`. Since `minBytesPerElement` can return `Integer.MAX_VALUE` (it
explicitly clamps record sums), this addition can overflow to a negative value,
which would prevent the byte-availability guard from triggering. Consider using
a `long` for `minBytesPerEntry` (with clamping) and performing the comparison
via division to avoid overflow.
##########
lang/java/avro/src/main/java/org/apache/avro/io/BinaryDecoder.java:
##########
@@ -910,6 +956,19 @@ public boolean isEof() {
return isEof;
}
+ @Override
+ protected int remainingBytes() {
+ int buffered = ba.getLim() - ba.getPos();
+ try {
+ if (in.getClass() == ByteArrayInputStream.class || in.getClass() ==
ByteBufferInputStream.class) {
+ return buffered + in.available();
Review Comment:
`remainingBytes()` can overflow when summing `buffered + in.available()`. If
`in.available()` is near `Integer.MAX_VALUE`, adding the buffered count can
wrap negative and cause callers to treat the remaining size as unknown/small,
defeating the early-EOF checks. Consider computing the sum in a `long` and
clamping to `Integer.MAX_VALUE` (or returning `-1` on overflow) for safety.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]