Gabriel Feyer created AVRO-4228:
-----------------------------------

             Summary: BinaryDecoder::arrayNext() does not handle negative block 
counts
                 Key: AVRO-4228
                 URL: https://issues.apache.org/jira/browse/AVRO-4228
             Project: Apache Avro
          Issue Type: Bug
            Reporter: Gabriel Feyer


BinaryDecoder::arrayNext() in lang/c++/impl/BinaryDecoder.cc calls 
doDecodeLong() directly instead of doDecodeItemCount(), causing it to mishandle 
negative array block counts.

Per the Avro spec, a negative block count indicates the absolute value is the 
item count, followed by an additional long representing the byte-size of the 
block. The helper method doDecodeItemCount() handles this correctly, and is 
already used by arrayStart(), mapStart(), and mapNext(). Only arrayNext() 
bypasses it.

When arrayNext() reads a negative count (e.g., -100), static_cast<size_t>(-100) 
produces a huge value, the byte-size long is left unconsumed, and the stream 
position becomes corrupted. Subsequent reads produce garbage, typically 
resulting in a vector::_M_range_check crash or (with the validating decoder) a 
"Wrong number of items" error.

This affects any array large enough to be encoded in multiple blocks. Arrays 
that fit in a single block are unaffected because arrayNext() only sees the 
terminal zero.

Fix:
{code:java}
  // BinaryDecoder.cc — from:
  size_t BinaryDecoder::arrayNext() {
      return static_cast<size_t>(doDecodeLong());
  }
  // to:
  size_t BinaryDecoder::arrayNext() {
      return doDecodeItemCount();
  } {code}

This is consistent with how arrayStart(), mapStart(), and mapNext() are 
implemented.

Note: ClickHouse independently encountered this bug 
(https://github.com/ClickHouse/ClickHouse/issues/60438) and fixed it in their 
fork (https://github.com/ClickHouse/avro/pull/23) by inlining the 
negative-count logic. The fix above is simpler as it reuses the existing helper.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to