The context is that Message-typed fields (and string-typed fields) are
encoded with an int32 of "number of bytes that follow this are this
message". It's not possible to encode a message which is larger than that
from a binary wire format technical point of view, and thats an ecosystem
wide implication.

This leaves some wiggle room though, notably top level messages are not
encoded with a length prefix, which means they don't have any such
technical constraint. But also more notably, if you just construct a
message in memory and then call some setters it will build up the
arbitrarily large message.

> May be a bit of context here would help, I am coming from the point of
view https://groups.google.com/g/protobuf/c/vvP4uajRE60
> If the potential fix for it was to set limit to 2g in message_lite.c,

Without other context and doing more archeology, I actually suspect the
'attack' was more that e.g. sufficiently smart attackers could send a
string which is length "2GB minus one byte", and then know that the service
boxes up that input in a protobuf message (adding a few bytes over
overhead), and then encode that to the next backend server.

And the fix C++ issue back then was not simply to try to enforce a
conceptual limit on 2GB, it instead required changing the C++ API to use a
`long` (int64) for the encoded size of messages instead of an int32 (the
size getter is called `ByteSizeLong()`). That made it much easier to write
correct behavior against 2GB limits; because when you have an `int
EncodedLength()` function, once you e.g. have 10 strings that are each
512MB, set them all as separate fields on the same parent message, then try
to see what the `int` serialized size should be, there's no way to handle
it gracefully. By making it a `long` instead it is able to return the
actual size without a 2GB limit, and then if you try to serialize a message
where that size is too large it will fail to serialize (serialize has a
bool return value on it).

On Fri, Jul 4, 2025 at 10:40 AM 'Somak Dutta' via Protocol Buffers <
[email protected]> wrote:

> Exactly, could not agree more. There are current limit set to
> Integer.MAX_VALUE in CodedInputStream
>
> May be a bit of context here would help, I am coming from the point of
> view https://groups.google.com/g/protobuf/c/vvP4uajRE60
>
> If the potential fix for it was to set limit to 2g in message_lite.c, in
> memory safe language like Java it is anyways default to 2g. I wonder if the
> vulnerability data in the world that marks java as impacted by the
> vulnerability is really over estimating.
>
> ```
>
> Somak Dutta
> Jul 3, 2025, 1:51:24 PM (yesterday)
> 
> 
> 
> to Protocol Buffers
> Hi,
>
> I am writing to ask about vulnerability reported GHSA-jwvw-v7c5-m82h
> <https://github.com/advisories/GHSA-jwvw-v7c5-m82h> for protobuf-java
> <https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java> which
> specifically talks about "*protobuf allows remote authenticated attackers
> to cause a heap-based buffer overflow.*"
>
> Specifically to ask about earlier versions < 3.4.0.
> Take for example a version 2.5.0, based on all the code i see for
> CodedInputStream
> <https://github.com/protocolbuffers/protobuf/blob/v2.5.0/java/src/main/java/com/google/protobuf/CodedInputStream.java>
> - methods such as readRawBytes/refillBuffer, which are performing either
> copy to/from or resizing , are all pretty safe from integer overflows.
> - there is also present a slow path, where we read buffer in chunks to
> potentially prevent out of memory issues.
>
> First Question:
> However i am not seeing any evidence where the package can be vulnerable
> to a buffer overflows issues
> Additionally given java is memory safe language i am failing to see how
> java ecosystem is susceptible to the afore mentioned vulnerability.
>
> Second Question:
> There is a question related / or along the same veins here
> https://github.com/protocolbuffers/protobuf/issues/760?reload=1#issuecomment-847162817
>  .
> The potential fix also suggests issue might be present only in c/c++
> ecosystems.
>
> ```
>
> Regards,
> Somak
>
> On Friday, July 4, 2025 at 3:27:21 PM UTC+5:30 Cassondra Foesch wrote:
>
>> I’m pretty sure that since 2 GiB is the maximum value an int32 could
>> carry, that is where the requirement is coming from. It’s entirely possible
>> that it is not actually enforced across the whole ecosystem, but is
>> essentially enforced by “if you exceed this boundary, some code will not
>> work with your protobuf.”
>>
>> Like, for instance, it is impossible for a 32-bit Golang implementation
>> do deal with more than 2 GiB data in a single slice. (Since the length of
>> the slice is stored as a 32-bit signed integer.)
>>
>> Am Do., 3. Juli 2025 um 08:21 Uhr schrieb 'Somak Dutta' via Protocol
>> Buffers <[email protected]>:
>>
>>> Hello,
>>>
>>> From https://protobuf.dev/programming-guides/proto-limits/ i understand
>>> across all ecosystems
>>>
>>> Any proto in serialized form must be <2GiB, as that is the maximum size
>>> supported by all implementations. It’s recommended to bound request and
>>> response sizes.
>>>
>>> However wanted to check where exactly is the limitation set up,
>>> specifically in protobuf-java library.
>>>
>>> I can see safe checks in only message_lite.cc files , but i dont think
>>> this would be reflected across ecosystems?
>>>
>>> if (size > INT_MAX) {
>>> GOOGLE_LOG(ERROR) << "Exceeded maximum protobuf size of 2GB: " << size;
>>> return false;
>>> }
>>>
>>> Regards
>>>
>>> *Confidentiality Notice: This email and any attachments are confidential
>>> and intended solely for the use of the individual or entity to whom they
>>> are addressed. If you have received this email in error, please notify the
>>> sender immediately and delete it from your system. Unauthorized use,
>>> disclosure, or copying of this email or its contents is strictly
>>> prohibited.*
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion visit
>>> https://groups.google.com/d/msgid/protobuf/e0d724d8-2a45-4ef1-aaac-c3e6d1077306n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/protobuf/e0d724d8-2a45-4ef1-aaac-c3e6d1077306n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
> *Confidentiality Notice: This email and any attachments are confidential
> and intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error, please notify the
> sender immediately and delete it from your system. Unauthorized use,
> disclosure, or copying of this email or its contents is strictly
> prohibited.*
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/protobuf/c75ea739-28b6-48fd-9394-3d13499d47ben%40googlegroups.com
> <https://groups.google.com/d/msgid/protobuf/c75ea739-28b6-48fd-9394-3d13499d47ben%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/protobuf/CAKRmVH-FgxWQPwU3h7%2B%2Bk8OxsWk6upbdsYADh2hzFKi4DmugTg%40mail.gmail.com.

Reply via email to