[ 
https://issues.apache.org/jira/browse/THRIFT-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated THRIFT-5464:
-----------------------------------
    Description: 
First: apologies if this is a false alarm, since I'm going by my reading of the 
C++ library source code.

To try to understand whether the new MaxMessageSize setting is important for 
our (Apache Parquet) use case, I tried to go through the C++ library source 
code to understand how it's used exactly. (see the message I posted in 
THRIFT-5237)

My understanding is that there are two main facilities for checking against the 
max message size:
* {{TTransport::countConsumedMessageBytes(numBytes)}} raises if {{numBytes}} is 
greater than the remaining message size, otherwise decrements the remaining 
message size by {{numBytes}}
* {{TTransport::checkReadBytesAvailable(numBytes=}} also raises if {{numBytes}} 
is greater than the remaining message size, but _doesn't_ otherwise update the 
remaining message size

In {{TBufferBase::read}}, the internal buffer pointer is bumped by {{len}} 
bytes; _however_, {{checkReadBytesAvailable}} is called and not 
{{countConsumedMessageBytes}}. This means that multiple calls to 
{{TBufferBase::read}} will iterate through buffer memory but never update the 
remaining message size. In the end, the max message size limit is never 
upholded, except if a single read is larger than that size.

As a side note, a quick grep through the {{lib/cpp/test}} directory seems to 
suggest that the max message size limits are not tested anywhere, but that I 
may be mistaken.

  was:
First: apologies if this is a false alarm, since I'm going by my reading of the 
C++ library source code.

To try to understand whether the new MaxMessageSize setting is important for 
our (Apache Parquet) use case, I tried to go through the C++ library source 
code to understand how it's used exactly. (see the message I posted in 
THRIFT-5237)

My understanding is that there are two main facilities for checking against the 
max message size:
* {{TTransport::countConsumedMessageBytes(numBytes)}} raises if {{numBytes}} is 
greater than the remaining message size, otherwise decrements the remaining 
message size by {{numBytes}}
* {{TTransport::checkReadBytesAvailable}} also raises if {{numBytes}} is 
greater than the remaining message size, but _doesn't_ otherwise update the 
remaining message size




> [C++] maxMessageSize possibly not correctly observed in TBufferBase
> -------------------------------------------------------------------
>
>                 Key: THRIFT-5464
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5464
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library
>    Affects Versions: 0.14.2
>            Reporter: Antoine Pitrou
>            Priority: Major
>
> First: apologies if this is a false alarm, since I'm going by my reading of 
> the C++ library source code.
> To try to understand whether the new MaxMessageSize setting is important for 
> our (Apache Parquet) use case, I tried to go through the C++ library source 
> code to understand how it's used exactly. (see the message I posted in 
> THRIFT-5237)
> My understanding is that there are two main facilities for checking against 
> the max message size:
> * {{TTransport::countConsumedMessageBytes(numBytes)}} raises if {{numBytes}} 
> is greater than the remaining message size, otherwise decrements the 
> remaining message size by {{numBytes}}
> * {{TTransport::checkReadBytesAvailable(numBytes=}} also raises if 
> {{numBytes}} is greater than the remaining message size, but _doesn't_ 
> otherwise update the remaining message size
> In {{TBufferBase::read}}, the internal buffer pointer is bumped by {{len}} 
> bytes; _however_, {{checkReadBytesAvailable}} is called and not 
> {{countConsumedMessageBytes}}. This means that multiple calls to 
> {{TBufferBase::read}} will iterate through buffer memory but never update the 
> remaining message size. In the end, the max message size limit is never 
> upholded, except if a single read is larger than that size.
> As a side note, a quick grep through the {{lib/cpp/test}} directory seems to 
> suggest that the max message size limits are not tested anywhere, but that I 
> may be mistaken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to