[ 
https://issues.apache.org/jira/browse/THRIFT-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870555#comment-16870555
 ] 

aqingsir commented on THRIFT-4887:
----------------------------------

Thanks James, The missing guide is great enough to worth the name.

I also agree with the solution of preventing replacement of previously assigned 
slot numbers, but could it be done on Thrift protocol level instead of a best 
practice, cause I doubt whether one fifth of Thrift users have ever read or 
understoodd the missing guide.

> Thrift will OOM at a low concurrency if fields added and old client requests 
> new server
> ---------------------------------------------------------------------------------------
>
>                 Key: THRIFT-4887
>                 URL: https://issues.apache.org/jira/browse/THRIFT-4887
>             Project: Thrift
>          Issue Type: Bug
>         Environment: Almost all versions from 0.8.0 to the newest 
> 0.13.0-snapshot, and verify on 0.9.3/0.11.0.
> Almost all languages, and verified on Go/Java
>            Reporter: aqingsir
>            Assignee: James E. King III
>            Priority: Major
>         Attachments: readI32.jpg
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> (Could see  more on [https://github.com/aqingsao/thrift-oom])
> //background
> A serious issue occured in our prod env and finally it came out to be the 
> changement of some fields in an IDL file, old client still requested new 
> server and crashed due to OOM.
> IDL changement could be stated as: Return value of the interface is a list, 
> element of which is a struct object and has 5 fields. A new field is added to 
> the middle of the struct.
> // to reproduce
> In this case a low concurency of 10 will reproduce this issue, you could find 
> a demo project on: [https://github.com/aqingsao/thrift-oom]
> // reason
> Thrift tries to consume all data in inputstream by skipping fields that are 
> redundant or have a type mismatch. But it won't consume subsequent fields if 
> there's an exception.
> In such a case Thrift does nothing the underlying inputstream, so trouble 
> comes to the next request who reuses this connection, as the cursor still 
> points to some middle position of the inputstream.
> As Thrift always starts with a readI32() method for any response, which means 
> the length of the method's name. Unbelievable the length could be as large as 
> 184549632, which is about 176M. This explains why OOM occurs even at a 
> concurrency of 10
> // how to fix
> Always clear inputstream in TSocket if there are any redundant data at the 
> end of a method call.
> I'll submit a PR soon for Java version.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to