[ 
https://issues.apache.org/jira/browse/THRIFT-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933843#comment-14933843
 ] 

James E. King, III commented on THRIFT-1559:
--------------------------------------------

There are other thread-based heap allocators like tcmalloc.

> Provide memory pool for TBinaryProtocol to eliminate memory fragmentation
> -------------------------------------------------------------------------
>
>                 Key: THRIFT-1559
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1559
>             Project: Thrift
>          Issue Type: Improvement
>          Components: C++ - Library
>    Affects Versions: 0.8
>         Environment: Linux
>            Reporter: Yingfeng Zhang
>              Labels: memory
>             Fix For: 0.9.3
>
>
> We use Thrift c++ client library (0.7/0.8) to communicate with Apache 
> Cassandra (1.0), and we need to frequently get intensive data from Cassandra. 
> The type of data got has the following definition(multiget_slice):
> std::map<std::string, std::vector<ColumnOrSuperColumn> >, where 
> ColumnOrSuperColumn is a struct composed of several std::map with std::string 
> keys.
> Supose we have 1M data, and each time we got 1k, it means 1k records will 
> exist in such struct as "std::map<std::string, 
> std::vector<ColumnOrSuperColumn> >", then we need to call thrift RPC 1K 
> times. While we destroy the above object of "std::map<std::string, 
> std::vector<ColumnOrSuperColumn> >" immediately after the RPC, which means we 
> do nothing but just perform the RPC operation. During that period, we found 
> that the memory consumption keeps growing, evenif we attach jemalloc to the 
> process for memory defragmentation. 
> No matter how we tune the batch size, say the above 1k, ranging from 10 to 
> 20k, the memory fragmentation keeps a high percentage, it means given more 
> data, say 10M, just such RPC operation will eat up the memory: In fact, our 
> process was killed by OS due to too much memory consumption. 
> We believe that the current design of memory usage of Thrift cpp client has 
> caused too much memory fragmentation and the issue appears to be more serious 
> given more data as well as more complicated struct as defined in Cassandra.
> I suggest to provide memory pool for Thrift cpp library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to