Provide memory pool for TBinaryProtocol to eliminate memory fragmentation
-------------------------------------------------------------------------

                 Key: THRIFT-1559
                 URL: https://issues.apache.org/jira/browse/THRIFT-1559
             Project: Thrift
          Issue Type: Improvement
          Components: C++ - Library
    Affects Versions: 0.8
         Environment: Linux
            Reporter: Yingfeng Zhang


We use Thrift c++ client library (0.7/0.8) to communicate with Apache Cassandra 
(1.0), and we need to frequently get intensive data from Cassandra. The type of 
data got has the following definition(multiget_slice):
std::map<std::string, std::vector<ColumnOrSuperColumn> >, where 
ColumnOrSuperColumn is a struct composed of several std::map with std::string 
keys.
Supose we have 1M data, and each time we got 1k, it means 1k records will exist 
in such struct as "std::map<std::string, std::vector<ColumnOrSuperColumn> >", 
then we need to call thrift RPC 1K times. While we destroy the above object of 
"std::map<std::string, std::vector<ColumnOrSuperColumn> >" immediately after 
the RPC, which means we do nothing but just perform the RPC operation. During 
that period, we found that the memory consumption keeps growing, evenif we 
attach jemalloc to the process for memory defragmentation. 

No matter how we tune the batch size, say the above 1k, ranging from 10 to 20k, 
the memory fragmentation keeps a high percentage, it means given more data, say 
10M, just such RPC operation will eat up the memory: In fact, our process was 
killed by OS due to too much memory consumption. 

We believe that the current design of memory usage of Thrift cpp client has 
caused too much memory fragmentation and the issue appears to be more serious 
given more data as well as more complicated struct as defined in Cassandra.

I suggest to provide memory pool for Thrift cpp library.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to