[ https://issues.apache.org/jira/browse/THRIFT-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jake Farrell closed THRIFT-1559. -------------------------------- Resolution: Fixed Fix Version/s: 0.9.3 Agree, this can always be reopened if someone has the time and does not want to use JEMALLOC for whatever reason > Provide memory pool for TBinaryProtocol to eliminate memory fragmentation > ------------------------------------------------------------------------- > > Key: THRIFT-1559 > URL: https://issues.apache.org/jira/browse/THRIFT-1559 > Project: Thrift > Issue Type: Improvement > Components: C++ - Library > Affects Versions: 0.8 > Environment: Linux > Reporter: Yingfeng Zhang > Labels: memory > Fix For: 0.9.3 > > > We use Thrift c++ client library (0.7/0.8) to communicate with Apache > Cassandra (1.0), and we need to frequently get intensive data from Cassandra. > The type of data got has the following definition(multiget_slice): > std::map<std::string, std::vector<ColumnOrSuperColumn> >, where > ColumnOrSuperColumn is a struct composed of several std::map with std::string > keys. > Supose we have 1M data, and each time we got 1k, it means 1k records will > exist in such struct as "std::map<std::string, > std::vector<ColumnOrSuperColumn> >", then we need to call thrift RPC 1K > times. While we destroy the above object of "std::map<std::string, > std::vector<ColumnOrSuperColumn> >" immediately after the RPC, which means we > do nothing but just perform the RPC operation. During that period, we found > that the memory consumption keeps growing, evenif we attach jemalloc to the > process for memory defragmentation. > No matter how we tune the batch size, say the above 1k, ranging from 10 to > 20k, the memory fragmentation keeps a high percentage, it means given more > data, say 10M, just such RPC operation will eat up the memory: In fact, our > process was killed by OS due to too much memory consumption. > We believe that the current design of memory usage of Thrift cpp client has > caused too much memory fragmentation and the issue appears to be more serious > given more data as well as more complicated struct as defined in Cassandra. > I suggest to provide memory pool for Thrift cpp library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)