Hi Xiangdong,

first, thanks for bringing it back tot he list and the excellent design 
document.
I just went shortly over it, I need some more time to read it in detail and 
give, if I can, some sensible feedback.

Julian

Am 28.04.19, 15:41 schrieb "Xiangdong Huang" <saint...@gmail.com>:

    Hi,
    
    Tian Jiang and I discussed about this issues and proposed a new design to
    control memory usage for Overflow.
    
    I leave the design document at:
    
https://cwiki.apache.org/confluence/display/IOTDB/New+Design+of+Overflow+and+the+Mergence+Process
    
    
    Please leave your comment at
    https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-84 or this
    mailing list.
    
    Best,
    
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院
    
    
    Xiangdong Huang <saint...@gmail.com> 于2019年4月22日周一 下午12:50写道:
    
    > Hi,
    >
    > I think we can split the task 1~3 as sub-tasks in JIRA.
    >
    > And, I recommend to learn how Cassandra manages memory (in package:
    > org.apache.cassandra.utils.memory) and then design our strategy.
    >
    > Best,
    > -----------------------------------
    > Xiangdong Huang
    > School of Software, Tsinghua University
    >
    >  黄向东
    > 清华大学 软件学院
    >
    >
    > kangrong (JIRA) <j...@apache.org> 于2019年4月22日周一 下午12:42写道:
    >
    >> kangrong created IOTDB-84:
    >> -----------------------------
    >>
    >>              Summary: Out-of-Memory bug
    >>                  Key: IOTDB-84
    >>                  URL: https://issues.apache.org/jira/browse/IOTDB-84
    >>              Project: Apache IoTDB
    >>           Issue Type: Bug
    >>             Reporter: kangrong
    >>          Attachments: image-2019-04-22-12-38-04-903.png
    >>
    >> It occurs out-of-memory problem in the last long-term test of branch
    >> "add_disabled_mem_control":
    >>
    >> !image-2019-04-22-12-38-04-903.png!
    >>
    >> We analyze the reason and try to solve it as follows:
    >>  # 1. *Flushing to disk may double the memory cost*: A storage group
    >> maintains a list of ChunkGroups in memory and will be flushed to disk 
when
    >> its occupied memory exceeding the threshold (128MB by default).
    >>  ## In the current implementation, when starting to flush data, a
    >> ChunkGroup is encoded in memory and thereby a new byte array is kept in
    >> memory. Until all ChunkGroups have been encoded in memory, their
    >> corresponding byte arrays can be released together. Since the byte array
    >> has a comparable size with original data (0.5× to 1×), the above strategy
    >> may double the memory in the worst case.
    >>  ## Solution: It is needed to redesign the flush strategy. In TsFile, a
    >> Page is the minimal flush unit, where a ChunkGroup contains several 
Chunks
    >> and a Chunk contains several pages. Once a page is encoded into a byte
    >> array, we can flush the byte array to disk and then release it. In this
    >> case, the extra memory is a page size (64KB by default) at most. This
    >> modification involves a list of cascading change, including metadata 
format
    >> and writing process.
    >>  # *Memory Control Strategy*: It is needed to redesign the memory Control
    >> Strategy. For example, assigning 60% memory to the writing process and 
30%
    >> memory to the querying process. The writing memory includes the memory
    >> table and the flush process. As an Insert coming, if its required memory
    >> exceeds TotalMem * 0.6 - MemTableUsage - FlushUsage, the Insert will be
    >> rejected.
    >>  # *Is the memory statistics accuracy?* In current codes, the memory
    >> usage of a TSRecord Java Object, corresponding to an Insert SQL, is
    >> calculated by summating its DataPoints. e.g., "insert into
    >> root.a.b.c(timestamp,v1, v2) values(1L, true, 1.2f)", its usage is 8 + 1 
+
    >> 4=13, which ignores the size of object head and others. It is needed to
    >> redesign the memory statistics accuracy carefully.
    >>  # *Is there still the memory leak?* As shown in the log of the last
    >> crash due to the out of memory exception, we find out the actual JVM 
memory
    >> is 18G, whereas our memory statistic module only counts 8G. Besides the
    >> inaccuracy mentioned in Q3, we doubt there are still memory leak or other
    >> potential problems. We will continue to debug it.
    >>
    >>
    >>
    >>
    >>
    >> --
    >> This message was sent by Atlassian JIRA
    >> (v7.6.3#76005)
    >>
    >
    

Reply via email to