[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167345#comment-15167345 ]
Anastasia Braginsky commented on HBASE-14921: --------------------------------------------- Hi Guys, So nice the HBASE-14918 gets so much attention :) Lets move some attention to HBASE-14921 I have moved the discussion here from the umbrella. After all it is all about task #4. For newcomers, for the previous discussion of this topic please take a look on HBASE-14918. bq. One question; what happens when a CellBlockSegment runs into a HFileBlock? How will the marshalling from CBS to HFB run? CellBlocksSegment can be transferred to HFile (in snapshot time) exactly as it is done now, using the special KeyValueScanner for the MemStoreSnapshot object. As a side note, because CellBlocksSegment inherits from ImmutableSegment it supports both KeyValueScanner (for the MemStoreObject) and SegmentScanner (a scanner for the given read point). {quote} Just striving to use an array of arrays, instead of list of arrays, in order to enjoy the binary search and less memory overhead. A noble goal. {quote} Thanks! :) bq. So, an array of CellBlocks? You'd allocate CellBlocks with MSLAB? Good question! So how MemStore is currently working? MSLAB deals with 90% of the memory and it is already flat. MSLAB allocates ByeRanges on top of which Cell data (Key/Value/Tags/etc.) is spread. After that the Cell object is created and the reference to this Cell is inserted into the skip-list to accelerate the search. In CellBlocksSegment the allocation from MSLAB is completely the same (!!!), meaning we are not going to change the mayBeCloneWithAllcoator(). Whether MSLAB's Chunks are going to be on/off-heap depends on MSLAB implementation and it is actually a separate issue. So CellBlocksSegment replaces the skip-list with array. We want to get a better search by using the binary search and binary search implies fixed-size elements of an array. Cell’s internals are written as variable-length elements in MSLAB's Chunks. In the array/block of CellBlocksSegment we keep the fixed-length references to Cells. This array/block can be allocated from MSLAB and can be not. Can be ByteRange[]/ByteBuffer[]/ByteBuff[]/MultipleByteBuf/BoundedByteBufferPool and the “plethora” of types as [~stack] has already mentioned. Pay attention that Cell reference is something small. On a 32-bit JVM, it will be 32 bits. on a 64-bit JVM, it can be 32 or 64 bits depending on configuration. The array is very space-efficient, so we can squeeze up to million Cell references into 4MB, so not much memory will be required for CellBlocksSegment. What are pros of CellBlocks vs SkipList? 1. Space efficient: Array uses less space then skip-list 2. GC friendly: less fragmentation 3. Locality: Sequential scan (like when downloading snapshot to disk) should be faster 4. Can be taken off-heap However, the memory used for SkipList should be less than 10% from 128MB allowed for a MemStore. So it is unclear whether we are going to see "de facto” any performance boost from flattening the navigation structure. Therefore, we want to separate this effort from “general MSLAB off-heaping” and to see benchmark results. Maybe we will see almost no improvement... bq. Yeah, lets align what you are doing here with the offheaping of the write path work @anastasia. Please note that CellBlocksSegment and off-heaping are really separate issues and we plan to refer to the second after we finish with the first. The initial plan (also supported by [~anoop.hbase] and [~ram_krish]) was that MSLAB should be switched from ByteRange to ByteBuffer and support both on and off-heap. We need to pre-allocate the Chunks with MemStoreChunkPool and to create a comfortable interface. We also need to see how to do it with less pain, because MSLAB is referenced from many places in the code. What do you think? Thanks to all previous repliers, Anastasia > Memory optimizations > -------------------- > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.0.0 > Reporter: Eshcar Hillel > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)