[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212838#comment-15212838
 ] 

stack commented on HBASE-14921:
-------------------------------

Reviewed again after another read of the attached diagram.

Does the 'CellBlock' Map have to be a Concurrent Map? This is a readonly 
datastructure once made? No need of the 'Concurrent'?

If so, you made CellSet (or Eshcar) made CellSet which implements NaviableSet. 
Here we are slinging Cells and implement NavigableMap. Call it CellMap rather 
than CellBock? I was trying to come up with a name that made reference to how 
it is actually implemented -- e.g. like ConcurrentSkipListMap -- which would 
seem to say all it CellArrayMap rather than CellBlock but when I see 
CellBlockObjectArray later..  may be CellBinSearchMap....

It is odd that comparator is passed on construction and when we call 
createCellBlocks. Should constructor be protected so folks can't construct one 
w/o there being data in it? Should we save the comparator when createCellBlocks 
rather than pass on construction?  Then we have to make sure the comparators 
agree? Or do we? Seems like you need it on construction so can propagate 
instances with the createCellBlocks call. Seems like allowing passing different 
comparators would make for trouble so don't allow passing comparator on 
createCellBlocks?

Why name the method getCellFromIndex(int i)? Should it be get(i) because we are 
getting the ith Cell in Map? (The comment says that this is what it is doing)

It'd be cool to compare a CSLM to one of these CAMs at different scales to see 
difference in perf in a test bench.

For CellBlockObjectArray, it should be called CellArrayMap?

What does "CellBlockObjectArray is a simple array of Cells allocated using 
JVM." mean? All our stuff is on a JVM (smile)

Below should be final

  Cell[] block;


Is the below total Cels?

  private int numOfCellsInsideChunk;

It looks like it is Cells per chunk? This is constant for all Chunks in a 
CellBlockSerialized?

(BTW CellBlockSerialized could be ChunkCellMap or CellMLABMap

This seems to be the 'key', or 'index key' size, not bytes in a cell?: private 
static final int BYTES_IN_CELL = 3*(Integer.SIZE / Byte.SIZE); // each Cell 
requires 3 integers

I love the way this works. Very nice. Simple. Only bit is how you ensure all 
Chunks have same amount of Cells?

Remove this stuff rather than just comment it out:

     //c = (chunkPool != null) ? chunkPool.getChunk() : new Chunk(chunkSize, 
5); //14921
197     

The 14921 everywhere should be stripped. Its HBASE-14921? 

Is this a hardcoding?


201             c = new Chunk(chunkSize, 5);

The Chunk Pool needs to be on for this all to work? Is that in this patch?

For...

   * Only used in testing

we usually mark with @VisibleForTestiing annotation.

Patch looks great.




> Memory optimizations
> --------------------
>
>                 Key: HBASE-14921
>                 URL: https://issues.apache.org/jira/browse/HBASE-14921
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Anastasia Braginsky
>         Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to