[jira] [Commented] (HBASE-13448) New Cell implementation with cached component offsets/lengths

Anoop Sam John (JIRA) Sun, 31 May 2015 00:24:07 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566404#comment-14566404
 ]


Anoop Sam John commented on HBASE-13448:
----------------------------------------

@larsh thanks for the comments

I was trying to explain why we won't see any improve as such in the test and 
especially  in 0.98. Sorry if I was not clearly saying.
Test have 1 CF and single file in that. Under StoreScanner KVHeap, we have only 
single file always and there is no comparison happening and no calls to 
getXXXOffset/Length there.  There is get calls in StoreScanner (max 2 times) 
and then in SQM also we need component offset/length.  But in SQM we dont do 
get calls on KeyValue to get offset/length.  Instead we calculate there on 
parsing KV buffer on our own. (See code below). Then SQM is skipping these 
cells and so no further get calls on the cells.  So in effect there is 2 times 
get call on rowLength and just one time on others.  This makes it clear why no 
adv.
In a real case where Cells are not skipped (and in trunk especially) there are 
many times call happen and mainly on rowLength.  When ExplicitColTracker in 
use, there are calls to qualifier offset/length also many times.  For other 
component length/offset, the keyLength is parsed frequently.  If u see table in 
above comments you can see how many times each call happen on a single Cell. 
Those numbers are when cells are written back to client side so comes in all 
layes.  But in that test also I had only 1 CF and one HFile.  So when this is 
also getting more, there will be comparison op happening in 2 KVHeaps and so 
the calls will be more. (We no longer pass the byte[], offset, length into 
Comparators but instead pass Cell alone)

So in case of trunk there will be adv we would see..  If you can give us your 
test, I will run it on trunk.

{code}
byte [] bytes = kv.getBuffer();
    int offset = kv.getOffset();

    int keyLength = Bytes.toInt(bytes, offset, Bytes.SIZEOF_INT);
    offset += KeyValue.ROW_OFFSET;

    int initialOffset = offset;

    short rowLength = Bytes.toShort(bytes, offset, Bytes.SIZEOF_SHORT);
    offset += Bytes.SIZEOF_SHORT;

    int ret = this.rowComparator.compareRows(row, this.rowOffset, 
this.rowLength,
        bytes, offset, rowLength);
...
...

//Passing rowLength
    offset += rowLength;

    //Skipping family
    byte familyLength = bytes [offset];
    offset += familyLength + 1;

    int qualLength = keyLength -
      (offset - initialOffset) - KeyValue.TIMESTAMP_TYPE_SIZE;

    long timestamp = Bytes.toLong(bytes, initialOffset + keyLength - 
KeyValue.TIMESTAMP_TYPE_SIZE);
        ...
        ...
byte type = bytes[initialOffset + keyLength - 1];
...
MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, type);
    if (colChecker == MatchCode.INCLUDE) {
      ReturnCode filterResponse = ReturnCode.SKIP;
      // STEP 2: Yes, the column is part of the requested columns. Check if 
filter is present
      if (filter != null) {
        // STEP 3: Filter the key value and return if it filters out
        filterResponse = filter.filterKeyValue(kv);

{code}


> New Cell implementation with cached component offsets/lengths
> -------------------------------------------------------------
>
>                 Key: HBASE-13448
>                 URL: https://issues.apache.org/jira/browse/HBASE-13448
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Scanners
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: 13448-0.98.txt, HBASE-13448.patch, HBASE-13448_V2.patch, 
> HBASE-13448_V3.patch, gc.png, hits.png
>
>
> This can be extension to KeyValue and can be instantiated and used in read 
> path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13448) New Cell implementation with cached component offsets/lengths

Reply via email to