[ 
https://issues.apache.org/jira/browse/PHOENIX-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024029#comment-16024029
 ] 

Karan Mehta edited comment on PHOENIX-3884 at 5/25/17 2:01 AM:
---------------------------------------------------------------

bq. Or just use CellUtil#estimatedSerializedSizeOf(Cell) ? It's in branch-1.1 
and up.
This computes values everytime. Can we use the following?

bq. + byteSize += c.getRowLength() + c.getFamilyLength() + 
c.getQualifierLength() + c.getValueLength() + KeyValue.KEY_INFRASTRUCTURE_SIZE 
+ KeyValue.KEYVALUE_WITH_TAGS_INFRASTRUCTURE_SIZE;
I think Cell will always be an instance of KeyValue over here, so 
KeyValue#getLength() can be used? The value of KV is computed during its 
creation and cached in this variable, so it will be quick.

This is how HBase does for its own Quota calculation. You might want to use 
that function as well.
{code}
    for (Map.Entry<byte [], List<Cell>> entry : 
mutation.getFamilyCellMap().entrySet()) {
      for (Cell cell : entry.getValue()) {
        size += KeyValueUtil.length(cell);
      }
    }
{code}


was (Author: karanmehta93):
bq. Or just use CellUtil#estimatedSerializedSizeOf(Cell) ? It's in branch-1.1 
and up.
This computes values everytime. Can we use the following?

bq. + byteSize += c.getRowLength() + c.getFamilyLength() + 
c.getQualifierLength() + c.getValueLength() + KeyValue.KEY_INFRASTRUCTURE_SIZE 
+ KeyValue.KEYVALUE_WITH_TAGS_INFRASTRUCTURE_SIZE;
I think Cell will always be an instance of KeyValue over here, so 
KeyValue#getLength() can be used? The value of KV is computed during its 
creation and cached in this variable, so it will be quick.

> Correct MutationState size estimation
> -------------------------------------
>
>                 Key: PHOENIX-3884
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3884
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.10.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: 3884.txt
>
>
> Currently the Mutation is estimated by called Mutation.heapSize(), which adds 
> all the overhead needed to store the Mutation on the Java heap and has little 
> to do with the actual size on the wire or the size of disk.
> With a sample row with a 20 byte key and 10 columns with a qualifier length 
> and value length of this reports 1800 bytes, where the size is closer to 
> 600-700 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to