[ 
https://issues.apache.org/jira/browse/HBASE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733941#comment-13733941
 ] 

Jean-Daniel Cryans commented on HBASE-7391:
-------------------------------------------

Why is this not pulled out of the if-else statement?

bq. rowDict.init(Short.MAX_VALUE);

Is the HLog change necessary?

The new blank line in Dictionary has white spaces. Same in ProtobufLogWriter. 
And ReaderBase.

ReaderBase.isRecoveredEdits() should just be one line, returning the statement, 
don't "if () return true else false".

And I see the same method in WriterBase, can we have a more centralized way of 
telling if a path contains recovered edits?
                
> Review/improve HLog compression's memory consumption
> ----------------------------------------------------
>
>                 Key: HBASE-7391
>                 URL: https://issues.apache.org/jira/browse/HBASE-7391
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.95.2
>
>         Attachments: HBASE-7391_1.patch
>
>
> From Ram in 
> http://mail-archives.apache.org/mod_mbox/hbase-dev/201205.mbox/%3C00bc01cd31e6$7caf1320$760d3960$%25vasude...@huawei.com%3E:
> {quote}
> One small observation after giving +1 on the RC.
> The WAL compression feature causes OOME and causes Full GC.
> The problem is, if we have 1500 regions and I need to create recovered.edits
> for each of the region (I don’t have much data in the regions (~300MB)).
> Now when I try to build the dictionary there is a Node object getting
> created.
> Each node object occupies 32 bytes.
> We have 5 such dictionaries.
> Initially we create indexToNodes array and its size is 32767.
> So now we have 32*5*32767 = ~5MB.
> Now I have 1500 regions.
> So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
> initial memory foot print and this never allows me to split the logs and I
> am not able to make the cluster up at all.
> Our configured heap size was 8GB, tested in 3 node cluster with 5000
> regions, very less data( 1GB in hdfs cluster including replication), some
> small data is spread evenly across all regions.
> The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
> objects)*noofregions.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to