[ 
https://issues.apache.org/jira/browse/HBASE-15525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209775#comment-15209775
 ] 

Anoop Sam John commented on HBASE-15525:
----------------------------------------

The issue why we go out of off heap memory is because of 2 main reasons
1. The pool has already reached its max capacity of #BBs. And at a given point 
of time, all are in use. Again other Calls ask pool for BBs for their cell 
block creation. The pool happily make new BBs which are off heap with each 
having size of avg running length. And all these cell blocks are tied to Call 
until the Responder write them to socket. Ya we wont be keeping them in pool. 
But it is kept as is for loner time specially when the response Q is growing. 
2. Even when the response CellBlock need is very low like 12 KB or so, we waste 
512 MB per response. Waste in the sense that all the 500 MB is not usable at 
all. And even the new BBs which pool create on demand (These might not pooled 
at all as we reach max #BBs in pool) also takes 512 MB per BB.
So in a simple way we can say that its really difficult for the user to predict 
how much max off heap size he need to give. With deepankar case, he is applying 
some calc based on the max #BBs in pool and max BB size + some additional GBs 
and set the max off heap size as 5 GB. But this is going wrong..
To explain it with an eg:
Consider one configured the max #BBs in pool as 100. And max per item size as 
1MB. Means max can have 100 MB off heap consumption by this pool.. Now consider 
there are lots of reqs and the response Q is big.. Say the 1st 100 response use 
all BBs from pool. Now again reqs are there and say there are like 100 more 
adding to Q.. Each one req to pool. It makes BB off heap for those. Means out 
of the pool we have made double the total max size what we thought it will 
take.. I agree that we wont store those all BBs in pool and ya the GC may be 
able to clean it also.. But for some time (untill we clear these response Q) 
the usage is more.
And one more thing for GC is that the full GC only can clean the off heap area? 
So this in other words cause more full GCs? (If we go out
of space in off heap area)!!!
So that is why my thinking abt changing these temp BB creation when happens, 
those should be HBBs.
We need to make pool such that we will give a BB back if it is having a free 
one. When it is not having a free one and capacity is not reached, it makes a 
new DBB and return. If that is also not the case it wont return any. The 
BBBPool will make and take back offf heap BBs only. If it can not give, let the 
caller do what they want (Make on heap BB and make sure dont give back to pool)
And abt fixing the size of BBs from pool.. Will write in another comment. This 
is too big already

> OutOfMemory could occur when using BoundedByteBufferPool during RPC bursts
> --------------------------------------------------------------------------
>
>                 Key: HBASE-15525
>                 URL: https://issues.apache.org/jira/browse/HBASE-15525
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>            Reporter: deepankar
>            Assignee: Anoop Sam John
>            Priority: Critical
>
> After HBASE-13819 the system some times run out of direct memory whenever 
> there is some network congestion or some client side issues.
> This was because of pending RPCs in the RPCServer$Connection.responseQueue 
> and since all the responses in this queue hold a buffer for cellblock from 
> BoundedByteBufferPool this could takeup a lot of memory if the 
> BoundedByteBufferPool's moving average settles down towards a higher value 
> See the discussion here 
> [HBASE-13819-comment|https://issues.apache.org/jira/browse/HBASE-13819?focusedCommentId=15207822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15207822]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to