[jira] [Commented] (HBASE-9865) WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM

churro morales (JIRA) Tue, 05 Nov 2013 13:44:06 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814278#comment-13814278
 ]


churro morales commented on HBASE-9865:
---------------------------------------

One thing i noticed in WALEdit, we should be accounting for the ArrayList 
object as well
instead of:
{code}
public long heapSize() {
    long ret = 0;
{code}
this would be correct, although it doesn't matter very much.
{code}
public long heapSize() {
    long ret = ClassSize.ARRAYLIST;
{code}

If you didn't want to bleed the ArrayList implementation that WALEdit uses 
maybe something like this might work:
For WALEdit
{code}
public void removeIf(Predicate<KeyValue> predicate) {
    for (int i = kvs.size()-1; i >= 0; i--) {
      KeyValue kv = kvs.get(i);
      if (predicate.apply(kv)) {
        kvs.remove(i);
      }
    }
    if (kvs.size() < size()/2) {
       kvs.trimToSize();
    }
  }
{code}

And ReplicationSource would change to:
{code}
protected void removeNonReplicableEdits(WALEdit edit) {
    final NavigableMap<byte[], Integer> scopes = edit.getScopes();
    edit.removeIf(new Predicate<KeyValue>() {
      @Override
      public boolean apply(KeyValue keyValue) {
        return scopes == null || !scopes.containsKey(keyValue.getFamily());
      }
    });
}
{code}

I don't think it adds much by doing this but it is an alternative if we don't 
want to bleed that the WALEdit uses an ArrayList. 

> WALEdit.heapSize() is incorrect in certain replication scenarios which may 
> cause RegionServers to go OOM
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9865
>                 URL: https://issues.apache.org/jira/browse/HBASE-9865
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.5, 0.95.0
>            Reporter: churro morales
>            Assignee: Lars Hofhansl
>         Attachments: 9865-0.94-v2.txt, 9865-sample-1.txt, 9865-sample.txt, 
> 9865-trunk-v2.txt, 9865-trunk-v3.txt, 9865-trunk.txt
>
>
> WALEdit.heapSize() is incorrect in certain replication scenarios which may 
> cause RegionServers to go OOM.
> A little background on this issue.  We noticed that our source replication 
> regionservers would get into gc storms and sometimes even OOM. 
> We noticed a case where it showed that there were around 25k WALEdits to 
> replicate, each one with an ArrayList of KeyValues.  The array list had a 
> capacity of around 90k (using 350KB of heap memory) but had around 6 non null 
> entries.
> When the ReplicationSource.readAllEntriesToReplicateOrNextFile() gets a 
> WALEdit it removes all kv's that are scoped other than local.  
> But in doing so we don't account for the capacity of the ArrayList when 
> determining heapSize for a WALEdit.  The logic for shipping a batch is 
> whether you have hit a size capacity or number of entries capacity.  
> Therefore if have a WALEdit with 25k entries and suppose all are removed: 
> The size of the arrayList is 0 (we don't even count the collection's heap 
> size currently) but the capacity is ignored.
> This will yield a heapSize() of 0 bytes while in the best case it would be at 
> least 100000 bytes (provided you pass initialCapacity and you have 32 bit 
> JVM) 
> I have some ideas on how to address this problem and want to know everyone's 
> thoughts:
> 1. We use a probabalistic counter such as HyperLogLog and create something 
> like:
>       * class CapacityEstimateArrayList implements ArrayList
>               ** this class overrides all additive methods to update the 
> probabalistic counts
>               ** it includes one additional method called estimateCapacity 
> (we would take estimateCapacity - size() and fill in sizes for all references)
>       * Then we can do something like this in WALEdit.heapSize:
>       
> {code}
>   public long heapSize() {
>     long ret = ClassSize.ARRAYLIST;
>     for (KeyValue kv : kvs) {
>       ret += kv.heapSize();
>     }
>     long nullEntriesEstimate = kvs.getCapacityEstimate() - kvs.size();
>     ret += ClassSize.align(nullEntriesEstimate * ClassSize.REFERENCE);
>     if (scopes != null) {
>       ret += ClassSize.TREEMAP;
>       ret += ClassSize.align(scopes.size() * ClassSize.MAP_ENTRY);
>       // TODO this isn't quite right, need help here
>     }
>     return ret;
>   }   
> {code}
> 2. In ReplicationSource.removeNonReplicableEdits() we know the size of the 
> array originally, and we provide some percentage threshold.  When that 
> threshold is met (50% of the entries have been removed) we can call 
> kvs.trimToSize()
> 3. in the heapSize() method for WALEdit we could use reflection (Please don't 
> shoot me for this) to grab the actual capacity of the list.  Doing something 
> like this:
> {code}
> public int getArrayListCapacity()  {
>     try {
>       Field f = ArrayList.class.getDeclaredField("elementData");
>       f.setAccessible(true);
>       return ((Object[]) f.get(kvs)).length;
>     } catch (Exception e) {
>       log.warn("Exception in trying to get capacity on ArrayList", e);
>       return kvs.size();
>     }
> {code}
> I am partial to (1) using HyperLogLog and creating a 
> CapacityEstimateArrayList, this is reusable throughout the code for other 
> classes that implement HeapSize which contains ArrayLists.  The memory 
> footprint is very small and it is very fast.  The issue is that this is an 
> estimate, although we can configure the precision we most likely always be 
> conservative.  The estimateCapacity will always be less than the 
> actualCapacity, but it will be close. I think that putting the logic in 
> removeNonReplicableEdits will work, but this only solves the heapSize problem 
> in this particular scenario.  Solution 3 is slow and horrible but that gives 
> us the exact answer.
> I would love to hear if anyone else has any other ideas on how to remedy this 
> problem?  I have code for trunk and 0.94 for all 3 ideas and can provide a 
> patch if the community thinks any of these approaches is a viable one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9865) WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM

Reply via email to