[
https://issues.apache.org/jira/browse/HBASE-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664343#action_12664343
]
stack commented on HBASE-1127:
------------------------------
Been looking at this more. I can watch the GC doing Full, Full, Full, but our
executor thread checking SoftValueMap reference queues is not clearing
anything. Then we OOME. I tried various things including a thread per
instance of BlockFSInputStream just blocked on the reference queue waiting for
the GC to add stuff. Odd is that even in this case, we OOME though we get a
bit further. Changing the interval between when our executor thread runs from
10 seconds to 1 second makes it so the executor now does clearing of reference
queues but again its not enough. We'll OOME at about same place as we do when
we have a thread per BlockFSInputStream instance (A thread per instance won't
fly so this is good)
I'm going to look at this a little more. In times of high memory pressure, its
as though the GC gives up adding items to reference queues which wouldn't seem
to make sense. Given that we're up against the RC, I am currently thinking
that I'll revert to having blockcache on by default and instead let users
choose it explicitly (with the checker running every second). I'll leave it on
in catalog tables so meta content has block cache on.
> OOME running randomRead PE
> --------------------------
>
> Key: HBASE-1127
> URL: https://issues.apache.org/jira/browse/HBASE-1127
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.19.0
>
>
> Blockcache is misbehaving on TRUNK. Something is broke. We OOME about 20%
> into the randomRead test. Looking at heap, its all soft references.
> Instrumenting the referencequeue, we're never clearing full gc'ing.
> Something is off.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.