[
https://issues.apache.org/jira/browse/HADOOP-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12518867
]
stack commented on HADOOP-1644:
-------------------------------
Let me try your suggestion Jim of not having compactions disable flushes.
Another thing I'd like to try is that rather than flushing memory to a new
file, instead flush by merging with an existant file. I'm thinking it will
take the same amount of elapsed time but we'll have put off a full compaction
by not producing an added file.
Another element to consider is that compactions are the means by which
HStoreFile references are cleaned up in a region (If references, then a region
cannot be split) so compaction should be doing its best to clean up instances
of reference files.
> [hbase] Compactions should not block updates
> --------------------------------------------
>
> Key: HADOOP-1644
> URL: https://issues.apache.org/jira/browse/HADOOP-1644
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: stack
> Fix For: 0.15.0
>
>
> Currently, compactions take a long time. During compaction, updates are
> carried by the HRegions' memcache (+ backing HLog). memcache is unable to
> flush to disk until compaction completes.
> Under sustained, substantial -- rows that contain multiple columns one of
> which is a web page -- updates by multiple concurrent clients (10 in this
> case), a common hbase usage scenario, the memcache grows fast and often to
> orders of magnitude in excess of the configured 'flush-to-disk' threshold.
> This throws the whole system out of kilter. When memcache does get to run
> after compaction completes -- assuming you have sufficent RAM and the region
> server doesn't OOME -- then the resulting on-disk file will be way larger
> than any other on-disk HStoreFile bringing on a region split ..... but the
> resulting split will produce regions that themselves need to be immediately
> split because each half is beyond the configured limit, and so on...
> In another issue yet to be posted, tuning and some pointed memcache flushes
> makes the above condition less extreme but until compaction durations come
> close to the memcache flush threshold compactions will remain disruptive.
> Its allowed that compactions may never be fast enough as per bigtable paper
> (This is a 'wish' issue).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.