[
https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565594#action_12565594
]
Billy Pearson commented on HBASE-69:
------------------------------------
Currently flushes and compaction's work good from what I can tell on my setup
and test.
There are two area I have concerns in and have not got a chance to test.
1. hlogs: If I have a column family that does not get updates but say 1 out
100-250 updates then is that region going to hold up the removal of old hlogs
waiting for a flush from this column. If this is so on column family could make
a recovery take a long time if the region server fails. This is one of the
reason besides memory usage I thank we need to leave/add back the option
flusher to flush every 30-60 mins.
2. Splits: If I have a large region split in to two the compaction starts on
the reload of the new splits. But say the columns take 50 mins to compact in
that 50 min. If I get updates to cause a split again will this fail if the
region has not finished compacting all the regions reference files from the
original split.
Out side of the above concerns I have not noticed any bugs in the patch while
flushing or compaction's all seams ok in that area.
> [hbase] Make cache flush triggering less simplistic
> ---------------------------------------------------
>
> Key: HBASE-69
> URL: https://issues.apache.org/jira/browse/HBASE-69
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: stack
> Assignee: Jim Kellerman
> Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt,
> patch.txt, patch.txt, patch.txt, patch.txt
>
>
> When flusher runs -- its triggered when the sum of all Stores in a Region > a
> configurable max size -- we flush all Stores though a Store memcache might
> have but a few bytes.
> I would think Stores should only dump their memcache disk if they have some
> substance.
> The problem becomes more acute, the more families you have in a Region.
> Possible behaviors would be to dump the biggest Store only, or only those
> Stores > 50% of max memcache size. Behavior would vary dependent on the
> prompt that provoked the flush. Would also log why the flush is running:
> optional or > max size.
> This issue comes out of HADOOP-2621.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.