HBASE-4241 solves part of the problem. It avoids flushing cells from the memstore to disk that would be collected during the next compaction anyway. Unfortunately it does not reduce the number of memstore flushes; it just leads to smaller HFiles.
There's HBASE-5311 to discuss ways to address the latter problem. Note that in any case *all* edits need to be written to the WAL -as you cannot anticipate future edits. -- Lars ----- Original Message ----- From: Igal Shilman <[email protected]> To: [email protected] Cc: Sent: Tuesday, May 1, 2012 10:11 PM Subject: Re: Understanding compacting memstore/HLog before flush Hi Alex, Have you seen: https://issues.apache.org/jira/browse/HBASE-4241 ? Igal. On May 2, 2012 7:01 AM, "Alex Baranau" <[email protected]> wrote: > Hello, > > Could you please tell me if I correctly understand this problem... > > Example behavior 1: > * create table > * do 10 operations: insert cell, override (given that versions # configured > to 1) it, override, ... override. > * after flushing memstore with these edits, all of them getting written to > hfiles > > Ideally, in this situation one edit should be performed (resulting value of > cell). I.e. only "current visible state" of memstore should be flushed as > opposed to flushing all the edits from HLog. This will have a lot of > benefits (e.g. reducing data amount to flush -> may be less frequent > flushing needing -> less freq compactions, etc. operations), esp in > particular use-cases (like using counters, or updating some "aggregated > values"). > > The problem, as I understand (correct me here, please if I'm wrong) is that > it is not an easy thing to do, mainly because > 1) additional resource management burden (flushing large memstore isn't > cheap) > 2) compaction may add a lot of unnecessary overhead (so that in some cases > there will be no actual benefit from it), may make flushing much slower, > which can bring a lot of issues > 3) edits flushed from memstore and HLog edits should be kept in sync, > because we want the flush process to be reliable. I.e. if it fails in the > middle we should be able to restore the state from HLog. Keeping memstore > and HLog in sync during compaction (and we would need partial compaction of > some older data of the memstore) is difficult. > 4) anything else? > > Esp. 3rd point - am I getting it right? > > Thanx, > Alex Baranau >
