HBASE-4465 is not needed for correctness.
Personally I'd rather release 0.94 sooner rather than backporting non-trivial 
patches.

I realize I am guilty of this myself (see HBASE-4838... although that was an 
important correctness fix)

-- Lars

________________________________
From: Ted Yu <yuzhih...@gmail.com>
To: dev@hbase.apache.org 
Cc: Mikael Sitruk <mikael.sit...@gmail.com> 
Sent: Thursday, January 12, 2012 2:09 PM
Subject: Re: Major Compaction Concerns

Thanks for the tips, Nicolas.

About lazy seek, if you were referring to HBASE-4465, that was only
integrated into TRUNK and 0.89-fb.
I was thinking about backporting it to 0.92

Cheers

On Thu, Jan 12, 2012 at 1:44 PM, Nicolas Spiegelberg <nspiegelb...@fb.com>wrote:

> Mikael,
>
> >The system is an OLTP system, with strict latency and throughput
> >requirements, regions are pre-splitted and throughput is controlled.
> >
> >The system has heavy load period for few hours, during heavy load i mean
> >high proportion insert/update and small proportion of read.
>
> I'm not sure about the production status of your system, but you sound
> like you have critical need for dozens of optimization features coming out
> in 0.92 and even some trunk patches.  In particular, update speed has been
> drastically improved due to lazy seek.  Although you can get incremental
> wins with a different compaction features, you will get exponential wins
> from looking at other features right now.
>
> >we fall in the memstore flush throttling (
> >will wait 90000 ms before flushing the memstore) retaining more logs,
> >triggering more flush that can't be flushed.... adding pressure on the
> >system memory (memstore is not flushed on time)
>
> Filling up the logs faster than you can flush normally indicates that you
> have disk or network saturation.  If you have an increment workload, I
> know there are a number of patches in 0.92 that will drastically reduce
> your flush size (1: read memstore before going to disk, 2: don't flush all
> versions).  You don't have a compaction problem, you have a write/read
> problem.
>
> In 0.92, you can try setting your compaction.ratio down (0.25 is a good
> start) to increase the StoreFile count to slow reads but save Network IO
> on write.  This setting is very similar to the defaults suggested in the
> BigTable paper.  However, this is only going to cut your Network IO in
> half.  The LevelDB or BigTable algorithm can reduce your outlier StoreFile
> count, but they wouldn't be able to cut this IO volume down much either.
>
> >Please remember i'm on 0.90.1 so when major compaction is running minor is
> >blocked, when a memstore for a column family is flushed all other memstore
> >(for other) column family are also (no matter if they are smaller or not).
> >As you already wrote, the best way is to manage compaction, and it is what
> >i tried to do.
>
> Per-storefile compactions & multi-threaded compactions were added 0.92 to
> address this problem.  However, a high StoreFile count is not necessarily
> a bad thing.  For an update workload, you only have to read the newest
> StoreFile and lazy seek optimizes your situation a lot (again 0.92).
>
> >Regarding the compaction plug-ability needs.
> >Let suppose that the data you are inserting in different column family has
> >a different pattern, for example on CF1 (column family #1) you update
> >fields in the same row key while in CF2 you add each time new fields or
> >CF2 has new row and older rows are never updated won't you use different
> >algorithms for compacting these CF?
>
> There are mostly 3 different workloads that require different
> optimizations (not necessarily compaction-related):
> 1. Read old data.  Should properly use bloom filters to filter out
> StoreFiles
> 2. R+W.  Will really benefit from lazy seeks & cache on write (0.92).  Far
> more than a compaction algorithm
> 3. Write mostly.  Don't really care about compactions here.  Just don't
> want them to be sucking too much IO
>
> >Finally the schema design is guided by the ACID property of a row, we have
> >2 CF only both CF holds a different volume of data even if they are
> >Updated approximately with the same amount of data (cell updated vs cell
> >created).
>
> Note that 0.90 only had row-based write atomicity.  HBASE-2856 is
> necessary for row-based read atomicity across column families.
>
>

Reply via email to