Re: minor compaction bug (was: upsert case performance problem)

Andrew Purtell Mon, 21 Mar 2011 17:39:23 -0700

Hi Dhruba,

> another bottleneck that I am seeing is that all transactions need to come to
> a halt when rolling hlogs, the reason being that all transactions need to be
> drained before we can close the hlog


I didn't measure the rate but I'd expect quite often due to a constant 
as-many-writes-as-we-can-push workload. 

The performance limitation in memstore suggested by the impact of flushes was 
the dominant factor.

> InitialOccupancyFactor
> what is the size of ur NewGen?

This is what I'm testing with: -Xmx4000m -Xms4000m -Xmn400m \
  -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 \
  -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseParNewGC \
  -XX:+CMSParallelRemarkEnabled -XX:MaxGCPauseMillis=100 \
  -XX:+UseMembar

> how many client threads 

20 single threaded mapreduce clients

> and how many region server handler threads are u using?

100 per rs

> For increment operation, I introduced the concept of a 
> ModifyableKeyValue whereby every increment actually updates
> the same KeyValue record if found in the MemStore (instead
> of creating a new KeyValue record and re-inserting
> it into memstore).

Patch! Patch! Patch! :-) :-)
(I'd ... consider ... trying it.)

   - Andy

--- On Sat, 3/19/11, Dhruba Borthakur <[email protected]> wrote:

> From: Dhruba Borthakur <[email protected]>
> Subject: Re: minor compaction bug (was: upsert case performance problem)
> To: [email protected], [email protected]
> Date: Saturday, March 19, 2011, 10:24 PM
> Hi andrew,
> 
> I have been doing a set of experiments for the last one month on  a workload
> that is purely "increments". I too have seen that the performance drops when
> the memstore fills up. My guess is that although the complexity is O(logn),
> still when n is large the time needed to insert/lookup could be large. It
> would have been nice if it were a hashMap instead of a tree, but the
> tradeoff is that we would have to sort it while writing to hfile.
> 
> another bottleneck that I am seeing is that all transactions need to come to
> a halt when rolling hlogs, the reason being that all transactions need to be
> drained before we can close the hlog. how frequently is this occuring in ur
> case?
> 
> how much GC are u seeing and what is the InitialOccupancyFactor for the JVM,
> I have set InitialOccupancyFactor to 40 in my case. what is the size of ur
> NewGen?
> 
> how many client threads and how many region server handler threads are u
> using?
> 
> For increment operation, I introduced the concept of a ModifyableKeyValue
> whereby every increment actually updates the same KeyValue record if found
> in the MemStore (instead of creating a new KeyValue record and re-inserting
> it into memstore).
> 
> I am very interested in exchanging notes and what else u find,
> thanks,
> dhruba
> 
> On Sat, Mar 19, 2011 at 11:15 AM, Andrew Purtell <[email protected]>wrote:
> 
> > See below.
> >
> > Doing some testing on that I let the mapreduce program
> and an hbase shell
> > flushing every 60 seconds run overnight. The result on
> two tables was:
> >
> > 562 store files!
> >
> >   ip-10-170-34-18.us-west-1.compute.internal:60020
> 1300494200158
> >       requests=51, regions=1,
> usedHeap=1980, maxHeap=3960
> >   
>    akamai.ip,,1300494562755.1b0614eaecca0d232d7315ff4a3ebb87.
> >         
>    stores=1, storefiles=562,
> storefileSizeMB=310,
> > memstoreSizeMB=1, storefileIndexSizeMB=2
> >
> > 528 store files!
> >
> >   
> ip-10-170-49-35.us-west-1.compute.internal:60020
> 1300494214101
> >        requests=79, regions=1,
> usedHeap=1830, maxHeap=3960
> >       
> akamai.domain,,1300494560898.af85225ae650574dbc4caa34df8b6a35.
> >         
>    stores=1, storefiles=528,
> storefileSizeMB=460,
> > memstoreSizeMB=3, storefileIndexSizeMB=3
> >
> > ... so that killed performance after a while ...
> >
> > Here's something else.
> >
> >   - Andy
> >
> > --- On Sat, 3/19/11, Andrew Purtell <[email protected]>
> wrote:
> >
> > From: Andrew Purtell <[email protected]>
> > Subject: upsert case performance problem (doubts
> about
> > ConcurrentSkipListMap)
> > To: [email protected]
> > Date: Saturday, March 19, 2011, 11:10 AM
> >
> > I have a mapreduce task put together for
> experimentation which does a lot
> > of Increments over three tables and Puts to another. I
> set writeToWAL to
> > false. My HBase includes the patch that fixes
> serialization of writeToWAL
> > for Increments. MemstoreLAB is enabled but is probably
> not a factor, but
> > still need to test to exclude it.
> >
> > After starting a job up on a test cluster on EC2 with
> 20 mappers over 10
> > slaves I see initially 10-15K/ops/sec/server. This
> performance drops over a
> > short time to stabilize around 1K/ops/sec/server. So I
> flush the tables with
> > the shell. Immediately after flushing the tables,
> performance is back up to
> > 10-15K/ops/sec/server. If I don't flush, performance
> remains low
> > indefinitely. If I flush only the table receiving the
> Gets, performance
> > remains low.
> >
> > If I set the shell to flush in a loop every 60
> seconds, performance
> > repeatedly drops during that interval, then recovers
> after flushing.
> >
> > When Gary and I went to NCHC in Taiwan, we saw a guy
> from PhiCloud present
> > something similar to this regarding 0.89DR. He
> measured the performance of
> > the memstore for a get-and-put use case over time and
> graphed it, looked
> > like time increased on a staircase with a trend to
> O(n). This was a
> > surprising result. ConcurrentSkipListMap#put is
> supposed to run in O(log n).
> > His workaround was to flush after some fixed number of
> gets+puts, 1000 I
> > think. At the time we weren't sure what was going on
> given the language
> > barrier.
> >
> > Sound familiar?
> >
> > I don't claim to really understand what is going on,
> but need to get to the
> > bottom of this. Going to look at it in depth starting
> Monday.
> >
> >   - Andy
> >
> >
> >
> >
> >
> 
> 
> -- 
> Connect to me at http://www.facebook.com/dhruba
>

Re: minor compaction bug (was: upsert case performance problem)

Reply via email to