Let's start with...

Are you using HTable directly or are you going through TableOutputFormat?

If former, do you use the write buffer?

Are you inserting into multiple families?

Are you using compression?

Did you take a look at the region server logs?

If so, so you see a lot of messages in the likes of "Blocking ..."?

Are you monitoring the GCs?

If so, do you see some pauses longer than a second?

Thx!

J-D

On Wed, Nov 24, 2010 at 11:00 AM, Tim Robertson
<timrobertson...@gmail.com> wrote:
> Hi all,
>
> I am running an MR job that is loading an HBase table in the reduce,
> and I am seeing hopeless performance - 10 million records of <1Kb in 2
> hours so far.
>
> Please bear in mind I am software guy, so go easy ;) but here is what
> I know so far:
>
> (http://code.google.com/p/gbif-occurrencestore/wiki/ClusterConfig
> describes the cluster, and currently 40 reducers are running, all on
> CDH3)
>
> - RS and TT all have load averages way down at 1-2 max
> - RS and TT CPUs are 398% idle on quad cores, 1598% idle on hyper
> threading dual quads
> - RS heap is 4G
> - there seems no iowait anywhere
> - Free -m shows "swap used 0" on all machines if I am reading it correctly
>
> Can anyone please suggest where I can go digging?  Please don't assume
> I have looked at the basics - I'm learning as much as I can as I go.
>
> Thanks,
> Tim
>

Reply via email to