Re: Problems with write performance (25kb rows)

Dmitriy Lyfar Thu, 14 Jan 2010 05:21:32 -0800

Hi,

> Speed still the same (about 1K rows per second).
> >
>
> This seems low for your 6 node cluster.
>
> If you look at the servers, are they cpu or io bound-up in any way?
>
> How many clients you have running now?
>


Now I'm running 1-2 clients in parallel. If I run more -- timings grows.
Also I not use namenode as datanode and as regionserver. There is only
namenode/secondarynn/master/zk.


>
> This is not a new table right?  (I see there is an existing table in your
> cluster looking at the regionserver log).   Its an existing table of many
> regions?
>

Yes. I have 7 test tables. Client randomly select table which will be used
at start.
Now after some tests I have about 800 regions per region server and 7
tables.


>
> You have upped the handlers in hbase.  Have you done same for datanodes (In
> case we are bottlenecking here).
>

I've updated this setting for hadoop also. As I understand if something
wrong with
number of handles -- I will get an exception TooManyOpenFiles and datanode
finish its work.
All works fine for now. I've attached metrics from one of datanodes. On
other nodes we have almost same picture. Please look at the throughput
picture. It seems illogical to me that node have almost equal inbound and
outbound traffic (render.png). These pictures were snapped while running two
clients and then after some break I've ran one client.


>  > Random ints plays a role of row keys now (i.e. uniform random
> distribution
> > on (0, 100 * 1000)).
> > What do you think is 5GB for hbase and 2GB for hdfs enough?
> >
> > Yes, that should be good.  Writing you are not using that memory in
> regionserver though, maybe you should go with bigger regions if you have
> 25k
> cells.  You using compression?
>

Yes, 25Kb is important, but I think in production system we will have 70-80%
of 5-10Kb rows,
about 20% of 25Kb rows and 10% of > 25Kb rows. I'm not using any compression
for columns because I was thinking about throughput. But I was planning to
use compression when I can achieve 80-90 Mb/sec for this test.


>
> I took a look at your regionserver log.  Its just after an open of the
> regionserver.  I see no activity other than the opening of a few regions.
>  These regions do happen to have alot of store files so we're starting up
> compactions but that all should be fine.  I'd be interested in seeing a log
> snippet from a regionserver under load.
>

Ok, there are some tests running now which will be interesting I think, I'll
provide regionserver logs a bit later.
Thank you for your help!

-- 
Regards, Lyfar Dmitriy

Re: Problems with write performance (25kb rows)

Reply via email to