Try iostat or if you are running it, try ganglia.
On Mon, May 30, 2011 at 10:07 PM, Harold Lim wrote:
> How do I know how much data is moving from the disk?
>
It may make it better.
We should have an update shortly that will allow multiple machines to
participate in generating load. A single YCSB is sufficient to stress a few
nodes but once you get to 10 or more (especially with MapR underneath) you
really need a cluster to generate the load.
The sync
My setup seems to have a lot of regions with no data that just keep
accumulating over time. Here are some details:
I have time-series data (created by opentsdb) being inserted into hbase
every minute. Since the data has little value after say 15 days, I go
ahead and delete all old data.
When I lo
Hi Ted,
I haven't tried with bigger instances yet. Those are my next steps.
I also see that you have a forked version of YCSB, will that make my
performance better?
Thanks,
Harold
--- On Tue, 5/31/11, Ted Dunning wrote:
> From: Ted Dunning
> Subject: Re: How to improve HBase throughput
Hi Ted,
I read all fields in the record. I was trying to get similar performance from
the YCSB paper.
How do I know how much data is moving from the disk?
-Harold
--- On Tue, 5/31/11, Ted Dunning wrote:
> From: Ted Dunning
> Subject: Re: How to improve HBase throughput with YCSB?
> To: user@
What happens if you increase heap space to 8GB on an m1.xlarge or
m2.2xlarge?
On Mon, May 30, 2011 at 8:50 PM, Harold Lim wrote:
> Hi Lohit,
>
> I'm running HBase 0.90.2. 10 x ec2 m1.large instances. I set the heap size
> to 4GB and handler count for hbase, and dfs to 100. I also set the dfs ma
How large are the reads?
Have you tried this on a better instance type such as was suggested a bit
ago?
How much data is moving from the disks?
On Mon, May 30, 2011 at 8:46 PM, Harold Lim wrote:
> Hi Ted,
>
> It's a pure random read operation.
>
>
> -Harold
> --- On Mon, 5/30/11, Ted Dunning
Hi Andrew,
Is this a normal behavior in m1.large instances? Does m1.xlarge work?
I am using the local storage of the instances (ephemeral disk in EC2
terminology).
I picked m1.large because that was the "smallest" type of instance that has a
high I/O performance listed.
Thanks,
Harold
---
Hi Lohit,
I'm running HBase 0.90.2. 10 x ec2 m1.large instances. I set the heap size to
4GB and handler count for hbase, and dfs to 100. I also set the dfs max
xcievers to 4096
I'm running a pure random read YCSB workload.
I also tried running multiple clients from multiple ec2 instances, but
Hi Ted,
It's a pure random read operation.
-Harold
--- On Mon, 5/30/11, Ted Dunning wrote:
> From: Ted Dunning
> Subject: Re: How to improve HBase throughput with YCSB?
> To: user@hbase.apache.org
> Date: Monday, May 30, 2011, 3:07 PM
> What kind of operations?
>
> On Mon, May 30, 2011 at 9:
I have filed an issue, and I'll commit a patch soon(For I still need to do some
test on the patch).
Issue Address: https://issues.apache.org/jira/browse/HBASE-3937
It indeed has something relating to HBASE-3789
I'm still looking into this issue.
Any further discussion, I'll add into comments.
Th
Sounds like -c is a little flakey. Glad you figured it Geoff (eventually).
St.Ack
On Sat, May 28, 2011 at 12:01 PM, Geoff Hendrey wrote:
> Never got the "-c" argument to work, but when I setup the following
> environment vars, it was happy:
>
> export HBASE_HOME
> added hbase conf dir to CLASSP
Thanks for digging in Jean. Your diagnosis below looks right to me --
the bit about master trying to reset OFFSET before reassigning. It
will help if a regionserver has set it OPENING in the meantime. How
do you propsose to handle the case where we fail setting it to OFFLINE
because RS1 has alre
Have you looked at deferred flushing? Its an attribute you set on
your table. You then say how often to run sync using
'hbase.regionserver.optionallogflushinterval'. Default is sync every
second.
St.Ack
On Sat, May 28, 2011 at 6:47 AM, Qing Yan wrote:
> Well, I realized myself RS flush to HDF
See http://hbase.apache.org/book.html#performance and the notes over
in the other thread, "How to improve HBase throughput with YCSB?"
St.Ack
On Sun, May 29, 2011 at 2:28 PM, Sean Bigdatafun
wrote:
> For pure random read, I do not think there exists a good way to improve
> latency. Essentially, e
Odd. I dont' see the regionserver checking into the master (maybe
thats the way it is in pseudo-distributed and I just forgot). Can you
paste more master log? I don't see the regionserver coming in in the
snippet you've pasted so not sure how its registering itself (I see
the timeout when we tr
Xiyun:
Take a look at https://issues.apache.org/jira/browse/HBASE-3871 for parallel
HFile splitting.
On Mon, May 30, 2011 at 6:31 PM, Gan, Xiyun wrote:
> I used BulkLoad to import data. The step of writing HFiles using m/r is
> fast, but the step of loading HFiles to hbase takes lots of time. It
Thanks a lot
Is there any suggestion on the Region is not online Exception?
On Tue, May 31, 2011 at 9:36 AM, Joey Echeverria wrote:
> If you have a well defined key space, you'll get better performance if
> you pre-split your table and use the TotalOrderPartitioner with your
> MapReduce job.
>
If you have a well defined key space, you'll get better performance if
you pre-split your table and use the TotalOrderPartitioner with your
MapReduce job.
You can see an example of pre-splitting here:
http://hbase.apache.org/book.html#precreate.regions.
-Joey
On Mon, May 30, 2011 at 9:31 PM, Gan
I used BulkLoad to import data. The step of writing HFiles using m/r is
fast, but the step of loading HFiles to hbase takes lots of time. It
says HFile at ** no longer fits inside a single region. Splitting
Even worth, sometimes it throws Region is not online Exception.
Thanks
On Fri, Ma
The hypervisor steals a lot of CPU time from m1.large instances. You should be
using c1.xlarge instances.
Are you using local storage or EBS?
Be aware that I/O performance on EC2 for any system is lower than if you are
using real hardware, significantly so if not using one of the instance type
Hello Harold,
Can you share with us what kind of throughput you are seeing.
Number of ops/sec and read latency you are seeing.
Also, what version of hbase are you running.
Thanks,
Lohit
2011/5/30 Harold Lim
> Hi All,
>
> I have an HBase cluster on ec2 m1.large instance (10 region servers). I'm
Hi Hari,
I am experiencing the same problem as you do, I think. (My system is also
Ubuntu 11.04).
Please take a look at my thread and see if it is the same problem you are
experiencing.topic "0.90.1 HMaster malfunction in pseudo-distributed mode".
Hopefully this question get answered after the
What kind of operations?
On Mon, May 30, 2011 at 9:43 AM, Harold Lim wrote:
> Hi All,
>
> I have an HBase cluster on ec2 m1.large instance (10 region servers). I'm
> trying to run a read-only YCSB workload. It seems that I can't get a good
> throughput. It saturates to around 600+ operations per
Hi All,
I have an HBase cluster on ec2 m1.large instance (10 region servers). I'm
trying to run a read-only YCSB workload. It seems that I can't get a good
throughput. It saturates to around 600+ operations per second.
My dataset is around 200GB (~1k+ regions). Running major compaction and als
Hi,
I am trying to set up hbase is pseudo distributed mode on one of the
machines and I am getting this error when I try to use hbase shell. If I try
the list command, it just hangs.
hbase(main):002:0> create 't1', 'f1'
ERROR: org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException:
org.apache.
26 matches
Mail list logo