Thank you for the pointer. I'm not sure if this is the bug I was encountering.
This particular bug points to a problem with how load was calculated. The
problem I was experiencing seemed to be a real issue that affected performance,
not just reporting.
They published a fix on 20100827, but
Ah, that explains a lot.
Thanks for the tips JGray! I shall do that ASAP.
On Thu, Sep 2, 2010 at 12:10 PM, Andrew Purtell wrote:
>> From: Bradford Stephens
>> A small improvement, but nowhere near what I'm used to,
>> even from vague memories of old clusters on EC2.
>
> Those days are gone.
>
>
> From: Bradford Stephens
> A small improvement, but nowhere near what I'm used to,
> even from vague memories of old clusters on EC2.
Those days are gone.
Used to be m1.small provided reasonable performance for some apps.
Now comment to the effect that the platform is simply too oversubscribed
to:bradfordsteph...@gmail.com]
> Sent: Wednesday, September 01, 2010 6:58 PM
> To: user@hbase.apache.org
> Subject: Re: Slow Inserts on EC2 Cluster
>
> On the full data set (10 reducers), speeds are about 100k/minute (WAL
> Disabled). Still much slower than I'd like, but I'll tak
On the full data set (10 reducers), speeds are about 100k/minute (WAL
Disabled). Still much slower than I'd like, but I'll take it over the
former :)
On Wed, Sep 1, 2010 at 5:59 PM, Ryan Rawson wrote:
> Yes exactly, column families have the same performance profile as
> tables. 12 CF = 12 tables
Yes exactly, column families have the same performance profile as
tables. 12 CF = 12 tables.
-ryan
On Wed, Sep 1, 2010 at 5:56 PM, Bradford Stephens
wrote:
> Good call JD! We've gone from 20k inserts/minute to 200k. Much
> better! I still think it's slower than I'd want by about one OOM, but
>
Good call JD! We've gone from 20k inserts/minute to 200k. Much
better! I still think it's slower than I'd want by about one OOM, but
it's progress.
Since we're populating 12 families, I guess we're seeking for 12 files
on each write. Not pretty. I'll look at the customer and see if they
really ha
There are a couple of things here happening, and some solutions:
- dont flush based on region size, only on family/store size.
- do what the bigtable paper says and merge the smallest file with
memstore while flushing thus keeping the net number of files low.
The latter would probably benefit fro
Yeah, those families are all needed -- but I didn't realize the files
were so small. That's odd -- and you're right, that'd certainly throw
it off. I'll merge them all and see if that helps.
On Wed, Sep 1, 2010 at 5:24 PM, Jean-Daniel Cryans wrote:
> Took a quick look at your RS log, it looks lik
Took a quick look at your RS log, it looks like you are using a lot of
families and loading them pretty much at the same rate. Look at lines
that start with:
INFO org.apache.hadoop.hbase.regionserver.Store: Added ...
And you will see that you are dumping very small files on the
filesystem, on ave
'allo,
I changed the cluster form m1.large to c1.xlarge -- we're getting
about 4k inserts /node / minute instead of 2k. A small improvement,
but nowhere near what I'm used to, even from vague memories of old
clusters on EC2.
I also stripped all the Cascading from my code and have a very basic
raw
> From: Gary Helmling
>
> If you're using AMIs based on the latest Ubuntu (10.4),
> theres a known kernel issue that seems to be causing
> high loads while idle. More info here:
>
> https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910
Seems best to avoid using Lucid on EC2 for now, th
> From: Matthew LeMieux
> I'm starting to find that EC2 is not reliable enough to support
> HBase.
[...]
> (I've been using m1.large and m2.xlarge running CDH3)
I personally don't use EC2 for anything more than on demand ad hoc testing, but
I do know of successful deployments there.
However, I
On Wed, Sep 1, 2010 at 7:24 AM, Matthew LeMieux wrote:
> I'm starting to find that EC2 is not reliable enough to support HBase. I'm
> running into 2 things that might be related:
>
> 1) On idle machines that are apparently doing nothing (reports of <3% CPU
> utilization, no I/O wait) the load i
n.
>
> JG
>
>> -Original Message-
>> From: Matthew LeMieux [mailto:m...@mlogiciels.com]
>> Sent: Wednesday, September 01, 2010 7:25 AM
>> To: user@hbase.apache.org
>> Subject: Re: Slow Inserts on EC2 Cluster
>>
>> I'm starting to find tha
IO so that it cannot write to its
transaction log and that is what is slowing it down.
JG
> -Original Message-
> From: Matthew LeMieux [mailto:m...@mlogiciels.com]
> Sent: Wednesday, September 01, 2010 7:25 AM
> To: user@hbase.apache.org
> Subject: Re: Slow Inserts on EC2
Wow, thanks. I didn't consider that ... I try to avoid the cloud if at
all possible :)
Cheers,
B
On Wed, Sep 1, 2010 at 4:14 AM, Andrew Purtell wrote:
>> From: Bradford Stephens
>> I'm banging my head against some perf issues on EC2. I'm
>> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hb
> From: Bradford Stephens
> I'm banging my head against some perf issues on EC2. I'm
> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase
> scripts to handle the new version.
>
> I'm trying to insert about 22G of data across nodes on EC2
> m1.large instances [...]
c1.xlarge provides (bare
Hey guys,
I'm banging my head against some perf issues on EC2. I'm using .20.6
on ASF hadoop .20.2, and tweaked the ec2 hbase scripts to handle the
new version.
I'm trying to insert about 22G of data across nodes on EC2 m1.large
instances. I'm getting speeds of about 1200 rows/minute. It seems li
19 matches
Mail list logo