Are you running enough xciever counts? Any failures in your datanode logs?
On Wed, Sep 1, 2010 at 10:51 PM, Stack wrote:
> Vidhya:
>
> Could you use the hadoop 0.20-append branch on your cluster as per
> Todd's suggestion?
>
> St.Ack
>
> On Wed, Sep 1, 2010 at 12:22 PM, Vidhyashankar Venkatarama
You networking looks borked (Where does it get 203.14.166.86 from?).
Figure that first.
St.Ack
On Wed, Sep 1, 2010 at 12:10 PM, Shuja Rehman wrote:
> Hagner,
>
> If i change etc/hosts file and give global ip theres then namenode of hadoop
> did not start and give the following error
>
> java.net.
Vidhya:
Could you use the hadoop 0.20-append branch on your cluster as per
Todd's suggestion?
St.Ack
On Wed, Sep 1, 2010 at 12:22 PM, Vidhyashankar Venkataraman
wrote:
> The RS logs is filled with exceptions like the one I have specified below..
>
> Vidhya
>
> RS log:
>
> 2010-09-01 18:23:55,88
On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani wrote:
> That email was just informational. Below are the details on my cluster - let
> me know if more is needed.
>
> I have 2 hbase clusters setup
> - for production, 6 node cluster, 32G, 8 processors
> - for dev, 3 node cluster , 16GRA
Sounds like 2047 is not enough. Up it again. 4k?
St.Ack
2010/9/1 xiujin yang :
>
> Thank you J-D.
>
>
> I've checked two datanode log and found the same error. "exceeds the limit
> of concurrent xcievers 2047"
>
>
> [2010-08-31
> 10:43:26][error][org.apache.hadoop.hdfs.server.datanode.dataxce
Been doing lots of importing recently. There are two easy ways to get big
performance boosts.
The first is HFileOuputFormat. It works into existing tables now.
Consistently see 10X+ performance this way versus API.
If you must use the API, pre-create a bunch of regions for your table. You c
Thank you J-D.
I've checked two datanode log and found the same error. "exceeds the limit of
concurrent xcievers 2047"
[2010-08-31
10:43:26][error][org.apache.hadoop.hdfs.server.datanode.dataxcei...@5a809419][org.apache.hadoop.hdfs.server.datanode.dataxceiver.run(DataXceiver.java:131)]
Da
On the full data set (10 reducers), speeds are about 100k/minute (WAL
Disabled). Still much slower than I'd like, but I'll take it over the
former :)
On Wed, Sep 1, 2010 at 5:59 PM, Ryan Rawson wrote:
> Yes exactly, column families have the same performance profile as
> tables. 12 CF = 12 tables
Yes exactly, column families have the same performance profile as
tables. 12 CF = 12 tables.
-ryan
On Wed, Sep 1, 2010 at 5:56 PM, Bradford Stephens
wrote:
> Good call JD! We've gone from 20k inserts/minute to 200k. Much
> better! I still think it's slower than I'd want by about one OOM, but
>
Better formatting would probably be helpful.
`links http://localhost:60010/ `
-Original Message-
From: Sharma, Avani [mailto:agsha...@ebay.com]
Sent: Wednesday, September 01, 2010 5:52 PM
To: user@hbase.apache.org
Subject: RE: regionserver skew
Links http://localhost:60010/ worked.
Good call JD! We've gone from 20k inserts/minute to 200k. Much
better! I still think it's slower than I'd want by about one OOM, but
it's progress.
Since we're populating 12 families, I guess we're seeking for 12 files
on each write. Not pretty. I'll look at the customer and see if they
really ha
There are a couple of things here happening, and some solutions:
- dont flush based on region size, only on family/store size.
- do what the bigtable paper says and merge the smallest file with
memstore while flushing thus keeping the net number of files low.
The latter would probably benefit fro
Links http://localhost:60010/ worked.
My hbase cluster (Solaris machines) is firewalled and this is the best I could
do currently.
-Original Message-
From: Sharma, Avani [mailto:agsha...@ebay.com]
Sent: Monday, August 30, 2010 6:48 PM
To: user@hbase.apache.org
Subject: RE: regionserver
I was able to do the same. Thanks.
-Avani
-Original Message-
From: Matthew LeMieux [mailto:m...@mlogiciels.com]
Sent: Tuesday, August 31, 2010 11:47 AM
To: user@hbase.apache.org
Subject: Re: Initial and max heap size
I've found that the master doesn't need as much memory as the regions
That email was just informational. Below are the details on my cluster - let me
know if more is needed.
I have 2 hbase clusters setup
- for production, 6 node cluster, 32G, 8 processors
- for dev, 3 node cluster , 16GRAM , 4 processors
1. I installed hadoop0.20.2 and hbase0.20.3
Yeah, those families are all needed -- but I didn't realize the files
were so small. That's odd -- and you're right, that'd certainly throw
it off. I'll merge them all and see if that helps.
On Wed, Sep 1, 2010 at 5:24 PM, Jean-Daniel Cryans wrote:
> Took a quick look at your RS log, it looks lik
Took a quick look at your RS log, it looks like you are using a lot of
families and loading them pretty much at the same rate. Look at lines
that start with:
INFO org.apache.hadoop.hbase.regionserver.Store: Added ...
And you will see that you are dumping very small files on the
filesystem, on ave
'allo,
I changed the cluster form m1.large to c1.xlarge -- we're getting
about 4k inserts /node / minute instead of 2k. A small improvement,
but nowhere near what I'm used to, even from vague memories of old
clusters on EC2.
I also stripped all the Cascading from my code and have a very basic
raw
unfortunately. I tried flush the table and disable, and then drop, and it
doesn't work.
I even wrote a utility to remove all records from the large table and then
do so,
and it doesn't work either. strangely. I looked at the web UI, and still see
many regions
even the number of rows in the tabl
That version doesn't have the fixes I referred to, and disabling large
tables will likely hit the race condition.
J-D
On Wed, Sep 1, 2010 at 2:47 PM, Jinsong Hu wrote:
> unfortunately. I tried flush the table and disable, and then drop, and it
> doesn't work.
> I even wrote a utility to remove a
unfortunately. I tried flush the table and disable, and then drop, and it
doesn't work.
I even wrote a utility to remove all records from the large table and then
do so,
and it doesn't work either. strangely. I looked at the web UI, and still see
many regions
even the number of rows in the tabl
"be sureto compress your data and set the split size bigger than the default
of 256MB or you'll end up with too many regions."
How many regions are to many? I have a decent sized cluster (~30 nodes) and
started inserting new data, and noticed that after a day, I went from 30
regions on each serve
The RS logs is filled with exceptions like the one I have specified below..
Vidhya
RS log:
2010-09-01 18:23:55,883 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader
reader=hdfs://b3130080.ys
Again, the read/write load has much more to do with cluster sizing than the
dataset (total capacity aside).
To give you an idea of how widely it varies, I had a client who put several
hundred GBs of data onto a single node setup of HBase. I've also seen clusters
of 20-100 nodes with only 10s o
Yes, I am indeed testing the sustained rate. the channel I/O exception shows
the I/O killed the regionserver.
the data node side shows:
2010-08-28 23:46:27,854 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Ex
ception in receiveBlock for block blk_7209586757797236713_2442298
java.io.
Hagner,
If i change etc/hosts file and give global ip theres then namenode of hadoop
did not start and give the following error
java.net.BindException: Problem binding to
myserver.mycompany.com/203.14.166.86:8020 : Cannot assign requested address
so how i resolve it?
On Wed, Sep 1, 2010 at 5:34
Hi Vidhya,
Problems like this used to be more frequent, but then we did a bunch
of DFS bug fixes in the hadoop-0.20-append branch that resolved a lot
of them. I imagine you're using YDH which doesn't have all the fixes,
but I couldn't say exactly what issue this is.
Could you grep both the NN and
I have been trying to run my scanner jobs and sometimes they fail due to DFS
errors in one of the storefiles:
I looked at the namenode logs and the file that caused the problem was in the
process of getting fixed by the namenode but by then the scanner failed.. (I
tried copying the file after t
One trick is to pre- force flush the table. Also try out the new 0.89,
it has 2 fixes regarding a race condition between the BaseScanner and
the closing of regions. The release candidate is here
http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1
J-D
On Wed, Sep 1, 2010 at 11:28 AM
Is that really a good test? Unless you are planning to write about 1TB
of new data per day into HBase I don't see how you are testing
capacity, you're more likely testing how HBase can sustain a constant
import of a lot of data. Regarding that, I'd be interested in knowing
exactly the circumstances
Hi, Team:
I have noticed that the truncate/drop table with large amount of data
fails and actually corrupt the hbase. In the worse case, we can't even
create the table with the same name any more and I was forced to dump the
whole hbase records and recreate all tables again.
I noticed there is
I did a testing with 6 regionserver cluster with a key design that spread
the incoming data to all regions.
I noticed after pumping data for 3-4 days for about 3 TB data, one of the
regionserver shuts down because
of channel IO error. on a 3 regionserver cluster and same key design, the
regions
> From: Gary Helmling
>
> If you're using AMIs based on the latest Ubuntu (10.4),
> theres a known kernel issue that seems to be causing
> high loads while idle. More info here:
>
> https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910
Seems best to avoid using Lucid on EC2 for now, th
> From: Matthew LeMieux
> I'm starting to find that EC2 is not reliable enough to support
> HBase.
[...]
> (I've been using m1.large and m2.xlarge running CDH3)
I personally don't use EC2 for anything more than on demand ad hoc testing, but
I do know of successful deployments there.
However, I
This is errors coming from HDFS, I would start looking at the datanode
log on the same machine for any exceptions thrown at the same time.
Also make sure your cluster is properly configured according to the
last bullet point in the requirements
http://hbase.apache.org/docs/r0.20.6/api/overview-summ
Very hard to tell how it got there by just looking at the end result,
but you could try using the shell tools like disable_region then
close_region, and then enable_region on
user_name_index,,1282242158507.8c9a40b89ee92e4b2f285b306a2d30ed.
Also you could giving a spin to the the latest 0.89 releas
On Wed, Sep 1, 2010 at 7:24 AM, Matthew LeMieux wrote:
> I'm starting to find that EC2 is not reliable enough to support HBase. I'm
> running into 2 things that might be related:
>
> 1) On idle machines that are apparently doing nothing (reports of <3% CPU
> utilization, no I/O wait) the load i
I think it's mostly a matter of cost-efficiency -- HBase *runs* just
fine on EC2, and is built to be in a transient environment. It's just
not always cost-effective because you have to use pricey instances.
As far as my issue -- it didn't seem to be ZK. I like Andrew's point,
I'll knock it up to b
While I completely agree with much of what you're saying, and am usually one of
the first to encourage people to not use virtual machines w/ HBase, I know of
several successful deployments of HBase on EC2. In most instances there was
some pain encountered, but it does work for some.
I've not s
Wow, thanks. I didn't consider that ... I try to avoid the cloud if at
all possible :)
Cheers,
B
On Wed, Sep 1, 2010 at 4:14 AM, Andrew Purtell wrote:
>> From: Bradford Stephens
>> I'm banging my head against some perf issues on EC2. I'm
>> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hb
Shuja,
If you are not running any type of DNS/rDNS service, then make sure the
/etc/hosts file on each of your nodes maps each node to the IP address you want
it to resolve to.
Thanks,
Travis Hegner
http://www.travishegner.com/
-Original Message-
From: Shuja Rehman [mailto:shujamug...@
My message is might be a bit late, but for others seeking the answer to this
quite frequently asked question I'd add the following link:
http://search-hadoop.com/m/o0hih24P4L71
Alex Baranau
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
On Sun, Aug 22, 2010 at 9:57 AM, Imran M Yo
> From: Bradford Stephens
> I'm banging my head against some perf issues on EC2. I'm
> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase
> scripts to handle the new version.
>
> I'm trying to insert about 22G of data across nodes on EC2
> m1.large instances [...]
c1.xlarge provides (bare
I think generally people are building their own HBase AMIs for use up on EC2,
but I'd like to announce there are new public AMIs available in all of the AWS
regions:
HBase 0.20.6
us-east-1
ami-2469834d
apache-hbase-images-us-east-1/hbase-0.20.6-i386.manifest.xml
ami-2c698345
Hi All
I have used these configuration settings to access hbase server from java
client
HBaseConfiguration config = new HBaseConfiguration();
config.clear();
config.set("hbase.zookeeper.quorum", "myserver.mycompany.com:2181");
config.set("hbase.zookeeper.property.clientPort","2181");
The p
kelvin,
yeah, it will help me a lot if u put an example. When u done with the
example then kindly forward it to shujamug...@gmail.com also
Thanks
On Wed, Sep 1, 2010 at 8:48 AM, Kelvin Rawls wrote:
> Shuja
>
> No real magic code here, Google JMX Tutorial and take any hello world JMX
> example
Stack,
This problem is already resolved, now can u check the new problem of
connecting to local ip as explained earlier
Thanks
On Wed, Sep 1, 2010 at 8:27 AM, Stack wrote:
> On Tue, Aug 31, 2010 at 5:30 PM, Shuja Rehman
> wrote:
> > HBaseConfiguration#create() to construct a plain Configurati
Hey guys,
I'm banging my head against some perf issues on EC2. I'm using .20.6
on ASF hadoop .20.2, and tweaked the ec2 hbase scripts to handle the
new version.
I'm trying to insert about 22G of data across nodes on EC2 m1.large
instances. I'm getting speeds of about 1200 rows/minute. It seems li
> From: Bradford Stephens
> [...] I'm trying to do gets by using JSONP, which
> embeds/retrieves requests in
49 matches
Mail list logo