Re: Scanner job failures due to bad blocks in storefiles...

2010-09-01 Thread Ryan Rawson
Are you running enough xciever counts? Any failures in your datanode logs? On Wed, Sep 1, 2010 at 10:51 PM, Stack wrote: > Vidhya: > > Could you use the hadoop 0.20-append branch on your cluster as per > Todd's suggestion? > > St.Ack > > On Wed, Sep 1, 2010 at 12:22 PM, Vidhyashankar Venkatarama

Re: JAVA CLIENT==10/08/31 20:27:54 INFO ipc.HbaseRPC: Problem connecting to server: /10.0.3.85:60020

2010-09-01 Thread Stack
You networking looks borked (Where does it get 203.14.166.86 from?). Figure that first. St.Ack On Wed, Sep 1, 2010 at 12:10 PM, Shuja Rehman wrote: > Hagner, > > If i change etc/hosts file and give global ip theres then namenode of hadoop > did not start and give the following error > > java.net.

Re: Scanner job failures due to bad blocks in storefiles...

2010-09-01 Thread Stack
Vidhya: Could you use the hadoop 0.20-append branch on your cluster as per Todd's suggestion? St.Ack On Wed, Sep 1, 2010 at 12:22 PM, Vidhyashankar Venkataraman wrote: > The RS logs is filled with exceptions like the one I have specified below.. > > Vidhya > > RS log: > > 2010-09-01 18:23:55,88

Re: HBase table lost on upgrade

2010-09-01 Thread Stack
On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani wrote: > That email was just informational. Below are the details on my cluster - let > me know if more is needed. > > I have 2 hbase clusters setup > -       for production, 6 node cluster,  32G, 8 processors > -       for dev, 3 node cluster , 16GRA

Re: Region servers down...

2010-09-01 Thread Stack
Sounds like 2047 is not enough. Up it again. 4k? St.Ack 2010/9/1 xiujin yang : > > Thank you J-D. > > > I've checked two datanode log and found the same error.  "exceeds the limit > of concurrent xcievers 2047" > > > [2010-08-31 > 10:43:26][error][org.apache.hadoop.hdfs.server.datanode.dataxce

RE: Slow Inserts on EC2 Cluster

2010-09-01 Thread Jonathan Gray
Been doing lots of importing recently. There are two easy ways to get big performance boosts. The first is HFileOuputFormat. It works into existing tables now. Consistently see 10X+ performance this way versus API. If you must use the API, pre-create a bunch of regions for your table. You c

RE: Region servers down...

2010-09-01 Thread xiujin yang
Thank you J-D. I've checked two datanode log and found the same error. "exceeds the limit of concurrent xcievers 2047" [2010-08-31 10:43:26][error][org.apache.hadoop.hdfs.server.datanode.dataxcei...@5a809419][org.apache.hadoop.hdfs.server.datanode.dataxceiver.run(DataXceiver.java:131)] Da

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
On the full data set (10 reducers), speeds are about 100k/minute (WAL Disabled). Still much slower than I'd like, but I'll take it over the former :) On Wed, Sep 1, 2010 at 5:59 PM, Ryan Rawson wrote: > Yes exactly, column families have the same performance profile as > tables.  12 CF = 12 tables

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Ryan Rawson
Yes exactly, column families have the same performance profile as tables. 12 CF = 12 tables. -ryan On Wed, Sep 1, 2010 at 5:56 PM, Bradford Stephens wrote: > Good call JD!  We've gone from 20k inserts/minute to 200k. Much > better! I still think it's slower than I'd want by about one OOM, but >

RE: regionserver skew

2010-09-01 Thread Sharma, Avani
Better formatting would probably be helpful. `links http://localhost:60010/ ` -Original Message- From: Sharma, Avani [mailto:agsha...@ebay.com] Sent: Wednesday, September 01, 2010 5:52 PM To: user@hbase.apache.org Subject: RE: regionserver skew Links http://localhost:60010/ worked.

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
Good call JD! We've gone from 20k inserts/minute to 200k. Much better! I still think it's slower than I'd want by about one OOM, but it's progress. Since we're populating 12 families, I guess we're seeking for 12 files on each write. Not pretty. I'll look at the customer and see if they really ha

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Ryan Rawson
There are a couple of things here happening, and some solutions: - dont flush based on region size, only on family/store size. - do what the bigtable paper says and merge the smallest file with memstore while flushing thus keeping the net number of files low. The latter would probably benefit fro

RE: regionserver skew

2010-09-01 Thread Sharma, Avani
Links http://localhost:60010/ worked. My hbase cluster (Solaris machines) is firewalled and this is the best I could do currently. -Original Message- From: Sharma, Avani [mailto:agsha...@ebay.com] Sent: Monday, August 30, 2010 6:48 PM To: user@hbase.apache.org Subject: RE: regionserver

RE: Initial and max heap size

2010-09-01 Thread Sharma, Avani
I was able to do the same. Thanks. -Avani -Original Message- From: Matthew LeMieux [mailto:m...@mlogiciels.com] Sent: Tuesday, August 31, 2010 11:47 AM To: user@hbase.apache.org Subject: Re: Initial and max heap size I've found that the master doesn't need as much memory as the regions

RE: HBase table lost on upgrade

2010-09-01 Thread Sharma, Avani
That email was just informational. Below are the details on my cluster - let me know if more is needed. I have 2 hbase clusters setup - for production, 6 node cluster, 32G, 8 processors - for dev, 3 node cluster , 16GRAM , 4 processors 1. I installed hadoop0.20.2 and hbase0.20.3

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
Yeah, those families are all needed -- but I didn't realize the files were so small. That's odd -- and you're right, that'd certainly throw it off. I'll merge them all and see if that helps. On Wed, Sep 1, 2010 at 5:24 PM, Jean-Daniel Cryans wrote: > Took a quick look at your RS log, it looks lik

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Jean-Daniel Cryans
Took a quick look at your RS log, it looks like you are using a lot of families and loading them pretty much at the same rate. Look at lines that start with: INFO org.apache.hadoop.hbase.regionserver.Store: Added ... And you will see that you are dumping very small files on the filesystem, on ave

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
'allo, I changed the cluster form m1.large to c1.xlarge -- we're getting about 4k inserts /node / minute instead of 2k. A small improvement, but nowhere near what I'm used to, even from vague memories of old clusters on EC2. I also stripped all the Cascading from my code and have a very basic raw

Re: truncate large table suggestion

2010-09-01 Thread Jinsong Hu
unfortunately. I tried flush the table and disable, and then drop, and it doesn't work. I even wrote a utility to remove all records from the large table and then do so, and it doesn't work either. strangely. I looked at the web UI, and still see many regions even the number of rows in the tabl

Re: truncate large table suggestion

2010-09-01 Thread Jean-Daniel Cryans
That version doesn't have the fixes I referred to, and disabling large tables will likely hit the race condition. J-D On Wed, Sep 1, 2010 at 2:47 PM, Jinsong Hu wrote: > unfortunately. I tried flush the table and disable, and then drop, and it > doesn't work. > I even wrote a utility to remove a

Re: truncate large table suggestion

2010-09-01 Thread Jinsong Hu
unfortunately. I tried flush the table and disable, and then drop, and it doesn't work. I even wrote a utility to remove all records from the large table and then do so, and it doesn't work either. strangely. I looked at the web UI, and still see many regions even the number of rows in the tabl

Re: how many regions a regionserver can support

2010-09-01 Thread Scott Whitecross
"be sureto compress your data and set the split size bigger than the default of 256MB or you'll end up with too many regions." How many regions are to many? I have a decent sized cluster (~30 nodes) and started inserting new data, and noticed that after a day, I went from 30 regions on each serve

Re: Scanner job failures due to bad blocks in storefiles...

2010-09-01 Thread Vidhyashankar Venkataraman
The RS logs is filled with exceptions like the one I have specified below.. Vidhya RS log: 2010-09-01 18:23:55,883 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://b3130080.ys

RE: how many regions a regionserver can support

2010-09-01 Thread Jonathan Gray
Again, the read/write load has much more to do with cluster sizing than the dataset (total capacity aside). To give you an idea of how widely it varies, I had a client who put several hundred GBs of data onto a single node setup of HBase. I've also seen clusters of 20-100 nodes with only 10s o

Re: how many regions a regionserver can support

2010-09-01 Thread Jinsong Hu
Yes, I am indeed testing the sustained rate. the channel I/O exception shows the I/O killed the regionserver. the data node side shows: 2010-08-28 23:46:27,854 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Ex ception in receiveBlock for block blk_7209586757797236713_2442298 java.io.

Re: JAVA CLIENT==10/08/31 20:27:54 INFO ipc.HbaseRPC: Problem connecting to server: /10.0.3.85:60020

2010-09-01 Thread Shuja Rehman
Hagner, If i change etc/hosts file and give global ip theres then namenode of hadoop did not start and give the following error java.net.BindException: Problem binding to myserver.mycompany.com/203.14.166.86:8020 : Cannot assign requested address so how i resolve it? On Wed, Sep 1, 2010 at 5:34

Re: Scanner job failures due to bad blocks in storefiles...

2010-09-01 Thread Todd Lipcon
Hi Vidhya, Problems like this used to be more frequent, but then we did a bunch of DFS bug fixes in the hadoop-0.20-append branch that resolved a lot of them. I imagine you're using YDH which doesn't have all the fixes, but I couldn't say exactly what issue this is. Could you grep both the NN and

Scanner job failures due to bad blocks in storefiles...

2010-09-01 Thread Vidhyashankar Venkataraman
I have been trying to run my scanner jobs and sometimes they fail due to DFS errors in one of the storefiles: I looked at the namenode logs and the file that caused the problem was in the process of getting fixed by the namenode but by then the scanner failed.. (I tried copying the file after t

Re: truncate large table suggestion

2010-09-01 Thread Jean-Daniel Cryans
One trick is to pre- force flush the table. Also try out the new 0.89, it has 2 fixes regarding a race condition between the BaseScanner and the closing of regions. The release candidate is here http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1 J-D On Wed, Sep 1, 2010 at 11:28 AM

Re: how many regions a regionserver can support

2010-09-01 Thread Jean-Daniel Cryans
Is that really a good test? Unless you are planning to write about 1TB of new data per day into HBase I don't see how you are testing capacity, you're more likely testing how HBase can sustain a constant import of a lot of data. Regarding that, I'd be interested in knowing exactly the circumstances

truncate large table suggestion

2010-09-01 Thread Jinsong Hu
Hi, Team: I have noticed that the truncate/drop table with large amount of data fails and actually corrupt the hbase. In the worse case, we can't even create the table with the same name any more and I was forced to dump the whole hbase records and recreate all tables again. I noticed there is

Re: how many regions a regionserver can support

2010-09-01 Thread Jinsong Hu
I did a testing with 6 regionserver cluster with a key design that spread the incoming data to all regions. I noticed after pumping data for 3-4 days for about 3 TB data, one of the regionserver shuts down because of channel IO error. on a 3 regionserver cluster and same key design, the regions

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Andrew Purtell
> From: Gary Helmling > > If you're using AMIs based on the latest Ubuntu (10.4), > theres a known kernel issue that seems to be causing > high loads while idle.  More info here: > > https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910 Seems best to avoid using Lucid on EC2 for now, th

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Andrew Purtell
> From: Matthew LeMieux > I'm starting to find that EC2 is not reliable enough to support > HBase. [...] > (I've been using m1.large and m2.xlarge running CDH3) I personally don't use EC2 for anything more than on demand ad hoc testing, but I do know of successful deployments there. However, I

Re: Region servers down...

2010-09-01 Thread Jean-Daniel Cryans
This is errors coming from HDFS, I would start looking at the datanode log on the same machine for any exceptions thrown at the same time. Also make sure your cluster is properly configured according to the last bullet point in the requirements http://hbase.apache.org/docs/r0.20.6/api/overview-summ

Re: Trying to restart HMaster

2010-09-01 Thread Jean-Daniel Cryans
Very hard to tell how it got there by just looking at the end result, but you could try using the shell tools like disable_region then close_region, and then enable_region on user_name_index,,1282242158507.8c9a40b89ee92e4b2f285b306a2d30ed. Also you could giving a spin to the the latest 0.89 releas

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Gary Helmling
On Wed, Sep 1, 2010 at 7:24 AM, Matthew LeMieux wrote: > I'm starting to find that EC2 is not reliable enough to support HBase. I'm > running into 2 things that might be related: > > 1) On idle machines that are apparently doing nothing (reports of <3% CPU > utilization, no I/O wait) the load i

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
I think it's mostly a matter of cost-efficiency -- HBase *runs* just fine on EC2, and is built to be in a transient environment. It's just not always cost-effective because you have to use pricey instances. As far as my issue -- it didn't seem to be ZK. I like Andrew's point, I'll knock it up to b

RE: Slow Inserts on EC2 Cluster

2010-09-01 Thread Jonathan Gray
While I completely agree with much of what you're saying, and am usually one of the first to encourage people to not use virtual machines w/ HBase, I know of several successful deployments of HBase on EC2. In most instances there was some pain encountered, but it does work for some. I've not s

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
Wow, thanks. I didn't consider that ... I try to avoid the cloud if at all possible :) Cheers, B On Wed, Sep 1, 2010 at 4:14 AM, Andrew Purtell wrote: >> From: Bradford Stephens >> I'm banging my head against some perf issues on EC2. I'm >> using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hb

RE: JAVA CLIENT==10/08/31 20:27:54 INFO ipc.HbaseRPC: Problem connecting to server: /10.0.3.85:60020

2010-09-01 Thread Hegner, Travis
Shuja, If you are not running any type of DNS/rDNS service, then make sure the /etc/hosts file on each of your nodes maps each node to the IP address you want it to resolve to. Thanks, Travis Hegner http://www.travishegner.com/ -Original Message- From: Shuja Rehman [mailto:shujamug...@

Re: Query about best practice regarding opening a HTable

2010-09-01 Thread Alex Baranau
My message is might be a bit late, but for others seeking the answer to this quite frequently asked question I'd add the following link: http://search-hadoop.com/m/o0hih24P4L71 Alex Baranau Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch On Sun, Aug 22, 2010 at 9:57 AM, Imran M Yo

Re: Slow Inserts on EC2 Cluster

2010-09-01 Thread Andrew Purtell
> From: Bradford Stephens > I'm banging my head against some perf issues on EC2. I'm > using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase > scripts to handle the new version. > > I'm trying to insert about 22G of data across nodes on EC2 > m1.large instances [...] c1.xlarge provides (bare

new HBase AMIs available

2010-09-01 Thread Andrew Purtell
I think generally people are building their own HBase AMIs for use up on EC2, but I'd like to announce there are new public AMIs available in all of the AWS regions: HBase 0.20.6 us-east-1 ami-2469834d apache-hbase-images-us-east-1/hbase-0.20.6-i386.manifest.xml ami-2c698345

JAVA CLIENT==10/08/31 20:27:54 INFO ipc.HbaseRPC: Problem connecting to server: /10.0.3.85:60020

2010-09-01 Thread Shuja Rehman
Hi All I have used these configuration settings to access hbase server from java client HBaseConfiguration config = new HBaseConfiguration(); config.clear(); config.set("hbase.zookeeper.quorum", "myserver.mycompany.com:2181"); config.set("hbase.zookeeper.property.clientPort","2181"); The p

Re: Getting data from Hbase from client/remote computer

2010-09-01 Thread Shuja Rehman
kelvin, yeah, it will help me a lot if u put an example. When u done with the example then kindly forward it to shujamug...@gmail.com also Thanks On Wed, Sep 1, 2010 at 8:48 AM, Kelvin Rawls wrote: > Shuja > > No real magic code here, Google JMX Tutorial and take any hello world JMX > example

Re: Getting data from Hbase from client/remote computer

2010-09-01 Thread Shuja Rehman
Stack, This problem is already resolved, now can u check the new problem of connecting to local ip as explained earlier Thanks On Wed, Sep 1, 2010 at 8:27 AM, Stack wrote: > On Tue, Aug 31, 2010 at 5:30 PM, Shuja Rehman > wrote: > > HBaseConfiguration#create() to construct a plain Configurati

Slow Inserts on EC2 Cluster

2010-09-01 Thread Bradford Stephens
Hey guys, I'm banging my head against some perf issues on EC2. I'm using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase scripts to handle the new version. I'm trying to insert about 22G of data across nodes on EC2 m1.large instances. I'm getting speeds of about 1200 rows/minute. It seems li

Re: JSONP and Stargate

2010-09-01 Thread Andrew Purtell
> From: Bradford Stephens > [...] I'm trying to do gets by using JSONP, which > embeds/retrieves requests in