Exception on startup -ClassNotFound-

2009-12-22 Thread alaa.nobani
Hi, Could anyone help me know what cause the following exception. it says that the regionserver class am using in my habse-site.xml cnfig is missing, the config part am using is: ... hbase.regionserver.impl org.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer

Re: Exception on startup -ClassNotFound-

2009-12-22 Thread Lars George
Hi Alaa, Yes, sounds like it. The IndexedTable is part of contrib. Add the lib you find there to the classpath. Lars On Dec 22, 2009, at 12:02, "alaa.nobani" wrote: Hi, Could anyone help me know what cause the following exception. it says that the regionserver class am using in my habs

How to set up 2-server hbase+zookeeper cluster

2009-12-22 Thread folderty
Hi, I would like to set up hbase and zookeeper (I know hbase uses zookeeper but i need to use it also in my app) on 2 servers (for evaluetion purposes) one will act as a server and a client and the other one will only act as a client. I would like to know how to config and setup hadoop common/hba

RE: Smaller Region Size?

2009-12-22 Thread Mark Vigeant
J-D, I noticed that performance for uploading data into tables got a lot better as I lowered the max file size -- but up until a certain point, where the performance began slowing down again. Is there a rule of thumb/formula/notion to rely on when setting this parameter for optimal performance

Updated HBASE RPMS

2009-12-22 Thread Edward Capriolo
All, I got my hbase jumpstart with the cloudera RPMs. Cloudera told me the HBase guys created them (im assuming those guys are on list). I have not been able to find the RPMs anywhere besides cloudera. Cloudera did provide me the source RPMs, I have noticed however that CE was still at v 0.20.0,

Re: Updated HBASE RPMS

2009-12-22 Thread Lars George
Hi Edward, Andrew, Jeff H and I were discussing this a while ago. I had someone else ask for a .deb of the current release. Andrew created the older rpm's for Cloudera (I think) back then but cannot really maintain them. So I personally would appreciate someone hosting those on their own infrastru

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
I have been updating the RPMs and have mailed Chad @ Cloudera with their locations. We have RPMs from 0.20.1 and 0.20.2. HBase 0.20.1: http://iridiant.s3.amazonaws.com/hbase-0.20-0.20.1-2.cloudera.src.rpm HBase 0.20.2: http://iridiant.s3.amazonaws.com/hbase-0.20-0.20.2-1.cloudera.src.rpm S

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
Unless you do init.d integration and service dependency -- both of which is complicated by distro specific details -- then a RPM or DEB is just basically a tarball in another form. I suppose registering it as a package has some small benefit for inventory and version tracking of what is installed

Re: Updated HBASE RPMS

2009-12-22 Thread Lars George
Hi Andy, I do not want to discourage you. What do you think of it all? Making any sense? Lars On Dec 22, 2009, at 18:53, Andrew Purtell wrote: I have been updating the RPMs and have mailed Chad @ Cloudera with their locations. We have RPMs from 0.20.1 and 0.20.2. HBase 0.20.1: http://

Re: Updated HBASE RPMS

2009-12-22 Thread Edward Capriolo
> So should I stop rolling these? I do not think so, cloudera is managing their own patch level and they are more likely to chose stability over latest and greatest. I liked the layout and the init scripts, but I needed the more bleeding edge features. I did some searching but was not able to fi

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
Edward, This S3 URLs doesn't work for you? http://iridiant.s3.amazonaws.com/hbase-0.20-0.20.2-1.cloudera.src.rpm That's HBase 0.20.2 release. - Andy - Original Message > From: Edward Capriolo > To: hbase-user@hadoop.apache.org > Sent: Tue, December 22, 2009 10:22:29 AM > Subj

Re: Updated HBASE RPMS

2009-12-22 Thread Lars George
Should those links be up on wiki? Are they "public"? On Dec 22, 2009, at 19:25, Andrew Purtell wrote: Edward, This S3 URLs doesn't work for you? http://iridiant.s3.amazonaws.com/hbase-0.20-0.20.2-1.cloudera.src.rpm That's HBase 0.20.2 release. - Andy - Original Message From

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
They are public in the sense that the ACL is world-readable. They are not public in the sense that this is my personal EC2 account. If I close it, those files will go away. - Andy - Original Message > From: Lars George > To: "hbase-user@hadoop.apache.org" > Sent: Tue, December 22,

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
Likewise with these public HBase AMIs: ami-c45dbfad iridiant-bundles/hbase-0.20.2-i386.manifest.xml ami-ce5dbfa7 iridiant-bundles/hbase-0.20.2-x86_64.manifest.xml ami-a65ab8cf iridiant-bundles/hbase-0.20.2-0.18.3-i386.manifest.xml ami-965ab8ff iridiant-bundles/hbase-0.20.2-0.18.3-x86_6

Re: How to set up 2-server hbase+zookeeper cluster

2009-12-22 Thread Jean-Daniel Cryans
The getting started documentation covers most of your questions: http://hadoop.apache.org/hbase/docs/r0.20.2/api/overview-summary.html#overview_description Since your second machine only acts as client, I guess you don't want any process from the hadoop/hbase stack on it? Then you will only need t

Re: Updated HBASE RPMS

2009-12-22 Thread Edward Capriolo
Yes, Those source rpms would work for me, but is there a repository that I can configure in my yum.repos.d? That is what I was looking to set up.

Re: Smaller Region Size?

2009-12-22 Thread stack
On Tue, Dec 22, 2009 at 8:57 AM, Mark Vigeant wrote: > J-D, > > I noticed that performance for uploading data into tables got a lot better > as I lowered the max file size -- but up until a certain point, where the > performance began slowing down again. > > Tell us more. What kinda size changes

Re: Updated HBASE RPMS

2009-12-22 Thread Andrew Purtell
> From: Edward Capriolo > is there a repository that I can configure in my yum.repos.d? No, sorry. - Andy

Re: starting thrift generates TTransportException

2009-12-22 Thread stack
Here's a successful thrift server start: ... 09/12/22 14:04:32 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=6 watcher=org.apache.hadoop.hbase.client.hconnectionmanager$clientzkwatc...@a39ab89 09/12/22 14:04:32 INFO zookeeper.ClientCnxn: zo

Re: Smaller Region Size?

2009-12-22 Thread Ryan Rawson
The biggest legitimate reason to run smaller region size is if your data set is small (lets say 400mb) but highly accessed, so you want a good spread of regions across your cluster. Another is to run a larger region if you are having a huge table and you want to keep absolute region count low. I a

startRow and endRow doesn't work when use HBase mapreduce

2009-12-22 Thread Sandy_Yin
Hi, The startRow and endRow of Scan doesn't work when use HBase mapreduce. The job always scans the entire table. Is there any reason for this or I misuse? Example code: Scan scan = new Scan(); scan.addFamily(...); scan.setStartRow(startkey); scan.setStopRow(endkey); TableMapReduceUt