Re: How to create config object using HBase 0.90.3 ?

2011-06-08 Thread Stack
Well, our example should have had that for an import. If you get the example working, paste it to a JIRA and I'll update our doc. St.Ack On Wed, Jun 8, 2011 at 11:32 PM, praveenesh kumar wrote: > Oh..!!! > Sorry Sorry.. My mistake.. > I was searching org.apache.hadoop.conf.Configuration in the H

Re: HBase Backups

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 11:33 PM, Ted Dunning wrote: > Otis, > > We should talk some time about MapR.  We did a test with Stack where we had > an hbase instance with very active writes going on.  We did successive > snapshots with no interruption or pause in hbase operations and were able to > demo

Re: HBase Backups

2011-06-08 Thread Ted Dunning
Otis, We should talk some time about MapR. We did a test with Stack where we had an hbase instance with very active writes going on. We did successive snapshots with no interruption or pause in hbase operations and were able to demonstrate the each snapshot was usable to restore hbase to the sta

Re: How to create config object using HBase 0.90.3 ?

2011-06-08 Thread praveenesh kumar
Oh..!!! Sorry Sorry.. My mistake.. I was searching org.apache.hadoop.conf.Configuration in the HBase API.. Its in the Hadoop-core. jar file.. My mistake.. Extremely Sorry.. :-) On Thu, Jun 9, 2011 at 12:00 PM, praveenesh kumar wrote: > The link you send to me showing HBASE 0.91.0 - SNAPSHOT API >

Re: increasing hbase get latencies

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 4:39 PM, Abhijit Pol wrote: > Recently we observed that our "get" latencies keep increasing over the > period (and eventually flatten out at higher value) and if we restart hbase > server, latencies go back to good state (low values) and start increasing > again. > What hap

Re: How to create config object using HBase 0.90.3 ?

2011-06-08 Thread praveenesh kumar
The link you send to me showing HBASE 0.91.0 - SNAPSHOT API and in the link http://hbase.apache.org/apidocs/overview-summary.html#overview_description, I am not able to see org.apache.hadoop.conf.Configuration Class. If I am following the given example -- http://hbase.apache.org/apidocs/org/apach

Re: a question about log level

2011-06-08 Thread Stack
We leave it at DEBUG level. Good for figuring issues. Is there client tracing in your datanode logs? You might want to disable this if its on. Here is what I add to the hadoop conf/log4j.properties: log4j.logger.org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace=WARN St.Ack 2011/6/8

Re: How to create config object using HBase 0.90.3 ?

2011-06-08 Thread Stack
Where are you reading? I just checked the javadoc, http://hbase.apache.org/apidocs/overview-summary.html#overview_description, and it seems to be current. St.Ack On Wed, Jun 8, 2011 at 11:06 PM, praveenesh kumar wrote: > Hello guys, > > I just started doing HBase programming. I am using HBase 0

Re: a question about log level

2011-06-08 Thread Gaojinchao
Stack, Thanks for your reply. In your production, set the log level information or warning ? -邮件原件- 发件人: saint@gmail.com [mailto:saint@gmail.com] 代表 Stack 发送时间: 2011年6月9日 12:36 收件人: user@hbase.apache.org 主题: Re: a question about log level In the conf/log4j.properties St.Ack On We

How to create config object using HBase 0.90.3 ?

2011-06-08 Thread praveenesh kumar
Hello guys, I just started doing HBase programming. I am using HBase 0.90.3 API. All tutorials I am getting are based on previous version. I am not able to create conf object using HBase 0.90.3 API.. In the HBASE 0.90.3 API link , its saying HBaseConfiguration is using org.apache.hadoop.conf.Conf

RE: Best practices for HBase in EC2?

2011-06-08 Thread Gaurav Kohli
Can anyone comment on the performance of "Cluster Compute Instances" of EC2 which they have released lately and do provide 10 Gigabit Ethernet which was the main issue with the previous instances. They have customized these instances for low latency inter-node communication We are plannin to s

Re: Hbase hbck showing status as INCONSISTENT

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 9:37 PM, praveenesh kumar wrote: > But my problem is I want to keep the entry of localhost in my /etc/hosts > file.. > Is there any parameter that we can put in hbase-site.xml so that RPC starts > listening on regionserver's actual IP rather than default localhost. ?? > No.

Re: Adding HQuorum dynamically.

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 9:45 PM, James Ram wrote: > Is there anyway to add a new HQuorum to the cluster dynamically? > If HQuorum == HRegionServer, then yes. Just make sure it has same config. as other members of the cluster and start it. St.Ack

Re: HQuorum failures

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 10:10 PM, James Ram wrote: > Hi, > Thanks for your reply. So does HBase automatically reassign to another > regionserver or do we have to do it manually. > It does it automatically. St.Ack

Re: HQuorum failures

2011-06-08 Thread James Ram
Hi, Thanks for your reply. So does HBase automatically reassign to another regionserver or do we have to do it manually. On Thu, Jun 9, 2011 at 10:18 AM, Chris Tarnas wrote: > What is an HQuorum? > > If you mean a regionserver then possibly you application is attempting to > get data that was on

Re: HQuorum failures

2011-06-08 Thread Chris Tarnas
What is an HQuorum? If you mean a regionserver then possibly you application is attempting to get data that was on a region hosted by the failed regionserver and in that case you need to make sure you application can deal the connection failure and wait for the the regions to be reassigned to

Does Put support "don't put if row exists"?

2011-06-08 Thread Ma, Ming
Hi, Maybe this has been asked before. I couldn't find much information on this. We have an application where multiple instances across different machines could try to insert a new row with the same row key into a global HBase table at the same time. If the row has been inserted by one instance

Adding HQuorum dynamically.

2011-06-08 Thread James Ram
Is there anyway to add a new HQuorum to the cluster dynamically? -- With Regards, Jr.

HQuorum failures

2011-06-08 Thread James Ram
Hi, We are running a 5 machine Hbase cluster. We have noticed that whenever an HQuorum fails in one machine, the entire application that is running on HBase crashes. Is there anything to do about this? -- With Regards, Jr.

Re: Hbase hbck showing status as INCONSISTENT

2011-06-08 Thread praveenesh kumar
Hi.. I guess the problem is one of my regionserver is having entry of localhost in /etc/hosts file. My log is saying that *2011-06-08 15:24:27,588 INFO org.apache.hadoop.hbase.* *regionserver.HRegionServer: Serving as ub8,60020,1307526863668, RPC listening on /127.0.0.1:60020, sessionid=0x306eacb5

Re: a question about log level

2011-06-08 Thread Stack
In the conf/log4j.properties St.Ack On Wed, Jun 8, 2011 at 9:02 PM, Gaojinchao wrote: > How should we set the log level for production ? > Do anyone have some experience? > I want to use information. > > >

Re: HBase Backups

2011-06-08 Thread Manoj Murumkar
Thanks, I have seen it. Once I verify a viable solution, I will update this thread. On Jun 8, 2011 5:57 PM, "Otis Gospodnetic" wrote: > There is this post about HBase backup options > http://blog.sematext.com/2011/03/11/hbase-backup-options/ . I hope it helps. > > Otis > > Sematext :: htt

a question about log level

2011-06-08 Thread Gaojinchao
How should we set the log level for production ? Do anyone have some experience? I want to use information.

RE: How to efficiently join HBase tables?

2011-06-08 Thread Doug Meil
Hi there- Summary comment: 1) Preference Several people in this thread have suggested approaches (map-side memory join, multi-get, temp files), all of which have merit and have advantages in certain situations. Kudos to the dist-list for chiming in. The "right" approach depends on the spec

Re: distribution of regions to servers

2011-06-08 Thread Ted Yu
The assumption was that regions were not evenly distributed prior to restarting. If they were, user wouldn't select this policy. We can this policy effective only once - retain assignment is selected following this new policy. Of course the dynamic portion of load balancer needs to select the unde

Re: Best practices for HBase in EC2?

2011-06-08 Thread George P. Stathis
Jim, I'd be interested in hearing your experience with Whirr when you try it. I've been testing it the last couple of days and I haven't been able to get the out-of-the box hadoop recipe to work when it cames up (the namenode doesn't have any datanodes configured although they are all up and runnin

Re: hbase hashing algorithm and schema design

2011-06-08 Thread Otis Gospodnetic
Sam, would HBaseWD help you here? See http://search-hadoop.com/m/AQ7CG2GkiO/hbasewd&subj=+ANN+HBaseWD+Distribute+Sequential+Writes+in+HBase Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message

Re: 0.92.0 availability

2011-06-08 Thread Otis Gospodnetic
I wouldn't rely on any dates. :) I'd look at the number of remaining open JIRA issues with that target version. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message > From: "Ma, Ming" > T

Re: HBase Backups

2011-06-08 Thread Otis Gospodnetic
There is this post about HBase backup options http://blog.sematext.com/2011/03/11/hbase-backup-options/ . I hope it helps. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Manoj Mur

Re: HBase Backups

2011-06-08 Thread Manoj Murumkar
We are trying to do this online as downtime is not an option. Good point, nonetheless. On Jun 8, 2011 3:48 PM, "Joey Echeverria" wrote: > Can you afford some down time? If so, you could minor compact, disable > the table, distcp, and then enable the table. > > -Joey > > On Wed, Jun 8, 2011 at 1:22

0.92.0 availability

2011-06-08 Thread Ma, Ming
Hi, Where can I find the targeted release date of 0.92.0? Thanks. Ming

increasing hbase get latencies

2011-06-08 Thread Abhijit Pol
We are on hbase 0.90 and using hbase for a while to perform high volume data lookup using hbase client (no map-reduce involved). Recently we observed that our "get" latencies keep increasing over the period (and eventually flatten out at higher value) and if we restart hbase server, latencies go b

RE: How to efficiently join HBase tables?

2011-06-08 Thread Buttler, David
Thank you for the explanation, I think I understand the suggestion now. I completely agree with you that this would be effective for cases that you can do the join of the sorted values in memory. A small tweak would make this more generic and effective for any size. If you had two separate Map

Re: HBase Backups

2011-06-08 Thread Joey Echeverria
Can you afford some down time? If so, you could minor compact, disable the table, distcp, and then enable the table. -Joey On Wed, Jun 8, 2011 at 1:22 PM, Manoj Murumkar wrote: > Hi, > > We're trying to come up with right strategy for backing up HBase tables. > Assumption is that sizes of tables

Re: How to efficiently join HBase tables?

2011-06-08 Thread Dave Latham
I believe this is what Eran is suggesting: Table A --- Row1 (has joinVal_1) Row2 (has joinVal_2) Row3 (has joinVal_1) Table B --- Row4 (has joinVal_1) Row5 (has joinVal_3) Row6 (has joinVal_2) Mapper receives a list of input rows (union of both input tables in any order), and produces (=

Re: How to efficiently join HBase tables?

2011-06-08 Thread Michel Segel
Unless I am mistaken... get() requires a row key, right? And you can join tables on column data which isn't in the row key, right? So how do you do a get()? :-) Sure there is more than one way to skin a cat. But if you want to be efficient... You will create a set of unique keys based on the col

RE: distribution of regions to servers

2011-06-08 Thread Doug Meil
If I understand the history correctly, round-robin was used in .89, but "retains" is the policy for .90+. My 2-cents is that if/when region-shuffling is required, I'd rather do that with another utility and keep that out of cluster startup. -Original Message- From: saint@gmail.com [

Re: hbase hashing algorithm and schema design

2011-06-08 Thread Sam Seigal
On Wed, Jun 8, 2011 at 12:40 AM, tsuna wrote: > On Tue, Jun 7, 2011 at 7:56 PM, Kjew Jned wrote: > > I was studying the OpenTSDB example, where they also prefix the row keys > with > > event id. > > > > I further modified my row keys to have this -> > > > > > > > > The uuid is fairly unique

RE: How to efficiently join HBase tables?

2011-06-08 Thread Buttler, David
Let's make a toy example to see if we can capture all of the edge conditions: Table A --- Key1 joinVal_1 Key2 joinVal_2 Key3 joinVal_1 Table B --- Key4 joinVal_1 Key5 joinVal_3 Key6 joinVal_2 Now, assume that we have a mapper that takes two values, one row from A, and one row from B. Ar

Re: distribution of regions to servers

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 12:50 PM, Ted Yu wrote: > I am thinking of creating a new policy for region assignment at cluster > startup which assigns regions from each table in round-robin fashion. > Don't we want to retain assignments on startup since that will ensure greatest locality of data? Roun

Re: distribution of regions to servers

2011-06-08 Thread Ted Yu
In trunk this behavior has been improved. Load balancer would move the youngest region off heavily loaded region server. See http://zhihongyu.blogspot.com/2011/04/load-balancer-in-hbase-090.html I am thinking of creating a new policy for region assignment at cluster startup which assigns regions

Re: Delete whole table HBase

2011-06-08 Thread Azshara
Yes, thanks it worked! Have no idea how I didn't come across the method! Thank you for the tip!

Re: How to efficiently join HBase tables?

2011-06-08 Thread Eran Kutner
I'd like to clarify, again what I'm trying to do and why I still think it's the best way to do it. I want to join two large tables, I'm assuming, and this is the key to the efficiency of this method, that: 1) I'm getting a lot of data from table A, something which is close enough top a full table s

Re: What the optimization method of when to delete Zk connection?

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 1:44 AM, bijieshan wrote: > Thanks Suraj. > Yes, It's a better method. For I haven't test on that. > So use HTablePool, it seems we haven't need to delete Zk connections > manually? Is that correct? > Yes. St.Ack

Re: A question about LeaseExpiredException

2011-06-08 Thread Stack
Grep the missing file in the namenode log and see if you can figure from mentions therein what happend with this file. Had the master taken it from you because it was processing server crash? St.Ack 2011/6/8 Gaojinchao : > Two regionservers(My cluster is 7 regionsever / datanode) crashed, saying

Re: in-memory data grid vs. ehcache + hbase

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 9:00 AM, Hiller, Dean x66079 wrote: > We have certain tables with under 10 rows, one under 200 rows and one with > 1,000,000 rows.  We have found out that having a copy/cache on each node is > EXTREMELY fast for our batch processing since these copies of data are local >

Re: EXT :Re: Failure to Launch: hbase-0.90.3 with hadoop-0.20.203.0

2011-06-08 Thread Stack
Looks like you need to copy to hbase a commons config jar; this version of hadoop seems to depend on it: java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration And you are clear that this version of hadoop does not have sync/append so hbase will lose data on crash. St.Ack

RE: EXT :Re: Failure to Launch: hbase-0.90.3 with hadoop-0.20.203.0

2011-06-08 Thread Ratner, Alan S (IS)
J-D, Thanks for the info. I copied the appropriate hadoop jar file to the lib directory (and renamed the original one). I wasn't able to figure out why zookeeper wasn't running on my master server so I launched zookeeper directly and set HBASE_MANAGES_ZK to false. (And since I am running

HBase Backups

2011-06-08 Thread Manoj Murumkar
Hi, We're trying to come up with right strategy for backing up HBase tables. Assumption is that sizes of tables will not grow beyond few hundred GB. Currently, we're employing exports (writing onto HDFS of another cluster directly), but is taking too long (~5 hours to export ~5GB of data). Are the

Re: Hbase hbck showing status as INCONSISTENT

2011-06-08 Thread Stack
A problem that will be fixed in 0.90.4 is that once hbck finds one issue, all checks that follow emit 'INCONSISTENCY'. A quick perusal of the below has it that hbck is not able to reach a server. Can you check into that? Its using an IP, rather than hostname. Why is that? ips in the regionserv

Re: How to find encoded name for a region?

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 10:01 AM, James Hammerton wrote: > Thanks Stack, > > I take it you mean get hold of check_meta.rb from a recent version and alter > it to find the HRIs? > Yes. Alter it to run in 0.20.6. St.Ack

Re: How to find encoded name for a region?

2011-06-08 Thread James Hammerton
Scratch that. You mean alter the find_overlapping_regions script to use .META. to find the overlapping regions don't you? James On Wed, Jun 8, 2011 at 6:01 PM, James Hammerton < james.hammer...@mendeley.com> wrote: > Thanks Stack, > > I take it you mean get hold of check_meta.rb from a recent ve

Re: How to find encoded name for a region?

2011-06-08 Thread James Hammerton
Thanks Stack, I take it you mean get hold of check_meta.rb from a recent version and alter it to find the HRIs? Regards, James On Wed, Jun 8, 2011 at 5:55 PM, Stack wrote: > Pull it in. You'll have to massage a little but rather than do the > indirect HTable.getStartKeys (which turns around

Re: How to find encoded name for a region?

2011-06-08 Thread Stack
Pull it in. You'll have to massage a little but rather than do the indirect HTable.getStartKeys (which turns around and reads meta), read .META. directly and get the HRIs yourself. St.Ack On Wed, Jun 8, 2011 at 9:51 AM, James Hammerton wrote: > Hi, > > I've checked /usr/lib/hbase/bin and it doe

Re: Hadoop/HBase Upgrade Suggestion

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 9:25 AM, Zhong, Sheng wrote: > Could anyone give me suggestion for Hadoop/HBase upgrade? We're > currently using  apache hadoop 0.20.2 + hbase 0.20.3 + zookeeper-3.2.2. > Has anyone done with latest stable version of hadoop-0.20.203.0rc1 + > Hbase 0.90.2, and will Hbase 0.90

Re: How to find encoded name for a region?

2011-06-08 Thread James Hammerton
Hi, I've checked /usr/lib/hbase/bin and it doesn't have check_meta.rb. Also, HTable doesn't have getHRegionInfos in 0.20.6. Regards, James On Wed, Jun 8, 2011 at 5:46 PM, Stack wrote: > Do you have check_meta.rb in 0.20.6 (I don't remember? I think you > do). Start with that? > > Otherwise

Re: How to find encoded name for a region?

2011-06-08 Thread Stack
Do you have check_meta.rb in 0.20.6 (I don't remember? I think you do). Start with that? Otherwise, here: keys = wanted_table.getStartEndKeys In 0.20.6 can you get HRegionInfos instead of start keys? That'd be more useful. They would have the encoded name. > We'd ideally like to feed the r

Re: How to find encoded name for a region?

2011-06-08 Thread James Hammerton
Thanks, Stack. The context is that we have a script, find_overlapping_regions.rb at: https://github.com/Mendeley/hbase-scripts/blob/master/find_overlapping_regions.rb We'd ideally like to feed the results into another script (to be written) that will call org.apache.hbase.util.Merge. I've been lo

Re: How to find encoded name for a region?

2011-06-08 Thread Stack
On Wed, Jun 8, 2011 at 9:22 AM, James Hammerton wrote: > Given the tableName, startKey and endKey for a region how do I get hold of > the encodedName? > I suppose it depends on the context. If reading .META., then if you deserialize the info:regioninfo into an HRegionInfo instance, then you can

Hadoop/HBase Upgrade Suggestion

2011-06-08 Thread Zhong, Sheng
Hey, Could anyone give me suggestion for Hadoop/HBase upgrade? We're currently using apache hadoop 0.20.2 + hbase 0.20.3 + zookeeper-3.2.2. Has anyone done with latest stable version of hadoop-0.20.203.0rc1 + Hbase 0.90.2, and will Hbase 0.90.2 have compatible issue with hadoop-0.20.203.0rc1? I

How to find encoded name for a region?

2011-06-08 Thread James Hammerton
Hi, Given the tableName, startKey and endKey for a region how do I get hold of the encodedName? We have code for identifying overlapping regions that outputs triples of the form tableName, startKey and endKey for each region, but it looks like the Merge command (we're using 0.20.6) requires the t

in-memory data grid vs. ehcache + hbase

2011-06-08 Thread Hiller, Dean x66079
We have certain tables with under 10 rows, one under 200 rows and one with 1,000,000 rows. We have found out that having a copy/cache on each node is EXTREMELY fast for our batch processing since these copies of data are local AND in-memory. The issue I am struggling with is the best way to ev

Re: Hbase Hardware requirement

2011-06-08 Thread Andrew Purtell
> From: Ted Dunning > Lots of people are moving towards more spindles per box to > increase IOP/s > > This is particular important for cases where the working > set gets pushed out of memory. Indeed. Our spec is more like 12x 500 GB SATA disks, to push IOPS and more evenly balance CPUs (fast du

Re: tech. talk at imageshack/yfrog

2011-06-08 Thread Himanshu Vashishtha
+1 to Matt's opinion (if possible?). I am interested in your use case, sounds very impressive by the stats you gave. You said 1000 tables? Looking forward to see what optimizations/config tweaks you had to do to cope up with your read/write requirements. Thanks, Himanshu On Wed, Jun 8, 2011 at 8

Re: tech. talk at imageshack/yfrog

2011-06-08 Thread Matt Davies
If it is possible I think any slides or even a video would be very interesting to some of us that can't travel. I, for one, would love to hear how you do it. Thanks! On Tue, Jun 7, 2011 at 6:07 PM, Jack Levin wrote: > Hey Guys, I plan to do a tech talk here at ImageShack, on how we store > and

RE: How to efficiently join HBase tables?

2011-06-08 Thread Doug Meil
Re: " With respect to Doug's posts, you can't do a multi-get off the bat" That's an assumption, but you're entitled to your opinion. -Original Message- From: Michael Segel [mailto:michael_se...@hotmail.com] Sent: Monday, June 06, 2011 10:08 PM To: user@hbase.apache.org Subject: RE: How t

Hbase hbck showing status as INCONSISTENT

2011-06-08 Thread praveenesh kumar
Hello guys, Well.. I am using 12 node hbase cluster. I can see all the nodes running on Hbase Web-UI. But when I am running hbase hbck , I am getting the following output : hadoop@ub13:/usr/local/hadoop$ hbase hbck 11/06/08 15:30:52 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version

RE: distribution of regions to servers

2011-06-08 Thread Kleegrewe, Christian
Hi geoff, Since hbase balances not at table but at cluster basis it may happen that all the regions for one table are located at the same region server. The reason for this may be the way hbase does table splits. If a region exceeds the configured maximum size the region is split into two, but

A question about LeaseExpiredException

2011-06-08 Thread Gaojinchao
Two regionservers(My cluster is 7 regionsever / datanode) crashed, saying that an file didn’t not exist, and that a lease has expired (log detail below). Tried to find in this mailing list. It seems different: Hbase version: 0.90.3 HDFS version: cloudera 0.20.2+320 OS: swappiness :0 and ulimit

Re: What the optimization method of when to delete Zk connection?

2011-06-08 Thread bijieshan
Thanks Suraj. Yes, It's a better method. For I haven't test on that. So use HTablePool, it seems we haven't need to delete Zk connections manually? Is that correct? Thanks! Jieshan Bean -- How about using HTablePool - doesn't that work for you? http://hbase.apache.org/apidocs/o

Re: hbase hashing algorithm and schema design

2011-06-08 Thread tsuna
On Tue, Jun 7, 2011 at 7:56 PM, Kjew Jned wrote: > I was studying the OpenTSDB example, where they also prefix the row keys with > event id. > > I further modified my row keys to have this -> > >   > > The uuid is fairly unique and random. > Is appending a uuid to the event id help the distribut