Re: Disk Seeks and Column families

2012-01-21 Thread M. C. Srivas
Praveen, basically you are correct on all counts. If there are too many columns, HBase will have to issue more disk-seeks to extract only the particular columns you need ... and since the data is laid out horizontally there are fewer common substrings in a single HBase-block and compression qua

where does hbase expect hbase-default.xml in webapp inside a servlet container ?

2012-01-21 Thread kim young ill
hi, i try to use hbase 0.90.5 as db-backend to store data from a servlet. i created the table manually via hbase shell, but call Configuration config = HBaseConfiguration.create(); from servlet fails : java.lang.RuntimeException: hbase-default.xml file seems to be for and old version of HBase (nul

Re: Fresh setup hadoop 1.0 and hbase 0.90.5 unable to start master

2012-01-21 Thread kim young ill
in fact it starts on my local setup ( one node) , but it there're complaints in the log file about not supporting append could cause data lost blablaba. but on release notes of hadoop 1, it says it support this.. can someone explain ? thanx On Sun, Jan 22, 2012 at 12:41 AM, kim young ill wrote:

Re: Fresh setup hadoop 1.0 and hbase 0.90.5 unable to start master

2012-01-21 Thread kim young ill
i guess just coying hadoop.jar wouldnt be enough, there're some commons-*** packages which will be needed for hadoop 1 to work hth On Sat, Jan 21, 2012 at 11:58 PM, Stack wrote: > On Fri, Jan 20, 2012 at 10:01 PM, Invisible.Trust < > invisible.tr...@gmail.com > > wrote: > > > I can't start HMas

HBql

2012-01-21 Thread Dalia Sobhy
Hiii all,Do anyone know any info about HBQL ??Good/ Bad ?? Performance??

Re: Fresh setup hadoop 1.0 and hbase 0.90.5 unable to start master

2012-01-21 Thread Stack
On Fri, Jan 20, 2012 at 10:01 PM, Invisible.Trust wrote: > I can't start HMaster > Please help me.. second day about this error > Exception in thread "main" java.lang.RuntimeException: Failed construction > of Regionserver: class org.apache.hadoop.hbase.** > regionserver.HRegionServer > > Thanks

Re: Hbase 0.90.5 and Hadoop 1.0.0 beta working?

2012-01-21 Thread Stack
On Fri, Jan 20, 2012 at 10:25 PM, Invisible.Trust wrote: > Hi, i have Debian 6.03 and problem with best friends hbase and hadoop > step by step, I want working configuration hbase (standalone for the first > step) and hadoop : > > wget http://www.sai.msu.su/apache//**hbase/hbase-0.90.5/hbase-0.90

Re: 0.92 Max Row Size

2012-01-21 Thread Stack
On Sat, Jan 21, 2012 at 5:34 AM, Wayne wrote: > Sorry but it would be too hard for us to be able to provide enough info in > a Jira to accurately reproduce. Our read problem is through thrift and has > everything to do with the row just being too big to bring back in its > entirety (13 million co

Interpret metrics

2012-01-21 Thread Yves Langisch
Hi, I have some performance issues and I try to interpret the HBase metrics I collect trough ganglia. I've noticed that the metric 'dfs.datanode.writeBlockOp_avg_time' is pretty high sometimes (~3'500'000) before dropping near to zero again. How do I have to interpret that characteristic? Ano

Re: RegionServer dying every two or three days

2012-01-21 Thread Matt Corgan
We actually don't run map/reduce on the same machines (most of our jobs are on an old message based system), so don't have much experience there. We run only HDFS (1G heap) and HBase (5.5G heap) with 12 * 100GB EBS volumes per regionserver, and ~350 regions/server at the moment. 5.5G is already a

Re: Disk Seeks and Column families

2012-01-21 Thread Andrey Stepachev
21 января 2012 г. 19:16 пользователь Doug Meil написал: > > One other "big picture" comment:  Hbase scales by having lots of servers, > and servers with multiple drives. While single-read performance is > obviously important, there is more to Hbase than a single-server RDBMS > drag-race comparison

Re: Disk Seeks and Column families

2012-01-21 Thread Doug Meil
Compression is at the block level within the StoreFile (Hfile), so yes, they can take advantage of compression. On 1/21/12 12:49 PM, "Praveen Sripati" wrote: >Thanks for the response. > >> The contents of a row stay together like a regular row-oriented >>database. > >> K: row-550/colfam1:50/1

Re: Disk Seeks and Column families

2012-01-21 Thread Praveen Sripati
Thanks for the response. > The contents of a row stay together like a regular row-oriented database. > K: row-550/colfam1:50/1309813948188/Put/vlen=2 V: 50 > K: row-550/colfam1:50/1309812287166/Put/vlen=2 V: 50 > K: row-551/colfam1:51/1309813948222/Put/vlen=2 V: 51 > K: row-551/colfam1:51/1309812

Re: Hbase out of memory error

2012-01-21 Thread Royston Sellman
Hmmm, when I look at the source that I *think* I am building HBase from I can see Benoit's patch is there. But when I look in the hbase-0.92.0-sources.jar that gets built at the same time the patch is not there. I better check my build process... Cheers, Royston On 21 Jan 2012, at 17:13, yuzhi

Re: Hbase out of memory error

2012-01-21 Thread yuzhihong
Benoit's patches are already in 0.92 Thanks On Jan 21, 2012, at 9:11 AM, Royston Sellman wrote: > So should I try applying Benoit Sigoure's patch for HBASE-5204? Will this > patch be in the 0.92 branch soon? > > Cheers, > Royston > > > > On 21 Jan 2012, at 16:58, yuzhih...@gmail.com wrot

Re: Hbase out of memory error

2012-01-21 Thread Royston Sellman
So should I try applying Benoit Sigoure's patch for HBASE-5204? Will this patch be in the 0.92 branch soon? Cheers, Royston On 21 Jan 2012, at 16:58, yuzhih...@gmail.com wrote: > That is the correct branch. > > Thanks > > > > On Jan 21, 2012, at 8:50 AM, Royston Sellman > wrote: > >>

Re: Hbase out of memory error

2012-01-21 Thread yuzhihong
That is the correct branch. Thanks On Jan 21, 2012, at 8:50 AM, Royston Sellman wrote: > Hi Ted, > > Yes, I am compiling with the same HBase jars. I wasn't aware of HBASE-5204, > thanks, it sounds possible this is my problem. Can you think of anything else > I should check? > > Just to

Re: Hbase out of memory error

2012-01-21 Thread Royston Sellman
Hi Ted, Yes, I am compiling with the same HBase jars. I wasn't aware of HBASE-5204, thanks, it sounds possible this is my problem. Can you think of anything else I should check? Just to make sure: I am checking out the code from svn.apache.org/repos/asf/hbase/branches/0.92 Is this the correc

Re: hbase won't startup after ipaddress change

2012-01-21 Thread Harsh J
What error do you run into specifically? Do you run bound to localhost or external address? If latter, do you have a hostname also being dynamically updated to resolve to the same? Try wiping your ZK clean, before starting HBase. Depending on your ZK version, try over zkCli.sh: "rmr /hbase" or "de

Re: Disk Seeks and Column families

2012-01-21 Thread yuzhihong
Have you considered using AggregationProtocol to perform aggregation ? Thanks On Jan 20, 2012, at 11:08 PM, Praveen Sripati wrote: > Hi, > > 1) According to the this url (1), HBase performs well for two or three > column families. Why is it so? > > 2) Dump of a HFile, looks like below. The

Re: Disk Seeks and Column families

2012-01-21 Thread Doug Meil
One other "big picture" comment: Hbase scales by having lots of servers, and servers with multiple drives. While single-read performance is obviously important, there is more to Hbase than a single-server RDBMS drag-race comparison. It's a distributed architecture (as with MapReduce). re: "hba

Re: Is HBase.Client.Result.getValue(...) and Result.getColumn(...) fetch actual value from TABLE everytime

2012-01-21 Thread Doug Meil
I think I understand what you're asking now. For background see "scan attribute selection" in here... http://hbase.apache.org/book.html#perf.reading The attributes that are specified in a Scan or Get are transferred to the client regardless if they are actually used by the client. On a relate

Re: 0.92 Max Row Size

2012-01-21 Thread yuzhihong
Thrift has been upgraded to 0.8 in trunk. 0.92 still uses 0.7 Can you provide Jira number which deals with memory leak ? Thanks On Jan 21, 2012, at 5:34 AM, Wayne wrote: > Sorry but it would be too hard for us to be able to provide enough info in > a Jira to accurately reproduce. Our read pr

Re: Is HBase.Client.Result.getValue(...) and Result.getColumn(...) fetch actual value from TABLE everytime

2012-01-21 Thread Alok Kumar
I could not get any other way/test to quickly know about it.. :) On Sat, Jan 21, 2012 at 7:48 PM, Alok Kumar wrote: > Hi Lars, > > Thanks for reply.. > I wanted to know ... Is Column Values are also cached in Result object > (ie, less number of calls to Hbase table for values) > or > It has bee

Re: Is HBase.Client.Result.getValue(...) and Result.getColumn(...) fetch actual value from TABLE everytime

2012-01-21 Thread Alok Kumar
Hi Lars, Thanks for reply.. I wanted to know ... Is Column Values are also cached in Result object (ie, less number of calls to Hbase table for values) or It has been fetched at the time when i loop through it with 'Col-Family:Col-Name' using getValue(..) or getColumn(...) ? I understand, it is

hbase won't startup after ipaddress change

2012-01-21 Thread Ben Cuthbert
All we have hbase on our laptops at work, when we switch to go home with the same laptop and get a new IP address hbase won't startup. Is there a solution for this?

Re: Disk Seeks and Column families

2012-01-21 Thread Doug Meil
Also, for #2 Hbase supports large-scale aggregation through MapReduce. On 1/21/12 7:47 AM, "Andrey Stepachev" wrote: >2012/1/21 Praveen Sripati : >> Hi, >> >> 1) According to the this url (1), HBase performs well for two or three >> column families. Why is it so? > >Frist, each column family

Re: HBase schema question

2012-01-21 Thread Doug Meil
Hi there- re: "Based on what I have read it looks like HBase is really good for scans or row key lookup." Yes. It is also a good MR source and sink. re: "re: how I can do joins" You either need to denormalize it on the way in to Hbase or do a lookup. re: "Will that also be fast?" Hbase

Re: 0.92 Max Row Size

2012-01-21 Thread Wayne
Sorry but it would be too hard for us to be able to provide enough info in a Jira to accurately reproduce. Our read problem is through thrift and has everything to do with the row just being too big to bring back in its entirety (13 million col row times out 1/3 of the time). Filters in .92 and thr

Re: RegionServer dying every two or three days

2012-01-21 Thread Leonardo Gamas
Thanks Matt for this insightful article, I will run my cluster with c1.xlarge to test it's performance. But i'm concerned with this machine, because the amount of RAM available, only 7GB. How many map/reduce slots do you configure? And the amount of Heap for HBase? How many regions per RegionServer

Re: HBase schema question

2012-01-21 Thread Michael Segel
You don't do joins. Sorry, but you need to put this in perspective... You need to get really drunk and with the next morning's hang over you need to look at HBASE as HBASE and do not think in terms of a relational schema. Having said that, you can do joins, however they are tricky to do right

Re: Disk Seeks and Column families

2012-01-21 Thread Andrey Stepachev
2012/1/21 Praveen Sripati : > Hi, > > 1) According to the this url (1), HBase performs well for two or three > column families. Why is it so? Frist, each column family stored in separate location, so, as stated in '6.2.1. Cardinality of ColumnFamilies', such schema design can lead to many small pi