date:20101011

Re: Question regarding data location in hdfs after hbase restarts

2010-10-11 Thread Ryan Rawson

We don't attempt to optimize region placement with hdfs locations yet. A reason why is because on a long lived cluster compactions create the locality you are looking for. Furthermore, in the old master such an optimization was really hard to do. The new master should make it easier to write such 1

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Ryan Rawson

Yes this is spot on. When hbase scans we read a block, iterate through the keys in the block then goes to the next block. We try to be as efficient as possible, but the inescapable fact remains we must read all the intervening data. We can do tricks (in 0.90) to use the block index to skip some blo

Increase region server throughput

2010-10-11 Thread Venkatesh

I would like to tune region server to increase throughput..On a 10 node cluster, I'm getting 5 sec per put. (this is unbatched/unbuffered). Other than region server handler count property is there anything else I can tune to increase throughput? ( this operation i can't use buffered write wit

Question regarding data location in hdfs after hbase restarts

2010-10-11 Thread Tao Xie

hi, all I set hdfs replica=1 when running hbase. And DN and RS co-exists on each slave node. So the data in the regions managed by RS will be stored on its local data node, rite? But when I restart hbase and hbase client does gets on RS, datanode will read data from remote data nodes. Does that mea

Re: Bulk import tools for HBase

2010-10-11 Thread Sean Bigdatafun

Another potential "problem" of incremental bulk loader is that the number of reducers (for the bulk loading process) needs to be equal to the existing regions -- this seems to be unfeasible for very large table, say with 2000 regions. Any comment on this? Thanks. Sean On Fri, Oct 8, 2010 at 9:03

Re: HBase cluster with heterogeneous resources

2010-10-11 Thread Sean Bigdatafun

On Sun, Oct 10, 2010 at 12:28 PM, Abhijit Pol wrote: > Thanks Stack. > > I think we have GC under control. We have CMS tunned to start early and > don't see slept x longer y in logs anymore. We also have higher zk timeout > (150 seconds), guess can bump that up a bit. > > I was able to point to s

HLog and durability question --0.90 and 0.20

2010-10-11 Thread Sean Bigdatafun

Can someone give me a detailed look at the HLog mechanism for 0.90 durablity? I recall that HBase committers claim that data will be truly durable in 0.90 after the client gets 'ok' acknowledgement from server, while it was not true in 0.20 (i.e., HBase may have the chance to lose the data even it

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Sean Bigdatafun

I think this is a good suggestion too. HBase linearly scans through the 64KB that is bring to memory. If big data payload (yet unused in a query/scan) is mixed with small data payload, it will be rather ineffective, I think? On Mon, Oct 11, 2010 at 9:43 AM, Ryan Rawson wrote: > The reason I tal

Re: Hbase rollback..

2010-10-11 Thread Ryan Rawson

That is correct. But we are confident with the new durability changes and other things 0.90 will be safer and faster than 0.20.6. On Oct 11, 2010 4:51 PM, "Sean Bigdatafun" wrote: > Thanks for clarifying this. > > But on the other hand, wow... that means that even I like the consistency > enhance

Re: Hbase rollback..

2010-10-11 Thread Sean Bigdatafun

Thanks for clarifying this. But on the other hand, wow... that means that even I like the consistency enhancement in 0.90, I can not enjoy it if I have started using HBase 0.20 on a production? On Thu, Sep 16, 2010 at 10:49 PM, Stack wrote: > On Thu, Sep 16, 2010 at 10:22 PM, Todd Lipcon w

HBase 0.89.20100726 with unmanaged zookeeper fails to start

2010-10-11 Thread Charles Thayer

We're using a pre-existing zookeeper cluster (HBASE_MANAGES_ZK=false), and trying to port some code from 0.20 to 0.89, but hbase fails to start with Couldnt start ZK at requested address of 2181 [..blah..] 2182 (from ./src/main/java/org/apache/hadoop/hbase/master/HMaster.java) Because port

Re: hbase.client.retries.number

2010-10-11 Thread Venkatesh

BTW..get this exception while trying a new put..& Also, get this exception on gets on some region servers org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, numtries=10, i=0, listsize=1

hbase.client.retries.number

2010-10-11 Thread Venkatesh

HBase was seamless for first couple of weeks..now all kinds of issues in production :) fun fun.. Curious ..does this property have to match up on "hbase client side" & region server side.. I've this number set to 0 on region server side & default on client side.. I can't do any put (new) t

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Ryan Rawson

The reason I talk about value size is one area where multiple families are good is when you have really large values in one column and smaller values in different columns. So if you want to just read the small values without scanning through the big values you can use separate column families. -ry

RE: question about region files

2010-10-11 Thread Gibbon, Robert, VF-Group

This smells of garbage and low memory. See for ref a similar problem report here - http://kr.forums.oracle.com/forums/thread.jspa?messageID=2146733 How many rest servers do you have loading all of that data? AFAIK they're stateless and loadbalancable # "Gang worker#0 (Parallel GC Threads)" pr

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Jean-Daniel Cryans

> Yes. I agree. OOME unlikely. I misinterpreted my current problem. > I found, that this (gc timeout) on my 0.89-stumpbleupon hbase occurs > only if writeToWAL=false. My RS eats all available memory (5GB), but > don't get OOME. I try ti figure out what is going on. Long GC pauses happens for many

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Andrey Stepachev

2010/10/11 Jean-Daniel Cryans : > On Mon, Oct 11, 2010 at 4:20 AM, Andrey Stepachev wrote: >> Hi. >> Yes. I agree. OOME unlikely. I misinterpreted my current problem. I found, that this (gc timeout) on my 0.89-stumpbleupon hbase occurs only if writeToWAL=false. My RS eats all available memory (5G

Re: Region servers suddenly disappearing

2010-10-11 Thread Jean-Daniel Cryans

No idea, the reason it died is higher in the log. Look for a message like "Dumping metrics" and the reason should be just a few lines higher than that. J-D On Sun, Oct 10, 2010 at 5:13 PM, Venkatesh wrote: > > Some of the region servers suddenly dying..I've pasted relevant log lines..I > don't

Re: Hbase internally row location mechanism

2010-10-11 Thread Jean-Daniel Cryans

Section 5.1 of the Bigtable paper gives a pretty good explanation: http://labs.google.com/papers/bigtable.html In HBase, Chubby is replaced by ZooKeeper, root tablet by the -ROOT- table, and METADATA tablets by the .META. table. J-D On Sun, Oct 10, 2010 at 10:54 PM, William Kang wrote: > Hi, >

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Jean-Daniel Cryans

On Mon, Oct 11, 2010 at 4:20 AM, Andrey Stepachev wrote: > Hi. > > One additional issue with column families: number of memstores. Each > family on insert utilizies > one memstory. If you'll write in several memstores at onces you get > more memstores and more > memory will be used by you region s

Re: StarGate HTTP ERROR: 404

2010-10-11 Thread Andrew Purtell

Hi Fleming, First, Sanel is correct, whatever you are attempting to use is not Stargate. Kindly follow the rest of the advice. > HBase 20.2 You should be using HBase 0.20.6. We can't help muchwith problems with 0.20.2 any more -- in just about all cases the first advice will be to upgrade to 0.

Re: StarGate HTTP ERROR: 404

2010-10-11 Thread Sanel Zukan

Actually, this is not Stargate, but older REST service that was deprecated. To activate Stargate, copy $HBASE_HOME/contrib/stargate/* and $HBASE_HOME/contrib/stargate/lib/* to hbase lib directory ($HBASE_HOME/lib) and start it with: $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.stargate.Main Now

Re: Number of column families vs Number of column family qualifiers

2010-10-11 Thread Andrey Stepachev

Hi. One additional issue with column families: number of memstores. Each family on insert utilizies one memstory. If you'll write in several memstores at onces you get more memstores and more memory will be used by you region server. Especially with random inserts you can easy get gc timeouts or O

Re: Question regarding data location in hdfs after hbase restarts

Re: Number of column families vs Number of column family qualifiers

Increase region server throughput

Question regarding data location in hdfs after hbase restarts

Re: Bulk import tools for HBase

Re: HBase cluster with heterogeneous resources

HLog and durability question --0.90 and 0.20

Re: Number of column families vs Number of column family qualifiers

Re: Hbase rollback..

Re: Hbase rollback..

HBase 0.89.20100726 with unmanaged zookeeper fails to start

Re: hbase.client.retries.number

hbase.client.retries.number

Re: Number of column families vs Number of column family qualifiers

RE: question about region files

Re: Number of column families vs Number of column family qualifiers

Re: Number of column families vs Number of column family qualifiers

Re: Region servers suddenly disappearing

Re: Hbase internally row location mechanism

Re: Number of column families vs Number of column family qualifiers

Re: StarGate HTTP ERROR: 404

Re: StarGate HTTP ERROR: 404

Re: Number of column families vs Number of column family qualifiers

23 matches

Site Navigation

Mail list logo

Footer information