Re: hbase standalone cannot start master, cannot assign requested address at port 60000

2010-09-14 Thread Michael Scott
Hi again, IPV6 was enabled. I shut it off, rebooted to be sure, verified it was still off, and encountered the same problem once again. I also tried to open port 6 by hand with a small php file. I can do this (as any user) for localhost. I can NOT do this (not even as root) for the IP addr

RE: Region server shutdown after putting millions of rows

2010-09-14 Thread Zhou Shuaifeng
> Saying "perhaps temporarily unavailable" is not exactly right. There are retries and timeouts. Files do not just disappear for no reason. Yes, given enough time or issues, the RS will eventually kill itself. We can do better here but in the end any database that loses its file system is goin

Re: question on Scan.setStopRow

2010-09-14 Thread John Sichi
Thanks guys! We'll certainly need InclusiveStopFilter once we support range scans with closed endpoints. JVS From: Jonathan Gray mailto:jg...@facebook.com>> Date: September 13, 2010 10:34:23 AM PDT To: "user@hbase.apache.org" mailto:user@hbase.apache.org>> Subject

RE: Failure in truncating table

2010-09-14 Thread Sharma, Avani
Thanks, Jimmy. The way I achieved step 2 is by running add_table.rb. That script essentially deletes the regions from .META for a given user table and then adds them again from their HDFS location. Since we have already deleted all data from hdfs for that table (keeping just the directory name)

RE: Region server shutdown after putting millions of rows

2010-09-14 Thread Jonathan Gray
Hey Matthew, Thanks for having the patience to stick with it and work through the issues. If you had issues that you got solutions to, it's always helpful to make sure it's documented somewhere on the wiki. Robustness is, of course, a primary goal for HBase. I agree with many of your points

Re: Region server shutdown after putting millions of rows

2010-09-14 Thread Matthew LeMieux
Hello Jonathan, Please don't take my post the wrong way. I am very happy with HBase. Thank you for being so detailed in your response. I will try to do the same. You are right that I originally started posting on this list because of problems I encountered with an EC2 based cluster,

Re: hbase and hdfs block sizes

2010-09-14 Thread Abhijit Pol
Thanks Ryan. Got the point on 64k hbase block size. Can you add more on negative impact from smaller HDFS block sizes? Larger HDFS blocks are great for batch ops, for random reads isn't making HDFS block size closer to HBASE block will help; so any block cache miss fetches around 64k rather than

RE: Region server shutdown after putting millions of rows

2010-09-14 Thread Jonathan Gray
> * The most common reason I've had for Region Server Suicide is > zookeeper. The region server thinks zookeeper is down. I thought this > had to do with heavy load, but this also happens for me even when there > is nothing running. I haven't been able to find a quantifiable cause. > This is jus

Re: hbase standalone cannot start master, cannot assign requested address at port 60000

2010-09-14 Thread Todd Lipcon
Hi Michael, It might be related to IPV6. Do you have IPV6 enabled on this machine? Check out this hadoop JIRA that might be related for some tips: https://issues.apache.org/jira/browse/HADOOP-6056 -Todd On Tue, Sep 14, 2010 at 10:17 AM, Michael

Re: hbase standalone cannot start master, cannot assign requested address at port 60000

2010-09-14 Thread Michael Scott
That's correct. I tried a number of different ports to see if there was something weird, and then I shut down the hadoop server and tried to connect to 50010 (which of course should have been free at that point) but got the same "cannot assign to requested address" error. If I start hadoop, netst

Re: HBase MR: run more map tasks than regions

2010-09-14 Thread Stack
On Tue, Sep 14, 2010 at 10:10 AM, Alex Baranau wrote: >Is the only way is to enhance TableInputFormat for me? > Currently, yes, you must enhance TIF or use an alternate TIF. St.Ack

HBase MR: run more map tasks than regions

2010-09-14 Thread Alex Baranau
Hello, As far as I know, the number of map tasks for "scan-based" mapreduce job is equal (not more than) number of underlying regions (for scan). Of course, if the max map task capacity is big enough. I have a situation, when map-side processing is very heavy but uses quite small amount of records

Re: hbase standalone cannot start master, cannot assign requested address at port 60000

2010-09-14 Thread Stack
On Tue, Sep 14, 2010 at 9:33 AM, Michael Scott wrote: > I don't see why hadoop binds > to a port but hbase does not (I even tried starting hbase with hadoop off > and binding to 50010, which hadoop uses). > Using 50010 worked for hadoop but not for hbase? (Odd. We hadoop their mechanism essenti

Re: hbase standalone cannot start master, cannot assign requested address at port 60000

2010-09-14 Thread Michael Scott
Thanks again. Don't worry, we're not exposing to the outside world, I was just clarifying that the IP address exists and takes connections, both internal and external, on other ports. I will see if I can figure out why it is choking on the 6 port. I'm not much of on expert on this, I know to

Re: Region server shutdown after putting millions of rows

2010-09-14 Thread Matthew LeMieux
I've dealt with dozens of spontaneous shutdowns in recent weeks. (We call them Region Server Suicides) The files problem is where the OS (i.e. linux) limits the number of files a user can open at one time. A common default of 1024 isn't enough for hbase. Based purely on empirical evidence, y

RE: how about zookeeper overhead?

2010-09-14 Thread Buttler, David
I think the standard advice is to use only one zk node for clusters of size < 10, and to collocate it with the namenode. So, I would suggest changing your config to: 1 master + NN + ZK 1 client (doing heavy put & get) 6 RS+DN. The reason you want to have an odd number of zk nodes is because zoo

Re: Hbase read performance with increasing number of client threads

2010-09-14 Thread Ryan Rawson
Yes, its all about the block cache. IN_MEMORY is a useful tool as well, but be careful you can choke out other regions/tables. -ryan On Tue, Sep 14, 2010 at 12:07 AM, Abhijit Pol wrote: > @Ryan > when you mentioned caching and lots of RAM you referred giving it to block > cache or memstore? > >

Re: Hbase read performance with increasing number of client threads

2010-09-14 Thread Abhijit Pol
@Ryan when you mentioned caching and lots of RAM you referred giving it to block cache or memstore? we have a table with two column families A and B. For column family A we have set "IN_MEMORY" ==> true and we have multiple 64GB ram machines where we would like to hold this column family in RAM fo