Yeah it looks like after i fixed the servers to work with Zookeeper, HDFS got hosed! Restarts of that fixed everything!
Thanks. Ananth T Sarathy On Wed, Jan 13, 2010 at 3:34 PM, Jean-Daniel Cryans <[email protected]>wrote: > Oh I see something, it seems that the master is waiting on the file > system in the main thread. Is HDFS running? Is can you create a file? > > J-D > > On Wed, Jan 13, 2010 at 12:27 PM, Ananth T. Sarathy > <[email protected]> wrote: > > here 's what I get > > > > http://pastebin.com/m60c1864b > > > > > > Ananth T Sarathy > > > > > > On Wed, Jan 13, 2010 at 2:57 PM, Jean-Daniel Cryans <[email protected] > >wrote: > > > >> Do "jps" then "jstack pid" with the master's pid given by jps. > >> > >> J-D > >> > >> On Wed, Jan 13, 2010 at 11:41 AM, Ananth T. Sarathy > >> <[email protected]> wrote: > >> > well when i do a ps -ef|grep hbase i have 3 processes running. I have > >> killed > >> > them all, reinstalled hbase, formated my name node, and still the > >> master.log > >> > is the same when I restart. What could be causing it hang? > >> > > >> > > >> > Ananth T Sarathy > >> > > >> > > >> > On Wed, Jan 13, 2010 at 2:26 PM, Jean-Daniel Cryans < > [email protected] > >> >wrote: > >> > > >> >> Well it's just weird that your master would just "refuse" to start. > Is > >> >> the process still there? If you jstack it, is there any thread > >> >> running? > >> >> > >> >> You could also clean up everything and retry, but that's just the > easy > >> >> way out :P > >> >> > >> >> J-D > >> >> > >> >> On Wed, Jan 13, 2010 at 11:23 AM, Ananth T. Sarathy > >> >> <[email protected]> wrote: > >> >> > master. out is empty.... could something have cludged up from the > >> >> previous > >> >> > issues? Are there files I should delete/ reformat my namenode? > >> >> > > >> >> > I don't have any data yet in these, so I can afford to blow things > >> away, > >> >> but > >> >> > I cleaned out the tmp dir already so I am not sure what else i need > to > >> >> do. > >> >> > Ananth T Sarathy > >> >> > > >> >> > > >> >> > On Wed, Jan 13, 2010 at 2:14 PM, Jean-Daniel Cryans < > >> [email protected] > >> >> >wrote: > >> >> > > >> >> >> If that's everything from your master log, then I would suggest > you > >> >> >> take a look at the .out file (instead of .log) since it might be a > >> >> >> problem on startup. > >> >> >> > >> >> >> J-D > >> >> >> > >> >> >> On Wed, Jan 13, 2010 at 11:09 AM, Ananth T. Sarathy > >> >> >> <[email protected]> wrote: > >> >> >> > Master log > >> >> >> > > >> >> >> > http://pastebin.com/m469d1b39 > >> >> >> > > >> >> >> > Zookeeper log > >> >> >> > http://pastebin.com/m47f0503 > >> >> >> > > >> >> >> > region server > >> >> >> > > >> >> >> > http://pastebin.com/m305fab14 > >> >> >> > > >> >> >> > Ananth T Sarathy > >> >> >> > > >> >> >> > > >> >> >> > On Wed, Jan 13, 2010 at 2:02 PM, Jean-Daniel Cryans < > >> >> [email protected] > >> >> >> >wrote: > >> >> >> > > >> >> >> >> Looks like your master didn't register itself in zookeeper, you > >> >> should > >> >> >> >> look in its log. > >> >> >> >> > >> >> >> >> J-D > >> >> >> >> > >> >> >> >> On Wed, Jan 13, 2010 at 10:59 AM, Ananth T. Sarathy > >> >> >> >> <[email protected]> wrote: > >> >> >> >> > ok, we got that to work and zookeeper is coming up, but now I > am > >> >> >> getting > >> >> >> >> > something else... the regionserver are connecting cause of > >> >> >> >> > > >> >> >> >> > 2010-01-13 13:57:56,029 WARN > >> >> >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to > >> read > >> >> >> master > >> >> >> >> > address from ZooKeeper. Retrying. Error was: > >> >> >> >> > java.io.IOException: > >> >> >> >> org.apache.zookeeper.KeeperException$NoNodeException: > >> >> >> >> > KeeperErrorCode = NoNode for /hbase/master > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:332) > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readMasterAddressOrThrow(ZooKeeperWrapper.java:240) > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1339) > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:1371) > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:427) > >> >> >> >> > at java.lang.Thread.run(Thread.java:636) > >> >> >> >> > Caused by: > org.apache.zookeeper.KeeperException$NoNodeException: > >> >> >> >> > KeeperErrorCode = NoNode for /hbase/master > >> >> >> >> > at > >> >> >> >> > > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:102) > >> >> >> >> > at > >> >> >> >> > > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > >> >> >> >> > at > >> >> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892) > >> >> >> >> > at > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:328) > >> >> >> >> > ... 5 more > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > any ideas? > >> >> >> >> > Ananth T Sarathy > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > On Wed, Jan 13, 2010 at 12:52 PM, Jean-Daniel Cryans < > >> >> >> >> [email protected]>wrote: > >> >> >> >> > > >> >> >> >> >> HBase 0.20.2 and previous only checked one address against > the > >> >> list > >> >> >> >> >> that is provided, the one returned was the default Java knew > >> of. > >> >> It > >> >> >> >> >> seems that in your case your /etc/hosts makes it that this > >> >> machines > >> >> >> >> >> resolves itself only as localhost. You can: > >> >> >> >> >> > >> >> >> >> >> 1) Try to fix your network configuration to have your > machine > >> >> always > >> >> >> >> >> resolve by its hostname first, or > >> >> >> >> >> > >> >> >> >> >> 2) Use HBase 0.20.3RC1 which contains a fix that tries > harder > >> to > >> >> >> match > >> >> >> >> >> the address. You can get it here: > >> >> >> >> >> > http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-1/<http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > >> >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > >> >> >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > >> >> >> >> < > http://people.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-1/> > >> >> >> >> >> > >> >> >> >> >> Sorry for that, > >> >> >> >> >> > >> >> >> >> >> J-D > >> >> >> >> >> > >> >> >> >> >> On Wed, Jan 13, 2010 at 9:43 AM, Ananth T. Sarathy > >> >> >> >> >> <[email protected]> wrote: > >> >> >> >> >> > I have Hbase.env set to manage Zookeeper. When I try to > start > >> >> >> hbase, > >> >> >> >> the > >> >> >> >> >> > zookeeper out says > >> >> >> >> >> > > >> >> >> >> >> > java.io.IOException: Could not find my address: localhost > in > >> >> list > >> >> >> of > >> >> >> >> >> > ZooKeeper quorum servers > >> >> >> >> >> > at > >> >> >> >> >> > > >> >> >> >> >> > >> >> >> >> > >> >> >> > >> >> > >> > org.apache.hadoop.hbase.zookeeper.HQuorumPeer.writeMyID(HQuorumPeer.java:128) > >> >> >> >> >> > at > >> >> >> >> >> > > >> >> >> >> > >> >> > org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:67) > >> >> >> >> >> > ~ > >> >> >> >> >> > > >> >> >> >> >> > in my hbase-site.xml > >> >> >> >> >> > > >> >> >> >> >> > <property> > >> >> >> >> >> > <name>hbase.zookeeper.quorum</name> > >> >> >> >> >> > <value>gs2,gs3,gs4</value> > >> >> >> >> >> > <description>Comma separated list of servers in the > >> ZooKeeper > >> >> >> >> Quorum. > >> >> >> >> >> > For example, "host1.mydomain.com,host2.mydomain.com, > >> >> >> >> host3.mydomain.com > >> >> >> >> >> ". > >> >> >> >> >> > By default this is set to localhost for local and > >> >> >> pseudo-distributed > >> >> >> >> >> > modes > >> >> >> >> >> > of operation. For a fully-distributed setup, this should > be > >> >> set > >> >> >> to a > >> >> >> >> >> full > >> >> >> >> >> > list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is > >> set > >> >> in > >> >> >> >> >> > hbase-env.sh > >> >> >> >> >> > this is the list of servers which we will start/stop > >> ZooKeeper > >> >> >> on. > >> >> >> >> >> > </description> > >> >> >> >> >> > </property> > >> >> >> >> >> > > >> >> >> >> >> > in my /etc/hosts > >> >> >> >> >> > > >> >> >> >> >> > # hostname gs2 added to /etc/hosts by anaconda > >> >> >> >> >> > 127.0.0.1 localhost localhost.localdomain localhost4 > >> >> >> >> >> > localhost4.localdomain4 gs2 > >> >> >> >> >> > ::1 localhost localhost.localdomain localhost6 > >> >> >> >> >> > localhost6.localdomain6 gs2 > >> >> >> >> >> > > >> >> >> >> >> > 192.168.20.101 gs1 > >> >> >> >> >> > 192.168.20.102 gs2 > >> >> >> >> >> > 192.168.20.103 gs3 > >> >> >> >> >> > 192.168.20.104 gs4 > >> >> >> >> >> > 192.168.20.105 gs5 > >> >> >> >> >> > 192.168.20.106 gs6 > >> >> >> >> >> > 192.168.20.107 gs7 > >> >> >> >> >> > 192.168.20.108 gs8 > >> >> >> >> >> > 192.168.20.110 gs10 > >> >> >> >> >> > 192.168.20.111 gs11 > >> >> >> >> >> > 192.168.20.112 gs12 > >> >> >> >> >> > 192.168.20.113 gs13 > >> >> >> >> >> > 192.168.20.114 gs14 > >> >> >> >> >> > 192.168.20.115 gs15 > >> >> >> >> >> > 192.168.20.116 gs16 > >> >> >> >> >> > 192.168.20.117 gs17 > >> >> >> >> >> > > >> >> >> >> >> > am I missing something here? Why does it insist on > localhost > >> in > >> >> the > >> >> >> >> >> quorum > >> >> >> >> >> > list? What do i need to do to unconfuse it? > >> >> >> >> >> > > >> >> >> >> >> > > >> >> >> >> >> > Ananth T Sarathy > >> >> >> >> >> > > >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > >> >> >> > > >> >> >> > >> >> > > >> >> > >> > > >> > > >
