thanks, you guys,
yes, I removed the root after the Hbase instance was shut
down. Now the topology of the cluster looks like this:\
hmaster : 192.168.0.178
regionserver: 192.168.1.98/100/104
zookeepers: 192.168.1.98/100/104
The instance can now boot with few exceptions.
There was an immediate IOException from ZooKeeper after
the start-hbase.sh command, it looks like this:
KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/master
at org.apache......
I checked the master log, and the master .out log, it
seems that there is some problem for Master to close the session after
initializing the zookeeper clients, there was always a IOException about
some "Read error rc" after every client session is ended.
And that's just the start, a
KeeperException$ConnectionLossException was thrown.
After all these exceptions the cluster is started.
Following is the master log:
2010-01-13 10:38:25,650 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection, connectString=192.168.1.104:36963,192.168.1.100:36963,
192.168.1.98:36963 sessionTimeout=60000 watcher=Thread[Thread-1,5,main]
2010-01-13 10:38:25,651 INFO org.apache.zookeeper.ClientCnxn:
zookeeper.disableAutoWatchReset is false
2010-01-13 10:38:25,657 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server /192.168.1.100:36963
2010-01-13 10:38:25,658 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.0.178:56731 remote=/192.168.1.100:36963]
2010-01-13 10:38:25,667 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2010-01-13 10:38:25,669 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x0 to sun.nio.ch.selectionkeyi...@ba5bdb
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
2010-01-13 10:38:25,671 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown input
java.net.SocketException: Transport endpoint is not connected
at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
at
sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
2010-01-13 10:38:25,671 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
at
sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
2010-01-13 10:38:25,673 INFO org.apache.hadoop.hbase.master.RegionManager:
-ROOT- region unset (but not set to be reassigned)
2010-01-13 10:38:25,674 INFO org.apache.hadoop.hbase.master.RegionManager:
ROOT inserted into regionsInTransition
2010-01-13 10:38:26,243 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server /192.168.1.104:36963
2010-01-13 10:38:26,244 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.0.178:45372 remote=/192.168.1.104:36963]
2010-01-13 10:38:26,244 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2010-01-13 10:38:26,248 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x0 to sun.nio.ch.selectionkeyi...@1e7c5cb
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
2010-01-13 10:38:26,251 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown input
java.net.SocketException: Transport endpoint is not connected
......
2010-01-13 10:38:26,251 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown output
java.net.SocketException: Transport endpoint is not connected
......
2010-01-13 10:38:26,353 WARN
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
-- check quorum servers, currently=192.168.1.104:36963,192.168.1.100:36963,
192.168.1.98:36963
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:608)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:343)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:366)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:454)
at
org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:272)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:254)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1218)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1259)
2010-01-13 10:38:26,383 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server /192.168.1.98:36963
2010-01-13 10:38:26,384 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.0.178:53192 remote=/192.168.1.98:36963]
2010-01-13 10:38:26,384 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2010-01-13 10:38:26,419 DEBUG org.apache.hadoop.hbase.master.HMaster: Got
event None with path null
2010-01-13 10:38:26,423 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master
got 192.168.0.178:60000
2010-01-13 10:38:26,423 DEBUG
org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Waiting for master
address ZNode to be deleted and watching the cluster state node
2010-01-13 10:39:05,032 DEBUG
org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeDeleted
with path /hbase/master
2010-01-13 10:39:05,032 DEBUG
org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Master address ZNode
deleted, notifying waiting masters
2010-01-13 10:39:05,092 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Wrote master address
192.168.0.178:60000 to ZooKeeper
2010-01-13 10:39:05,096 WARN
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to set state node
in ZooKeeper
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode =
NodeExists for /hbase/shutdown
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:608)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.setClusterState(ZooKeeperWrapper.java:279)
at
org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:273)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:254)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1218)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1259)
2010-01-13 10:39:05,097 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master
got 192.168.0.178:60000
2010-01-13 10:39:05,097 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
initialized on 192.168.0.178:60000
2010-01-13 10:39:05,097 DEBUG org.apache.hadoop.hbase.master.HMaster:
Checking cluster state...
2010-01-13 10:39:05,098 DEBUG
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode
/hbase/root-region-server got 192.168.1.104:60020
2010-01-13 10:39:05,102 DEBUG org.apache.hadoop.hbase.master.HMaster: This
is a fresh start, proceeding with normal startup
2010-01-13 10:39:05,104 DEBUG org.apache.hadoop.hbase.master.HMaster: No log
files to split, proceeding...
2010-01-13 10:39:05,106 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=Master, sessionId=HMaster
2010-01-13 10:39:05,106 INFO
org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
On Thu, Jan 14, 2010 at 3:57 AM, Andrew Purtell <[email protected]> wrote:
> Yep, I did that same thing once by accident.
>
>
>
> ----- Original Message ----
> > From: Jean-Daniel Cryans <[email protected]>
> > To: [email protected]
> > Sent: Wed, January 13, 2010 9:49:06 AM
> > Subject: Re: cannot build a fully distributed mode hbase instance.
> >
> > Don't feel bad, I think we all messed up our first HBase setup.
> >
> > Did you delete /hbase while HBase was running? If so, first shut it
> > down/kill -9, clear out the folder and the the Master will take care
> > of recreating the ROOT and META on restart.
> >
> > J-D
> >
> > On Tue, Jan 12, 2010 at 6:03 PM, steven zhuang wrote:
> > > hi, Jean.
> > > Thanks a lot.
> > > I am really an idiot of Hbase.
> > > I removed the /hbase root directory from HDFS once, hoping it
> > > will rebuild the whole META-regions thing. Then I found the exception
> is
> > > still there every time I use the shell command.
> > > Before all that I am gonna ask, I have one question :"Is it
> OK if
> > > we run hbase shell command on any slave/region server?
> > > I have checked the log, seems the master will request the
> wrong
> > > regionserver for a region it's not servicing:
> > >
> > > 2010-01-12 20:25:11,996 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > > handler 3 on 60020, call getRegionInfo([...@dc9766) from
> 192.168.1.98:55351:
> > > error: org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2309)
> > >
> > > I am still analyzing the master log, for the most recent
> start,
> > > there seems no exception records in the log.
> > >
> > >
> > >
> > >
> > > On Wed, Jan 13, 2010 at 9:39 AM, Jean-Daniel Cryans
> > wrote:
> > >
> > >> It seems it found the ROOT region but META wasn't assigned. Either you
> > >> didn't wait enough after starting hbase or you should look at the
> > >> master's log for the reason why that region wasn't assigned.
> > >>
> > >> J-D
> > >>
> > >> On Tue, Jan 12, 2010 at 5:36 PM, steven zhuang
> > >> wrote:
> > >> > That's done, thanks, Jean.
> > >> >
> > >> > But now there is another problem. Now I can start the
> cluster
> > >> > without any exception(good!), but at any node, when I run
> list/create, I
> > >> > always get this exception, although afterwards I checked the table
> is
> > >> > created.
> > >> >
> > >> > 10/01/12 20:25:16 DEBUG client.HConnectionManager$TableServers:
> Found
> > >> ROOT
> > >> > at 192.168.1.104:60020
> > >> > 10/01/12 20:25:16 DEBUG client.HConnectionManager$TableServers:
> > >> > locateRegionInMeta attempt 0 of 5 failed; retrying after sleep of
> 2000
> > >> > org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> > >> address
> > >> > listed in -ROOT- for region .META.,,1
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:668)
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:590)
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:563)
> > >> > at
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:407)
> > >> >
> > >> >
> > >> >
> > >> > On Wed, Jan 13, 2010 at 8:57 AM, Jean-Daniel Cryans
> > >> >wrote:
> > >> >
> > >> >> Just make sure your OS doesn't resolve itself as 127.0.0.1, usual
> > >> >> suspect if you are using ubuntu is to look at /etc/hosts and make
> sure
> > >> >> your hostname resolves to your IP.
> > >> >>
> > >> >> J-D
> > >> >>
> > >> >> On Tue, Jan 12, 2010 at 4:52 PM, steven zhuang
> > >> >
> > >> >> wrote:
> > >> >> > thanks, Jean,
> > >> >> > I figured out that, in the netstat's output I
> can see
> > >> >> > 127.0.0.1:60000, I don't know if this means it only listen on
> > >> connection
> > >> >> > request from the same machine.
> > >> >> > About the hbase.master configuration, is there
> > >> anything
> > >> >> I
> > >> >> > can use to replace it?
> > >> >> >
> > >> >> >
> > >> >> > On Wed, Jan 13, 2010 at 1:36 AM, Jean-Daniel Cryans <
> > >> [email protected]
> > >> >> >wrote:
> > >> >> >
> > >> >> >> > 10/01/11 21:16:46 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode
> > >> >> >> /hbase/master
> > >> >> >> > got 127.0.1.1:60000
> > >> >> >>
> > >> >> >> This means that your master registered itself in Zookeeper as
> > >> >> >> 127.0.0.1, you seem to have a network configuration problem.
> > >> >> >>
> > >> >> >> Also the hbase.master configuration is deprecated and unused.
> > >> >> >>
> > >> >> >> J-D
> > >> >> >>
> > >> >> >> On Tue, Jan 12, 2010 at 6:16 AM, steven zhuang <
> > >> [email protected]
> > >> >> >
> > >> >> >> wrote:
> > >> >> >> > hello, list,
> > >> >> >> >
> > >> >> >> > I am now setting up a HBase cluster using HBase
> > >> version
> > >> >> >> > 0.20.2. But I have met some problems which I googled a lot and
> got
> > >> no
> > >> >> >> > answer.
> > >> >> >> > Please help me.
> > >> >> >> >
> > >> >> >> > I modified the Hbase-site.xml and copy the whole
> > >> >> directory
> > >> >> >> to
> > >> >> >> > another machine.
> > >> >> >> > Using one as the master, after I started the
> hbase
> > >> >> server, I
> > >> >> >> > CAN see Hmaster / HQuorumPeer / HRegionServer running on
> Master
> > >> >> >> > and HQuorumPeer / HRegionServer running on the slave node.
> > >> >> >> > Here is what's weird:
> > >> >> >> > I can enter the hbase shell on master node, but
> on the
> > >> >> other
> > >> >> >> > region server I cannot execute any command, a "list" command
> would
> > >> >> cause
> > >> >> >> a
> > >> >> >> > list of exception.
> > >> >> >> >
> > >> >> >> > 10/01/11 21:16:46 DEBUG
> client.HConnectionManager$ClientZKWatcher:
> > >> Got
> > >> >> >> > ZooKeeper event, state: SyncConnected, type: None, path: null
> > >> >> >> > 10/01/11 21:16:46 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode
> > >> >> >> /hbase/master
> > >> >> >> > got 127.0.1.1:60000
> > >> >> >> > 10/01/11 21:16:46 INFO client.HConnectionManager$TableServers:
> > >> >> getMaster
> > >> >> >> > attempt 0 of 5 failed; retrying after sleep of 2000
> > >> >> >> > java.net.ConnectException: Connection refused
> > >> >> >> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> Method)
> > >> >> >> > at
> > >> >> >> >
> > >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> > >> >> >> > at
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> > >> >> >> > at
> org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
> > >> >> >> >
> > >> >> >> > I can create table in the master node's Hbase
> shell, but
> > >> >> there
> > >> >> >> > sometime is some exception like:
> > >> >> >> > 10/01/12 06:08:15 DEBUG
> client.HConnectionManager$TableServers:
> > >> >> >> > locateRegionInMeta attempt 2 of 5 failed; retrying after sleep
> of
> > >> 2000
> > >> >> >> > org.apache.hadoop.hbase.client.NoServerForRegionException: No
> > >> server
> > >> >> >> address
> > >> >> >> > listed in .META. for region t3,,1263305290760
> > >> >> >> > at
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:668)
> > >> >> >> > at
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:594)
> > >> >> >> > at
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:557)
> > >> >> >> >
> > >> >> >> > But after this I can use list to see that the table HAS
> BEEN
> > >> >> BUILT
> > >> >> >> > inside the hdfs.
> > >> >> >> >
> > >> >> >> > the Hbase-site.xml I used:
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.rootdir
> > >> >> >> > hdfs://sz:8998/hbase
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.cluster.distributed
> > >> >> >> > true
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.master
> > >> >> >> > sz:60000
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.tmp.dir
> > >> >> >> > /home/steven/data/hbase-${user.name}
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.zookeeper.property.dataDir
> > >> >> >> > ${hbase.tmp.dir}/zookeeper
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.zookeeper.quorum
> > >> >> >> > sz,hadoop3
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.zookeeper.peerport
> > >> >> >> > 2888
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > hbase.zookeeper.leaderport
> > >> >> >> > 3888
> > >> >> >> >
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > --
> > >> >> >> > best wishes.
> > >> >> >> > steven
> > >> >> >> >
> > >> >> >>
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > --
> > >> >> > best wishes.
> > >> >> > steven
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > best wishes.
> > >> > steven
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > best wishes.
> > > steven
> > >
>
>
>
>
>
>
--
best wishes.
steven