Hi Benoit, I think you need to move the directory out of "/tmp" and give it a shot. /tmp/hbase-${user.name} /zk will get cleaned up during restart.
~Anil On Mon, May 18, 2015 at 9:45 PM, tsuna <tsuna...@gmail.com> wrote: > I added this to hbase-site.xml: > > <property> > <name>hbase.zookeeper.property.dataDir</name> > <value>/tmp/hbase-${user.name}/zk</value> > </property> > > Didn’t change anything. Once I kill/shutdown HBase, it won’t come back up. > > On Mon, May 18, 2015 at 1:14 AM, Viral Bajaria <viral.baja...@gmail.com> > wrote: > > Same for me, I had faced similar issues especially on my virtual machines > > since I would restart them more often than my host machine. > > > > Moving ZK from /tmp which could get cleared on reboots fixed the issue > for > > me. > > > > Thanks, > > Viral > > > > > > On Sun, May 17, 2015 at 10:39 PM, Lars George <lars.geo...@gmail.com> > wrote: > > > >> I noticed similar ZK related issues but those went away after changing > the > >> ZK directory to a permanent directory along with the HBase root > directory. > >> Both point now to a location in my home folder and restarts work fine > now. > >> Not much help but wanted to at least state that. > >> > >> Lars > >> > >> Sent from my iPhone > >> > >> > On 18 May 2015, at 05:55, tsuna <tsuna...@gmail.com> wrote: > >> > > >> > Hi all, > >> > For testing on my laptop (OSX with JDK 1.7.0_45) I usually build the > >> > latest version from branch-1.0 and use the following config: > >> > > >> > <configuration> > >> > <property> > >> > <name>hbase.rootdir</name> > >> > <value>file:///tmp/hbase-${user.name}</value> > >> > </property> > >> > <property> > >> > <name>hbase.online.schema.update.enable</name> > >> > <value>true</value> > >> > </property> > >> > <property> > >> > <name>zookeeper.session.timeout</name> > >> > <value>300000</value> > >> > </property> > >> > <property> > >> > <name>hbase.zookeeper.property.tickTime</name> > >> > <value>2000000</value> > >> > </property> > >> > <property> > >> > <name>hbase.zookeeper.dns.interface</name> > >> > <value>lo0</value> > >> > </property> > >> > <property> > >> > <name>hbase.regionserver.dns.interface</name> > >> > <value>lo0</value> > >> > </property> > >> > <property> > >> > <name>hbase.master.dns.interface</name> > >> > <value>lo0</value> > >> > </property> > >> > </configuration> > >> > > >> > Since at least a month ago (perhaps longer, I don’t remember exactly) > >> > I can’t restart HBase. The very first time it starts up fine, but > >> > subsequent startup attempts all fail with: > >> > > >> > 2015-05-17 20:39:19,024 INFO [RpcServer.responder] ipc.RpcServer: > >> > RpcServer.responder: starting > >> > 2015-05-17 20:39:19,024 INFO [RpcServer.listener,port=49809] > >> > ipc.RpcServer: RpcServer.listener,port=49809: starting > >> > 2015-05-17 20:39:19,029 INFO [main] http.HttpRequestLog: Http request > >> > log for http.requests.regionserver is not defined > >> > 2015-05-17 20:39:19,030 INFO [main] http.HttpServer: Added global > >> > filter 'safety' > >> > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > >> > 2015-05-17 20:39:19,031 INFO [main] http.HttpServer: Added filter > >> > static_user_filter > >> > > >> > (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) > >> > to context regionserver > >> > 2015-05-17 20:39:19,031 INFO [main] http.HttpServer: Added filter > >> > static_user_filter > >> > > >> > (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) > >> > to context static > >> > 2015-05-17 20:39:19,031 INFO [main] http.HttpServer: Added filter > >> > static_user_filter > >> > > >> > (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) > >> > to context logs > >> > 2015-05-17 20:39:19,033 INFO [main] http.HttpServer: Jetty bound to > >> port 49811 > >> > 2015-05-17 20:39:19,033 INFO [main] mortbay.log: jetty-6.1.26 > >> > 2015-05-17 20:39:19,157 INFO [main] mortbay.log: Started > >> > SelectChannelConnector@0.0.0.0:49811 > >> > 2015-05-17 20:39:19,222 INFO [M:0;localhost:49807] > >> > zookeeper.RecoverableZooKeeper: Process > >> > identifier=hconnection-0x4f708099 connecting to ZooKeeper > >> > ensemble=localhost:2181 > >> > 2015-05-17 20:39:19,222 INFO [M:0;localhost:49807] > >> > zookeeper.ZooKeeper: Initiating client connection, > >> > connectString=localhost:2181 sessionTimeout=10000 > >> > watcher=hconnection-0x4f7080990x0, quorum=localhost:2181, > >> > baseZNode=/hbase > >> > 2015-05-17 20:39:19,223 INFO > >> > [M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn: > >> > Opening socket connection to server localhost/127.0.0.1:2181. Will > not > >> > attempt to authenticate using SASL (unknown error) > >> > 2015-05-17 20:39:19,223 INFO > >> > [M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn: > >> > Socket connection established to localhost/127.0.0.1:2181, initiating > >> > session > >> > 2015-05-17 20:39:19,223 INFO > >> > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > >> > server.NIOServerCnxnFactory: Accepted socket connection from > >> > /127.0.0.1:49812 > >> > 2015-05-17 20:39:19,223 INFO > >> > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.ZooKeeperServer: > >> > Client attempting to establish new session at /127.0.0.1:49812 > >> > 2015-05-17 20:39:19,224 INFO [SyncThread:0] server.ZooKeeperServer: > >> > Established session 0x14d651aaec00002 with negotiated timeout 4000000 > >> > for client /127.0.0.1:49812 > >> > 2015-05-17 20:39:19,224 INFO > >> > [M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn: > >> > Session establishment complete on server localhost/127.0.0.1:2181, > >> > sessionid = 0x14d651aaec00002, negotiated timeout = 4000000 > >> > 2015-05-17 20:39:19,249 INFO [M:0;localhost:49807] > >> > regionserver.HRegionServer: ClusterId : > >> > 6ad7eddd-2886-4ff0-b377-a2ff42c8632f > >> > 2015-05-17 20:39:49,208 ERROR [main] master.HMasterCommandLine: Master > >> exiting > >> > java.lang.RuntimeException: Master not active after 30 seconds > >> > at > >> > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:194) > >> > at > >> > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445) > >> > at > >> > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:197) > >> > at > >> > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139) > >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > >> > at > >> > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > >> > at > org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2002) > >> > > >> > > >> > I noticed that this has something to do with the ZooKeeper data. If I > >> > rm -rf $TMPDIR/hbase-tsuna/zookeeper then I can start HBase again. > >> > But of course HBase won’t work properly because while some tables > >> > exist on the filesystem, they no longer exist in ZK, etc. > >> > > >> > Does anybody know what could be left behind in ZK that could make it > >> > hang during startup? I looked at a jstack output while it was paused > >> > during 30s and didn’t find anything noteworthy. > >> > > >> > -- > >> > Benoit "tsuna" Sigoure > >> > > > > -- > Benoit "tsuna" Sigoure > -- Thanks & Regards, Anil Gupta