You computer might be swapping, garbaging the memory or another application might be taking the load. And because of that ZK might timeout. If this is a local standalone test server, then there is no real issue of increasing the ZK timeout.
JM 2013/11/11 polyimide <[email protected]> > No, I haven't been able to resolve this issue. > This is a standalone hbase instance backed by local file system. Is this an > indication that the load exceeded a single node capacity, a cluster should > be used instead? > > Below are last part of the hbase log and gc log. > ---- hbase log ---------- > 2013-11-06 17:23:10,554 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient > ZooKeeper exception: > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for > /hbase/rs/localhost,48464,1383757658143 > 2013-11-06 17:23:10,554 INFO org.apache.hadoop.hbase.util.RetryCounter: > Sleeping 4000ms before retry #2... > 2013-11-06 17:23:14,554 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient > ZooKeeper exception: > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for > /hbase/rs/localhost,48464,1383757658143 > 2013-11-06 17:23:14,555 INFO org.apache.hadoop.hbase.util.RetryCounter: > Sleeping 8000ms before retry #3... > 2013-11-06 17:23:22,555 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient > ZooKeeper exception: > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for > /hbase/rs/localhost,48464,1383757658143 > 2013-11-06 17:23:22,555 ERROR > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete > failed after 3 retries > 2013-11-06 17:23:22,555 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my > ephemeral node > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for > /hbase/rs/localhost,48464,1383757658143 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) > at > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:133) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1195) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1184) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1128) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:893) > at java.lang.Thread.run(Thread.java:744) > 2013-11-06 17:23:22,556 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server > localhost,48464,1383757658143; zookeeper connection closed. > 2013-11-06 17:23:22,556 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: > RegionServer:0;localhost,48464,1383757658143 exiting > 2013-11-06 17:27:58,548 DEBUG org.apache.hadoop.hbase.master.HMaster: > Master > has not been initialized, don't run balancer. > 2013-11-06 17:27:58,549 DEBUG org.apache.hadoop.hbase.client.MetaScanner: > Scanning .META. starting at row= for max=2147483647 rows using > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@cd6cecb > 2013-11-06 17:27:58,550 DEBUG > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > Removed all cached region locations that map to localhost:48464 > 2013-11-06 17:27:59,578 FATAL org.apache.hadoop.hbase.master.HMaster: > master:50631-0x1422e6235f60000 master:50631-0x1422e6235f60000 received > expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > 2013-11-06 17:27:59,579 INFO org.apache.hadoop.hbase.master.HMaster: > Aborting > > ------------- gc log --------------------- > 2013-11-06T17:21:53.597-0600: 22456.490: [GC2013-11-06T17:21:53.597-0600: > 22456.490: [ParNew (promotion failed): 2456933K->2456933K(2457600K), > 61.6810250 secs] 7336934K->7916992K(7918976K), 61.6813330 secs] [Times: > user=130.74 sys=18.43, real=61.68 secs] > GC locker: Trying a full collection because scavenge failed > 2013-11-06T17:22:55.278-0600: 22518.172: [Full > GC2013-11-06T17:22:55.279-0600: 22518.172: [CMS: > 5460058K->2294509K(5461376K), 3.2691420 secs] 7916992K->2294509K(7918976K), > [CMS Perm : 25868K->25868K(43432K)], 3.2693260 secs] [Times: user=3.12 > sys=0.12, real=3.27 secs] > 2013-11-06T17:22:58.549-0600: 22521.442: [GC [1 CMS-initial-mark: > 2294509K(5461376K)] 2294524K(7918976K), 0.0018870 secs] [Times: user=0.00 > sys=0.00, real=0.00 secs] > 2013-11-06T17:22:58.551-0600: 22521.444: [CMS-concurrent-mark-start] > 2013-11-06T17:22:58.722-0600: 22521.615: [CMS-concurrent-mark: 0.164/0.171 > secs] [Times: user=2.09 sys=0.03, real=0.18 secs] > 2013-11-06T17:22:58.722-0600: 22521.615: [CMS-concurrent-preclean-start] > 2013-11-06T17:22:58.739-0600: 22521.633: [CMS-concurrent-preclean: > 0.017/0.017 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] > 2013-11-06T17:22:58.740-0600: 22521.633: > [CMS-concurrent-abortable-preclean-start] > CMS: abort preclean due to time 2013-11-06T17:23:03.797-0600: 22526.690: > [CMS-concurrent-abortable-preclean: 0.575/5.057 secs] [Times: user=3.51 > sys=1.79, real=5.06 secs] > 2013-11-06T17:23:03.797-0600: 22526.691: [GC[YG occupancy: 642355 K > (2457600 > K)]2013-11-06T17:23:03.797-0600: 22526.691: [Rescan (parallel) , 0.0688310 > secs]2013-11-06T17:23:03.866-0600: 22526.760: [weak refs processing, > 0.0000560 secs]2013-11-06T17:23:03.866-0600: 22526.760: [scrub string > table, > 0.0007120 secs] [1 CMS-remark: 2294509K(5461376K)] 2936865K(7918976K), > 0.0697900 secs] [Times: user=1.47 sys=0.03, real=0.07 secs] > 2013-11-06T17:23:03.868-0600: 22526.761: [CMS-concurrent-sweep-start] > 2013-11-06T17:23:04.248-0600: 22527.141: [CMS-concurrent-sweep: 0.371/0.380 > secs] [Times: user=0.64 sys=0.03, real=0.38 secs] > 2013-11-06T17:23:04.248-0600: 22527.141: [CMS-concurrent-reset-start] > 2013-11-06T17:23:04.262-0600: 22527.155: [CMS-concurrent-reset: 0.014/0.014 > secs] [Times: user=0.03 sys=0.01, real=0.02 secs] > 2013-11-06T17:23:07.979-0600: 22530.873: [GC2013-11-06T17:23:07.979-0600: > 22530.873: [ParNew: 2184576K->273024K(2457600K), 0.0995280 secs] > 4477717K->2649442K(7918976K), 0.0997390 secs] [Times: user=3.40 sys=0.10, > real=0.10 secs] > Heap > par new generation total 2457600K, used 801119K [0x0000000601a00000, > 0x00000006a84a0000, 0x00000006a84a0000) > eden space 2184576K, 24% used [0x0000000601a00000, 0x0000000621db7c20, > 0x0000000686f60000) > from space 273024K, 100% used [0x0000000697a00000, 0x00000006a84a0000, > 0x00000006a84a0000) > to space 273024K, 0% used [0x0000000686f60000, 0x0000000686f60000, > 0x0000000697a00000) > concurrent mark-sweep generation total 5461376K, used 2376418K > [0x00000006a84a0000, 0x00000007f5a00000, 0x00000007f5a00000) > concurrent-mark-sweep perm gen total 43432K, used 26151K > [0x00000007f5a00000, 0x00000007f846a000, 0x0000000800000000) > > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Unable-to-override-zookeeper-server-maxSessionTimeout-property-tp4052554p4052677.html > Sent from the HBase User mailing list archive at Nabble.com. >
