Oops, I made a mistake. Edward is right. Each node has 192G RAM. Thanks, Minho Kim
2015-07-07 19:50 GMT+09:00 Edward J. Yoon <[email protected]>: > > - 8 GB RAM > > I guess it looks like a typo Minho. :-) AFAIK, each node has 192GB memory. > > +1 we need to increase the default maxClientCnxns since modern > machines have enough RAM. > > On Tue, Jul 7, 2015 at 7:13 PM, 김민호 <[email protected]> wrote: > > Hi all, > > > > > > > > Recently, I set up Hama cluster using 2 machines. > > > > This specification is as follows: > > > > - 8 GB RAM > > > > - 12 TB HDD > > > > - (I don’t remember CPU spec.) > > > > > > > > In order to run hama job, I set up configuration, bsp.tasks.maximum=40 > and > > bsp.child.java.opts=-Xmx4096m, in hama-site.xml. (skip rests of > settings.) > > > > So I performed examples which are pi Estimator and FastGraphGen but I got > > below errors. > > > > > > > > attempt_201507071627_0001_000023_0: > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > KeeperErrorCode = ConnectionLoss for > > /bsp/job_201507071627_0001/peers/cluster-0:61029 > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:261) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncC > > lientImpl.java:279) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClien > > tImpl.java:261) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > initializeSyncService(BSPPeerImpl.java:305) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > <init>(BSPPeerImpl.java:185) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251) > > > > attempt_201507071627_0001_000023_0: 15/07/07 16:27:40 ERROR > > sync.ZKSyncClient: Error creating zk path > > /bsp/job_201507071627_0001/peers/cluster-0:61029 > > > > attempt_201507071627_0001_000023_0: > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > KeeperErrorCode = ConnectionLoss for /bsp > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:281) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncC > > lientImpl.java:279) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClien > > tImpl.java:261) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > initializeSyncService(BSPPeerImpl.java:305) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > <init>(BSPPeerImpl.java:185) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251) > > > > attempt_201507071627_0001_000023_0: 15/07/07 16:27:42 ERROR > > sync.ZKSyncClient: Error checking zk path > /bsp/job_201507071627_0001/sync/-1 > > > > attempt_201507071627_0001_000023_0: > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > KeeperErrorCode = ConnectionLoss for /bsp/job_201507071627_0001/sync/-1 > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:261) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC > > lientImpl.java:100) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > doFirstSync(BSPPeerImpl.java:312) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > <init>(BSPPeerImpl.java:238) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251) > > > > attempt_201507071627_0001_000023_0: 15/07/07 16:27:44 ERROR > > sync.ZKSyncClient: Error creating zk path > /bsp/job_201507071627_0001/sync/-1 > > > > attempt_201507071627_0001_000023_0: > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > KeeperErrorCode = ConnectionLoss for /bsp > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:281) > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC > > lientImpl.java:100) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > doFirstSync(BSPPeerImpl.java:312) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > <init>(BSPPeerImpl.java:238) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251) > > > > attempt_201507071627_0001_000023_0: 15/07/07 16:27:46 FATAL > > bsp.GroomServer: SyncError from child > > > > attempt_201507071627_0001_000023_0: > org.apache.hama.bsp.sync.SyncException > > > > attempt_201507071627_0001_000023_0: at > > > org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncC > > lientImpl.java:138) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > doFirstSync(BSPPeerImpl.java:312) > > > > attempt_201507071627_0001_000023_0: at > org.apache.hama.bsp.BSPPeerImpl. > > <init>(BSPPeerImpl.java:238) > > > > attempt_201507071627_0001_000023_0: at > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1251) > > > > 15/07/07 16:27:48 INFO bsp.BSPJobClient: Job failed. > > > > > > > > This is a ZK error. Hama tasks try to get the /bsp node from zookeeper > and > > fails. > > > > This is just because hama.zookeeper.property.maxClientCnxns is 30 in > hama- > > default.xml. > > > > The problem has been encountered while the number of maximum tasks is > > larger than it. > > > > To solve the problem, Hama has a setting to increase the number of > > connectiosns to ZK. > > > > > > > > <property> > > > > <name>hama.zookeeper.property.maxClientCnxns</name> > > > > <value>100</value> > > > > </property> > > > > > > > > So we should update the default number of connections which is over 100 > > because server’s performance has been more improved than before. > > > > If you agree my opinion, I will change the default value as 300. > > > > > > > > Best regards, > > > > Minho Kim > > > > > > > > > > -- > Best Regards, Edward J. Yoon >
