Hi,
I am trying to run hama in distributed mode using an already running
zookeeper quorum. Whenever I try to run start-bspd.sh it is unable to
use the existing zookeeper and tries to start a new instance of
zookeeper which fails by throwing "Address already in use error".
Here is the stdout:
-bash-3.2$ bin/start-bspd.sh
Enter passphrase for key '/home/hgahlot/.ssh/id_dsa': Enter passphrase
for key '/home/hgahlot/.ssh/id_dsa': Enter passphrase for key
'/home/hgahlot/.ssh/id_dsa':
<worker1>.cnet.com: starting zookeeper, logging to
/home/hgahlot/graph/hama/hama-trunk/bin/../logs/<worker1>.cnet.com.out
Nothing happens after this. The hama-zookeeper log is as follows:
-bash-3.2$ cat logs/hama-hgahlot-<worker1>.cnet.com.log
2011-07-11 16:47:27,935 INFO
org.apache.zookeeper.server.quorum.QuorumPeerConfig: Defaulting to
majority quorums
2011-07-11 16:47:28,198 INFO
org.apache.zookeeper.server.quorum.QuorumPeerMain: Starting quorum
peer
2011-07-11 16:47:28,468 INFO
org.apache.zookeeper.server.NIOServerCnxn: binding to port
0.0.0.0/0.0.0.0:2181
2011-07-11 16:47:28,469 ERROR org.apache.hama.zookeeper.QuorumPeer:
Exception during ZooKeeper startup - exiting...
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at
org.apache.zookeeper.server.NIOServerCnxn$Factory.<init>(NIOServerCnxn.java:144)
at
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:121)
at org.apache.hama.zookeeper.QuorumPeer.runZKServer(QuorumPeer.java:80)
at org.apache.hama.zookeeper.QuorumPeer.run(QuorumPeer.java:70)
at org.apache.hama.ZooKeeperRunner.run(ZooKeeperRunner.java:36)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hama.ZooKeeperRunner.main(ZooKeeperRunner.java:41)
The hama-site.xml config is as follows:
<configuration>
<property>
<name>bsp.master.address</name>
<value>localhost</value>
<description>The address of the bsp master server. Either the
literal string "local" or a host:port for distributed mode
</description>
</property>
<property>
<name>fs.default.name</name>
<value><hdfs-host:port>/</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for HDFS.
</description>
</property>
<property>
<name>hama.zookeeper.quorum</name>
<value><worker1>.cnet.com,<worker2>.cnet.com,<worker3>.cnet.com</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is set in hama-env.sh
this is the list of servers which we will start/stop zookeeper on.
</description>
</property>
<property>
<name>hama.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
</configuration>
Why is hama trying to invoke an instance of zookeeper instead of using
the already running zookeeper ?
Thanks,
Himanshu