Hello,
During last few days I've tested Hama solutions and today I found some
strange error in Hama framework. If you run a simple job with more than few
supersteps the following error occures:
2011-02-15 15:13:55,934 ERROR org.apache.hama.bsp.BSPPeer:
2011-02-15 15:13:56,525 INFO org.apache.zookeeper.ClientCnxn: Opening socket
connection to server cl5/127.0.1.1:2181
2011-02-15 15:13:56,526 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
for server null, unexpected error, closing socket connection and attempting
reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
2011-02-15 15:13:56,626 ERROR org.apache.hama.bsp.BSPPeer:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /bsp
You can reproduce that by running PiEstimator (the newest source code from
svn) with small changes - put whole body of the bsp() method in the for
loop. So add in the beginning following line:
for (int j = 0; j < 100; j++) {
// oryginal bsp() code
}
When I'm trying to run it, the framowork hangs and mentioned before error
occures.
Your help will be appreciated.
Cheers,
--
Pawel Brach