Looks like problem of sync. Can you try again it after add Thread.sleep(100); line?
Sent from my iPhone On 2011. 2. 16., at 오후 3:24, Paweł Brach <[email protected]> wrote: > Yes, I have of course. My cluster has been configured and both examples > PiEstimator and SerializePrinting work (there is communication between 3 > nodes). I've modified your example - PiEstimator (put everything in the > loop) and it works for few iterations (there is communication) and after > that connection is lost. After that connection is re-established but some > messages are missing. It looks like that Hama framework is very unstable > when it's loaded and many messages are sending between nodes. > On the same cluster I've configured Apache Hadoop and it's very stable. > If you have own cluster configured, could you run my example on it ? Have > you ever run something more complicated than PiEstimator and > SerializePrinting on it ? > > Cheers, > Pawel > > 2011/2/16 Chia-Hung Lin <[email protected]> > >> Have you configured zookeeper in hama-site.xml? Hama makes use of >> zookeeper to do node communication IIRC. >> >> Opening socket connection to server cl5/127.0.1.1:2181 >> >> indicates that seems only localhost is up. If this is the case, you >> can change hama.zookeeper.quorum property pointing with value set to >> e.g. >> >> <property> >> <name>hama.zookeeper.quorum</name> >> <value>node1,node2,node3,node4,node5</value> >> </property> >> >> Hope it helps >> >> 2011/2/15 Paweł Brach <[email protected]>: >>> Hello, >>> >>> During last few days I've tested Hama solutions and today I found some >>> strange error in Hama framework. If you run a simple job with more than >> few >>> supersteps the following error occures: >>> >>> 2011-02-15 15:13:55,934 ERROR org.apache.hama.bsp.BSPPeer: >>> 2011-02-15 15:13:56,525 INFO org.apache.zookeeper.ClientCnxn: Opening >> socket >>> connection to server cl5/127.0.1.1:2181 >>> 2011-02-15 15:13:56,526 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 >>> for server null, unexpected error, closing socket connection and >> attempting >>> reconnect >>> java.net.ConnectException: Connection refused >>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>> at >>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) >>> at >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078) >>> 2011-02-15 15:13:56,626 ERROR org.apache.hama.bsp.BSPPeer: >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode = ConnectionLoss for /bsp >>> >>> You can reproduce that by running PiEstimator (the newest source code >> from >>> svn) with small changes - put whole body of the bsp() method in the for >>> loop. So add in the beginning following line: >>> >>> for (int j = 0; j < 100; j++) { >>> // oryginal bsp() code >>> } >>> >>> When I'm trying to run it, the framowork hangs and mentioned before error >>> occures. >>> >>> Your help will be appreciated. >>> >>> Cheers, >>> >>> -- >>> Pawel Brach >>> >> >> >> >> -- >> ChiaHung Lin @ nuk, tw. >> > > > > -- > Paweł Brach
