I am having similar problems. Not really sure what's causing it. I've tried to specify giraph.zkList, as well as adding various directories to the classpath, but that of course didn't work. I would keep getting a connection exception for the localhost on the worker computers, which doesn't really make any sense because it should be connecting to the master. The only way I could get it to work was by putting the IP address of the master machine next to localhost in /etc/hosts, which is a horrible solution, but it works.
On Wed, Jul 31, 2013 at 5:33 PM, Claudio Martella <claudio.marte...@gmail.com> wrote: > as a follow up to this, I think there might be some problem with the > classpath and the jar files: > > 2013-07-31 19:51:53,270 INFO org.apache.giraph.graph.GraphTaskManager: > setup: classpath @ > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/distcache/3182842244657980528_-2099658804_903563705/localhost/tmp/hadoop-hammer/mapred/staging/hammer/.staging/job_201307302017_0010/libjars/SOCCPartitioning-1.0-SNAPSHOT.jar > for job Giraph: org.apache.giraph.benchmark.WeightedPageRankComputation > 2013-07-31 19:51:53,367 INFO org.apache.giraph.zk.ZooKeeperManager: > createCandidateStamp: Made the directory /user/martella/zk > 2013-07-31 19:51:53,369 INFO org.apache.giraph.zk.ZooKeeperManager: > createCandidateStamp: Creating my filestamp > /user/martella/zk/_task/localhost. 0 > 2013-07-31 19:51:53,378 INFO org.apache.giraph.zk.ZooKeeperManager: > getZooKeeperServerList: Got [localhost.] 1 hosts from 1 candidates when 1 > required (polling period is 3000) on attempt 0 > 2013-07-31 19:51:53,379 INFO org.apache.giraph.zk.ZooKeeperManager: > createZooKeeperServerList: Creating the final ZooKeeper file > '/user/martella/zk/zkServerList_localhost. 0 ' > 2013-07-31 19:51:53,383 INFO org.apache.giraph.zk.ZooKeeperManager: > getZooKeeperServerList: For task 0, got file 'zkServerList_localhost. 0 ' > (polling period is 3000) > 2013-07-31 19:51:53,383 INFO org.apache.giraph.zk.ZooKeeperManager: > getZooKeeperServerList: Found [localhost., 0] 2 hosts in filename > 'zkServerList_localhost. 0 ' > 2013-07-31 19:51:53,385 INFO org.apache.giraph.zk.ZooKeeperManager: > onlineZooKeeperServers: Trying to delete old directory > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/jobcache/job_201307302017_0010/work/_bspZooKeeper > 2013-07-31 19:51:53,390 INFO org.apache.giraph.zk.ZooKeeperManager: > generateZooKeeperConfigFile: Creating file > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/jobcache/job_201307302017_0010/work/_bspZooKeeper/zoo.cfg > in > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/jobcache/job_201307302017_0010/work/_bspZooKeeper > with base port 22181 > 2013-07-31 19:51:53,390 INFO org.apache.giraph.zk.ZooKeeperManager: > generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true > 2013-07-31 19:51:53,390 INFO org.apache.giraph.zk.ZooKeeperManager: > generateZooKeeperConfigFile: Delete of zoo.cfg = false > 2013-07-31 19:51:53,392 INFO org.apache.giraph.zk.ZooKeeperManager: > onlineZooKeeperServers: Attempting to start ZooKeeper server with command > [/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java, > -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, > -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/distcache/3182842244657980528_-2099658804_903563705/localhost/tmp/hadoop-hammer/mapred/staging/hammer/.staging/job_201307302017_0010/libjars/SOCCPartitioning-1.0-SNAPSHOT.jar, > org.apache.zookeeper.server.quorum.QuorumPeerMain, > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/jobcache/job_201307302017_0010/work/_bspZooKeeper/zoo.cfg] > in directory > /Users/hammer/Dev/hadoop/data/hadoop/mapred/taskTracker/hammer/jobcache/job_201307302017_0010/work/_bspZooKeeper > 2013-07-31 19:51:53,401 INFO org.apache.giraph.zk.ZooKeeperManager: > onlineZooKeeperServers: Shutdown hook added. > 2013-07-31 19:51:53,401 INFO org.apache.giraph.zk.ZooKeeperManager: > onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to > localhost.:22181 with poll msecs = 3000 > 2013-07-31 19:51:53,409 WARN org.apache.giraph.zk.ZooKeeperManager: > onlineZooKeeperServers: Got ConnectException > java.net.ConnectException: Connection refused > > the jar used to run ZK is actually the jar with my application (used with > -libjars and put in HADOOP_CLASSPATH). The question is why suddenly this is > creating a problem... > > > > On Tue, Jul 30, 2013 at 8:43 PM, Claudio Martella > <claudio.marte...@gmail.com> wrote: >> >> Am I the only one that recently is experiencing problems with zookeeper? I >> get the workers failing to connect to zookeeper. I presume it is not >> starting at all. I'm using trunk and hadoop 1.0.3. Used to work smoothly. >> >> -- >> Claudio Martella >> claudio.marte...@gmail.com > > > > > -- > Claudio Martella > claudio.marte...@gmail.com -- Kyle Orlando Computer Engineering Major University of Maryland