can you try defining the zookeeper manager directory from the command line? like this -D giraph.zkManagerDirectory=/path/in/hdfs/foobar
you'll have to delete this directory by hand before each job. Just to see if it solves the problem. Then I could know how to fix it. On Tue, Sep 3, 2013 at 12:32 PM, Ken Williams <zoo9...@hotmail.com> wrote: > Hi Pradeep, > > Yes, the zookeeper server is definitely running, I can connect to it with > the > command-line client > > [root@localhost giraph]# zkCli.sh -server 127.0.0.1:2181 > Connecting to 127.0.0.1:2181 > 2013-09-03 11:15:45,987 [myid:] - INFO [main:Environment@100] - Client > environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34 > GMT > 2013-09-03 11:15:45,990 [myid:] - INFO [main:Environment@100] - Client > environment:host.name=localhost.localdomain > 2013-09-03 11:15:45,990 [myid:] - INFO [main:Environment@100] - Client > environment:java.version=1.6.0_31 > ...... > WatchedEvent state:SyncConnected type:None path:null > [zk: 127.0.0.1:2181(CONNECTED) 0] ls / > [hbase, zookeeper] > [zk: 127.0.0.1:2181(CONNECTED) 1] > > > However, I am a bit confused. > If I look in the zookeeper log-file I see this port 2181 'Address already > in use' error, > > 2013-09-03 10:52:24,412 [myid:] - INFO [main:ZooKeeperServer@735] - > minSessionTimeout set to -1 > 2013-09-03 10:52:24,413 [myid:] - INFO [main:ZooKeeperServer@744] - > maxSessionTimeout set to -1 > 2013-09-03 10:52:24,436 [myid:] - INFO [main:NIOServerCnxnFactory@99] - > binding to port 0.0.0.0/0.0.0.0:2181 > 2013-09-03 10:52:24,447 [myid:] - ERROR [main:ZooKeeperServerMain@68] - > Unexpected exception, exiting abnormally > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100) > at > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115) > at > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91) > > The process listening on port 2181 is 2892, which turns out to be HBase. > > [root@localhost giraph]# fuser 2181/tcp > 2181/tcp: 2892 > [root@localhost giraph]# ps aux | grep 2892 > hbase 2892 0.1 3.2 719592 119624 ? Sl Aug29 7:35 > /usr/java/jdk1.6.0_31/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx500m > -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/var/log/hbase > -Dhbase.log.file=hbase-hbase-master-localhost.localdomain.log > -Dhbase.home.dir=/usr/lib/hbase/bin/.. > ...... > > So I am not sure what my zookeeper client is connecting to. > It seems to be connecting to a zookeeper server but when I do 'ps' I > cannot see > a zookeeper server running. > Here is my zoo.cfg file, > > maxClientCnxns=50 > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > dataDir=/var/lib/zookeeper > # the port at which the clients will connect > clientPort=2181 > server.1=localhost:2888:3888 > > Thanks for any help, > > Ken > > > > Date: Mon, 2 Sep 2013 22:48:29 +0530 > > Subject: Re: FileNotFoundException: File > _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist. > > From: pradeep0...@gmail.com > > To: user@giraph.apache.org > > > > > Can you check if zookeeper running properly. > > > > On 9/2/13, Ken Williams <zoo9...@hotmail.com> wrote: > > > Hi, > > > I am trying to one of the example programs included with Giraph > > > 1.0.0but whatever I do I always get this same error: > > > FileNotFoundException: File > _bsp/_defaultZkManagerDir/<job_number>/_zkServer > > > does not exist. > > > I am running Giraph 1.0.0, on hadoop-2.0.0-cdh4.1.1 I successfully ran > > > 'mvn clean install -Phadoop_2.0.0' with no problems. > > > This is my input file, > > > [root@localhost giraph]# hadoop fs -cat > > > > /user/root/input/tiny_graph.txt[0,0,[[1,1],[3,3]]][1,0,[[0,1],[2,2],[3,1]]][2,0,[[1,2],[4,4]]][3,0,[[0,3],[1,1],[4,4]]][4,0,[[3,4],[2,4]]] > > > When I try to run an example program this is the output, > > > [root@localhost giraph]# hadoop jar > > > > /usr/local/giraph/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-2.0.0-alpha-jar-with-dependencies.jar > > > org.apache.giraph.GiraphRunner > > > org.apache.giraph.examples.SimpleShortestPathsVertex -vif > > > > org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip > > > /user/root/input/tiny_graph.txt -of > > > org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op > > > /user/root/output/shortestpaths -w 1 13/09/02 17:06:36 INFO > > > utils.ConfigurationUtils: No edge input format specified. Ensure your > > > InputFormat does not require one.13/09/02 17:06:36 WARN > > > job.GiraphConfigurationValidator: Output format vertex index type is > not > > > known13/09/02 17:06:36 WARN job.GiraphConfigurationValidator: Output > format > > > vertex value type is not known13/09/02 17:06:36 WARN > > > job.GiraphConfigurationValidator: Output format edge value type is not > > > known13/09/02 17:06:36 INFO job.GiraphJob: run: Since checkpointing is > > > disabled (default), do not allow any task retries (setting > > > mapred.map.max.attempts = 0, old value = 4)13/09/02 17:06:37 WARN > > > mapred.JobClient: Use GenericOptionsParser for parsing the arguments. > > > Applications should implement Tool for the same.13/09/02 17:06:40 INFO > > > mapred.JobClient: Running job: job_201308291126_002913/09/02 17:06:41 > INFO > > > mapred.JobClient: map 0% reduce 0%13/09/02 17:06:51 INFO > mapred.JobClient: > > > Job complete: job_201308291126_002913/09/02 17:06:51 INFO > mapred.JobClient: > > > Counters: 613/09/02 17:06:51 INFO mapred.JobClient: Job Counters > 13/09/02 > > > 17:06:51 INFO mapred.JobClient: Failed map tasks=113/09/02 17:06:51 > INFO > > > mapred.JobClient: Launched map tasks=213/09/02 17:06:51 INFO > > > mapred.JobClient: Total time spent by all maps in occupied slots > > > (ms)=1651513/09/02 17:06:51 INFO mapred.JobClient: Total time spent by > > > all reduces in occupied slots (ms)=013/09/02 17:06:51 INFO > mapred.JobClient: > > > Total time spent by all maps waiting after reserving slots > > > (ms)=013/09/02 17:06:51 INFO mapred.JobClient: Total time spent by all > > > reduces waiting after reserving slots (ms)=0[root@localhost giraph]# > > > > > > There are no errors but no output is produced, and in the Web UI I can > see > > > the 2 map tasks have both failed.When I look in the log files this is > the > > > exception I see thrown, > > > java.lang.IllegalStateException: run: Caught an unrecoverable exception > > > java.io.FileNotFoundException: File > > > _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not > exist. at > > > org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102) at > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) at > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at > > > org.apache.hadoop.mapred.Child$4.run(Child.java:268) at > > > java.security.AccessController.doPrivileged(Native Method) at > > > javax.security.auth.Subject.doAs(Subject.java:396) at > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) > at > > > org.apache.hadoop.mapred.Child.main(Child.java:262)Caused by: > > > java.lang.RuntimeException: java.io.FileNotFoundException: File > > > _bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not > exist. at > > > > org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:790) > at > > > > org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java > > > Every time I run a new job, it throws this same error. > > > I have a copy of Zookeeper installed here, > > > [root@localhost giraph]# /usr/lib/zookeeper/bin/zkServer.sh statusJMX > > > enabled by defaultUsing config: > /usr/lib/zookeeper/bin/../conf/zoo.cfgMode: > > > standalone[root@localhost giraph]# > > > Any help would be greatly appreciated. > > > Thank you, > > > Ken > > > > > > > > > > > > -- > > Pradeep Kumar > -- Claudio Martella claudio.marte...@gmail.com