Hi Claudio,
The patch worked !!  :-)
Just to be clear,        I am running Giraph (1.0.0), not git cloned.         
and hadoop 2.0.0-cdh4.1.1
I applied your patch and rebuilt the giraph source code with this command,      
                 mvn -Phadoop_2.0.0 clean compile package test install verify   
        This built correctly, with no exceptions and no tests failed.   
I then ran the giraph example, which ran successfully with this command
[root@localhost giraph]# hadoop jar 
/usr/local/giraph/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-2.0.0-
 alpha-jar-with-dependencies.jar  org.apache.giraph.GiraphRunner 
org.apache.giraph.examples.SimpleShortestPathsVertex  -vif 
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat  -vip 
/user/root/input/tiny_graph.txt   -of 
org.apache.giraph.io.formats.IdWithValueTextOutputFormat   -op 
/user/root/output/shortestpaths -w 1
I then deleted the output                hadoop fs -rm -R  
/user/root/output/shortestpaths
I then restarted my HBase daemons, and ran the giraph example again, and it 
worked successfully again,no errors, no exceptions, no tasks failed, and output 
produced correctly.
Using 'netstat -an | grep 22181' I can see that ZooKeeper is listening on port 
22181.
     Thank you very much for your help  :-)
Ken

From: claudio.marte...@gmail.com
Date: Wed, 4 Sep 2013 19:21:37 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
To: user@giraph.apache.org

Giraph is shipped with Zookeeper 3.3.3, and it is run, if an existing zookeeper 
is not used through the giraph.zkServerList parameter, with its own 
configuration listening on port 22181.



On Wed, Sep 4, 2013 at 7:11 PM, Ken Williams <zoo9...@hotmail.com> wrote:





Hmmmmmmmm. Interesting.
Is Giraph (1.0.0) supposed to come with its own version of ZooKeeper ?
The only version of ZooKeeper I have installed is the one that came with HBase,

and the config file it uses /etc/zookeeper/conf/zoo.cfg specifies 
clientPort=2181This is the only zoo.cfg file on my machine.



[root@localhost]# cat /etc/zookeeper/conf/zoo.cfg ....maxClientCnxns=50# The 
number of milliseconds of each tick

tickTime=2000# The number of ticks that the initial # synchronization phase can 
takeinitLimit=10# The number of ticks that can pass between 

# sending a request and getting an acknowledgementsyncLimit=5# the directory 
where the snapshot is stored.dataDir=/var/lib/zookeeper# the port at which the 
clients will connect

clientPort=2181server.1=localhost:2888:3888[root@localhost Downloads]# 


From: claudio.marte...@gmail.com


Date: Wed, 4 Sep 2013 12:13:50 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
To: user@giraph.apache.org



That should in principle not be the case, as the zookeeper started by Giraph 
listens on a different port than the default. See parameter 
giraph.zkServerPort, which defaults to 22181.



On Wed, Sep 4, 2013 at 11:40 AM, Ken Williams <zoo9...@hotmail.com> wrote:







Hi Claudio,
    I think I have fixed the problem.
   HBase runs with its own copy of ZooKeeper which listens on port 2181.   So, 
when I tried to start ZooKeeper for Giraph it also tried to listen on port 2181



   and found it was already in use, and then it terminated - which is why 
Giraph failed.   If I stop the HBase daemons (including its copy of ZooKeeper) 
then Giraph runs fine. 
   Essentially there is a conflict between running ZooKeeper for Giraph, if 
there is 



   already ZooKeeper running for HBase. 
   I will try the patch and get back to you.
   Thanks for all your help,
Ken




From: claudio.marte...@gmail.com
Date: Tue, 3 Sep 2013 17:01:01 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.




To: user@giraph.apache.org

try with the attached patch applied to trunk, without the mentioned -D 
giraph.zkManagerDirectory.





On Tue, Sep 3, 2013 at 3:25 PM, Ken Williams <zoo9...@hotmail.com> wrote:





Hi Claudio,
    I tried this but it made no difference. The map tasks still fail, still no 
output, and still anexception in the log files - FileNotFoundException: File 
/tmp/giraph/_zkServer does not exist.






[root@localhost giraph]# hadoop jar 
/usr/local/giraph/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-2.0.0-alpha-jar-with-dependencies.jar
   org.apache.giraph.GiraphRunner  -Dgiraph.zkManagerDirectory='/tmp/giraph/'   
  org.apache.giraph.examples.SimpleShortestPathsVertex  -vif 
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip 
/user/root/input/tiny_graph.txt -of 
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op 
/user/root/output/shortestpaths -w 1 





13/09/03 14:19:58 INFO utils.ConfigurationUtils: No edge input format 
specified. Ensure your InputFormat does not require one.13/09/03 14:19:58 WARN 
job.GiraphConfigurationValidator: Output format vertex index type is not known





13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: Output format vertex 
value type is not known13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: 
Output format edge value type is not known





13/09/03 14:19:58 INFO job.GiraphJob: run: Since checkpointing is disabled 
(default), do not allow any task retries (setting mapred.map.max.attempts = 0, 
old value = 4)13/09/03 14:19:58 WARN mapred.JobClient: Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.





13/09/03 14:20:01 INFO mapred.JobClient: Running job: 
job_201308291126_003913/09/03 14:20:02 INFO mapred.JobClient:  map 0% reduce 
0%13/09/03 14:20:12 INFO mapred.JobClient: Job complete: job_201308291126_0039





13/09/03 14:20:12 INFO mapred.JobClient: Counters: 613/09/03 14:20:12 INFO 
mapred.JobClient:   Job Counters 13/09/03 14:20:12 INFO mapred.JobClient:     
Failed map tasks=113/09/03 14:20:12 INFO mapred.JobClient:     Launched map 
tasks=2





13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps in 
occupied slots (ms)=1632713/09/03 14:20:12 INFO mapred.JobClient:     Total 
time spent by all reduces in occupied slots (ms)=0





13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps 
waiting after reserving slots (ms)=013/09/03 14:20:12 INFO mapred.JobClient:    
 Total time spent by all reduces waiting after reserving slots (ms)=0





[root@localhost giraph]# 

When I try to run Zookeeper it still gives me an 'Address already in use' 
exception.
[root@localhost giraph]# /usr/lib/zookeeper/bin/zkServer.sh start-foreground





JMX enabled by defaultUsing config: 
/usr/lib/zookeeper/bin/../conf/zoo.cfg2013-09-03 14:23:37,882 [myid:] - INFO  
[main:QuorumPeerConfig@101] - Reading configuration from: 
/usr/lib/zookeeper/bin/../conf/zoo.cfg





2013-09-03 14:23:37,888 [myid:] - ERROR [main:QuorumPeerConfig@283] - Invalid 
configuration, only one server specified (ignoring)2013-09-03 14:23:37,889 
[myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set 
to 3





2013-09-03 14:23:37,889 [myid:] - INFO  [main:DatadirCleanupManager@79] - 
autopurge.purgeInterval set to 02013-09-03 14:23:37,890 [myid:] - INFO  
[main:DatadirCleanupManager@101] - Purge task is not scheduled.





2013-09-03 14:23:37,890 [myid:] - WARN  [main:QuorumPeerMain@118] - Either no 
config or no quorum defined in config, running  in standalone mode2013-09-03 
14:23:37,904 [myid:] - INFO  [main:QuorumPeerConfig@101] - Reading 
configuration from: /usr/lib/zookeeper/bin/../conf/zoo.cfg





2013-09-03 14:23:37,905 [myid:] - ERROR [main:QuorumPeerConfig@283] - Invalid 
configuration, only one server specified (ignoring)2013-09-03 14:23:37,905 
[myid:] - INFO  [main:ZooKeeperServerMain@100] - Starting server





2013-09-03 14:23:37,920 [myid:] - INFO  [main:Environment@100] - Server 
environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34 
GMT2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:host.name=localhost.localdomain





2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.version=1.6.0_312013-09-03 14:23:37,921 [myid:] - INFO  
[main:Environment@100] - Server environment:java.vendor=Sun Microsystems Inc.





2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.home=/usr/java/jdk1.6.0_31/jre2013-09-03 14:23:37,921 [myid:] 
- INFO  [main:Environment@100] - Server 
environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.1.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/usr/lib/zookeeper/bin/../conf:





2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.library.path=/usr/java/jdk1.6.0_31/jre/lib/i386/client:/usr/java/jdk1.6.0_31/jre/lib/i386:/usr/java/jdk1.6.0_31/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib





2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.io.tmpdir=/tmp2013-09-03 14:23:37,922 [myid:] - INFO  
[main:Environment@100] - Server environment:java.compiler=<NA>





2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.name=Linux2013-09-03 14:23:37,922 [myid:] - INFO  
[main:Environment@100] - Server environment:os.arch=i386





2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.version=2.6.32-279.14.1.el6.i6862013-09-03 14:23:37,923 [myid:] 
- INFO  [main:Environment@100] - Server environment:user.name=root





2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server 
environment:user.home=/root2013-09-03 14:23:37,923 [myid:] - INFO  
[main:Environment@100] - Server environment:user.dir=/usr/local/giraph-1.0.0





2013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@726] - tickTime 
set to 20002013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@735] - 
minSessionTimeout set to -12013-09-03 14:23:37,935 [myid:] - INFO  
[main:ZooKeeperServer@744] - maxSessionTimeout set to -1





2013-09-03 14:23:37,970 [myid:] - INFO  [main:NIOServerCnxnFactory@99] - 
binding to port 0.0.0.0/0.0.0.0:21812013-09-03 14:23:37,972 [myid:] - ERROR 
[main:ZooKeeperServerMain@68] - Unexpected exception, exiting abnormally





java.net.BindException: Address already in use  at sun.nio.ch.Net.bind(Native 
Method)   at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)





        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)     
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)





        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)
    at 
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)





        at 
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)
        at 
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)





        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:121)
  at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)





[root@localhost giraph]# 

      Thank you for any help,
Ken



From: claudio.marte...@gmail.com






Date: Tue, 3 Sep 2013 12:43:59 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
To: user@giraph.apache.org







can you try defining the zookeeper manager directory from the command line? 
like this -D giraph.zkManagerDirectory=/path/in/hdfs/foobar
you'll have to delete this directory by hand before each job. Just to see if it 
solves the problem. Then I could know how to fix it.









On Tue, Sep 3, 2013 at 12:32 PM, Ken Williams <zoo9...@hotmail.com> wrote:







Hi Pradeep,
Yes, the zookeeper server is definitely running, I can connect to it with the 
command-line client    [root@localhost giraph]# zkCli.sh  -server 127.0.0.1:2181







Connecting to 127.0.0.1:21812013-09-03 11:15:45,987 [myid:] - INFO  
[main:Environment@100] - Client 
environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34 GMT







2013-09-03 11:15:45,990 [myid:] - INFO  [main:Environment@100] - Client 
environment:host.name=localhost.localdomain2013-09-03 11:15:45,990 [myid:] - 
INFO  [main:Environment@100] - Client environment:java.version=1.6.0_31







......WatchedEvent state:SyncConnected type:None path:null[zk: 
127.0.0.1:2181(CONNECTED) 0] ls /[hbase, zookeeper][zk: 
127.0.0.1:2181(CONNECTED) 1] 









However, I am a bit confused. If I look in the zookeeper log-file I see this 
port 2181 'Address already in use' error,
2013-09-03 10:52:24,412 [myid:] - INFO  [main:ZooKeeperServer@735] - 
minSessionTimeout set to -1







2013-09-03 10:52:24,413 [myid:] - INFO  [main:ZooKeeperServer@744] - 
maxSessionTimeout set to -12013-09-03 10:52:24,436 [myid:] - INFO  
[main:NIOServerCnxnFactory@99] - binding to port 0.0.0.0/0.0.0.0:2181







2013-09-03 10:52:24,447 [myid:] - ERROR [main:ZooKeeperServerMain@68] - 
Unexpected exception, exiting abnormallyjava.net.BindException: Address already 
in use  at sun.nio.ch.Net.bind(Native Method)







        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)    at 
sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)







        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)     
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)







        at 
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)
  at 
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)








The process listening on port 2181 is 2892, which turns out to be HBase. 
[root@localhost giraph]# fuser 2181/tcp2181/tcp:             
2892[root@localhost giraph]# ps aux | grep 2892







hbase     2892  0.1  3.2 719592 119624 ?       Sl   Aug29   7:35 
/usr/java/jdk1.6.0_31/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx500m 
-XX:+UseConcMarkSweepGC -Dhbase.log.dir=/var/log/hbase 
-Dhbase.log.file=hbase-hbase-master-localhost.localdomain.log 
-Dhbase.home.dir=/usr/lib/hbase/bin/..   







......
So I am not sure what my zookeeper client is connecting to.     It seems to be 
connecting to a zookeeper server but when I do 'ps' I cannot see a zookeeper 
server running. 







Here is my zoo.cfg file,
maxClientCnxns=50# The number of milliseconds of each ticktickTime=2000# The 
number of ticks that the initial synchronization phase can take







initLimit=10# The number of ticks that can pass between # sending a request and 
getting an acknowledgementsyncLimit=5# the directory where the snapshot is 
stored.







dataDir=/var/lib/zookeeper# the port at which the clients will 
connectclientPort=2181server.1=localhost:2888:3888
    Thanks for any help,








Ken


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          

Reply via email to