Hi Claudio,
    I think I have fixed the problem.
   HBase runs with its own copy of ZooKeeper which listens on port 2181.   So, 
when I tried to start ZooKeeper for Giraph it also tried to listen on port 2181 
  and found it was already in use, and then it terminated - which is why Giraph 
failed.   If I stop the HBase daemons (including its copy of ZooKeeper) then 
Giraph runs fine. 
   Essentially there is a conflict between running ZooKeeper for Giraph, if 
there is    already ZooKeeper running for HBase. 
   I will try the patch and get back to you.
   Thanks for all your help,
Ken
From: claudio.marte...@gmail.com
Date: Tue, 3 Sep 2013 17:01:01 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
To: user@giraph.apache.org

try with the attached patch applied to trunk, without the mentioned -D 
giraph.zkManagerDirectory.

On Tue, Sep 3, 2013 at 3:25 PM, Ken Williams <zoo9...@hotmail.com> wrote:





Hi Claudio,
    I tried this but it made no difference. The map tasks still fail, still no 
output, and still anexception in the log files - FileNotFoundException: File 
/tmp/giraph/_zkServer does not exist.


[root@localhost giraph]# hadoop jar 
/usr/local/giraph/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-2.0.0-alpha-jar-with-dependencies.jar
   org.apache.giraph.GiraphRunner  -Dgiraph.zkManagerDirectory='/tmp/giraph/'   
  org.apache.giraph.examples.SimpleShortestPathsVertex  -vif 
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip 
/user/root/input/tiny_graph.txt -of 
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op 
/user/root/output/shortestpaths -w 1 

13/09/03 14:19:58 INFO utils.ConfigurationUtils: No edge input format 
specified. Ensure your InputFormat does not require one.13/09/03 14:19:58 WARN 
job.GiraphConfigurationValidator: Output format vertex index type is not known

13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: Output format vertex 
value type is not known13/09/03 14:19:58 WARN job.GiraphConfigurationValidator: 
Output format edge value type is not known

13/09/03 14:19:58 INFO job.GiraphJob: run: Since checkpointing is disabled 
(default), do not allow any task retries (setting mapred.map.max.attempts = 0, 
old value = 4)13/09/03 14:19:58 WARN mapred.JobClient: Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.

13/09/03 14:20:01 INFO mapred.JobClient: Running job: 
job_201308291126_003913/09/03 14:20:02 INFO mapred.JobClient:  map 0% reduce 
0%13/09/03 14:20:12 INFO mapred.JobClient: Job complete: job_201308291126_0039

13/09/03 14:20:12 INFO mapred.JobClient: Counters: 613/09/03 14:20:12 INFO 
mapred.JobClient:   Job Counters 13/09/03 14:20:12 INFO mapred.JobClient:     
Failed map tasks=113/09/03 14:20:12 INFO mapred.JobClient:     Launched map 
tasks=2

13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps in 
occupied slots (ms)=1632713/09/03 14:20:12 INFO mapred.JobClient:     Total 
time spent by all reduces in occupied slots (ms)=0

13/09/03 14:20:12 INFO mapred.JobClient:     Total time spent by all maps 
waiting after reserving slots (ms)=013/09/03 14:20:12 INFO mapred.JobClient:    
 Total time spent by all reduces waiting after reserving slots (ms)=0

[root@localhost giraph]# 

When I try to run Zookeeper it still gives me an 'Address already in use' 
exception.
[root@localhost giraph]# /usr/lib/zookeeper/bin/zkServer.sh start-foreground

JMX enabled by defaultUsing config: 
/usr/lib/zookeeper/bin/../conf/zoo.cfg2013-09-03 14:23:37,882 [myid:] - INFO  
[main:QuorumPeerConfig@101] - Reading configuration from: 
/usr/lib/zookeeper/bin/../conf/zoo.cfg

2013-09-03 14:23:37,888 [myid:] - ERROR [main:QuorumPeerConfig@283] - Invalid 
configuration, only one server specified (ignoring)2013-09-03 14:23:37,889 
[myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set 
to 3

2013-09-03 14:23:37,889 [myid:] - INFO  [main:DatadirCleanupManager@79] - 
autopurge.purgeInterval set to 02013-09-03 14:23:37,890 [myid:] - INFO  
[main:DatadirCleanupManager@101] - Purge task is not scheduled.

2013-09-03 14:23:37,890 [myid:] - WARN  [main:QuorumPeerMain@118] - Either no 
config or no quorum defined in config, running  in standalone mode2013-09-03 
14:23:37,904 [myid:] - INFO  [main:QuorumPeerConfig@101] - Reading 
configuration from: /usr/lib/zookeeper/bin/../conf/zoo.cfg

2013-09-03 14:23:37,905 [myid:] - ERROR [main:QuorumPeerConfig@283] - Invalid 
configuration, only one server specified (ignoring)2013-09-03 14:23:37,905 
[myid:] - INFO  [main:ZooKeeperServerMain@100] - Starting server

2013-09-03 14:23:37,920 [myid:] - INFO  [main:Environment@100] - Server 
environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34 
GMT2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:host.name=localhost.localdomain

2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.version=1.6.0_312013-09-03 14:23:37,921 [myid:] - INFO  
[main:Environment@100] - Server environment:java.vendor=Sun Microsystems Inc.

2013-09-03 14:23:37,921 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.home=/usr/java/jdk1.6.0_31/jre2013-09-03 14:23:37,921 [myid:] 
- INFO  [main:Environment@100] - Server 
environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.1.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/usr/lib/zookeeper/bin/../conf:

2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.library.path=/usr/java/jdk1.6.0_31/jre/lib/i386/client:/usr/java/jdk1.6.0_31/jre/lib/i386:/usr/java/jdk1.6.0_31/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib

2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.io.tmpdir=/tmp2013-09-03 14:23:37,922 [myid:] - INFO  
[main:Environment@100] - Server environment:java.compiler=<NA>

2013-09-03 14:23:37,922 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.name=Linux2013-09-03 14:23:37,922 [myid:] - INFO  
[main:Environment@100] - Server environment:os.arch=i386

2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.version=2.6.32-279.14.1.el6.i6862013-09-03 14:23:37,923 [myid:] 
- INFO  [main:Environment@100] - Server environment:user.name=root

2013-09-03 14:23:37,923 [myid:] - INFO  [main:Environment@100] - Server 
environment:user.home=/root2013-09-03 14:23:37,923 [myid:] - INFO  
[main:Environment@100] - Server environment:user.dir=/usr/local/giraph-1.0.0

2013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@726] - tickTime 
set to 20002013-09-03 14:23:37,934 [myid:] - INFO  [main:ZooKeeperServer@735] - 
minSessionTimeout set to -12013-09-03 14:23:37,935 [myid:] - INFO  
[main:ZooKeeperServer@744] - maxSessionTimeout set to -1

2013-09-03 14:23:37,970 [myid:] - INFO  [main:NIOServerCnxnFactory@99] - 
binding to port 0.0.0.0/0.0.0.0:21812013-09-03 14:23:37,972 [myid:] - ERROR 
[main:ZooKeeperServerMain@68] - Unexpected exception, exiting abnormally

java.net.BindException: Address already in use  at sun.nio.ch.Net.bind(Native 
Method)   at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)

        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)     
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)

        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)
    at 
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)

        at 
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)
        at 
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)

        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:121)
  at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)

[root@localhost giraph]# 

      Thank you for any help,
Ken



From: claudio.marte...@gmail.com


Date: Tue, 3 Sep 2013 12:43:59 +0200
Subject: Re: FileNotFoundException: File 
_bsp/_defaultZkManagerDir/job_201308291126_0029/_zkServer does not exist.
To: user@giraph.apache.org



can you try defining the zookeeper manager directory from the command line? 
like this -D giraph.zkManagerDirectory=/path/in/hdfs/foobar
you'll have to delete this directory by hand before each job. Just to see if it 
solves the problem. Then I could know how to fix it.





On Tue, Sep 3, 2013 at 12:32 PM, Ken Williams <zoo9...@hotmail.com> wrote:







Hi Pradeep,
Yes, the zookeeper server is definitely running, I can connect to it with the 
command-line client    [root@localhost giraph]# zkCli.sh  -server 127.0.0.1:2181



Connecting to 127.0.0.1:21812013-09-03 11:15:45,987 [myid:] - INFO  
[main:Environment@100] - Client 
environment:zookeeper.version=3.4.3-cdh4.1.1--1, built on 10/16/2012 17:34 GMT



2013-09-03 11:15:45,990 [myid:] - INFO  [main:Environment@100] - Client 
environment:host.name=localhost.localdomain2013-09-03 11:15:45,990 [myid:] - 
INFO  [main:Environment@100] - Client environment:java.version=1.6.0_31



......WatchedEvent state:SyncConnected type:None path:null[zk: 
127.0.0.1:2181(CONNECTED) 0] ls /[hbase, zookeeper][zk: 
127.0.0.1:2181(CONNECTED) 1] 





However, I am a bit confused. If I look in the zookeeper log-file I see this 
port 2181 'Address already in use' error,
2013-09-03 10:52:24,412 [myid:] - INFO  [main:ZooKeeperServer@735] - 
minSessionTimeout set to -1



2013-09-03 10:52:24,413 [myid:] - INFO  [main:ZooKeeperServer@744] - 
maxSessionTimeout set to -12013-09-03 10:52:24,436 [myid:] - INFO  
[main:NIOServerCnxnFactory@99] - binding to port 0.0.0.0/0.0.0.0:2181



2013-09-03 10:52:24,447 [myid:] - ERROR [main:ZooKeeperServerMain@68] - 
Unexpected exception, exiting abnormallyjava.net.BindException: Address already 
in use  at sun.nio.ch.Net.bind(Native Method)



        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)    at 
sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)



        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)     
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:100)



        at 
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:115)
  at 
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:91)




The process listening on port 2181 is 2892, which turns out to be HBase. 
[root@localhost giraph]# fuser 2181/tcp2181/tcp:             
2892[root@localhost giraph]# ps aux | grep 2892



hbase     2892  0.1  3.2 719592 119624 ?       Sl   Aug29   7:35 
/usr/java/jdk1.6.0_31/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx500m 
-XX:+UseConcMarkSweepGC -Dhbase.log.dir=/var/log/hbase 
-Dhbase.log.file=hbase-hbase-master-localhost.localdomain.log 
-Dhbase.home.dir=/usr/lib/hbase/bin/..   



......
So I am not sure what my zookeeper client is connecting to.     It seems to be 
connecting to a zookeeper server but when I do 'ps' I cannot see a zookeeper 
server running. 



Here is my zoo.cfg file,
maxClientCnxns=50# The number of milliseconds of each ticktickTime=2000# The 
number of ticks that the initial synchronization phase can take



initLimit=10# The number of ticks that can pass between # sending a request and 
getting an acknowledgementsyncLimit=5# the directory where the snapshot is 
stored.



dataDir=/var/lib/zookeeper# the port at which the clients will 
connectclientPort=2181server.1=localhost:2888:3888
    Thanks for any help,




Ken


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          


-- 
    Claudio Martella
    claudio.marte...@gmail.com   
                                          

Reply via email to