[
https://issues.apache.org/jira/browse/GIRAPH-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905393#comment-13905393
]
Hudson commented on GIRAPH-850:
-------------------------------
FAILURE: Integrated in Giraph-trunk-Commit #1420 (See
[https://builds.apache.org/job/Giraph-trunk-Commit/1420/])
GIRAPH-850 (claudio.martella:
http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=c1d50bca96043c8a3e1ed1acfec98f92d03f6864)
* giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
* giraph-core/src/main/java/org/apache/giraph/zk/ZooKeeperManager.java
* giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java
> Improve internal zookeeper launching
> ------------------------------------
>
> Key: GIRAPH-850
> URL: https://issues.apache.org/jira/browse/GIRAPH-850
> Project: Giraph
> Issue Type: Bug
> Components: zookeeper
> Reporter: Alexandre Fonseca
> Fix For: 1.1.0
>
> Attachments: GIRAPH-850-2.patch, GIRAPH-850.patch
>
>
> With the most up to date trunk, internal zookeeper launching only appears to
> work with Hadoop 1.x.x MR1.
> With Hadoop 2.x.x MR2, trying to run a job without specifying an external
> zookeeper location results in a failed job with the following in the logs:
> {code}
> 2014-02-12 17:30:30,281 INFO [main] org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Attempting to start ZooKeeper server with command
> [/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64/jre/bin/java, -Xmx512m,
> -XX:ParallelGCThr
> eads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70,
> -XX:MaxGCPauseMillis=100, -cp,
> /tmp/hadoop-yarn/staging/b.ajf/.staging/job_1392221733726_0002/job.jar,
> org.apache.zookeeper.server.quorum.QuorumPeerMain, /tmp/hadoop-b
> .ajf/nm-local-dir/usercache/b.ajf/appcache/application_1392221733726_0002/work/_bspZooKeeper/zoo.cfg]
> in directory
> /tmp/hadoop-b.ajf/nm-local-dir/usercache/b.ajf/appcache/application_1392221733726_0002/work/_bspZooKeeper
> (...)
> 2014-02-12 17:30:30,285 INFO [main] org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
> igraph-02.hi.inet:22181 with poll msecs = 3000
> 2014-02-12 17:30:30,289 WARN [main] org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Got ConnectException
> java.net.ConnectException: Connection refused
> (...)
> 2014-02-12 17:30:30,413 INFO
> [org.apache.giraph.zk.ZooKeeperManager$StreamCollector]
> org.apache.giraph.zk.ZooKeeperManager$StreamCollector: readLines: Error:
> Could not find or load main class
> org.apache.zookeeper.server.quorum.QuorumPeerMain
> (...)
> {code}
> It clearly is unable to launch Zookeeper as it can't find the necessary class
> in the classpath. Looking at the command with which it tries to launch
> Zookeeper, we can see that it has specified a classpath of:
> {code}
> -cp, /tmp/hadoop/yarn/staging/b.ajf/.staging/job_1392221733726_0002/job.jar
> {code}
> which is a HDFS location.
> It seems that with Hadoop 2.x.x, the function Job.getJar() returns a HDFS
> path to the jar instead of the path to the local copy of the jar in the
> DirectoryCache. Hadoop 1.x.x appears to return a correct path as I didn't
> detect any problem there.
> The whole logic of finding the Zookeeper classpath seems extremely convoluted
> to me (not to mention broken as just shown for both MR2 and YARN). Since the
> currently running Java process has to have the zookeeper classes in its
> classpath anyway (because some of the classes in Giraph refer to Zookeeper
> classes), wouldn't it make more sense to just have the child java process
> starting Zookeeper simply inherit the classpath?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)