Re: workers no route to host

2015-04-02 Thread Dean Wampler
It appears you are using a Cloudera Spark build, 1.3.0-cdh5.4.0-SNAPSHOT,
which expects to find the hadoop command:

/data/PlatformDep/cdh5/dist/bin/compute-classpath.sh: line 164: hadoop:
command not found

If you don't want to use Hadoop, download one of the pre-built Spark
releases from spark.apache.org. Even the Hadoop builds there will work
okay, as they don't actually attempt to run Hadoop commands.


Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com

On Tue, Mar 31, 2015 at 3:12 AM, ZhuGe t...@outlook.com wrote:

 Hi,
 i set up a standalone cluster of 5 machines(tmaster, tslave1,2,3,4) with
 spark-1.3.0-cdh5.4.0-snapshort.
 when i execute the sbin/start-all.sh, the master is ok, but i cant see the
 web ui. Moreover, the worker logs is something like this:

 Spark assembly has been built with Hive, including Datanucleus jars on
 classpath
 /data/PlatformDep/cdh5/dist/bin/compute-classpath.sh: line 164: hadoop:
 command not found
 Spark Command: java -cp
 :/data/PlatformDep/cdh5/dist/sbin/../conf:/data/PlatformDep/cdh5/dist/lib/spark-assembly-1.3.0-cdh5.4.0-SNAPSHOT-hadoop2.6.0-cdh5.4.0-SNAPSHOT.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-rdbms-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-api-jdo-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-core-3.2.2.jar:
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m
 org.apache.spark.deploy.worker.Worker spark://192.168.128.16:7071
 --webui-port 8081
 

 Using Spark's default log4j profile:
 org/apache/spark/log4j-defaults.properties
 15/03/31 06:47:22 INFO Worker: Registered signal handlers for [TERM, HUP,
 INT]
 15/03/31 06:47:23 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 15/03/31 06:47:23 INFO SecurityManager: Changing view acls to: dcadmin
 15/03/31 06:47:23 INFO SecurityManager: Changing modify acls to: dcadmin
 15/03/31 06:47:23 INFO SecurityManager: SecurityManager: authentication
 disabled; ui acls disabled; users with view permissions: Set(dcadmin);
 users with modify permissions: Set(dcadmin)
 15/03/31 06:47:23 INFO Slf4jLogger: Slf4jLogger started
 15/03/31 06:47:23 INFO Remoting: Starting remoting
 15/03/31 06:47:23 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://sparkWorker@tslave2:60815]
 15/03/31 06:47:24 INFO Utils: Successfully started service 'sparkWorker'
 on port 60815.
 15/03/31 06:47:24 INFO Worker: Starting Spark worker tslave2:60815 with 2
 cores, 3.0 GB RAM
 15/03/31 06:47:24 INFO Worker: Running Spark version 1.3.0
 15/03/31 06:47:24 INFO Worker: Spark home: /data/PlatformDep/cdh5/dist
 15/03/31 06:47:24 INFO Server: jetty-8.y.z-SNAPSHOT
 15/03/31 06:47:24 INFO AbstractConnector: Started
 SelectChannelConnector@0.0.0.0:8081
 15/03/31 06:47:24 INFO Utils: Successfully started service 'WorkerUI' on
 port 8081.
 15/03/31 06:47:24 INFO WorkerWebUI: Started WorkerWebUI at
 http://tslave2:8081
 15/03/31 06:47:24 INFO Worker: Connecting to master akka.tcp://
 sparkMaster@192.168.128.16:7071/user/Master...
 15/03/31 06:47:24 ERROR EndpointWriter: AssociationError
 [akka.tcp://sparkWorker@tslave2:60815] - [akka.tcp://
 sparkMaster@192.168.128.16:7071]: Error [Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]] [
 akka.remote.EndpointAssociationException: Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]
 Caused by:
 akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No
 route to host
 ]
 15/03/31 06:47:24 ERROR EndpointWriter: AssociationError
 [akka.tcp://sparkWorker@tslave2:60815] - [akka.tcp://
 sparkMaster@192.168.128.16:7071]: Error [Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]] [
 akka.remote.EndpointAssociationException: Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]
 Caused by:
 akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No
 route to host
 ]
 15/03/31 06:47:24 ERROR EndpointWriter: AssociationError
 [akka.tcp://sparkWorker@tslave2:60815] - [akka.tcp://
 sparkMaster@192.168.128.16:7071]: Error [Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]] [
 akka.remote.EndpointAssociationException: Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]
 Caused by:
 akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No
 route to host
 ]
 15/03/31 06:47:24 ERROR EndpointWriter: AssociationError
 [akka.tcp://sparkWorker@tslave2:60815] - [akka.tcp://
 sparkMaster@192.168.128.16:7071]: Error [Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]] [
 akka.remote.EndpointAssociationException: Association failed with
 [akka.tcp://sparkMaster@192.168.128.16:7071]



 the worker machines 

workers no route to host

2015-03-31 Thread ZhuGe
Hi,i set up a standalone cluster of 5 machines(tmaster, tslave1,2,3,4) with 
spark-1.3.0-cdh5.4.0-snapshort. when i execute the sbin/start-all.sh, the 
master is ok, but i cant see the web ui. Moreover, the worker logs is something 
like this:
Spark assembly has been built with Hive, including Datanucleus jars on 
classpath/data/PlatformDep/cdh5/dist/bin/compute-classpath.sh: line 164: 
hadoop: command not foundSpark Command: java -cp 
:/data/PlatformDep/cdh5/dist/sbin/../conf:/data/PlatformDep/cdh5/dist/lib/spark-assembly-1.3.0-cdh5.4.0-SNAPSHOT-hadoop2.6.0-cdh5.4.0-SNAPSHOT.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-rdbms-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-api-jdo-3.2.1.jar:/data/PlatformDep/cdh5/dist/lib/datanucleus-core-3.2.2.jar:
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
org.apache.spark.deploy.worker.Worker spark://192.168.128.16:7071 --webui-port 
8081
Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties15/03/31 06:47:22 INFO Worker: 
Registered signal handlers for [TERM, HUP, INT]15/03/31 06:47:23 WARN 
NativeCodeLoader: Unable to load native-hadoop library for your platform... 
using builtin-java classes where applicable15/03/31 06:47:23 INFO 
SecurityManager: Changing view acls to: dcadmin15/03/31 06:47:23 INFO 
SecurityManager: Changing modify acls to: dcadmin15/03/31 06:47:23 INFO 
SecurityManager: SecurityManager: authentication disabled; ui acls disabled; 
users with view permissions: Set(dcadmin); users with modify permissions: 
Set(dcadmin)15/03/31 06:47:23 INFO Slf4jLogger: Slf4jLogger started15/03/31 
06:47:23 INFO Remoting: Starting remoting15/03/31 06:47:23 INFO Remoting: 
Remoting started; listening on addresses 
:[akka.tcp://sparkWorker@tslave2:60815]15/03/31 06:47:24 INFO Utils: 
Successfully started service 'sparkWorker' on port 60815.15/03/31 06:47:24 INFO 
Worker: Starting Spark worker tslave2:60815 with 2 cores, 3.0 GB RAM15/03/31 
06:47:24 INFO Worker: Running Spark version 1.3.015/03/31 06:47:24 INFO Worker: 
Spark home: /data/PlatformDep/cdh5/dist15/03/31 06:47:24 INFO Server: 
jetty-8.y.z-SNAPSHOT15/03/31 06:47:24 INFO AbstractConnector: Started 
SelectChannelConnector@0.0.0.0:808115/03/31 06:47:24 INFO Utils: Successfully 
started service 'WorkerUI' on port 8081.15/03/31 06:47:24 INFO WorkerWebUI: 
Started WorkerWebUI at http://tslave2:808115/03/31 06:47:24 INFO Worker: 
Connecting to master 
akka.tcp://sparkMaster@192.168.128.16:7071/user/Master...15/03/31 06:47:24 
ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@tslave2:60815] 
- [akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]] 
[akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No 
route to host]15/03/31 06:47:24 ERROR EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@tslave2:60815] - 
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]] 
[akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No 
route to host]15/03/31 06:47:24 ERROR EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@tslave2:60815] - 
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]] 
[akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]Caused by: 
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: No 
route to host]15/03/31 06:47:24 ERROR EndpointWriter: AssociationError 
[akka.tcp://sparkWorker@tslave2:60815] - 
[akka.tcp://sparkMaster@192.168.128.16:7071]: Error [Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]] 
[akka.remote.EndpointAssociationException: Association failed with 
[akka.tcp://sparkMaster@192.168.128.16:7071]


the worker machines ping the master machine successfully. the hosts is like 
this:192.168.128.16 tmaster tmaster192.168.128.17 tslave1 tslave1192.168.128.18 
tslave2 tslave2192.168.128.19 tslave3 tslave3192.168.128.20 tslave4 tslave4
Hope someone could help. Thanks