Hi,
I have set up Spark 0.9.2 standalone cluster using CDH5 and pre-built
spark distribution archive for Hadoop 2. I was not using spark-ec2
scripts because I am not on EC2 cloud.
Spark-shell seems to be working properly -- I am able to perform simple
RDD operations, as well as e.g. SparkPi standalone example works well
when run via `run-example`. Web UI shows all workers connected.
However, standalone Scala application gets "connection refused"
messages. I think this has something to do with configuration, because
spark-shell and SparkPi works well. I verified that .setMaster and
.setSparkHome are properly assigned within scala app.
Is there anything else in configuration of standalone scala app on spark
that I am missing?
I would very much appreciate any clues.
Namely, I am trying to run MovieLensALS.scala example from AMPCamp big
data mini course
(http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html).
Here is error which I get when try to run compiled jar:
---------------
root@master:~/machine-learning/scala# sbt/sbt package "run
/movielens/medium"
Launching sbt from sbt/sbt-launch-0.12.4.jar
[info] Loading project definition from
/root/training/machine-learning/scala/project
[info] Set current project to movielens-als (in build
file:/root/training/machine-learning/scala/)
[info] Compiling 1 Scala source to
/root/training/machine-learning/scala/target/scala-2.10/classes...
[warn] there were 2 deprecation warning(s); re-run with -deprecation for
details
[warn] one warning found
[info] Packaging
/root/training/machine-learning/scala/target/scala-2.10/movielens-als_2.10-0.0.jar
...
[info] Done packaging.
[success] Total time: 6 s, completed Oct 2, 2014 1:19:00 PM
[info] Running MovieLensALS /movielens/medium
master = spark://master:7077
log4j:WARN No appenders could be found for logger
(akka.event.slf4j.Slf4jLogger).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
14/10/02 13:19:01 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
HERE
THERE
14/10/02 13:19:02 INFO FileInputFormat: Total input paths to process : 1
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 0 on host2:
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 4 on host5:
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 3 (task 0.0:1)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 1 on host4:
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 2 (task 0.0:0)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 4 (task 0.0:1)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 3 on host3:
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 6 (task 0.0:0)
14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 2 on host1:
remote Akka client disassociated
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 5 (task 0.0:1)
14/10/02 13:19:03 WARN TaskSetManager: Lost TID 7 (task 0.0:0)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 6 on host4:
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 8 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 5 on host2:
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 10 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 7 on host5:
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 11 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 12 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 8 on host3:
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 13 (task 0.0:1)
14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 9 on host1:
remote Akka client disassociated
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 14 (task 0.0:0)
14/10/02 13:19:04 WARN TaskSetManager: Lost TID 15 (task 0.0:1)
14/10/02 13:19:05 ERROR AppClient$ClientActor: Master removed our
application: FAILED; stopping client
14/10/02 13:19:05 WARN SparkDeploySchedulerBackend: Disconnected from
Spark cluster! Waiting for reconnection...
14/10/02 13:19:06 ERROR TaskSchedulerImpl: Lost executor 11 on host5:
remote Akka client disassociated
14/10/02 13:19:06 WARN TaskSetManager: Lost TID 17 (task 0.0:0)
14/10/02 13:19:06 WARN TaskSetManager: Lost TID 16 (task 0.0:1)
---------------
And this is error log on one of the workers:
---------------
14/10/02 13:19:05 INFO worker.Worker: Executor app-20141002131901-0002/9
finished with state FAILED message Command exited with code 1 exitStatus 1
14/10/02 13:19:05 INFO actor.LocalActorRef: Message
[akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
from Actor[akka://sparkWorker/deadLetters] to
Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40xxx.xx.xx.xx%3A57719-15#1504298502]
was not delivered. [6] dead letters encountered. This logging can be
turned off or adjusted with configuration settings
'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://sparkWorker@host1:47421] ->
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkExecutor@host1:45542]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: host1/xxx.xx.xx.xx:45542
]
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://sparkWorker@host1:47421] ->
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkExecutor@host1:45542]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: host1/xxx.xx.xx.xx:45542
]
14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://sparkWorker@host1:47421] ->
[akka.tcp://sparkExecutor@host1:45542]: Error [Association failed with
[akka.tcp://sparkExecutor@host1:45542]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkExecutor@host1:45542]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: host1/xxx.xx.xx.xx:45542
---------------
Thanks!
Irina
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org