when you're running spark-shell and the example, are you actually
specifying --master spark://master:7077 as shown here:
http://spark.apache.org/docs/latest/programming-guide.html#initializing-spark

because if you're not, your spark-shell is running in local mode and not
actually connecting to the cluster. Also, if you run spark-shell against
the cluster, you'll see it listed under the Running applications in the
master UI. It would be pretty odd for spark shell to connect successfully
to the cluster but for your app to not connect...(which is why I suspect
that you're running spark-shell local)

Another thing to check, the executors need to connect back to your driver,
so it could be that you have to set the driver host or driver port...in
fact looking at your executor log, this seems fairly likely: is
host1/xxx.xx.xx.xx:45542
the machine where your driver is running? is that host/port reachable from
the worker machines?

On Fri, Oct 3, 2014 at 5:32 AM, Irina Fedulova <fedul...@gmail.com> wrote:

> Hi,
>
> I have set up Spark 0.9.2 standalone cluster using CDH5 and pre-built
> spark distribution archive for Hadoop 2. I was not using spark-ec2 scripts
> because I am not on EC2 cloud.
>
> Spark-shell seems to be working properly -- I am able to perform simple
> RDD operations, as well as e.g. SparkPi standalone example works well when
> run via `run-example`. Web UI shows all workers connected.
>
> However, standalone Scala application gets "connection refused" messages.
> I think this has something to do with configuration, because spark-shell
> and SparkPi works well. I verified that .setMaster and .setSparkHome are
> properly assigned within scala app.
>
> Is there anything else in configuration of standalone scala app on spark
> that I am missing?
> I would very much appreciate any clues.
>
> Namely, I am trying to run MovieLensALS.scala example from AMPCamp big
> data mini course (http://ampcamp.berkeley.edu/big-data-mini-course/movie-
> recommendation-with-mllib.html).
>
> Here is error which I get when try to run compiled jar:
> ---------------
> root@master:~/machine-learning/scala# sbt/sbt package "run
> /movielens/medium"
> Launching sbt from sbt/sbt-launch-0.12.4.jar
> [info] Loading project definition from /root/training/machine-
> learning/scala/project
> [info] Set current project to movielens-als (in build
> file:/root/training/machine-learning/scala/)
> [info] Compiling 1 Scala source to /root/training/machine-
> learning/scala/target/scala-2.10/classes...
> [warn] there were 2 deprecation warning(s); re-run with -deprecation for
> details
> [warn] one warning found
> [info] Packaging 
> /root/training/machine-learning/scala/target/scala-2.10/movielens-als_2.10-0.0.jar
> ...
> [info] Done packaging.
> [success] Total time: 6 s, completed Oct 2, 2014 1:19:00 PM
> [info] Running MovieLensALS /movielens/medium
> master = spark://master:7077
> log4j:WARN No appenders could be found for logger
> (akka.event.slf4j.Slf4jLogger).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> 14/10/02 13:19:01 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> HERE
> THERE
> 14/10/02 13:19:02 INFO FileInputFormat: Total input paths to process : 1
> 14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 0 on host2:
> remote Akka client disassociated
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 1 (task 0.0:1)
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 0 (task 0.0:0)
> 14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 4 on host5:
> remote Akka client disassociated
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 3 (task 0.0:1)
> 14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 1 on host4:
> remote Akka client disassociated
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 2 (task 0.0:0)
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 4 (task 0.0:1)
> 14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 3 on host3:
> remote Akka client disassociated
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 6 (task 0.0:0)
> 14/10/02 13:19:03 ERROR TaskSchedulerImpl: Lost executor 2 on host1:
> remote Akka client disassociated
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 5 (task 0.0:1)
> 14/10/02 13:19:03 WARN TaskSetManager: Lost TID 7 (task 0.0:0)
> 14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 6 on host4:
> remote Akka client disassociated
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 8 (task 0.0:0)
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 9 (task 0.0:1)
> 14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 5 on host2:
> remote Akka client disassociated
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 10 (task 0.0:1)
> 14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 7 on host5:
> remote Akka client disassociated
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 11 (task 0.0:0)
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 12 (task 0.0:1)
> 14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 8 on host3:
> remote Akka client disassociated
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 13 (task 0.0:1)
> 14/10/02 13:19:04 ERROR TaskSchedulerImpl: Lost executor 9 on host1:
> remote Akka client disassociated
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 14 (task 0.0:0)
> 14/10/02 13:19:04 WARN TaskSetManager: Lost TID 15 (task 0.0:1)
> 14/10/02 13:19:05 ERROR AppClient$ClientActor: Master removed our
> application: FAILED; stopping client
> 14/10/02 13:19:05 WARN SparkDeploySchedulerBackend: Disconnected from
> Spark cluster! Waiting for reconnection...
> 14/10/02 13:19:06 ERROR TaskSchedulerImpl: Lost executor 11 on host5:
> remote Akka client disassociated
> 14/10/02 13:19:06 WARN TaskSetManager: Lost TID 17 (task 0.0:0)
> 14/10/02 13:19:06 WARN TaskSetManager: Lost TID 16 (task 0.0:1)
> ---------------
>
> And this is error log on one of the workers:
> ---------------
> 14/10/02 13:19:05 INFO worker.Worker: Executor app-20141002131901-0002/9
> finished with state FAILED message Command exited with code 1 exitStatus 1
> 14/10/02 13:19:05 INFO actor.LocalActorRef: Message [akka.remote.transport.
> ActorTransportAdapter$DisassociateUnderlying] from
> Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/
> system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%
> 2FsparkWorker%40xxx.xx.xx.xx%3A57719-15#1504298502] was not delivered.
> [6] dead letters encountered. This logging can be turned off or adjusted
> with configuration settings 'akka.log-dead-letters' and
> 'akka.log-dead-letters-during-shutdown'.
> 14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@host1:47421] -> 
> [akka.tcp://sparkExecutor@host1:45542]:
> Error [Association failed with [akka.tcp://sparkExecutor@host1:45542]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@host1:45542]
> Caused by: 
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: host1/xxx.xx.xx.xx:45542
> ]
> 14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@host1:47421] -> 
> [akka.tcp://sparkExecutor@host1:45542]:
> Error [Association failed with [akka.tcp://sparkExecutor@host1:45542]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@host1:45542]
> Caused by: 
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: host1/xxx.xx.xx.xx:45542
> ]
> 14/10/02 13:19:05 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@host1:47421] -> 
> [akka.tcp://sparkExecutor@host1:45542]:
> Error [Association failed with [akka.tcp://sparkExecutor@host1:45542]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@host1:45542]
> Caused by: 
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: host1/xxx.xx.xx.xx:45542
> ---------------
>
> Thanks!
> Irina
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to