(Trying to bubble up the issue again...)

Any insights (based on the enclosed logs) into why standalone client
invocation might fail while issuing jobs through the spark console
succeeded?

Thanks,
Bharath


On Thu, May 15, 2014 at 5:08 PM, Bharath Ravi Kumar <reachb...@gmail.com>wrote:

> Hi,
>
> I'm running the spark server with a single worker on a laptop using the
> docker images. The spark shell examples run fine with this setup. However,
> a standalone java client that tries to run wordcount on a local files (1 MB
> in size), the execution fails with the following error on the stdout of the
> worker:
>
> 14/05/15 10:31:21 INFO Slf4jLogger: Slf4jLogger started
> 14/05/15 10:31:21 INFO Remoting: Starting remoting
> 14/05/15 10:31:22 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkExecutor@worker1:55924]
> 14/05/15 10:31:22 INFO Remoting: Remoting now listens on addresses:
> [akka.tcp://sparkExecutor@worker1:55924]
> 14/05/15 10:31:22 INFO CoarseGrainedExecutorBackend: Connecting to driver:
> akka.tcp://spark@R9FX97h.local:56720/user/CoarseGrainedScheduler
> 14/05/15 10:31:22 INFO WorkerWatcher: Connecting to worker
> akka.tcp://sparkWorker@worker1:50040/user/Worker
> 14/05/15 10:31:22 WARN Remoting: Tried to associate with unreachable
> remote address [akka.tcp://spark@R9FX97h.local:56720]. Address is now
> gated for 60000 ms, all messages to this address will be delivered to dead
> letters.
> 14/05/15 10:31:22 ERROR CoarseGrainedExecutorBackend: Driver Disassociated
> [akka.tcp://sparkExecutor@worker1:55924] ->
> [akka.tcp://spark@R9FX97h.local:56720] disassociated! Shutting down.
>
> I noticed the following messages on the worker console when I attached
> through docker:
>
> 14/05/15 11:24:33 INFO Worker: Asked to launch executor
> app-20140515112408-0005/7 for billingLogProcessor
> 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@worker1:50040] ->
> [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with
> [akka.tcp://sparkExecutor@worker1:42437]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@worker1:42437]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: worker1/172.17.0.4:42437
> ]
> 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@worker1:50040] ->
> [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with
> [akka.tcp://sparkExecutor@worker1:42437]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@worker1:42437]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: worker1/172.17.0.4:42437
> ]
> 14/05/15 11:24:33 INFO ExecutorRunner: Launch command:
> "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp"
> ":/opt/spark-0.9.0/conf:/opt/spark-0.9.0/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar"
> "-Xms512M" "-Xmx512M"
> "org.apache.spark.executor.CoarseGrainedExecutorBackend"
> "akka.tcp://spark@R9FX97h.local:46986/user/CoarseGrainedScheduler" "7"
> "worker1" "1" "akka.tcp://sparkWorker@worker1:50040/user/Worker"
> "app-20140515112408-0005"
> 14/05/15 11:24:35 INFO Worker: Executor app-20140515112408-0005/7 finished
> with state FAILED message Command exited with code 1 exitStatus 1
> 14/05/15 11:24:35 INFO LocalActorRef: Message
> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from
> Actor[akka://sparkWorker/deadLetters] to
> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40172.17.0.4%3A33648-135#310170905]
> was not delivered. [34] dead letters encountered. This logging can be
> turned off or adjusted with configuration settings 'akka.log-dead-letters'
> and 'akka.log-dead-letters-during-shutdown'.
> 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@worker1:50040] ->
> [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with
> [akka.tcp://sparkExecutor@worker1:56594]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@worker1:56594]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: worker1/172.17.0.4:56594
> ]
> 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError
> [akka.tcp://sparkWorker@worker1:50040] ->
> [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with
> [akka.tcp://sparkExecutor@worker1:56594]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://sparkExecutor@worker1:56594]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: worker1/172.17.0.4:56594
> ]
>
> The significant code snippets from the standalone java client are as
> follows:
>
> JavaSparkContext ctx = new JavaSparkContext(masterAddr, "log_processor",
> sparkHome, jarFileLoc);
> JavaRDD<String> rawLog = ctx.textFile("/tmp/some.log");
> List<Tuple2<String, Long>> topRecords =
> rawLog.map(fieldSplitter).map(fieldExtractor).top(5, tupleComparator);
>
>
> However, running the sample code provided on github (amplab docker page)
> over the spark shell went through fine with the following stdout message:
>
> 14/05/15 10:39:41 INFO Slf4jLogger: Slf4jLogger started
> 14/05/15 10:39:42 INFO Remoting: Starting remoting
> 14/05/15 10:39:42 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkExecutor@worker1:33203]
> 14/05/15 10:39:42 INFO Remoting: Remoting now listens on addresses:
> [akka.tcp://sparkExecutor@worker1:33203]
> 14/05/15 10:39:42 INFO CoarseGrainedExecutorBackend: Connecting to driver:
> akka.tcp://spark@shell18046:45505/user/CoarseGrainedScheduler
> 14/05/15 10:39:42 INFO WorkerWatcher: Connecting to worker
> akka.tcp://sparkWorker@worker1:50040/user/Worker
> 14/05/15 10:39:42 INFO WorkerWatcher: Successfully connected to
> akka.tcp://sparkWorker@worker1:50040/user/Worker
> 14/05/15 10:39:42 INFO CoarseGrainedExecutorBackend: Successfully
> registered with driver
> ...
>
> The correspond output seen on the on the worker was:
>
> 14/05/15 11:31:31 INFO Worker: Asked to launch executor
> app-20140515113131-0006/0 for Spark shell
> 14/05/15 11:31:31 INFO ExecutorRunner: Launch command:
> "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp"
> ":/opt/spark-0.9.0/conf:/opt/spark-0.9.0/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar"
> "-Xms800M" "-Xmx800M"
> "org.apache.spark.executor.CoarseGrainedExecutorBackend"
> "akka.tcp://spark@shell16722:52142/user/CoarseGrainedScheduler" "0"
> "worker1" "1" "akka.tcp://sparkWorker@worker1:50040/user/Worker"
> "app-20140515113131-0006"
>
> Any pointer towards what might be wrong with the standalone client?
> Apologies for the lengthy log messages. Thanks in advance.
>
> -Bharath
>

Reply via email to