[ https://issues.apache.org/jira/browse/SPARK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henry Saputra updated SPARK-2586: --------------------------------- Labels: tachyon (was: ) > Lack of information to figure out connection to Tachyon master is inactive/ > down > -------------------------------------------------------------------------------- > > Key: SPARK-2586 > URL: https://issues.apache.org/jira/browse/SPARK-2586 > Project: Spark > Issue Type: Bug > Components: Spark Core > Reporter: Henry Saputra > Labels: tachyon > > When you running Spark with Tachyon, when the connection to Tachyon master is > down (due to problem in network or the Master node is down) there is no clear > log or error message to indicate it. > Here is sample stack running SparkTachyonPi example with Tachyon connecting: > 14/07/15 16:43:10 INFO Utils: Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 14/07/15 16:43:10 WARN Utils: Your hostname, henry-pivotal.local resolves to > a loopback address: 127.0.0.1; using 10.64.5.148 instead (on interface en5) > 14/07/15 16:43:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > 14/07/15 16:43:11 INFO SecurityManager: Changing view acls to: hsaputra > 14/07/15 16:43:11 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(hsaputra) > 14/07/15 16:43:11 INFO Slf4jLogger: Slf4jLogger started > 14/07/15 16:43:11 INFO Remoting: Starting remoting > 14/07/15 16:43:11 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203] > 14/07/15 16:43:11 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sp...@office-5-148.pa.gopivotal.com:53203] > 14/07/15 16:43:11 INFO SparkEnv: Registering MapOutputTracker > 14/07/15 16:43:11 INFO SparkEnv: Registering BlockManagerMaster > 14/07/15 16:43:11 INFO DiskBlockManager: Created local directory at > /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3w0000gp/T/spark-local-20140715164311-e63c > 14/07/15 16:43:11 INFO ConnectionManager: Bound socket to port 53204 with id > = ConnectionManagerId(office-5-148.pa.gopivotal.com,53204) > 14/07/15 16:43:11 INFO MemoryStore: MemoryStore started with capacity 2.1 GB > 14/07/15 16:43:11 INFO BlockManagerMaster: Trying to register BlockManager > 14/07/15 16:43:11 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:53204 with 2.1 GB RAM > 14/07/15 16:43:11 INFO BlockManagerMaster: Registered BlockManager > 14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server > 14/07/15 16:43:11 INFO HttpBroadcast: Broadcast server started at > http://10.64.5.148:53205 > 14/07/15 16:43:11 INFO HttpFileServer: HTTP File server directory is > /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3w0000gp/T/spark-b2fb12ae-4608-4833-87b6-b335da00738e > 14/07/15 16:43:11 INFO HttpServer: Starting HTTP Server > 14/07/15 16:43:12 INFO SparkUI: Started SparkUI at > http://office-5-148.pa.gopivotal.com:4040 > 2014-07-15 16:43:12.210 java[39068:1903] Unable to load realm info from > SCDynamicStore > 14/07/15 16:43:12 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 14/07/15 16:43:12 INFO SparkContext: Added JAR > examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar at > http://10.64.5.148:53206/jars/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar > with timestamp 1405467792813 > 14/07/15 16:43:12 INFO AppClient$ClientActor: Connecting to master > spark://henry-pivotal.local:7077... > 14/07/15 16:43:12 INFO SparkContext: Starting job: reduce at > SparkTachyonPi.scala:43 > 14/07/15 16:43:12 INFO DAGScheduler: Got job 0 (reduce at > SparkTachyonPi.scala:43) with 2 output partitions (allowLocal=false) > 14/07/15 16:43:12 INFO DAGScheduler: Final stage: Stage 0(reduce at > SparkTachyonPi.scala:43) > 14/07/15 16:43:12 INFO DAGScheduler: Parents of final stage: List() > 14/07/15 16:43:12 INFO DAGScheduler: Missing parents: List() > 14/07/15 16:43:12 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map > at SparkTachyonPi.scala:39), which has no missing parents > 14/07/15 16:43:13 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 > (MappedRDD[1] at map at SparkTachyonPi.scala:39) > 14/07/15 16:43:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks > 14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Connected to Spark > cluster with app ID app-20140715164313-0000 > 14/07/15 16:43:13 INFO AppClient$ClientActor: Executor added: > app-20140715164313-0000/0 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:43:13 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164313-0000/0 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:43:13 INFO AppClient$ClientActor: Executor updated: > app-20140715164313-0000/0 is now RUNNING > 14/07/15 16:43:15 INFO SparkDeploySchedulerBackend: Registered executor: > Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:53213/user/Executor#-423405256] > with ID 0 > 14/07/15 16:43:15 INFO TaskSetManager: Re-computing pending task lists. > 14/07/15 16:43:15 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on > executor 0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:43:15 INFO TaskSetManager: Serialized task 0.0:0 as 1428 bytes in > 3 ms > 14/07/15 16:43:15 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on > executor 0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:43:15 INFO TaskSetManager: Serialized task 0.0:1 as 1428 bytes in > 1 ms > 14/07/15 16:43:15 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:53218 with 294.9 MB RAM > 14/07/15 16:43:16 INFO BlockManagerInfo: Added rdd_0_1 on tachyon on > office-5-148.pa.gopivotal.com:53218 (size: 977.2 KB) > 14/07/15 16:43:16 INFO BlockManagerInfo: Added rdd_0_0 on tachyon on > office-5-148.pa.gopivotal.com:53218 (size: 977.2 KB) > 14/07/15 16:43:16 INFO TaskSetManager: Finished TID 0 in 1307 ms on > office-5-148.pa.gopivotal.com (progress: 1/2) > 14/07/15 16:43:16 INFO TaskSetManager: Finished TID 1 in 1300 ms on > office-5-148.pa.gopivotal.com (progress: 2/2) > 14/07/15 16:43:16 INFO DAGScheduler: Completed ResultTask(0, 0) > 14/07/15 16:43:16 INFO DAGScheduler: Completed ResultTask(0, 1) > 14/07/15 16:43:16 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/07/15 16:43:16 INFO DAGScheduler: Stage 0 (reduce at > SparkTachyonPi.scala:43) finished in 3.336 s > 14/07/15 16:43:16 INFO SparkContext: Job finished: reduce at > SparkTachyonPi.scala:43, took 3.413498 s > Pi is roughly 3.14254 > 14/07/15 16:43:16 INFO SparkUI: Stopped Spark web UI at > http://office-5-148.pa.gopivotal.com:4040 > 14/07/15 16:43:16 INFO DAGScheduler: Stopping DAGScheduler > 14/07/15 16:43:16 INFO SparkDeploySchedulerBackend: Shutting down all > executors > 14/07/15 16:43:16 INFO SparkDeploySchedulerBackend: Asking each executor to > shut down > 14/07/15 16:43:17 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor > stopped! > 14/07/15 16:43:17 INFO ConnectionManager: Selector thread was interrupted! > 14/07/15 16:43:17 INFO ConnectionManager: ConnectionManager stopped > 14/07/15 16:43:17 INFO MemoryStore: MemoryStore cleared > 14/07/15 16:43:17 INFO BlockManager: BlockManager stopped > 14/07/15 16:43:17 INFO BlockManagerMasterActor: Stopping BlockManagerMaster > 14/07/15 16:43:17 INFO BlockManagerMaster: BlockManagerMaster stopped > 14/07/15 16:43:17 INFO SparkContext: Successfully stopped SparkContext > 14/07/15 16:43:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting > down remote daemon. > 14/07/15 16:43:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote > daemon shut down; proceeding with flushing remote transports. > Process finished with exit code 0 > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > And here is the stack when Tachyon cannot be reached: > 14/07/15 16:49:17 INFO Utils: Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 14/07/15 16:49:17 WARN Utils: Your hostname, henry-pivotal.local resolves to > a loopback address: 127.0.0.1; using 10.64.5.148 instead (on interface en5) > 14/07/15 16:49:17 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > 14/07/15 16:49:17 INFO SecurityManager: Changing view acls to: hsaputra > 14/07/15 16:49:17 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(hsaputra) > 14/07/15 16:49:17 INFO Slf4jLogger: Slf4jLogger started > 14/07/15 16:49:17 INFO Remoting: Starting remoting > 14/07/15 16:49:17 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sp...@office-5-148.pa.gopivotal.com:54541] > 14/07/15 16:49:17 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sp...@office-5-148.pa.gopivotal.com:54541] > 14/07/15 16:49:17 INFO SparkEnv: Registering MapOutputTracker > 14/07/15 16:49:17 INFO SparkEnv: Registering BlockManagerMaster > 14/07/15 16:49:17 INFO DiskBlockManager: Created local directory at > /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3w0000gp/T/spark-local-20140715164917-bf9e > 14/07/15 16:49:17 INFO ConnectionManager: Bound socket to port 54542 with id > = ConnectionManagerId(office-5-148.pa.gopivotal.com,54542) > 14/07/15 16:49:17 INFO MemoryStore: MemoryStore started with capacity 2.1 GB > 14/07/15 16:49:17 INFO BlockManagerMaster: Trying to register BlockManager > 14/07/15 16:49:17 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:54542 with 2.1 GB RAM > 14/07/15 16:49:17 INFO BlockManagerMaster: Registered BlockManager > 14/07/15 16:49:17 INFO HttpServer: Starting HTTP Server > 14/07/15 16:49:17 INFO HttpBroadcast: Broadcast server started at > http://10.64.5.148:54543 > 14/07/15 16:49:17 INFO HttpFileServer: HTTP File server directory is > /var/folders/nv/nsr_3ysj0wgfq93fqp0rdt3w0000gp/T/spark-400178c7-8c6e-4e44-9610-926bd1f84877 > 14/07/15 16:49:17 INFO HttpServer: Starting HTTP Server > 14/07/15 16:49:18 INFO SparkUI: Started SparkUI at > http://office-5-148.pa.gopivotal.com:4040 > 2014-07-15 16:49:18.144 java[39346:1903] Unable to load realm info from > SCDynamicStore > 14/07/15 16:49:18 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 14/07/15 16:49:18 INFO SparkContext: Added JAR > examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar at > http://10.64.5.148:54544/jars/spark-examples-1.1.0-SNAPSHOT-hadoop2.4.0.jar > with timestamp 1405468158551 > 14/07/15 16:49:18 INFO AppClient$ClientActor: Connecting to master > spark://henry-pivotal.local:7077... > 14/07/15 16:49:18 INFO SparkContext: Starting job: reduce at > SparkTachyonPi.scala:43 > 14/07/15 16:49:18 INFO DAGScheduler: Got job 0 (reduce at > SparkTachyonPi.scala:43) with 2 output partitions (allowLocal=false) > 14/07/15 16:49:18 INFO DAGScheduler: Final stage: Stage 0(reduce at > SparkTachyonPi.scala:43) > 14/07/15 16:49:18 INFO DAGScheduler: Parents of final stage: List() > 14/07/15 16:49:18 INFO DAGScheduler: Missing parents: List() > 14/07/15 16:49:18 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map > at SparkTachyonPi.scala:39), which has no missing parents > 14/07/15 16:49:18 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 > (MappedRDD[1] at map at SparkTachyonPi.scala:39) > 14/07/15 16:49:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks > 14/07/15 16:49:18 INFO SparkDeploySchedulerBackend: Connected to Spark > cluster with app ID app-20140715164918-0001 > 14/07/15 16:49:18 INFO AppClient$ClientActor: Executor added: > app-20140715164918-0001/0 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:49:18 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164918-0001/0 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:49:18 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/0 is now RUNNING > 14/07/15 16:49:20 INFO SparkDeploySchedulerBackend: Registered executor: > Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:54548/user/Executor#-221675010] > with ID 0 > 14/07/15 16:49:20 INFO TaskSetManager: Re-computing pending task lists. > 14/07/15 16:49:20 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on > executor 0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:20 INFO TaskSetManager: Serialized task 0.0:0 as 1429 bytes in > 1 ms > 14/07/15 16:49:20 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on > executor 0: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:20 INFO TaskSetManager: Serialized task 0.0:1 as 1429 bytes in > 0 ms > 14/07/15 16:49:20 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:54553 with 294.9 MB RAM > 14/07/15 16:49:26 INFO SparkDeploySchedulerBackend: Executor 0 disconnected, > so removing it > 14/07/15 16:49:26 ERROR TaskSchedulerImpl: Lost executor 0 on > office-5-148.pa.gopivotal.com: remote Akka client disassociated > 14/07/15 16:49:26 INFO TaskSetManager: Re-queueing tasks for 0 from TaskSet > 0.0 > 14/07/15 16:49:26 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/0 is now EXITED (Command exited with code 55) > 14/07/15 16:49:26 WARN TaskSetManager: Lost TID 1 (task 0.0:1) > 14/07/15 16:49:26 INFO SparkDeploySchedulerBackend: Executor > app-20140715164918-0001/0 removed: Command exited with code 55 > 14/07/15 16:49:26 WARN TaskSetManager: Lost TID 0 (task 0.0:0) > 14/07/15 16:49:26 INFO AppClient$ClientActor: Executor added: > app-20140715164918-0001/1 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:49:26 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164918-0001/1 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:49:26 INFO DAGScheduler: Executor lost: 0 (epoch 0) > 14/07/15 16:49:26 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/1 is now RUNNING > 14/07/15 16:49:26 INFO BlockManagerMasterActor: Trying to remove executor 0 > from BlockManagerMaster. > 14/07/15 16:49:26 INFO BlockManagerMaster: Removed 0 successfully in > removeExecutor > 14/07/15 16:49:28 INFO SparkDeploySchedulerBackend: Registered executor: > Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:54573/user/Executor#1564333236] > with ID 1 > 14/07/15 16:49:28 INFO TaskSetManager: Re-computing pending task lists. > 14/07/15 16:49:28 INFO TaskSetManager: Starting task 0.0:0 as TID 2 on > executor 1: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:28 INFO TaskSetManager: Serialized task 0.0:0 as 1429 bytes in > 0 ms > 14/07/15 16:49:28 INFO TaskSetManager: Starting task 0.0:1 as TID 3 on > executor 1: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:28 INFO TaskSetManager: Serialized task 0.0:1 as 1429 bytes in > 0 ms > 14/07/15 16:49:28 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:54578 with 294.9 MB RAM > 14/07/15 16:49:34 INFO SparkDeploySchedulerBackend: Executor 1 disconnected, > so removing it > 14/07/15 16:49:34 ERROR TaskSchedulerImpl: Lost executor 1 on > office-5-148.pa.gopivotal.com: remote Akka client disassociated > 14/07/15 16:49:34 INFO TaskSetManager: Re-queueing tasks for 1 from TaskSet > 0.0 > 14/07/15 16:49:34 WARN TaskSetManager: Lost TID 2 (task 0.0:0) > 14/07/15 16:49:34 WARN TaskSetManager: Lost TID 3 (task 0.0:1) > 14/07/15 16:49:34 INFO DAGScheduler: Executor lost: 1 (epoch 1) > 14/07/15 16:49:34 INFO BlockManagerMasterActor: Trying to remove executor 1 > from BlockManagerMaster. > 14/07/15 16:49:34 INFO BlockManagerMaster: Removed 1 successfully in > removeExecutor > 14/07/15 16:49:34 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/1 is now EXITED (Command exited with code 55) > 14/07/15 16:49:34 INFO SparkDeploySchedulerBackend: Executor > app-20140715164918-0001/1 removed: Command exited with code 55 > 14/07/15 16:49:34 INFO AppClient$ClientActor: Executor added: > app-20140715164918-0001/2 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:49:34 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164918-0001/2 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:49:34 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/2 is now RUNNING > 14/07/15 16:49:37 INFO SparkDeploySchedulerBackend: Registered executor: > Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:54599/user/Executor#-557403228] > with ID 2 > 14/07/15 16:49:37 INFO TaskSetManager: Re-computing pending task lists. > 14/07/15 16:49:37 INFO TaskSetManager: Starting task 0.0:1 as TID 4 on > executor 2: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:37 INFO TaskSetManager: Serialized task 0.0:1 as 1429 bytes in > 1 ms > 14/07/15 16:49:37 INFO TaskSetManager: Starting task 0.0:0 as TID 5 on > executor 2: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:37 INFO TaskSetManager: Serialized task 0.0:0 as 1429 bytes in > 0 ms > 14/07/15 16:49:37 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:54604 with 294.9 MB RAM > 14/07/15 16:49:43 INFO SparkDeploySchedulerBackend: Executor 2 disconnected, > so removing it > 14/07/15 16:49:43 ERROR TaskSchedulerImpl: Lost executor 2 on > office-5-148.pa.gopivotal.com: remote Akka client disassociated > 14/07/15 16:49:43 INFO TaskSetManager: Re-queueing tasks for 2 from TaskSet > 0.0 > 14/07/15 16:49:43 WARN TaskSetManager: Lost TID 5 (task 0.0:0) > 14/07/15 16:49:43 WARN TaskSetManager: Lost TID 4 (task 0.0:1) > 14/07/15 16:49:43 INFO DAGScheduler: Executor lost: 2 (epoch 2) > 14/07/15 16:49:43 INFO BlockManagerMasterActor: Trying to remove executor 2 > from BlockManagerMaster. > 14/07/15 16:49:43 INFO BlockManagerMaster: Removed 2 successfully in > removeExecutor > 14/07/15 16:49:43 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/2 is now EXITED (Command exited with code 55) > 14/07/15 16:49:43 INFO SparkDeploySchedulerBackend: Executor > app-20140715164918-0001/2 removed: Command exited with code 55 > 14/07/15 16:49:43 INFO AppClient$ClientActor: Executor added: > app-20140715164918-0001/3 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:49:43 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164918-0001/3 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:49:43 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/3 is now RUNNING > 14/07/15 16:49:45 INFO SparkDeploySchedulerBackend: Registered executor: > Actor[akka.tcp://sparkexecu...@office-5-148.pa.gopivotal.com:54627/user/Executor#-1697612197] > with ID 3 > 14/07/15 16:49:45 INFO TaskSetManager: Re-computing pending task lists. > 14/07/15 16:49:45 INFO TaskSetManager: Starting task 0.0:1 as TID 6 on > executor 3: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:45 INFO TaskSetManager: Serialized task 0.0:1 as 1429 bytes in > 0 ms > 14/07/15 16:49:45 INFO TaskSetManager: Starting task 0.0:0 as TID 7 on > executor 3: office-5-148.pa.gopivotal.com (PROCESS_LOCAL) > 14/07/15 16:49:45 INFO TaskSetManager: Serialized task 0.0:0 as 1429 bytes in > 0 ms > 14/07/15 16:49:45 INFO BlockManagerMasterActor: Registering block manager > office-5-148.pa.gopivotal.com:54634 with 294.9 MB RAM > 14/07/15 16:49:51 INFO SparkDeploySchedulerBackend: Executor 3 disconnected, > so removing it > 14/07/15 16:49:51 ERROR TaskSchedulerImpl: Lost executor 3 on > office-5-148.pa.gopivotal.com: remote Akka client disassociated > 14/07/15 16:49:51 INFO TaskSetManager: Re-queueing tasks for 3 from TaskSet > 0.0 > 14/07/15 16:49:51 WARN TaskSetManager: Lost TID 7 (task 0.0:0) > 14/07/15 16:49:51 ERROR TaskSetManager: Task 0.0:0 failed 4 times; aborting > job > 14/07/15 16:49:51 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/07/15 16:49:51 INFO TaskSchedulerImpl: Cancelling stage 0 > 14/07/15 16:49:51 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/3 is now EXITED (Command exited with code 55) > 14/07/15 16:49:51 INFO DAGScheduler: Failed to run reduce at > SparkTachyonPi.scala:43 > Exception in thread "main" 14/07/15 16:49:51 INFO > SparkDeploySchedulerBackend: Executor app-20140715164918-0001/3 removed: > Command exited with code 55 > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 > failed 4 times, most recent failure: TID 7 on host > office-5-148.pa.gopivotal.com failed for unknown reason > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1046) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1030) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1028) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1028) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:632) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:632) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:632) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1231) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 14/07/15 16:49:51 INFO AppClient$ClientActor: Executor added: > app-20140715164918-0001/4 on > worker-20140715164009-office-5-148.pa.gopivotal.com-52519 > (office-5-148.pa.gopivotal.com:52519) with 8 cores > 14/07/15 16:49:51 INFO SparkDeploySchedulerBackend: Granted executor ID > app-20140715164918-0001/4 on hostPort office-5-148.pa.gopivotal.com:52519 > with 8 cores, 512.0 MB RAM > 14/07/15 16:49:51 INFO AppClient$ClientActor: Executor updated: > app-20140715164918-0001/4 is now RUNNING > 14/07/15 16:49:51 INFO DAGScheduler: Executor lost: 3 (epoch 3) > 14/07/15 16:49:51 INFO BlockManagerMasterActor: Trying to remove executor 3 > from BlockManagerMaster. > 14/07/15 16:49:51 INFO BlockManagerMaster: Removed 3 successfully in > removeExecutor > Process finished with exit code 1 -- This message was sent by Atlassian JIRA (v6.2#6252)