Hi All,
I am having a few issues with stability and scheduling. When I use spark shell
to submit my application. I get the following error message and spark shell
crashes. I have a small 4-node cluster for PoC. I tried both manual and
scripts-based cluster set up. I tried using FQDN as well for specifying the
master node, but no luck.
14/07/07 23:44:35 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1
(MappedRDD[6] at map at JaccardScore.scala:83)14/07/07 23:44:35 INFO
TaskSchedulerImpl: Adding task set 1.0 with 2 tasks14/07/07 23:44:35 INFO
TaskSetManager: Starting task 1.0:0 as TID 1 on executor localhost: localhost
(PROCESS_LOCAL)14/07/07 23:44:35 INFO TaskSetManager: Serialized task 1.0:0 as
2322 bytes in 0 ms14/07/07 23:44:35 INFO TaskSetManager: Starting task 1.0:1 as
TID 2 on executor localhost: localhost (PROCESS_LOCAL)14/07/07 23:44:35 INFO
TaskSetManager: Serialized task 1.0:1 as 2322 bytes in 0 ms14/07/07 23:44:35
INFO Executor: Running task ID 114/07/07 23:44:35 INFO Executor: Running task
ID 214/07/07 23:44:35 INFO BlockManager: Found block broadcast_1
locally14/07/07 23:44:35 INFO BlockManager: Found block broadcast_1
locally14/07/07 23:44:35 INFO HadoopRDD: Input split:
hdfs://pzxnvm2018:54310/data/sameer_7-2-2014_3mm_sentences.tsv:0+9723938914/07/07
23:44:35 INFO HadoopRDD: Input split:
hdfs://pzxnvm2018:54310/data/sameer_7-2-2014_3mm_sentences.tsv:97239389+9723939014/07/07
23:44:54 INFO AppClient$ClientActor: Connecting to master
spark://pzxnvm2018:7077...14/07/07 23:45:14 INFO AppClient$ClientActor:
Connecting to master spark://pzxnvm2018:7077...14/07/07 23:45:35 ERROR
SparkDeploySchedulerBackend: Application has been killed. Reason: All masters
are unresponsive! Giving up.14/07/07 23:45:35 ERROR TaskSchedulerImpl: Exiting
due to error from cluster scheduler: All masters are unresponsive! Giving
up.14/07/07 23:45:35 WARN HadoopRDD: Exception in
RecordReader.close()java.io.IOException: Filesystem closed at
org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264) at
org.apache.hadoop.hdfs.DFSClient.access$1100(DFSClient.java:74) at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.close(DFSClient.java:2135) at
java.io.FilterInputStream.close(FilterInputStream.java:181) at
org.apache.hadoop.util.LineReader.close(LineReader.java:83) at
org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:168) at
org.apache.spark.rdd.HadoopRDD$$anon$1.close(HadoopRDD.scala:208) at
org.apache.spark.util.NextIterator.closeIfNeeded(NextIterator.scala:63) at
org.apache.spark.rdd.HadoopRDD$$anon$1$$anonfun$1.apply$mcV$sp(HadoopRDD.scala:193)
at
org.apache.spark.TaskContext$$anonfun$executeOnCompleteCallbacks$1.apply(TaskContext.scala:63)
at
org.apache.spark.TaskContext$$anonfun$executeOnCompleteCallbacks$1.apply(TaskContext.scala:63)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at
org.apache.spark.TaskContext.executeOnCompleteCallbacks(TaskContext.scala:63)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:113) at
org.apache.spark.scheduler.Task.run(Task.scala:51) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)14/07/07 23:45:35 ERROR Executor:
Exception in task ID 2java.io.IOException: Filesystem closed at
org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:264) at
org.apache.hadoop.hdfs.DFSClient.access$1100(DFSClient.java:74) at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2213) at
java.io.DataInputStream.read(DataInputStream.java:100) at
org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:133) at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38) at
org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:198) at
org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:181) at
org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at
scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) at
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at
scala.collection.Iterator$class.foreach(Iterator.scala:727) at
scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at
scala.collection.AbstractIterator.to(Iterator.scala:1157) at
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at
scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at
scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717) at
org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:717) at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) at
org.apache.spark.scheduler.Task.run(Task.scala:51) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)