Hi,
I am using Spark 1.5. I have a spark application which is divided into
some jobs. I noticed from the Event Timeline - Spark History UI, that
there was idle time between jobs. See below, job 1 was submitted at
11:20:49 and finished at 11:20:52, but the job 2 was submitted "16s"
after (at 11:21:08). I wonder what is going on during 16s? Any suggestions?
Job Id Description Submitted Duration
2 saveAsTextFile at GenerateHistogram.scala:143 2015/09/16
11:21:08 0.7 s
1 collect at GenerateHistogram.scala:132 2015/09/16 11:20:49 2 s
0 count at GenerateHistogram.scala:129 2015/09/16 11:20:41 9 s
Below is log
15/09/16 11:20:52 INFO DAGScheduler: Job 1 finished: collect at
GenerateHistogram.scala:132, took 2.221756 s
15/09/16 11:21:08 INFO deprecation: mapred.tip.id is deprecated.
Instead, use mapreduce.task.id
15/09/16 11:21:08 INFO deprecation: mapred.task.id is deprecated.
Instead, use mapreduce.task.attempt.id
15/09/16 11:21:08 INFO deprecation: mapred.task.is.map is deprecated.
Instead, use mapreduce.task.ismap
15/09/16 11:21:08 INFO deprecation: mapred.task.partition is deprecated.
Instead, use mapreduce.task.partition
15/09/16 11:21:08 INFO deprecation: mapred.job.id is deprecated.
Instead, use mapreduce.job.id
15/09/16 11:21:08 INFO FileOutputCommitter: File Output Committer
Algorithm version is 1
15/09/16 11:21:08 INFO SparkContext: Starting job: saveAsTextFile at
GenerateHistogram.scala:143
15/09/16 11:21:08 INFO DAGScheduler: Got job 2 (saveAsTextFile at
GenerateHistogram.scala:143) with 1 output partitions
15/09/16 11:21:08 INFO DAGScheduler: Final stage: ResultStage
2(saveAsTextFile at GenerateHistogram.scala:143)
BR,
Patcharee
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org