Hello!

We're running Spark 2.3.0 on Scala 2.11.  We have a number of Spark
Streaming jobs that are using MapWithState.  We've observed that these jobs
will complete some set of stages, and then not schedule the next set of
stages.  It looks like the DAG Scheduler correctly identifies required
stages:

19/08/27 15:29:48 INFO YarnClusterScheduler: Removed TaskSet 79.0, whose
tasks have all completed, from pool
19/08/27 15:29:48 INFO DAGScheduler: ShuffleMapStage 79 (map at
SomeCode.scala:121) finished in 142.985 s
19/08/27 15:29:48 INFO DAGScheduler: looking for newly runnable stages
19/08/27 15:29:48 INFO DAGScheduler: running: Set()
19/08/27 15:29:48 INFO DAGScheduler: waiting: Set(ShuffleMapStage 81,
ResultStage 82, ResultStage 83, ShuffleMapStage 54, ResultStage 61,
ResultStage 55, ShuffleMapStage 48, ShuffleMapStage 84, Result
Stage 49, ShuffleMapStage 85, ShuffleMapStage 56, ResultStage 86,
ShuffleMapStage 57, ResultStage 58, ResultStage 80)
19/08/27 15:29:48 INFO DAGScheduler: failed: Set()

However, we see no stages that begin execution.  This happens semi-rarely
(every couple of days), which makes repro difficult.  I checked known bugs
fixed in 2.3.x and did not see anything pop out.  Has anyone else seen this
behavior? Any thoughts on debugging?

Regards,

Bryan Jeffrey

Reply via email to