Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10363343 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -30,16 +32,23 @@ import org.apache.spark.scheduler._ * class, since the UI thread and the DAGScheduler event loop may otherwise * be reading/updating the internal data structures concurrently. */ -private[spark] class JobProgressListener(val sc: SparkContext) extends SparkListener { +private[ui] class JobProgressListener(sc: SparkContext, live: Boolean) + extends StorageStatusSparkListener { --- End diff -- This seems a little confusing to me...it looks like this extends StorageStatusSparkListener because it needs access to the list of executors and their addresses through the storageStatusList member variable. But for JobProgressListener, storageStatusList isn't really kept up to date correctly -- because onTaskEnd gets overridden by JobProgressListener, so the RDD info won't get correctly updated when the task ends. I'd propose simplifying this by having JobProgressListener just extend the usual SparkListener. Then, you need a way of getting the executors for the ExecutorTable on the stage status page. But you can get the list of executor Ids / hostnames from the list of tasks (which works because you only care about the executors that are being used by one of the tasks anyway).
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---