Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/246#issuecomment-39011464
It looks like github is just moving slowly today...the commit just got
pulled in. I took another look at this and have a question: what happens for
stages that are used for multiple jobs? Right now, stageIdToJobId in the UI
code you added just maps a stage to a single job id. So, if stage0 is used by
JobA and jobB, the ui code only stores one of these jobs, and then cancelJob()
will only be called for one of the jobs. cancelJob() ultimately calls
DAGScheduler.handleJobCancellation(), which only cancels the stages that are
independent to the job. So, because stage0 is not independent to either of the
jobs, it won't get cancelled. Did I misunderstand this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---