Thanks Mark for the answer! It helps, but still leaves me with few
more questions. If you don't mind, I'd like to ask you few more
questions.

When you said "It can be used, and is used in user code, but it isn't
always as straightforward as you might think." did you think about the
Spark code or some other user code? Can I have a look at the code and
the use case? The method is `private[spark]` and it's not even
@DeveloperApi that makes using the method even more risky. I believe
it's a very low-level ingredient of Spark that very few people use if
at all. If I could see the code that uses the method, that could help.

Following up, isn't killing a stage similar to killing a job? They can
both be shared and I could imagine a very similar case for killing a
job as for a stage where an implementation does some checks before
killing the job eventually. It is possible for stages that are in a
sense similar to jobs so...I'm still unsure why the method is not used
by Spark itself. If it's not used by Spark why could it be useful for
others outside Spark?

Doh, why did I come across the method? It will take some time before I
forget about it :-)

Pozdrawiam,
Jacek

--
Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
http://blog.jaceklaskowski.pl
Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski


On Wed, Dec 16, 2015 at 10:55 AM, Mark Hamstra <m...@clearstorydata.com> wrote:
> It can be used, and is used in user code, but it isn't always as
> straightforward as you might think.  This is mostly because a Job often
> isn't a Job -- or rather it is more than one Job.  There are several RDD
> transformations that aren't lazy, so they end up launching "hidden" Jobs
> that you may not anticipate and may expect to be canceled (but won't be) by
> a cancelJob() called on a later action on that transformed RDD.  It is also
> possible for a single DataFrame or Spark SQL query to result in more than
> one running Job.  The upshot of all of this is that getting cancelJob() to
> work as most users would expect all the time is non-trivial, and most of the
> time using a jobGroup is a better way to capture what may be more than one
> Job that the user is thinking of as a single Job.
>
> On Wed, Dec 16, 2015 at 5:34 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> It does look like it's not actually used. It may simply be there for
>> completeness, to match cancelStage and cancelJobGroup, which are used.
>> I also don't know of a good reason there's no way to kill a whole job.
>>
>> On Wed, Dec 16, 2015 at 1:15 PM, Jacek Laskowski <ja...@japila.pl> wrote:
>> > Hi,
>> >
>> > While reviewing Spark code I came across SparkContext.cancelJob. I
>> > found no part of Spark using it. Is this a leftover after some
>> > refactoring? Why is this part of sc?
>> >
>> > The reason I'm asking is another question I'm having after having
>> > learnt about killing a stage in webUI. I noticed there is a way to
>> > kill/cancel stages, but no corresponding feature to kill/cancel jobs.
>> > Why? Is there a JIRA ticket to have it some day perhaps?
>> >
>> > Pozdrawiam,
>> > Jacek
>> >
>> > --
>> > Jacek Laskowski | https://medium.com/@jaceklaskowski/
>> > Mastering Apache Spark
>> > ==> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
>> > Follow me at https://twitter.com/jaceklaskowski
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: user-h...@spark.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to