Hi

Fyi
The following 2 tickets are blocking currently (for releases up to 1.5.2)
the pattern of Starting and Stopping a sparkContext inside the same driver
program

https://issues.apache.org/jira/browse/SPARK-11700 ->memory leak in
SqlContext
https://issues.apache.org/jira/browse/SPARK-11739

In an application we have built we initially wanted to use the same pattern
(start-stop-start.etc)
in order to have a better usage of the spark cluster resources.

I believe that the fixes in the above tickets will allow to safely stop and
restart the sparkContext in the driver program in release 1.6.0

Kind Regards



2015-12-22 21:00 GMT+02:00 Sean Owen <so...@cloudera.com>:

> I think the original idea is that the life of the driver is the life
> of the SparkContext: the context is stopped when the driver finishes.
> Or: if for some reason the "context" dies or there's an unrecoverable
> error, that's it for the driver.
>
> (There's nothing wrong with stop(), right? you have to call that when
> the driver ends to shut down Spark cleanly. It's the re-starting
> another context that's at issue.)
>
> This makes most sense in the context of a resource manager, which can
> conceivably restart a driver if you like, but can't reach into your
> program.
>
> That's probably still the best way to think of it. Still it would be
> nice if SparkContext were friendlier to a restart just as a matter of
> design. AFAIK it is; not sure about SQLContext though. If it's not a
> priority it's just because this isn't a usual usage pattern, which
> doesn't mean it's crazy, just not the primary pattern.
>
> On Tue, Dec 22, 2015 at 5:57 PM, Jerry Lam <chiling...@gmail.com> wrote:
> > Hi Sean,
> >
> > What if the spark context stops for involuntary reasons (misbehavior of
> some connections) then we need to programmatically handle the failures by
> recreating spark context. Is there something I don't understand/know about
> the assumptions on how to use spark context? I tend to think of it as a
> resource manager/scheduler for spark jobs. Are you guys planning to
> deprecate the stop method from spark?
> >
> > Best Regards,
> >
> > Jerry
> >
> > Sent from my iPhone
> >
> >> On 22 Dec, 2015, at 3:57 am, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> Although in many cases it does work to stop and then start a second
> >> context, it wasn't how Spark was originally designed, and I still see
> >> gotchas. I'd avoid it. I don't think you should have to release some
> >> resources; just keep the same context alive.
> >>
> >>> On Tue, Dec 22, 2015 at 5:13 AM, Jerry Lam <chiling...@gmail.com>
> wrote:
> >>> Hi Zhan,
> >>>
> >>> I'm illustrating the issue via a simple example. However it is not
> difficult
> >>> to imagine use cases that need this behaviour. For example, you want to
> >>> release all resources of spark when it does not use for longer than an
> hour
> >>> in  a job server like web services. Unless you can prevent people from
> >>> stopping spark context, then it is reasonable to assume that people
> can stop
> >>> it and start it again in  later time.
> >>>
> >>> Best Regards,
> >>>
> >>> Jerry
> >>>
> >>>
> >>>> On Mon, Dec 21, 2015 at 7:20 PM, Zhan Zhang <zzh...@hortonworks.com>
> wrote:
> >>>>
> >>>> This looks to me is a very unusual use case. You stop the
> SparkContext,
> >>>> and start another one. I don’t think it is well supported. As the
> >>>> SparkContext is stopped, all the resources are supposed to be
> released.
> >>>>
> >>>> Is there any mandatory reason you have stop and restart another
> >>>> SparkContext.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> Zhan Zhang
> >>>>
> >>>> Note that when sc is stopped, all resources are released (for example
> in
> >>>> yarn
> >>>>> On Dec 20, 2015, at 2:59 PM, Jerry Lam <chiling...@gmail.com> wrote:
> >>>>>
> >>>>> Hi Spark developers,
> >>>>>
> >>>>> I found that SQLContext.getOrCreate(sc: SparkContext) does not behave
> >>>>> correctly when a different spark context is provided.
> >>>>>
> >>>>> ```
> >>>>> val sc = new SparkContext
> >>>>> val sqlContext =SQLContext.getOrCreate(sc)
> >>>>> sc.stop
> >>>>> ...
> >>>>>
> >>>>> val sc2 = new SparkContext
> >>>>> val sqlContext2 = SQLContext.getOrCreate(sc2)
> >>>>> sc2.stop
> >>>>> ```
> >>>>>
> >>>>> The sqlContext2 will reference sc instead of sc2 and therefore, the
> >>>>> program will not work because sc has been stopped.
> >>>>>
> >>>>> Best Regards,
> >>>>>
> >>>>> Jerry
> >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Reply via email to