Hi Fyi The following 2 tickets are blocking currently (for releases up to 1.5.2) the pattern of Starting and Stopping a sparkContext inside the same driver program
https://issues.apache.org/jira/browse/SPARK-11700 ->memory leak in SqlContext https://issues.apache.org/jira/browse/SPARK-11739 In an application we have built we initially wanted to use the same pattern (start-stop-start.etc) in order to have a better usage of the spark cluster resources. I believe that the fixes in the above tickets will allow to safely stop and restart the sparkContext in the driver program in release 1.6.0 Kind Regards 2015-12-22 21:00 GMT+02:00 Sean Owen <[email protected]>: > I think the original idea is that the life of the driver is the life > of the SparkContext: the context is stopped when the driver finishes. > Or: if for some reason the "context" dies or there's an unrecoverable > error, that's it for the driver. > > (There's nothing wrong with stop(), right? you have to call that when > the driver ends to shut down Spark cleanly. It's the re-starting > another context that's at issue.) > > This makes most sense in the context of a resource manager, which can > conceivably restart a driver if you like, but can't reach into your > program. > > That's probably still the best way to think of it. Still it would be > nice if SparkContext were friendlier to a restart just as a matter of > design. AFAIK it is; not sure about SQLContext though. If it's not a > priority it's just because this isn't a usual usage pattern, which > doesn't mean it's crazy, just not the primary pattern. > > On Tue, Dec 22, 2015 at 5:57 PM, Jerry Lam <[email protected]> wrote: > > Hi Sean, > > > > What if the spark context stops for involuntary reasons (misbehavior of > some connections) then we need to programmatically handle the failures by > recreating spark context. Is there something I don't understand/know about > the assumptions on how to use spark context? I tend to think of it as a > resource manager/scheduler for spark jobs. Are you guys planning to > deprecate the stop method from spark? > > > > Best Regards, > > > > Jerry > > > > Sent from my iPhone > > > >> On 22 Dec, 2015, at 3:57 am, Sean Owen <[email protected]> wrote: > >> > >> Although in many cases it does work to stop and then start a second > >> context, it wasn't how Spark was originally designed, and I still see > >> gotchas. I'd avoid it. I don't think you should have to release some > >> resources; just keep the same context alive. > >> > >>> On Tue, Dec 22, 2015 at 5:13 AM, Jerry Lam <[email protected]> > wrote: > >>> Hi Zhan, > >>> > >>> I'm illustrating the issue via a simple example. However it is not > difficult > >>> to imagine use cases that need this behaviour. For example, you want to > >>> release all resources of spark when it does not use for longer than an > hour > >>> in a job server like web services. Unless you can prevent people from > >>> stopping spark context, then it is reasonable to assume that people > can stop > >>> it and start it again in later time. > >>> > >>> Best Regards, > >>> > >>> Jerry > >>> > >>> > >>>> On Mon, Dec 21, 2015 at 7:20 PM, Zhan Zhang <[email protected]> > wrote: > >>>> > >>>> This looks to me is a very unusual use case. You stop the > SparkContext, > >>>> and start another one. I don’t think it is well supported. As the > >>>> SparkContext is stopped, all the resources are supposed to be > released. > >>>> > >>>> Is there any mandatory reason you have stop and restart another > >>>> SparkContext. > >>>> > >>>> Thanks. > >>>> > >>>> Zhan Zhang > >>>> > >>>> Note that when sc is stopped, all resources are released (for example > in > >>>> yarn > >>>>> On Dec 20, 2015, at 2:59 PM, Jerry Lam <[email protected]> wrote: > >>>>> > >>>>> Hi Spark developers, > >>>>> > >>>>> I found that SQLContext.getOrCreate(sc: SparkContext) does not behave > >>>>> correctly when a different spark context is provided. > >>>>> > >>>>> ``` > >>>>> val sc = new SparkContext > >>>>> val sqlContext =SQLContext.getOrCreate(sc) > >>>>> sc.stop > >>>>> ... > >>>>> > >>>>> val sc2 = new SparkContext > >>>>> val sqlContext2 = SQLContext.getOrCreate(sc2) > >>>>> sc2.stop > >>>>> ``` > >>>>> > >>>>> The sqlContext2 will reference sc instead of sc2 and therefore, the > >>>>> program will not work because sc has been stopped. > >>>>> > >>>>> Best Regards, > >>>>> > >>>>> Jerry > >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
