I think that's expected since Zeppelin is keeping the spark context alive even 
when the notebook is not executing (the idea is you could be running more 
things). That would keep broadcasted data and cached rdd in memory. You should 
see the same if you run the same code from spark-shell and not exit the shell.






On Thu, Dec 3, 2015 at 9:01 AM -0800, "Jakub Liska" <liska.ja...@gmail.com> 
wrote:





Hey,

I mentioned that I'm using broardcast variables, but I'm destroying them at
the end... I'm using Spark 1.7.1 ... I'll let you know later if the problem
still occurs. So far it seems it stopped after I started destroying them +
cachedRdd.unpersist

On Thu, Dec 3, 2015 at 5:52 PM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> Do you know what version of spark you are running with?
>
>
>
>
>
> On Thu, Dec 3, 2015 at 12:52 AM -0800, "Kevin (Sangwoo) Kim" <
> kevin...@apache.org> wrote:
>
> Do you use broadcast variables? I've found many problems related
> to broadcast variables and not using it.
> (It's a Spark problem, rather than Zeppelin problem)
>
> For RDD's, no need to be manually unpersisted, it automatically does.
>
>
> 2015년 12월 3일 (목) 오후 5:28, Jakub Liska <liska.ja...@gmail.com>님이 작성:
>
> Hi,
>
> no, just running it manually. I think I need to unpersist cached rdds and
> destroy broadcast variables in the end, am I correct? Because it hasn't
> crashed since then, the following runs are always a little slower though.
>
> On Thu, Dec 3, 2015 at 8:08 AM, Felix Cheung <felixcheun...@hotmail.com>
> wrote:
>
> How are you running jobs? Do you schedule a notebook to run from Zeppelin?
>
> ------------------------------
> Date: Mon, 30 Nov 2015 12:42:16 +0100
> Subject: Spark worker memory not freed up after zeppelin run finishes
> From: liska.ja...@gmail.com
> To: users@zeppelin.incubator.apache.org
>
> Hey,
>
> I'm connecting Zeppelin with a remote Spark standalone cluster (2 worker
> nodes) and I noticed that if I run a job from Zeppelin twice without
> restarting the Interpreter, it fails on OOME. After the Zeppelin jobs
> successfully finishes I can see all executor memory being allocated on
> workers and restarting Interpreter frees the memory... But if I don't do it
> it fails when running the task again.
>
> Any idea how to deal with this problem? Currently I have to always restart
> Interpreter between running spark jobs.
>
> Thanks Jakub
>
>
>

Reply via email to