Done:

https://issues.apache.org/jira/browse/SPARK-25837

On Thu, Oct 25, 2018 at 10:21 AM Marcelo Vanzin <van...@cloudera.com> wrote:

> Ah that makes more sense. Could you file a bug with that information
> so we don't lose track of this?
>
> Thanks
> On Wed, Oct 24, 2018 at 6:13 PM Patrick Brown
> <patrick.barry.br...@gmail.com> wrote:
> >
> > On my production application I am running ~200 jobs at once, but
> continue to submit jobs in this manner for sometimes ~1 hour.
> >
> > The reproduction code above generally only has 4 ish jobs running at
> once, and as you can see runs through 50k jobs in this manner.
> >
> > I guess I should clarify my above statement, the issue seems to appear
> when running multiple jobs at once as well as in sequence for a while and
> may as well have something to do with high master CPU usage (thus the
> collect in the code). My rough guess would be whatever is managing clearing
> out completed jobs gets overwhelmed (my master was a 4 core machine while
> running this, and htop reported almost full CPU usage across all 4 cores).
> >
> > The attached screenshot shows the state of the webui after running the
> repro code, you can see the ui is displaying some 43k completed jobs (takes
> a long time to load) after a few minutes of inactivity this will clear out,
> however as my production application continues to submit jobs every once in
> a while, the issue persists.
> >
> > On Wed, Oct 24, 2018 at 5:05 PM Marcelo Vanzin <van...@cloudera.com>
> wrote:
> >>
> >> When you say many jobs at once, what ballpark are you talking about?
> >>
> >> The code in 2.3+ does try to keep data about all running jobs and
> >> stages regardless of the limit. If you're running into issues because
> >> of that we may have to look again at whether that's the right thing to
> >> do.
> >> On Tue, Oct 23, 2018 at 10:02 AM Patrick Brown
> >> <patrick.barry.br...@gmail.com> wrote:
> >> >
> >> > I believe I may be able to reproduce this now, it seems like it may
> be something to do with many jobs at once:
> >> >
> >> > Spark 2.3.1
> >> >
> >> > > spark-shell --conf spark.ui.retainedJobs=1
> >> >
> >> > scala> import scala.concurrent._
> >> > scala> import scala.concurrent.ExecutionContext.Implicits.global
> >> > scala> for (i <- 0 until 50000) { Future { println(sc.parallelize(0
> until i).collect.length) } }
> >> >
> >> > On Mon, Oct 22, 2018 at 11:25 AM Marcelo Vanzin <van...@cloudera.com>
> wrote:
> >> >>
> >> >> Just tried on 2.3.2 and worked fine for me. UI had a single job and a
> >> >> single stage (+ the tasks related to that single stage), same thing
> in
> >> >> memory (checked with jvisualvm).
> >> >>
> >> >> On Sat, Oct 20, 2018 at 6:45 PM Marcelo Vanzin <van...@cloudera.com>
> wrote:
> >> >> >
> >> >> > On Tue, Oct 16, 2018 at 9:34 AM Patrick Brown
> >> >> > <patrick.barry.br...@gmail.com> wrote:
> >> >> > > I recently upgraded to spark 2.3.1 I have had these same
> settings in my spark submit script, which worked on 2.0.2, and according to
> the documentation appear to not have changed:
> >> >> > >
> >> >> > > spark.ui.retainedTasks=1
> >> >> > > spark.ui.retainedStages=1
> >> >> > > spark.ui.retainedJobs=1
> >> >> >
> >> >> > I tried that locally on the current master and it seems to be
> working.
> >> >> > I don't have 2.3 easily in front of me right now, but will take a
> look
> >> >> > Monday.
> >> >> >
> >> >> > --
> >> >> > Marcelo
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Marcelo
> >>
> >>
> >>
> >> --
> >> Marcelo
>
>
>
> --
> Marcelo
>

Reply via email to