Hi Teja,

The only thought I have is maybe considering decreasing
the spark.scheduler.listenerbus.eventqueue.capacity parameter. That should
decrease the driver memory pressure but of course you'll end up with
dropping events probably more frequently, meaning you can't really trust
anything you see in the UI anymore.

I'm not sure what other options there are other than trying things like
increasing driver memory and tuning GC.

Have you looked at the GC logs? For example, are both young and old
generation portions of the heap heavily utilized or is it just the young
generation? Depending on what you end up seeing in the GC log your
particular application might just need a larger young generation size for
example.

Just some ideas for you to consider till you make the move to 2.3 or later.

On Tue, Aug 11, 2020 at 7:14 AM Teja <saiteja.pa...@gmail.com> wrote:

> We have ~120 executors with 5 cores each, for a very long-running job which
> crunches ~2.5 TB of data with has too many filters to query. Currently, we
> have ~30k partitions which make ~90MB per partition.
>
> We are using Spark v2.2.2 as of now. The major problem we are facing is due
> to GC on the driver. All of the driver memory (30G) is getting filled and
> GC
> is very active, which is taking more than 50% of the runtime for Full GC
> Evacuation. The heap dump indicates that 80% of the memory is being
> occupied
> by LiveListenerBus and it's not being cleared by GC. Frequent GC runs are
> clearing newly created objects only.
>
> From the Jira tickets, I got to know that Memory consumption by
> LiveListenerBus has been addressed in v2.3 (not sure of the specifics). But
> until we evaluate migrating to v2.3, is there any quick fix or workaround
> either to prevent various listerner events bulking up in driver's memory or
> to identify and disable the Listener which is causing the delay in
> processing events.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to