There are a bunch of tricks noted in the Tuning
Guide<http://spark.incubator.apache.org/docs/latest/tuning.html#memory-tuning>.
You may have seen them already but I thought its still worth mentioning for
the records.

Besides those, if you are concerned about consistent latency (that is, low
variability in the job processing times), then using
concurrent-mark-and-sweep GC is recommended. Instead of big stop-the-world
GC pauses, there are many smaller pauses. This reduction in variability
comes at the cost of processing throughput though, so thats a tradeoff.

TD


On Thu, Jan 16, 2014 at 11:35 AM, Kay Ousterhout <[email protected]>wrote:

> Hi all,
>
> I'm finding that Java GC can be a major performance bottleneck when running
> Spark at high (>50% or so) memory utilization.  What GC tuning have people
> tried for Spark and how effective has it been?
>
> Thanks!
>
> Kay
>

Reply via email to