Hi,

The other option to consider is using G1 GC, which should behave better
with large heaps.  But pointers are not compressed in heaps > 32 GB in
size, so you may be better off staying under 32 GB.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Oct 6, 2014 at 8:08 PM, Mingyu Kim <m...@palantir.com> wrote:

> Ok, cool. This seems to be general issues in JVM with very large heaps. I
> agree that the best workaround would be to keep the heap size below 32GB.
> Thanks guys!
>
> Mingyu
>
> From: Arun Ahuja <aahuj...@gmail.com>
> Date: Monday, October 6, 2014 at 7:50 AM
> To: Andrew Ash <and...@andrewash.com>
> Cc: Mingyu Kim <m...@palantir.com>, "user@spark.apache.org" <
> user@spark.apache.org>, Dennis Lawler <dlaw...@palantir.com>
> Subject: Re: Larger heap leads to perf degradation due to GC
>
> We have used the strategy that you suggested, Andrew - using many workers
> per machine and keeping the heaps small (< 20gb).
>
> Using a large heap resulted in workers hanging or not responding (leading
> to timeouts).  The same dataset/job for us will fail (most often due to
> akka disassociated or fetch failures errors) with 10 cores / 100 executors,
> 60 gb per executor while succceed with 1 core / 1000 executors / 6gb per
> executor.
>
> When the job does succceed with more cores per executor and larger heap it
> is usually much slower than the smaller executors (the same 8-10 min job
> taking 15-20 min to complete)
>
> The unfortunate downside of this has been, we have had some large
> broadcast variables which may not fit into memory (and unnecessarily
> duplicated) when using the smaller executors.
>
> Most of this is anecdotal but for the most part we have had more success
> and consistency with more executors with smaller memory requirements.
>
> On Sun, Oct 5, 2014 at 7:20 PM, Andrew Ash <and...@andrewash.com> wrote:
>
>> Hi Mingyu,
>>
>> Maybe we should be limiting our heaps to 32GB max and running multiple
>> workers per machine to avoid large GC issues.
>>
>> For a 128GB memory, 32 core machine, this could look like:
>>
>> SPARK_WORKER_INSTANCES=4
>> SPARK_WORKER_MEMORY=32
>> SPARK_WORKER_CORES=8
>>
>> Are people running with large (32GB+) executor heaps in production?  I'd
>> be curious to hear if so.
>>
>> Cheers!
>> Andrew
>>
>> On Thu, Oct 2, 2014 at 1:30 PM, Mingyu Kim <m...@palantir.com> wrote:
>>
>>> This issue definitely needs more investigation, but I just wanted to
>>> quickly check if anyone has run into this problem or has general guidance
>>> around it. We’ve seen a performance degradation with a large heap on a
>>> simple map task (I.e. No shuffle). We’ve seen the slowness starting around
>>> from 50GB heap. (I.e. spark.executor.memoty=50g) And, when we checked the
>>> CPU usage, there were just a lot of GCs going on.
>>>
>>> Has anyone seen a similar problem?
>>>
>>> Thanks,
>>> Mingyu
>>>
>>
>>
>

Reply via email to