Re: master=local vs master=local[*]

Andre Bois-Crettez Tue, 05 Aug 2014 09:21:49 -0700

The more cores you have, the less memory they will get.
512M is already quite small, and if you have 4 cores it will mean
roughly 128M per task.
Sometimes it is interesting to have less cores and more memory.


how many cores do you have ?

André

On 2014-08-05 16:43, Grzegorz Białek wrote:

Hi,

I have Spark application which computes join of two RDDs. One contains
around 150MB of data (7 million entries) second around 1,5MB (80
thousand entries) and
result of this join contains 50MB of data (2 million entries).

When I run it on one core (with master=local) it works correctly
(whole process uses between 600 and 700MB of memory) but when I run it
on all cores (with master=local[*]) it throws:
java.lang.OutOfMemoryError: GC overhead limit exceeded
and sometimes
java.lang.OutOfMemoryError: Java heap space

I have set spark.executor.memory=512m (default value).

Does anyone know why above occurs?

Thanks,
Grzegorz



--
André Bois-Crettez

Software Architect
Big Data Developer
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: master=local vs master=local[*]

Reply via email to