Hi Calvin,

When you say "until all the memory in the cluster is allocated and the job
gets killed", do you know what's going on?  Spark apps should never be
killed for requesting / using too many resources?  Any associated error
message?

Unfortunately there are no tools currently for tweaking the number of
executors in an automated manner.  An option to use the entire YARN cluster
seems useful. I just filed a JIRA for it -
https://issues.apache.org/jira/browse/SPARK-3183.

-Sandy


On Tue, Aug 19, 2014 at 12:51 PM, Calvin <iphcal...@gmail.com> wrote:

> I've set up a YARN (Hadoop 2.4.1) cluster with Spark 1.0.1 and I've
> been seeing some inconsistencies with out of memory errors
> (java.lang.OutOfMemoryError: unable to create new native thread) when
> increasing the number of executors for a simple job (wordcount).
>
> The general format of my submission is:
>
> spark-submit \
>  --master yarn-client \
>  --num-executors=$EXECUTORS \
>  --executor-cores 1 \
>  --executor-memory 2G \
>  --driver-memory 3G \
>  count.py intput output
>
> If I run without specifying the number of executors, it defaults to
> two (3 containers: 2 executors, 1 driver). Is there any mechanism to
> let a spark application scale to the capacity of the YARN cluster
> automatically?
>
> Similarly, for low numbers of executors I get what I asked for (e.g.,
> 10 executors results in 11 containers running, 20 executors results in
> 21 containers, etc) until a particular threshold... when I specify 50
> containers, Spark seems to start asking for more and more containers
> until all the memory in the cluster is allocated and the job gets
> killed.
>
> I don't understand that particular behavior—if anyone has any
> thoughts, that would be great if you could share your experiences.
>
> Wouldn't it be preferable to have Spark stop requesting containers if
> the cluster is at capacity rather than kill the job or error out?
>
> Does anyone have any recommendations on how to tweak the number of
> executors in an automated manner?
>
> Thanks,
> Calvin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to