Setting Executor memory

2015-09-14 Thread Thomas Gerber
Hello,

I was looking for guidelines on what value to set executor memory to
(via spark.executor.memory for example).

This seems to be important to avoid OOM during tasks, especially in no swap
environments (like AWS EMR clusters).

This setting is really about the executor JVM heap. Hence, in order to come
up with the maximal amount of heap memory for the executor, we need to list:
1. the memory taken by other processes (Worker in standalone mode, ...)
2. all off-heap allocations in the executor

Fortunately, for #1, we can just look at memory consumption without any
application running.

For #2, it is trickier. What I suspect we should account for:
a. thread stack size
b. akka buffers (via akka framesize & number of akka threads)
c. kryo buffers
d. shuffle buffers
(e. tachyon)

Could anyone shed some light on this? Maybe a formula? Or maybe swap should
actually be turned on, as a safeguard against OOMs?

Thanks


Re: Setting executor memory when using spark-shell

2014-06-06 Thread Oleg Proudnikov
Thank you, Andrew!


On 5 June 2014 23:14, Andrew Ash and...@andrewash.com wrote:

 Oh my apologies that was for 1.0

 For Spark 0.9 I did it like this:

 MASTER=spark://mymaster:7077 SPARK_MEM=8g ./bin/spark-shell -c
 $CORES_ACROSS_CLUSTER

 The downside of this though is that SPARK_MEM also sets the driver's JVM
 to be 8g, rather than just the executors.  I think this is the reason for
 why SPARK_MEM was deprecated.  See https://github.com/apache/spark/pull/99


 On Thu, Jun 5, 2014 at 2:37 PM, Oleg Proudnikov oleg.proudni...@gmail.com
  wrote:

 Thank you, Andrew,

 I am using Spark 0.9.1 and tried your approach like this:

 bin/spark-shell --driver-java-options
 -Dspark.executor.memory=$MEMORY_PER_EXECUTOR

 I get

 bad option: '--driver-java-options'

 There must be something different in my setup. Any ideas?

 Thank you again,
 Oleg





 On 5 June 2014 22:28, Andrew Ash and...@andrewash.com wrote:

 Hi Oleg,

 I set the size of my executors on a standalone cluster when using the
 shell like this:

 ./bin/spark-shell --master $MASTER --total-executor-cores
 $CORES_ACROSS_CLUSTER --driver-java-options
 -Dspark.executor.memory=$MEMORY_PER_EXECUTOR

 It doesn't seem particularly clean, but it works.

 Andrew


 On Thu, Jun 5, 2014 at 2:15 PM, Oleg Proudnikov 
 oleg.proudni...@gmail.com wrote:

 Hi All,

 Please help me set Executor JVM memory size. I am using Spark shell and
 it appears that the executors are started with a predefined JVM heap of
 512m as soon as Spark shell starts. How can I change this setting? I tried
 setting SPARK_EXECUTOR_MEMORY before launching Spark shell:

 export SPARK_EXECUTOR_MEMORY=1g

 I also tried several other approaches:

 1) setting SPARK_WORKER_MEMORY in conf/spark-env.sh on the worker
 2)  passing it as -m argument and running bin/start-slave.sh 1 -m 1g on
 the worker

 Thank you,
 Oleg





 --
 Kind regards,

 Oleg





-- 
Kind regards,

Oleg


Re: Setting executor memory when using spark-shell

2014-06-06 Thread Oleg Proudnikov
Thank you, Hassan!


On 6 June 2014 03:23, hassan hellfire...@gmail.com wrote:

 just use -Dspark.executor.memory=



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.




-- 
Kind regards,

Oleg


Re: Setting executor memory when using spark-shell

2014-06-06 Thread Patrick Wendell
In 1.0+ you can just pass the --executor-memory flag to ./bin/spark-shell.

On Fri, Jun 6, 2014 at 12:32 AM, Oleg Proudnikov
oleg.proudni...@gmail.com wrote:
 Thank you, Hassan!


 On 6 June 2014 03:23, hassan hellfire...@gmail.com wrote:

 just use -Dspark.executor.memory=



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.




 --
 Kind regards,

 Oleg



Re: Setting executor memory when using spark-shell

2014-06-06 Thread Oleg Proudnikov
Thank you, Patrick

I am planning to switch to 1.0 now.

By the way of feedback - I used Andrew's suggestion and found that it does
exactly that - sets Executor JVM heap - and nothing else. Workers have
already been started and when the shell starts, it is now able to control
Executor JVM heap.

Thank you again,
Oleg



On 6 June 2014 18:05, Patrick Wendell pwend...@gmail.com wrote:

 In 1.0+ you can just pass the --executor-memory flag to ./bin/spark-shell.

 On Fri, Jun 6, 2014 at 12:32 AM, Oleg Proudnikov
 oleg.proudni...@gmail.com wrote:
  Thank you, Hassan!
 
 
  On 6 June 2014 03:23, hassan hellfire...@gmail.com wrote:
 
  just use -Dspark.executor.memory=
 
 
 
  --
  View this message in context:
 
 http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
  Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 
 
 
  --
  Kind regards,
 
  Oleg
 




-- 
Kind regards,

Oleg


Setting executor memory when using spark-shell

2014-06-05 Thread Oleg Proudnikov
Hi All,

Please help me set Executor JVM memory size. I am using Spark shell and it
appears that the executors are started with a predefined JVM heap of 512m
as soon as Spark shell starts. How can I change this setting? I tried
setting SPARK_EXECUTOR_MEMORY before launching Spark shell:

export SPARK_EXECUTOR_MEMORY=1g

I also tried several other approaches:

1) setting SPARK_WORKER_MEMORY in conf/spark-env.sh on the worker
2)  passing it as -m argument and running bin/start-slave.sh 1 -m 1g on the
worker

Thank you,
Oleg


Re: Setting executor memory when using spark-shell

2014-06-05 Thread Andrew Ash
Hi Oleg,

I set the size of my executors on a standalone cluster when using the shell
like this:

./bin/spark-shell --master $MASTER --total-executor-cores
$CORES_ACROSS_CLUSTER --driver-java-options
-Dspark.executor.memory=$MEMORY_PER_EXECUTOR

It doesn't seem particularly clean, but it works.

Andrew


On Thu, Jun 5, 2014 at 2:15 PM, Oleg Proudnikov oleg.proudni...@gmail.com
wrote:

 Hi All,

 Please help me set Executor JVM memory size. I am using Spark shell and it
 appears that the executors are started with a predefined JVM heap of 512m
 as soon as Spark shell starts. How can I change this setting? I tried
 setting SPARK_EXECUTOR_MEMORY before launching Spark shell:

 export SPARK_EXECUTOR_MEMORY=1g

 I also tried several other approaches:

 1) setting SPARK_WORKER_MEMORY in conf/spark-env.sh on the worker
 2)  passing it as -m argument and running bin/start-slave.sh 1 -m 1g on
 the worker

 Thank you,
 Oleg




Re: Setting executor memory when using spark-shell

2014-06-05 Thread Oleg Proudnikov
Thank you, Andrew,

I am using Spark 0.9.1 and tried your approach like this:

bin/spark-shell --driver-java-options
-Dspark.executor.memory=$MEMORY_PER_EXECUTOR

I get

bad option: '--driver-java-options'

There must be something different in my setup. Any ideas?

Thank you again,
Oleg





On 5 June 2014 22:28, Andrew Ash and...@andrewash.com wrote:

 Hi Oleg,

 I set the size of my executors on a standalone cluster when using the
 shell like this:

 ./bin/spark-shell --master $MASTER --total-executor-cores
 $CORES_ACROSS_CLUSTER --driver-java-options
 -Dspark.executor.memory=$MEMORY_PER_EXECUTOR

 It doesn't seem particularly clean, but it works.

 Andrew


 On Thu, Jun 5, 2014 at 2:15 PM, Oleg Proudnikov oleg.proudni...@gmail.com
  wrote:

 Hi All,

 Please help me set Executor JVM memory size. I am using Spark shell and
 it appears that the executors are started with a predefined JVM heap of
 512m as soon as Spark shell starts. How can I change this setting? I tried
 setting SPARK_EXECUTOR_MEMORY before launching Spark shell:

 export SPARK_EXECUTOR_MEMORY=1g

 I also tried several other approaches:

 1) setting SPARK_WORKER_MEMORY in conf/spark-env.sh on the worker
 2)  passing it as -m argument and running bin/start-slave.sh 1 -m 1g on
 the worker

 Thank you,
 Oleg





-- 
Kind regards,

Oleg


Re: Setting executor memory when using spark-shell

2014-06-05 Thread Andrew Ash
Oh my apologies that was for 1.0

For Spark 0.9 I did it like this:

MASTER=spark://mymaster:7077 SPARK_MEM=8g ./bin/spark-shell -c
$CORES_ACROSS_CLUSTER

The downside of this though is that SPARK_MEM also sets the driver's JVM to
be 8g, rather than just the executors.  I think this is the reason for why
SPARK_MEM was deprecated.  See https://github.com/apache/spark/pull/99


On Thu, Jun 5, 2014 at 2:37 PM, Oleg Proudnikov oleg.proudni...@gmail.com
wrote:

 Thank you, Andrew,

 I am using Spark 0.9.1 and tried your approach like this:

 bin/spark-shell --driver-java-options
 -Dspark.executor.memory=$MEMORY_PER_EXECUTOR

 I get

 bad option: '--driver-java-options'

 There must be something different in my setup. Any ideas?

 Thank you again,
 Oleg





 On 5 June 2014 22:28, Andrew Ash and...@andrewash.com wrote:

 Hi Oleg,

 I set the size of my executors on a standalone cluster when using the
 shell like this:

 ./bin/spark-shell --master $MASTER --total-executor-cores
 $CORES_ACROSS_CLUSTER --driver-java-options
 -Dspark.executor.memory=$MEMORY_PER_EXECUTOR

 It doesn't seem particularly clean, but it works.

 Andrew


 On Thu, Jun 5, 2014 at 2:15 PM, Oleg Proudnikov 
 oleg.proudni...@gmail.com wrote:

 Hi All,

 Please help me set Executor JVM memory size. I am using Spark shell and
 it appears that the executors are started with a predefined JVM heap of
 512m as soon as Spark shell starts. How can I change this setting? I tried
 setting SPARK_EXECUTOR_MEMORY before launching Spark shell:

 export SPARK_EXECUTOR_MEMORY=1g

 I also tried several other approaches:

 1) setting SPARK_WORKER_MEMORY in conf/spark-env.sh on the worker
 2)  passing it as -m argument and running bin/start-slave.sh 1 -m 1g on
 the worker

 Thank you,
 Oleg





 --
 Kind regards,

 Oleg




Re: Setting executor memory when using spark-shell

2014-06-05 Thread hassan
just use -Dspark.executor.memory=



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.