Re: How to specify executor memory in EC2 ?

2014-06-13 Thread Aliaksei Litouka
Aaron,
spark.executor.memory is set to 2454m in my spark-defaults.conf, which is a
reasonable value for EC2 instances which I use (they are m3.medium
machines). However, it doesn't help and each executor uses only 512 MB of
memory. To figure out why, I examined spark-submit and spark-class scripts
and found that java options and java memory size are computed in the
spark-class script (see OUR_JAVA_OPTS and OUR_JAVA_MEM variables in that
script). Then these values are used to compose the following string:

JAVA_OPTS=$JAVA_OPTS -Xms$OUR_JAVA_MEM -Xmx$OUR_JAVA_MEM

Note that OUR_JAVA_MEM is appended to the end of the string. For some
reason which I haven't found yet, OUR_JAVA_MEM is set to its default value
- 512 MB. I was able to fix it only by setting the SPARM_MEM variable in
the spark-env.sh file:

export SPARK_MEM=2g

However, this variable is deprecated, so my solution doesn't seem to be
good :)


On Thu, Jun 12, 2014 at 10:16 PM, Aaron Davidson ilike...@gmail.com wrote:

 The scripts for Spark 1.0 actually specify this property in
 /root/spark/conf/spark-defaults.conf

 I didn't know that this would override the --executor-memory flag, though,
 that's pretty odd.


 On Thu, Jun 12, 2014 at 6:02 PM, Aliaksei Litouka 
 aliaksei.lito...@gmail.com wrote:

 Yes, I am launching a cluster with the spark_ec2 script. I checked
 /root/spark/conf/spark-env.sh on the master node and on slaves and it looks
 like this:

 #!/usr/bin/env bash
 export SPARK_LOCAL_DIRS=/mnt/spark
 # Standalone cluster options
 export SPARK_MASTER_OPTS=
 export SPARK_WORKER_INSTANCES=1
 export SPARK_WORKER_CORES=1
 export HADOOP_HOME=/root/ephemeral-hdfs
 export SPARK_MASTER_IP=ec2-54-89-95-238.compute-1.amazonaws.com
 export MASTER=`cat /root/spark-ec2/cluster-url`
 export
 SPARK_SUBMIT_LIBRARY_PATH=$SPARK_SUBMIT_LIBRARY_PATH:/root/ephemeral-hdfs/lib/native/
 export
 SPARK_SUBMIT_CLASSPATH=$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:/root/ephemeral-hdfs/conf
 # Bind Spark's web UIs to this machine's public EC2 hostname:
 export SPARK_PUBLIC_DNS=`wget -q -O -
 http://169.254.169.254/latest/meta-data/public-hostname`
 http://169.254.169.254/latest/meta-data/public-hostname
 # Set a high ulimit for large shuffles
 ulimit -n 100


 None of these variables seem to be related to memory size. Let me know if
 I am missing something.


 On Thu, Jun 12, 2014 at 7:17 PM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 Are you launching this using our EC2 scripts? Or have you set up a
 cluster by hand?

 Matei

 On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka 
 aliaksei.lito...@gmail.com wrote:

 spark-env.sh doesn't seem to contain any settings related to memory size
 :( I will continue searching for a solution and will post it if I find it :)
 Thank you, anyway


 On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com
  wrote:

 It might be that conf/spark-env.sh on EC2 is configured to set it to
 512, and is overriding the application’s settings. Take a look in there and
 delete that line if possible.

 Matei

 On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka 
 aliaksei.lito...@gmail.com wrote:

  I am testing my application in EC2 cluster of m3.medium machines. By
 default, only 512 MB of memory on each machine is used. I want to increase
 this amount and I'm trying to do it by passing --executor-memory 2G option
 to the spark-submit script, but it doesn't seem to work - each machine uses
 only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I
 increase the amount of memory?








Re: How to specify executor memory in EC2 ?

2014-06-12 Thread Aliaksei Litouka
spark-env.sh doesn't seem to contain any settings related to memory size :(
I will continue searching for a solution and will post it if I find it :)
Thank you, anyway


On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com
wrote:

 It might be that conf/spark-env.sh on EC2 is configured to set it to 512,
 and is overriding the application’s settings. Take a look in there and
 delete that line if possible.

 Matei

 On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com
 wrote:

  I am testing my application in EC2 cluster of m3.medium machines. By
 default, only 512 MB of memory on each machine is used. I want to increase
 this amount and I'm trying to do it by passing --executor-memory 2G option
 to the spark-submit script, but it doesn't seem to work - each machine uses
 only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I
 increase the amount of memory?




Re: How to specify executor memory in EC2 ?

2014-06-12 Thread Matei Zaharia
Are you launching this using our EC2 scripts? Or have you set up a cluster by 
hand?

Matei

On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka aliaksei.lito...@gmail.com 
wrote:

 spark-env.sh doesn't seem to contain any settings related to memory size :( I 
 will continue searching for a solution and will post it if I find it :)
 Thank you, anyway
 
 
 On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com 
 wrote:
 It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and 
 is overriding the application’s settings. Take a look in there and delete 
 that line if possible.
 
 Matei
 
 On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com 
 wrote:
 
  I am testing my application in EC2 cluster of m3.medium machines. By 
  default, only 512 MB of memory on each machine is used. I want to increase 
  this amount and I'm trying to do it by passing --executor-memory 2G option 
  to the spark-submit script, but it doesn't seem to work - each machine uses 
  only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I 
  increase the amount of memory?
 
 



Re: How to specify executor memory in EC2 ?

2014-06-12 Thread Aaron Davidson
The scripts for Spark 1.0 actually specify this property in
/root/spark/conf/spark-defaults.conf

I didn't know that this would override the --executor-memory flag, though,
that's pretty odd.


On Thu, Jun 12, 2014 at 6:02 PM, Aliaksei Litouka 
aliaksei.lito...@gmail.com wrote:

 Yes, I am launching a cluster with the spark_ec2 script. I checked
 /root/spark/conf/spark-env.sh on the master node and on slaves and it looks
 like this:

 #!/usr/bin/env bash
 export SPARK_LOCAL_DIRS=/mnt/spark
 # Standalone cluster options
 export SPARK_MASTER_OPTS=
 export SPARK_WORKER_INSTANCES=1
 export SPARK_WORKER_CORES=1
 export HADOOP_HOME=/root/ephemeral-hdfs
 export SPARK_MASTER_IP=ec2-54-89-95-238.compute-1.amazonaws.com
 export MASTER=`cat /root/spark-ec2/cluster-url`
 export
 SPARK_SUBMIT_LIBRARY_PATH=$SPARK_SUBMIT_LIBRARY_PATH:/root/ephemeral-hdfs/lib/native/
 export
 SPARK_SUBMIT_CLASSPATH=$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:/root/ephemeral-hdfs/conf
 # Bind Spark's web UIs to this machine's public EC2 hostname:
 export SPARK_PUBLIC_DNS=`wget -q -O -
 http://169.254.169.254/latest/meta-data/public-hostname`
 http://169.254.169.254/latest/meta-data/public-hostname
 # Set a high ulimit for large shuffles
 ulimit -n 100


 None of these variables seem to be related to memory size. Let me know if
 I am missing something.


 On Thu, Jun 12, 2014 at 7:17 PM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 Are you launching this using our EC2 scripts? Or have you set up a
 cluster by hand?

 Matei

 On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka aliaksei.lito...@gmail.com
 wrote:

 spark-env.sh doesn't seem to contain any settings related to memory size
 :( I will continue searching for a solution and will post it if I find it :)
 Thank you, anyway


 On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com
 wrote:

 It might be that conf/spark-env.sh on EC2 is configured to set it to
 512, and is overriding the application’s settings. Take a look in there and
 delete that line if possible.

 Matei

 On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka 
 aliaksei.lito...@gmail.com wrote:

  I am testing my application in EC2 cluster of m3.medium machines. By
 default, only 512 MB of memory on each machine is used. I want to increase
 this amount and I'm trying to do it by passing --executor-memory 2G option
 to the spark-submit script, but it doesn't seem to work - each machine uses
 only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I
 increase the amount of memory?







Re: How to specify executor memory in EC2 ?

2014-06-10 Thread Matei Zaharia
It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and 
is overriding the application’s settings. Take a look in there and delete that 
line if possible.

Matei

On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com 
wrote:

 I am testing my application in EC2 cluster of m3.medium machines. By default, 
 only 512 MB of memory on each machine is used. I want to increase this amount 
 and I'm trying to do it by passing --executor-memory 2G option to the 
 spark-submit script, but it doesn't seem to work - each machine uses only 512 
 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the 
 amount of memory?