Re: How to specify executor memory in EC2 ?
Aaron, spark.executor.memory is set to 2454m in my spark-defaults.conf, which is a reasonable value for EC2 instances which I use (they are m3.medium machines). However, it doesn't help and each executor uses only 512 MB of memory. To figure out why, I examined spark-submit and spark-class scripts and found that java options and java memory size are computed in the spark-class script (see OUR_JAVA_OPTS and OUR_JAVA_MEM variables in that script). Then these values are used to compose the following string: JAVA_OPTS=$JAVA_OPTS -Xms$OUR_JAVA_MEM -Xmx$OUR_JAVA_MEM Note that OUR_JAVA_MEM is appended to the end of the string. For some reason which I haven't found yet, OUR_JAVA_MEM is set to its default value - 512 MB. I was able to fix it only by setting the SPARM_MEM variable in the spark-env.sh file: export SPARK_MEM=2g However, this variable is deprecated, so my solution doesn't seem to be good :) On Thu, Jun 12, 2014 at 10:16 PM, Aaron Davidson ilike...@gmail.com wrote: The scripts for Spark 1.0 actually specify this property in /root/spark/conf/spark-defaults.conf I didn't know that this would override the --executor-memory flag, though, that's pretty odd. On Thu, Jun 12, 2014 at 6:02 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: Yes, I am launching a cluster with the spark_ec2 script. I checked /root/spark/conf/spark-env.sh on the master node and on slaves and it looks like this: #!/usr/bin/env bash export SPARK_LOCAL_DIRS=/mnt/spark # Standalone cluster options export SPARK_MASTER_OPTS= export SPARK_WORKER_INSTANCES=1 export SPARK_WORKER_CORES=1 export HADOOP_HOME=/root/ephemeral-hdfs export SPARK_MASTER_IP=ec2-54-89-95-238.compute-1.amazonaws.com export MASTER=`cat /root/spark-ec2/cluster-url` export SPARK_SUBMIT_LIBRARY_PATH=$SPARK_SUBMIT_LIBRARY_PATH:/root/ephemeral-hdfs/lib/native/ export SPARK_SUBMIT_CLASSPATH=$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:/root/ephemeral-hdfs/conf # Bind Spark's web UIs to this machine's public EC2 hostname: export SPARK_PUBLIC_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname` http://169.254.169.254/latest/meta-data/public-hostname # Set a high ulimit for large shuffles ulimit -n 100 None of these variables seem to be related to memory size. Let me know if I am missing something. On Thu, Jun 12, 2014 at 7:17 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Are you launching this using our EC2 scripts? Or have you set up a cluster by hand? Matei On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: spark-env.sh doesn't seem to contain any settings related to memory size :( I will continue searching for a solution and will post it if I find it :) Thank you, anyway On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com wrote: It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and is overriding the application’s settings. Take a look in there and delete that line if possible. Matei On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: I am testing my application in EC2 cluster of m3.medium machines. By default, only 512 MB of memory on each machine is used. I want to increase this amount and I'm trying to do it by passing --executor-memory 2G option to the spark-submit script, but it doesn't seem to work - each machine uses only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the amount of memory?
Re: How to specify executor memory in EC2 ?
spark-env.sh doesn't seem to contain any settings related to memory size :( I will continue searching for a solution and will post it if I find it :) Thank you, anyway On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com wrote: It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and is overriding the application’s settings. Take a look in there and delete that line if possible. Matei On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: I am testing my application in EC2 cluster of m3.medium machines. By default, only 512 MB of memory on each machine is used. I want to increase this amount and I'm trying to do it by passing --executor-memory 2G option to the spark-submit script, but it doesn't seem to work - each machine uses only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the amount of memory?
Re: How to specify executor memory in EC2 ?
Are you launching this using our EC2 scripts? Or have you set up a cluster by hand? Matei On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: spark-env.sh doesn't seem to contain any settings related to memory size :( I will continue searching for a solution and will post it if I find it :) Thank you, anyway On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com wrote: It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and is overriding the application’s settings. Take a look in there and delete that line if possible. Matei On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: I am testing my application in EC2 cluster of m3.medium machines. By default, only 512 MB of memory on each machine is used. I want to increase this amount and I'm trying to do it by passing --executor-memory 2G option to the spark-submit script, but it doesn't seem to work - each machine uses only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the amount of memory?
Re: How to specify executor memory in EC2 ?
The scripts for Spark 1.0 actually specify this property in /root/spark/conf/spark-defaults.conf I didn't know that this would override the --executor-memory flag, though, that's pretty odd. On Thu, Jun 12, 2014 at 6:02 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: Yes, I am launching a cluster with the spark_ec2 script. I checked /root/spark/conf/spark-env.sh on the master node and on slaves and it looks like this: #!/usr/bin/env bash export SPARK_LOCAL_DIRS=/mnt/spark # Standalone cluster options export SPARK_MASTER_OPTS= export SPARK_WORKER_INSTANCES=1 export SPARK_WORKER_CORES=1 export HADOOP_HOME=/root/ephemeral-hdfs export SPARK_MASTER_IP=ec2-54-89-95-238.compute-1.amazonaws.com export MASTER=`cat /root/spark-ec2/cluster-url` export SPARK_SUBMIT_LIBRARY_PATH=$SPARK_SUBMIT_LIBRARY_PATH:/root/ephemeral-hdfs/lib/native/ export SPARK_SUBMIT_CLASSPATH=$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:/root/ephemeral-hdfs/conf # Bind Spark's web UIs to this machine's public EC2 hostname: export SPARK_PUBLIC_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname` http://169.254.169.254/latest/meta-data/public-hostname # Set a high ulimit for large shuffles ulimit -n 100 None of these variables seem to be related to memory size. Let me know if I am missing something. On Thu, Jun 12, 2014 at 7:17 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Are you launching this using our EC2 scripts? Or have you set up a cluster by hand? Matei On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: spark-env.sh doesn't seem to contain any settings related to memory size :( I will continue searching for a solution and will post it if I find it :) Thank you, anyway On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia matei.zaha...@gmail.com wrote: It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and is overriding the application’s settings. Take a look in there and delete that line if possible. Matei On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: I am testing my application in EC2 cluster of m3.medium machines. By default, only 512 MB of memory on each machine is used. I want to increase this amount and I'm trying to do it by passing --executor-memory 2G option to the spark-submit script, but it doesn't seem to work - each machine uses only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the amount of memory?
Re: How to specify executor memory in EC2 ?
It might be that conf/spark-env.sh on EC2 is configured to set it to 512, and is overriding the application’s settings. Take a look in there and delete that line if possible. Matei On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka aliaksei.lito...@gmail.com wrote: I am testing my application in EC2 cluster of m3.medium machines. By default, only 512 MB of memory on each machine is used. I want to increase this amount and I'm trying to do it by passing --executor-memory 2G option to the spark-submit script, but it doesn't seem to work - each machine uses only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I increase the amount of memory?