Re: worker_instances vs worker_cores

2014-10-20 Thread Andrew Ash
Hi Anny, SPARK_WORKER_INSTANCES is the number of copies of spark workers
running on a single box.  If you change the number you change how the
hardware you have is split up (useful for breaking large servers into 32GB
heaps each which perform better) but doesn't change the amount of hardware
you have.  Because the hardware's the same, you're not going to see huge
performance improvements unless you were in the huge heap scenario.

Typically you should configure the parameters so that SPARK_WORKER_CORES *
SPARK_WORKER_INSTANCES = the number of cores on your machine.  If you have
an 8 core box, then you should lower SPARK_WORKER_CORES as you raise
SPARK_WORKER_INSTANCES.

Cheers!
Andrew

On Mon, Oct 20, 2014 at 3:21 PM, anny9699 anny9...@gmail.com wrote:

 Hi,

 I have a question about the worker_instances setting and worker_cores
 setting in aws ec2 cluster. I understand it is a cluster and the default
 setting in the cluster is

 *SPARK_WORKER_CORES = 8
 SPARK_WORKER_INSTANCES = 1*

 However after I changed it to

 *SPARK_WORKER_CORES = 8
 SPARK_WORKER_INSTANCES = 8*

 Seems the speed doesn't change very much. Could anyone give an explanation
 about this? Maybe more details about work_cores vs worker_instances?

 Thanks a lot!
 Anny




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/worker-instances-vs-worker-cores-tp16855.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: worker_instances vs worker_cores

2014-10-20 Thread Anny Chen
Thanks a lot Andrew! Yeah I actually realized that later. I made a silly
mistake here.


On Mon, Oct 20, 2014 at 6:03 PM, Andrew Ash and...@andrewash.com wrote:

 Hi Anny, SPARK_WORKER_INSTANCES is the number of copies of spark workers
 running on a single box.  If you change the number you change how the
 hardware you have is split up (useful for breaking large servers into 32GB
 heaps each which perform better) but doesn't change the amount of hardware
 you have.  Because the hardware's the same, you're not going to see huge
 performance improvements unless you were in the huge heap scenario.

 Typically you should configure the parameters so that SPARK_WORKER_CORES *
 SPARK_WORKER_INSTANCES = the number of cores on your machine.  If you have
 an 8 core box, then you should lower SPARK_WORKER_CORES as you raise
 SPARK_WORKER_INSTANCES.

 Cheers!
 Andrew

 On Mon, Oct 20, 2014 at 3:21 PM, anny9699 anny9...@gmail.com wrote:

 Hi,

 I have a question about the worker_instances setting and worker_cores
 setting in aws ec2 cluster. I understand it is a cluster and the default
 setting in the cluster is

 *SPARK_WORKER_CORES = 8
 SPARK_WORKER_INSTANCES = 1*

 However after I changed it to

 *SPARK_WORKER_CORES = 8
 SPARK_WORKER_INSTANCES = 8*

 Seems the speed doesn't change very much. Could anyone give an explanation
 about this? Maybe more details about work_cores vs worker_instances?

 Thanks a lot!
 Anny




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/worker-instances-vs-worker-cores-tp16855.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org