Re: Why does this siimple spark program uses only one core?

Matei Zaharia Sun, 09 Nov 2014 15:45:06 -0800

Call getNumPartitions() on your RDD to make sure it has the right number of 
partitions. You can also specify it when doing parallelize, e.g.


rdd = sc.parallelize(xrange(1000), 10))

This should run in parallel if you have multiple partitions and cores, but it 
might be that during part of the process only one node (e.g. the master 
process) is doing anything.

Matei


> On Nov 9, 2014, at 9:27 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote:
> 
> You can set the following entry inside the conf/spark-defaults.conf file 
> 
> spark.cores.max 16
> 
> If you want to read the default value, then you can use the following api call
> 
> sc.defaultParallelism
> 
> where sc is your sparkContext object.
> 
> Thanks
> Best Regards
> 
> On Sun, Nov 9, 2014 at 6:48 PM, ReticulatedPython <person.of.b...@gmail.com 
> <mailto:person.of.b...@gmail.com>> wrote:
> So, I'm running this simple program on a 16 core multicore system. I run it
> by issuing the following.
> 
> spark-submit --master local[*] pi.py
> 
> And the code of that program is the following. When I use top to see CPU
> consumption, only 1 core is being utilized. Why is it so? Seconldy, spark
> documentation says that the default parallelism is contained in property
> spark.default.parallelism. How can I read this property from within my
> python program?
> 
> #"""pi.py"""
> from pyspark import SparkContext
> import random
> 
> NUM_SAMPLES = 12500000
> 
> def sample(p):
>     x, y = random.random(), random.random()
>     return 1 if x*x + y*y < 1 else 0
> 
> sc = SparkContext("local", "Test App")
> count = sc.parallelize(xrange(0, NUM_SAMPLES)).map(sample).reduce(lambda a,
> b: a + b)
> print "Pi is roughly %f" % (4.0 * count / NUM_SAMPLES)
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-this-siimple-spark-program-uses-only-one-core-tp18434.html
>  
> <http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-this-siimple-spark-program-uses-only-one-core-tp18434.html>
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org 
> <mailto:user-h...@spark.apache.org>
> 
>

Re: Why does this siimple spark program uses only one core?

Reply via email to