Also worth adding that in standalone mode there is only one executor per
spark-submit job.
In Standalone cluster mode Spark allocates resources based on cores. By
default, an application will grab all the cores in the cluster.
You only have one worker that lives within the driver JVM process
The number of executors is set when you launch the shell or an application
with /spark-submit/. It's controlled by the /num-executors/ parameter:
https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/.
Important is also that cranking up the number may not
To the original question of parallelism and executors: you can have a
parallelism of 200, even with 2 executors. In the Spark UI, you should see
that the number of _tasks_ is 200 when your job involves shuffling.
Executors vs. tasks:
http://spark.apache.org/docs/latest/cluster-overview.html
Xinh
Since the default is 200, I would guess you're only running 2 executors.
Try to verify how many executor you are actually running with the web
interface (port 8080 where the master is running).
On Sat, May 21, 2016 at 11:42 PM Ted Yu wrote:
> Looks like an equal sign is
Looks like an equal sign is missing between partitions and 200.
On Sat, May 21, 2016 at 8:31 PM, SRK wrote:
> Hi,
>
> How to set the degree of parallelism in Spark SQL? I am using the following
> but it somehow seems to allocate only two executors at a time.
>
>
Hi,
How to set the degree of parallelism in Spark SQL? I am using the following
but it somehow seems to allocate only two executors at a time.
sqlContext.sql(" set spark.sql.shuffle.partitions 200 ")
Thanks,
Swetha
--
View this message in context: