subject:"How to set the degree of parallelism in Spark SQL\?"

Re: How to set the degree of parallelism in Spark SQL?

2016-05-26 Thread Mich Talebzadeh

Also worth adding that in standalone mode there is only one executor per spark-submit job. In Standalone cluster mode Spark allocates resources based on cores. By default, an application will grab all the cores in the cluster. You only have one worker that lives within the driver JVM process

Re: How to set the degree of parallelism in Spark SQL?

2016-05-26 Thread Ian

The number of executors is set when you launch the shell or an application with /spark-submit/. It's controlled by the /num-executors/ parameter: https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/. Important is also that cranking up the number may not

Re: How to set the degree of parallelism in Spark SQL?

2016-05-23 Thread Xinh Huynh

To the original question of parallelism and executors: you can have a parallelism of 200, even with 2 executors. In the Spark UI, you should see that the number of _tasks_ is 200 when your job involves shuffling. Executors vs. tasks: http://spark.apache.org/docs/latest/cluster-overview.html Xinh

Re: How to set the degree of parallelism in Spark SQL?

2016-05-23 Thread Mathieu Longtin

Since the default is 200, I would guess you're only running 2 executors. Try to verify how many executor you are actually running with the web interface (port 8080 where the master is running). On Sat, May 21, 2016 at 11:42 PM Ted Yu wrote: > Looks like an equal sign is

Re: How to set the degree of parallelism in Spark SQL?

2016-05-21 Thread Ted Yu

Looks like an equal sign is missing between partitions and 200. On Sat, May 21, 2016 at 8:31 PM, SRK wrote: > Hi, > > How to set the degree of parallelism in Spark SQL? I am using the following > but it somehow seems to allocate only two executors at a time. > >

How to set the degree of parallelism in Spark SQL?

2016-05-21 Thread SRK

Hi, How to set the degree of parallelism in Spark SQL? I am using the following but it somehow seems to allocate only two executors at a time. sqlContext.sql(" set spark.sql.shuffle.partitions 200 ") Thanks, Swetha -- View this message in context:

Re: How to set the degree of parallelism in Spark SQL?

Re: How to set the degree of parallelism in Spark SQL?

Re: How to set the degree of parallelism in Spark SQL?

Re: How to set the degree of parallelism in Spark SQL?

Re: How to set the degree of parallelism in Spark SQL?

How to set the degree of parallelism in Spark SQL?

6 matches

Site Navigation

Mail list logo

Footer information