Hi, I tried what you suggested and started the slave using the following command:
start-slave.sh --cores 1 <master> But it still seems to start as many pyspark daemons as the number of cores in the node (1 parent and 3 workers). Limiting it via spark-env.sh file by giving SPARK_WORKER_CORES=1 also didn't help. When you said it helped you and limited it to 2 processes in your cluster, how many cores did each machine have? On Mon, Jul 4, 2016 at 8:22 PM, Mathieu Longtin <math...@closetwork.org> wrote: > It depends on what you want to do: > > If, on any given server, you don't want Spark to use more than one core, > use this to start the workers: SPARK_HOME/sbin/start-slave.sh --cores=1 > > If you have a bunch of servers dedicated to Spark, but you don't want a > driver to use more than one core per server, then: spark.executor.cores=1 > tells it not to use more than 1 core per server. However, it seems it will > start as many pyspark as there are cores, but maybe not use them. > > On Mon, Jul 4, 2016 at 10:44 AM Ashwin Raaghav <ashraag...@gmail.com> > wrote: > >> Hi Mathieu, >> >> Isn't that the same as setting "spark.executor.cores" to 1? And how can I >> specify "--cores=1" from the application? >> >> On Mon, Jul 4, 2016 at 8:06 PM, Mathieu Longtin <math...@closetwork.org> >> wrote: >> >>> When running the executor, put --cores=1. We use this and I only see 2 >>> pyspark process, one seem to be the parent of the other and is idle. >>> >>> In your case, are all pyspark process working? >>> >>> On Mon, Jul 4, 2016 at 3:15 AM ar7 <ashraag...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I am currently using PySpark 1.6.1 in my cluster. When a pyspark >>>> application >>>> is run, the load on the workers seems to go more than what was given. >>>> When I >>>> ran top, I noticed that there were too many Pyspark.daemons processes >>>> running. There was another mail thread regarding the same: >>>> >>>> >>>> https://mail-archives.apache.org/mod_mbox/spark-user/201606.mbox/%3ccao429hvi3drc-ojemue3x4q1vdzt61htbyeacagtre9yrhs...@mail.gmail.com%3E >>>> >>>> I followed what was mentioned there, i.e. reduced the number of executor >>>> cores and number of executors in one node to 1. But the number of >>>> pyspark.daemons process is still not coming down. It looks like >>>> initially >>>> there is one Pyspark.daemons process and this in turn spawns as many >>>> pyspark.daemons processes as the number of cores in the machine. >>>> >>>> Any help is appreciated :) >>>> >>>> Thanks, >>>> Ashwin Raaghav. >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Limiting-Pyspark-daemons-tp27272.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> >>>> -- >>> Mathieu Longtin >>> 1-514-803-8977 >>> >> >> >> >> -- >> Regards, >> >> Ashwin Raaghav >> > -- > Mathieu Longtin > 1-514-803-8977 > -- Regards, Ashwin Raaghav