Try to figure out what the env vars and arguments of the worker JVM and Python process are. Maybe you'll get a clue.
On Mon, Jul 4, 2016 at 11:42 AM Mathieu Longtin <math...@closetwork.org> wrote: > I started with a download of 1.6.0. These days, we use a self compiled > 1.6.2. > > On Mon, Jul 4, 2016 at 11:39 AM Ashwin Raaghav <ashraag...@gmail.com> > wrote: > >> I am thinking of any possibilities as to why this could be happening. If >> the cores are multi-threaded, should that affect the daemons? Your spark >> was built from source code or downloaded as a binary, though that should >> not technically change anything? >> >> On Mon, Jul 4, 2016 at 9:03 PM, Mathieu Longtin <math...@closetwork.org> >> wrote: >> >>> 1.6.1. >>> >>> I have no idea. SPARK_WORKER_CORES should do the same. >>> >>> On Mon, Jul 4, 2016 at 11:24 AM Ashwin Raaghav <ashraag...@gmail.com> >>> wrote: >>> >>>> Which version of Spark are you using? 1.6.1? >>>> >>>> Any ideas as to why it is not working in ours? >>>> >>>> On Mon, Jul 4, 2016 at 8:51 PM, Mathieu Longtin <math...@closetwork.org >>>> > wrote: >>>> >>>>> 16. >>>>> >>>>> On Mon, Jul 4, 2016 at 11:16 AM Ashwin Raaghav <ashraag...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I tried what you suggested and started the slave using the following >>>>>> command: >>>>>> >>>>>> start-slave.sh --cores 1 <master> >>>>>> >>>>>> But it still seems to start as many pyspark daemons as the number of >>>>>> cores in the node (1 parent and 3 workers). Limiting it via spark-env.sh >>>>>> file by giving SPARK_WORKER_CORES=1 also didn't help. >>>>>> >>>>>> When you said it helped you and limited it to 2 processes in your >>>>>> cluster, how many cores did each machine have? >>>>>> >>>>>> On Mon, Jul 4, 2016 at 8:22 PM, Mathieu Longtin < >>>>>> math...@closetwork.org> wrote: >>>>>> >>>>>>> It depends on what you want to do: >>>>>>> >>>>>>> If, on any given server, you don't want Spark to use more than one >>>>>>> core, use this to start the workers: SPARK_HOME/sbin/start-slave.sh >>>>>>> --cores=1 >>>>>>> >>>>>>> If you have a bunch of servers dedicated to Spark, but you don't >>>>>>> want a driver to use more than one core per server, then: >>>>>>> spark.executor.cores=1 >>>>>>> tells it not to use more than 1 core per server. However, it seems it >>>>>>> will >>>>>>> start as many pyspark as there are cores, but maybe not use them. >>>>>>> >>>>>>> On Mon, Jul 4, 2016 at 10:44 AM Ashwin Raaghav <ashraag...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Mathieu, >>>>>>>> >>>>>>>> Isn't that the same as setting "spark.executor.cores" to 1? And how >>>>>>>> can I specify "--cores=1" from the application? >>>>>>>> >>>>>>>> On Mon, Jul 4, 2016 at 8:06 PM, Mathieu Longtin < >>>>>>>> math...@closetwork.org> wrote: >>>>>>>> >>>>>>>>> When running the executor, put --cores=1. We use this and I only >>>>>>>>> see 2 pyspark process, one seem to be the parent of the other and is >>>>>>>>> idle. >>>>>>>>> >>>>>>>>> In your case, are all pyspark process working? >>>>>>>>> >>>>>>>>> On Mon, Jul 4, 2016 at 3:15 AM ar7 <ashraag...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am currently using PySpark 1.6.1 in my cluster. When a pyspark >>>>>>>>>> application >>>>>>>>>> is run, the load on the workers seems to go more than what was >>>>>>>>>> given. When I >>>>>>>>>> ran top, I noticed that there were too many Pyspark.daemons >>>>>>>>>> processes >>>>>>>>>> running. There was another mail thread regarding the same: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://mail-archives.apache.org/mod_mbox/spark-user/201606.mbox/%3ccao429hvi3drc-ojemue3x4q1vdzt61htbyeacagtre9yrhs...@mail.gmail.com%3E >>>>>>>>>> >>>>>>>>>> I followed what was mentioned there, i.e. reduced the number of >>>>>>>>>> executor >>>>>>>>>> cores and number of executors in one node to 1. But the number of >>>>>>>>>> pyspark.daemons process is still not coming down. It looks like >>>>>>>>>> initially >>>>>>>>>> there is one Pyspark.daemons process and this in turn spawns as >>>>>>>>>> many >>>>>>>>>> pyspark.daemons processes as the number of cores in the machine. >>>>>>>>>> >>>>>>>>>> Any help is appreciated :) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Ashwin Raaghav. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> View this message in context: >>>>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Limiting-Pyspark-daemons-tp27272.html >>>>>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>>>>> Nabble.com. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> Mathieu Longtin >>>>>>>>> 1-514-803-8977 >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Regards, >>>>>>>> >>>>>>>> Ashwin Raaghav >>>>>>>> >>>>>>> -- >>>>>>> Mathieu Longtin >>>>>>> 1-514-803-8977 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> >>>>>> Ashwin Raaghav >>>>>> >>>>> -- >>>>> Mathieu Longtin >>>>> 1-514-803-8977 >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> >>>> Ashwin Raaghav >>>> >>> -- >>> Mathieu Longtin >>> 1-514-803-8977 >>> >> >> >> >> -- >> Regards, >> >> Ashwin Raaghav >> > -- > Mathieu Longtin > 1-514-803-8977 > -- Mathieu Longtin 1-514-803-8977