When running the executor, put --cores=1. We use this and I only see 2 pyspark process, one seem to be the parent of the other and is idle.
In your case, are all pyspark process working? On Mon, Jul 4, 2016 at 3:15 AM ar7 <ashraag...@gmail.com> wrote: > Hi, > > I am currently using PySpark 1.6.1 in my cluster. When a pyspark > application > is run, the load on the workers seems to go more than what was given. When > I > ran top, I noticed that there were too many Pyspark.daemons processes > running. There was another mail thread regarding the same: > > > https://mail-archives.apache.org/mod_mbox/spark-user/201606.mbox/%3ccao429hvi3drc-ojemue3x4q1vdzt61htbyeacagtre9yrhs...@mail.gmail.com%3E > > I followed what was mentioned there, i.e. reduced the number of executor > cores and number of executors in one node to 1. But the number of > pyspark.daemons process is still not coming down. It looks like initially > there is one Pyspark.daemons process and this in turn spawns as many > pyspark.daemons processes as the number of cores in the machine. > > Any help is appreciated :) > > Thanks, > Ashwin Raaghav. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Limiting-Pyspark-daemons-tp27272.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Mathieu Longtin 1-514-803-8977