Re: Limit pyspark.daemon threads

Jeff Zhang Wed, 15 Jun 2016 16:35:38 -0700

>>> I am seeing this issue too with pyspark (Using Spark 1.6.1).  I have
set spark.executor.cores to 1, but I see that whenever streaming batch
starts processing data, see python -m pyspark.daemon processes increase
gradually to about 5, (increasing CPU% on a box about 4-5 times, each
pyspark.daemon takes up around 100 % CPU)
>>> After the processing is done 4 pyspark.daemon processes go away and we
are left with one till the next batch run. Also sometimes the  CPU usage
for executor process spikes to about 800% even though spark.executor.core
is set to 1



As my understanding, each spark task consume at most 1 python process.  In
this case (spark.executor.cores=1), there should be only at most 1 python
process for each executor. And here's 4 python processes, I suspect there's
at least 4 executors on this machine. Could you check that ?

On Thu, Jun 16, 2016 at 6:50 AM, Sudhir Babu Pothineni <
sbpothin...@gmail.com> wrote:

> Hi Ken, It may be also related to Grid Engine job scheduling? If it is 16
> core (virtual cores?), grid engine allocates 16 slots, If you use 'max'
> scheduling, it will send 16 processes sequentially to same machine, on the
> top of it each spark job has its own executors. Limit the number of jobs
> scheduled to the machine = number of physical cores of single CPU, it will
> solve the problem if it is related to GE. If you are sure it's related to
> Spark, please ignore.
>
> -Sudhir
>
>
> Sent from my iPhone
>
> On Jun 15, 2016, at 8:53 AM, Gene Pang <gene.p...@gmail.com> wrote:
>
> As Sven mentioned, you can use Alluxio to store RDDs in off-heap memory,
> and you can then share that RDD across different jobs. If you would like to
> run Spark on Alluxio, this documentation can help:
> http://www.alluxio.org/documentation/master/en/Running-Spark-on-Alluxio.html
>
> Thanks,
> Gene
>
> On Tue, Jun 14, 2016 at 12:44 AM, agateaaa <agate...@gmail.com> wrote:
>
>> Hi,
>>
>> I am seeing this issue too with pyspark (Using Spark 1.6.1).  I have set
>> spark.executor.cores to 1, but I see that whenever streaming batch starts
>> processing data, see python -m pyspark.daemon processes increase gradually
>> to about 5, (increasing CPU% on a box about 4-5 times, each pyspark.daemon
>> takes up around 100 % CPU)
>>
>> After the processing is done 4 pyspark.daemon processes go away and we
>> are left with one till the next batch run. Also sometimes the  CPU usage
>> for executor process spikes to about 800% even though spark.executor.core
>> is set to 1
>>
>> e.g. top output
>> PID USER      PR   NI  VIRT  RES  SHR S       %CPU %MEM    TIME+  COMMAND
>> 19634 spark     20   0 8871420 1.790g  32056 S 814.1  2.9   0:39.33
>> /usr/lib/j+ <--EXECUTOR
>>
>> 13897 spark     20   0   46576  17916   6720 S   100.0  0.0   0:00.17
>> python -m + <--pyspark.daemon
>> 13991 spark     20   0   46524  15572   4124 S   98.0  0.0   0:08.18
>> python -m + <--pyspark.daemon
>> 14488 spark     20   0   46524  15636   4188 S   98.0  0.0   0:07.25
>> python -m + <--pyspark.daemon
>> 14514 spark     20   0   46524  15636   4188 S   94.0  0.0   0:06.72
>> python -m + <--pyspark.daemon
>> 14526 spark     20   0   48200  17172   4092 S   0.0  0.0   0:00.38
>> python -m + <--pyspark.daemon
>>
>>
>>
>> Is there any way to control the number of pyspark.daemon processes that
>> get spawned ?
>>
>> Thank you
>> Agateaaa
>>
>> On Sun, Mar 27, 2016 at 1:08 AM, Sven Krasser <kras...@gmail.com> wrote:
>>
>>> Hey Ken,
>>>
>>> 1. You're correct, cached RDDs live on the JVM heap. (There's an
>>> off-heap storage option using Alluxio, formerly Tachyon, with which I have
>>> no experience however.)
>>>
>>> 2. The worker memory setting is not a hard maximum unfortunately. What
>>> happens is that during aggregation the Python daemon will check its process
>>> size. If the size is larger than this setting, it will start spilling to
>>> disk. I've seen many occasions where my daemons grew larger. Also, you're
>>> relying on Python's memory management to free up space again once objects
>>> are evicted. In practice, leave this setting reasonably small but make sure
>>> there's enough free memory on the machine so you don't run into OOM
>>> conditions. If the lower memory setting causes strains for your users, make
>>> sure they increase the parallelism of their jobs (smaller partitions
>>> meaning less data is processed at a time).
>>>
>>> 3. I believe that is the behavior you can expect when setting
>>> spark.executor.cores. I've not experimented much with it and haven't looked
>>> at that part of the code, but what you describe also reflects my
>>> understanding. Please share your findings here, I'm sure those will be very
>>> helpful to others, too.
>>>
>>> One more suggestion for your users is to move to the Pyspark DataFrame
>>> API. Much of the processing will then happen in the JVM, and you will bump
>>> into fewer Python resource contention issues.
>>>
>>> Best,
>>> -Sven
>>>
>>>
>>> On Sat, Mar 26, 2016 at 1:38 PM, Carlile, Ken <carli...@janelia.hhmi.org
>>> > wrote:
>>>
>>>> This is extremely helpful!
>>>>
>>>> I’ll have to talk to my users about how the python memory limit should
>>>> be adjusted and what their expectations are. I’m fairly certain we bumped
>>>> it up in the dark past when jobs were failing because of insufficient
>>>> memory for the python processes.
>>>>
>>>> So just to make sure I’m understanding correctly:
>>>>
>>>>
>>>>    - JVM memory (set by SPARK_EXECUTOR_MEMORY and/or
>>>>    SPARK_WORKER_MEMORY?) is where the RDDs are stored. Currently both of 
>>>> those
>>>>    values are set to 90GB
>>>>    - spark.python.worker.memory controls how much RAM each python task
>>>>    can take maximum (roughly speaking. Currently set to 4GB
>>>>    - spark.task.cpus controls how many java worker threads will exist
>>>>    and thus indirectly how many pyspark daemon processes will exist
>>>>
>>>>
>>>> I’m also looking into fixing my cron jobs so they don’t stack up by
>>>> implementing flock in the jobs and changing how teardowns of the spark
>>>> cluster work as far as failed workers.
>>>>
>>>> Thanks again,
>>>> —Ken
>>>>
>>>> On Mar 26, 2016, at 4:08 PM, Sven Krasser <kras...@gmail.com> wrote:
>>>>
>>>> My understanding is that the spark.executor.cores setting controls the
>>>> number of worker threads in the executor in the JVM. Each worker thread
>>>> communicates then with a pyspark daemon process (these are not threads) to
>>>> stream data into Python. There should be one daemon process per worker
>>>> thread (but as I mentioned I sometimes see a low multiple).
>>>>
>>>> Your 4GB limit for Python is fairly high, that means even for 12
>>>> workers you're looking at a max of 48GB (and it goes frequently beyond
>>>> that). You will be better off using a lower number there and instead
>>>> increasing the parallelism of your job (i.e. dividing the job into more and
>>>> smaller partitions).
>>>>
>>>> On Sat, Mar 26, 2016 at 7:10 AM, Carlile, Ken <
>>>> carli...@janelia.hhmi.org> wrote:
>>>>
>>>>> Thanks, Sven!
>>>>>
>>>>> I know that I’ve messed up the memory allocation, but I’m trying not
>>>>> to think too much about that (because I’ve advertised it to my users as
>>>>> “90GB for Spark works!” and that’s how it displays in the Spark UI 
>>>>> (totally
>>>>> ignoring the python processes). So I’ll need to deal with that at some
>>>>> point… esp since I’ve set the max python memory usage to 4GB to work 
>>>>> around
>>>>> other issues!
>>>>>
>>>>> The load issue comes in because we have a lot of background cron jobs
>>>>> (mostly to clean up after spark…), and those will stack up behind the high
>>>>> load and keep stacking until the whole thing comes crashing down. I will
>>>>> look into how to avoid this stacking, as I think one of my predecessors 
>>>>> had
>>>>> a way, but that’s why the high load nukes the nodes. I don’t have the
>>>>> spark.executor.cores set, but will setting that to say, 12 limit the
>>>>> pyspark threads, or will it just limit the jvm threads?
>>>>>
>>>>> Thanks!
>>>>> Ken
>>>>>
>>>>> On Mar 25, 2016, at 9:10 PM, Sven Krasser <kras...@gmail.com> wrote:
>>>>>
>>>>> Hey Ken,
>>>>>
>>>>> I also frequently see more pyspark daemons than configured
>>>>> concurrency, often it's a low multiple. (There was an issue pre-1.3.0 that
>>>>> caused this to be quite a bit higher, so make sure you at least have a
>>>>> recent version; see SPARK-5395.)
>>>>>
>>>>> Each pyspark daemon tries to stay below the configured memory limit
>>>>> during aggregation (which is separate from the JVM heap as you note). 
>>>>> Since
>>>>> the number of daemons can be high and the memory limit is per daemon (each
>>>>> daemon is actually a process and not a thread and therefore has its own
>>>>> memory it tracks against the configured per-worker limit), I found memory
>>>>> depletion to be the main source of pyspark problems on larger data sets.
>>>>> Also, as Sea already noted the memory limit is not firm and individual
>>>>> daemons can grow larger.
>>>>>
>>>>> With that said, a run queue of 25 on a 16 core machine does not sound
>>>>> great but also not awful enough to knock it offline. I suspect something
>>>>> else may be going on. If you want to limit the amount of work running
>>>>> concurrently, try reducing spark.executor.cores (under normal 
>>>>> circumstances
>>>>> this would leave parts of your resources underutilized).
>>>>>
>>>>> Hope this helps!
>>>>> -Sven
>>>>>
>>>>>
>>>>> On Fri, Mar 25, 2016 at 10:41 AM, Carlile, Ken <
>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>
>>>>>> Further data on this.
>>>>>> I’m watching another job right now where there are 16 pyspark.daemon
>>>>>> threads, all of which are trying to get a full core (remember, this is a 
>>>>>> 16
>>>>>> core machine). Unfortunately , the java process actually running the 
>>>>>> spark
>>>>>> worker is trying to take several cores of its own, driving the load up. 
>>>>>> I’m
>>>>>> hoping someone has seen something like this.
>>>>>>
>>>>>> —Ken
>>>>>>
>>>>>> On Mar 21, 2016, at 3:07 PM, Carlile, Ken <carli...@janelia.hhmi.org>
>>>>>> wrote:
>>>>>>
>>>>>> No further input on this? I discovered today that the pyspark.daemon
>>>>>> threadcount was actually 48, which makes a little more sense (at least 
>>>>>> it’s
>>>>>> a multiple of 16), and it seems to be happening at reduce and collect
>>>>>> portions of the code.
>>>>>>
>>>>>> —Ken
>>>>>>
>>>>>> On Mar 17, 2016, at 10:51 AM, Carlile, Ken <carli...@janelia.hhmi.org>
>>>>>> wrote:
>>>>>>
>>>>>> Thanks! I found that part just after I sent the email… whoops. I’m
>>>>>> guessing that’s not an issue for my users, since it’s been set that way 
>>>>>> for
>>>>>> a couple of years now.
>>>>>>
>>>>>> The thread count is definitely an issue, though, since if enough
>>>>>> nodes go down, they can’t schedule their spark clusters.
>>>>>>
>>>>>> —Ken
>>>>>>
>>>>>> On Mar 17, 2016, at 10:50 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>
>>>>>> I took a look at docs/configuration.md
>>>>>> Though I didn't find answer for your first question, I think the
>>>>>> following pertains to your second question:
>>>>>>
>>>>>> <tr>
>>>>>>   <td><code>spark.python.worker.memory</code></td>
>>>>>>   <td>512m</td>
>>>>>>   <td>
>>>>>>     Amount of memory to use per python worker process during
>>>>>> aggregation, in the same
>>>>>>     format as JVM memory strings (e.g. <code>512m</code>,
>>>>>> <code>2g</code>). If the memory
>>>>>>     used during aggregation goes above this amount, it will spill the
>>>>>> data into disks.
>>>>>>   </td>
>>>>>> </tr>
>>>>>>
>>>>>> On Thu, Mar 17, 2016 at 7:43 AM, Carlile, Ken <
>>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> We have an HPC cluster that we run Spark jobs on using standalone
>>>>>>> mode and a number of scripts I’ve built up to dynamically schedule and
>>>>>>> start spark clusters within the Grid Engine framework. Nodes in the 
>>>>>>> cluster
>>>>>>> have 16 cores and 128GB of RAM.
>>>>>>>
>>>>>>> My users use pyspark heavily. We’ve been having a number of problems
>>>>>>> with nodes going offline with extraordinarily high load. I was able to 
>>>>>>> look
>>>>>>> at one of those nodes today before it went truly sideways, and I 
>>>>>>> discovered
>>>>>>> that the user was running 50 pyspark.daemon threads (remember, this is 
>>>>>>> a 16
>>>>>>> core box), and the load was somewhere around 25 or so, with all CPUs 
>>>>>>> maxed
>>>>>>> out at 100%.
>>>>>>>
>>>>>>> So while the spark worker is aware it’s only got 16 cores and
>>>>>>> behaves accordingly, pyspark seems to be happy to overrun everything 
>>>>>>> like
>>>>>>> crazy. Is there a global parameter I can use to limit pyspark threads 
>>>>>>> to a
>>>>>>> sane number, say 15 or 16? It would also be interesting to set a memory
>>>>>>> limit, which leads to another question.
>>>>>>>
>>>>>>> How is memory managed when pyspark is used? I have the spark worker
>>>>>>> memory set to 90GB, and there is 8GB of system overhead (GPFS caching), 
>>>>>>> so
>>>>>>> if pyspark operates outside of the JVM memory pool, that leaves it at 
>>>>>>> most
>>>>>>> 30GB to play with, assuming there is no overhead outside the JVM’s 90GB
>>>>>>> heap (ha ha.)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ken Carlile
>>>>>>> Sr. Unix Engineer
>>>>>>> HHMI/Janelia Research Campus
>>>>>>> 571-209-4363
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Т���������������������������������������������������������������������ХF�
>>>>>> V�7V'67&�&R� R�� �â W6W"�V�7V'67&�&T 7 &�� 6�R��&pФf�" FF�F��� � 6��� 
>>>>>> �G2�
>>>>>> R�� �â W6W"ֆV� 7 &�� 6�R��&pР
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For
>>>>>> additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>
>>
>>
>


-- 
Best Regards

Jeff Zhang

Re: Limit pyspark.daemon threads

Reply via email to