[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

Kyle Stanley Mon, 09 Dec 2019 20:23:27 -0800

> This logic doesn't seem much different than would be for coroutines...
Just need to wait for larger systems...


> With 100k threads started we're only using 8G memory, there are plenty of
systems today with more than 80G of RAM

Well, either way, I think it's still a solid argument against imposing the
1M limit on coroutines. Arguing in favor or against 1M OS threads wasn't
the primary message I was trying to convey, it was just to demonstrate that
1M coroutines could be created and awaited concurrently on most current
systems (w/ ~1.3GB+ available RAM and virtual memory).

But I think there's a reasonable question of practicality when it comes to
running 1M OS threads simultaneously. For especially high volumes of
concurrent tasks, OS threads are generally not the best solution (for
CPython, at least). They work for handling a decent number IO-bound tasks
such as sending out and processing network requests, but coroutine objects
are significantly more efficient when it comes to memory usage.

For the usage of child processes, we have watcher implementations that
don't use OS threads at all, such as the recently added PidfdChildWatcher (
https://docs.python.org/3.9/library/asyncio-policy.html#asyncio.PidfdChildWatcher).
There are also others that don't spawn a new thread per process.

That being said, you are correct in that at some point, the memory usage
for running 1M simultaneous OS threads will be perfectly reasonable. I'm
just not sure if there's a practical reason to do so, considering that more
efficient means of implementing parallelism are available when memory usage
becomes a significant concern.

Of course, the main question here is: "What benefit would imposing this
particular limit to either coroutines objects or OS threads provide?".

Personally, I'm not entirely convinced that placing a hard limit of 1M at
once for either would result in a significant benefit to performance,
efficiency, or security (mentioned in the PEP, as a reason for imposing the
limits). I could see it being more useful for other areas though, such as
lines of code or bytecode instructions per object.

I just think that placing a limit of 1M on current coroutine objects would
not be reasonable. But between the two, I think a limit of 1M on OS threads
is *more* reasonable in comparison.


On Mon, Dec 9, 2019 at 10:17 PM Khazhismel Kumykov <[email protected]> wrote:

>
>
> On Mon, Dec 9, 2019, 18:48 Kyle Stanley <[email protected]> wrote:
>
>> > (b) Why limit coroutines? It's just another Python object and has no
>> operating resources associated with it. Perhaps your definition of
>> coroutine is different, and you are thinking of OS threads?
>>
>> This was my primary concern with the proposed PEP. At the moment, it's
>> rather trivial to create one million coroutines, and the total memory taken
>> up by each individual coroutine object is very minimal compared to each OS
>> thread.
>>
>> There's also a practical use case for having a large number of coroutine
>> objects, such as for asynchronously:
>>
>> 1) Handling a large number of concurrent clients on a continuously
>> running web server that receives a significant amount of traffic.
>> 2) Sending a large number of concurrent database transactions to run on a
>> cluster of database servers.
>>
>> I don't know that anyone is currently using production code that results
>> in 1 million coroutine objects within the same interpreter at once, but
>> something like this definitely scales over time. Arbitrarily placing a
>> limit on the total number of coroutine objects doesn't make sense to me for
>> that reason.
>>
>> OS threads on the other hand take significantly more memory. From a
>> recent (but entirely unrelated) discussion where the memory usage of
>> threads was brought up, Victor Stinner wrote a program that demonstrated
>> that each OS thread takes up approximately ~13.2kB on Linux, which I
>> verified on kernel version 5.3.8. See https://bugs.python.org/msg356596.
>>
>> For comparison, I just wrote a similar program to compare the memory
>> usage between 1M threads and 1M coroutines:
>>
>> ```
>> import asyncio
>> import threading
>> import sys
>> import os
>>
>> def wait(event):
>>     event.wait()
>>
>> class Thread(threading.Thread):
>>     def __init__(self):
>>         super().__init__()
>>         self.stop_event = threading.Event()
>>         self.started_event = threading.Event()
>>
>>     def run(self):
>>         self.started_event.set()
>>         self.stop_event.wait()
>>
>>     def stop(self):
>>         self.stop_event.set()
>>         self.join()
>>
>> def display_rss():
>>     os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
>>
>> async def test_mem_coros(count):
>>     print("Coroutine memory usage before:")
>>     display_rss()
>>     coros = tuple(asyncio.sleep(0) for _ in range(count))
>>     print("Coroutine memory usage after creation:")
>>     display_rss()
>>     await asyncio.gather(*coros)
>>     print("Coroutine memory usage after awaiting:")
>>     display_rss()
>>
>> def test_mem_threads(count):
>>     print("Thread memory usage before:")
>>     display_rss()
>>     threads = tuple(Thread() for _ in range(count))
>>     print("Thread memory usage after creation:")
>>     display_rss()
>>     for thread in threads:
>>         thread.start()
>>     print("Thread memory usage after starting:")
>>     for thread in threads:
>>         thread.run()
>>     print("Thread memory usage after running:")
>>     display_rss()
>>     for thread in threads:
>>         thread.stop()
>>     print("Thread memory usage after stopping:")
>>     display_rss()
>>
>> if __name__ == '__main__':
>>     count = 1_000_000
>>     arg = sys.argv[1]
>>     if arg == 'threads':
>>         test_mem_threads(count)
>>     if arg == 'coros':
>>         asyncio.run(test_mem_coros(count))
>>
>> ```
>> Here are the results:
>>
>> 1M coroutine objects:
>>
>> Coroutine memory usage before:
>> VmRSS:     14800 kB
>> Coroutine memory usage after creation:
>> VmRSS:    651916 kB
>> Coroutine memory usage after awaiting:
>> VmRSS:   1289528 kB
>>
>> 1M OS threads:
>>
>> Thread memory usage before:
>> VmRSS:     14816 kB
>> Thread memory usage after creation:
>> VmRSS:   4604356 kB
>> Traceback (most recent call last):
>>   File "temp.py", line 60, in <module>
>>     test_mem_threads(count)
>>   File "temp.py", line 44, in test_mem_threads
>>     thread.start()
>>   File "/usr/lib/python3.8/threading.py", line 852, in start
>>     _start_new_thread(self._bootstrap, ())
>> RuntimeError: can't start new thread
>>
>> (Python version: 3.8)
>> (Linux kernel version: 5.13)
>>
>> As is present in the results above, 1M OS threads can't even be ran at
>> once, and the memory taken up just to create the 1M threads is ~3.6x more
>> than it costs to concurrently await the 1M coroutine objects. Based on
>> that, I think it would be reasonable to place a limit of 1M on the total
>> number of OS threads. It seems unlikely that a system would be able to
>> properly handle 1M threads at once anyways, whereas that seems entirely
>> feasible with 1M coroutine objects. Especially on a high traffic server.
>>
>
> This logic doesn't seem much different than would be for coroutines...
> Just need to wait for larger systems...
>
> With 100k threads started we're only using 8G memory, there are plenty of
> systems today with more than 80G of RAM
>
>
>> On Mon, Dec 9, 2019 at 12:01 PM Guido van Rossum <[email protected]>
>> wrote:
>>
>>> I want to question two specific limits.
>>>
>>> (a) Limiting the number of classes, in order to potentially save space
>>> in object headers, sounds like a big C API change, and I think it's better
>>> to lift this idea out of PEP 611 and debate the and cons separately.
>>>
>>> (b) Why limit coroutines? It's just another Python object and has no
>>> operating resources associated with it. Perhaps your definition of
>>> coroutine is different, and you are thinking of OS threads?
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>> *Pronouns: he/him **(why is my pronoun here?)*
>>> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>>> _______________________________________________
>>> Python-Dev mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>>> Message archived at
>>> https://mail.python.org/archives/list/[email protected]/message/CJO36YRFWCTEUUROJVXIQDMWGZBFAD5T/
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>> _______________________________________________
>> Python-Dev mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/[email protected]/message/WYZHKRGNOT252O3BUTFNDVCIYI6WSBXZ/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/FGTHP7ECCPHI74LV2OLJ72M42DDPJ7Q6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 611 -- why limit coroutines and classes?

Reply via email to