When creating UDP servers with asyncio's create_datagram_endpoint(), the default value for reuse_address = True, resulting in a dangerous (and currently incorrectly documented) situation. I have proposed changing the default value but understandably such a change for a core library function parameter is not to be taken lightly. Thus I put this up for discussion on the list.
As background, when creating TCP servers on UNIX-like systems, it is almost boilerplate to set SO_REUSEADDR for all server sockets to make sure that a restarting server can immediately bind the socket again. Without the SO_REUSEADDR sockopt, the kernel will hold the addr:port in a TIME_WAIT state for a while, preventing reuse. Thus, when creating TCP servers with loop.create_server(), the parameter reuse_address has a very reasonable default value of True. However things are very different in UDP-land. The kernel does not hold UDP server ports in a waiting state so the SO_REUSEADDR sockopt was repurposed in Linux (and *BSD afaik) to allow multiple processes to bind the SAME addr:port for a UDP server. The kernel will then feed incoming UDP packets to all such processes in a semi-fair-roundrobin manner. This is very useful in some scenarios, for example I've used it myself in C++ projects to allow UDP servers to be scaled easily and rolling upgrades to be implemented without separate load-balancing. But for this to be the default behaviour is quite dangerous. I discovered this default behaviour accidentally by having 2 separate Python programs (both doing SIP over UDP) accidentally configured to use the same UDP port. The result was that my 2 processes were indeed "sharing the load" - neither of them threw an exception at startup about the port being already in use and both started getting ~half of the incoming packets. So off to the docs I went and discovered that the documentation for create_datagram_endpoint() does not mention this behaviour at all, instead it mistakenly refers to the TCP protocol use of SO_REUSEADDR: "reuse_address tells the kernel to reuse a local socket in TIME_WAIT state, without waiting for its natural timeout to expire. If not specified will automatically be set to True on Unix." https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create_datagram_endpoint What makes this default especially dangerous is, - Most people are not aware of this special mode that Linux allows for UDP sockets - Even if it was documented to be the default, many people would miss it unless a big warning was slapped on the docs - The problems are unlikely to appear in test scenarios and much more likely to pop up in production months or years after rolling out the code - If you have never used it on purpose, it is very confusing to debug, causing you to doubt your own and the kernel's sanity - The behaviour changes again if you happen to use a multicast address... Thus, my proposal is to change the default value for UDP to False or deprecate the function and introduce a new one as suggested by Yuri in my original bug report at: https://bugs.python.org/issue37228 _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TK2NTPWID7RBUUNUU5JYAZHR6FKEABRU/ Code of Conduct: http://python.org/psf/codeofconduct/