[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: > Are you aware of what currently supported platforms have SO_REUSEPORT defined > where the *reuse_port* parameter doesn't actually work? This would be quite > helpful to know. This was reported by a partner that was working porting our code to Android but might be fixed current Android API levels. I cannot test this myself. > .. we don't directly support every single available platform, as making the > code specific to an OS incurs a long-term maintenance cost Fully agree, part of the reason why I was suggesting getting rid of reuse_address and reuse_port completely on this higher level API. But I'm happy with Antoine's proposal which effectively kills the reuse_address parameter, just suggesting we also kill reuse_port as well. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: > A higher-level interface like asyncio doesn't necessarily want to map its > kwargs directly to non-portable low-level options like this. Also reuse_port has portability issues, the whole portability mess i s nicely summed up in: https://stackoverflow.com/questions/14388706/how-do-so-reuseaddr-and-so-reuseport-differ And the docs say that "If the SO_REUSEPORT constant is not defined then this capability is unsupported." which is true but the converse is not - some platforms apparently do have SO_REUSEPORT defined but the option still doesn't work, resulting in a ValueError exception from create_datagram_endpoint(). So to create truly portable code, the developer who really wants to use these options has to know differences between underlying OS and kernel versions, catch errors that shouldn't happen according to docs and then try again with different options. This is ok for lower level APIs like socket.socket where things map 1:1 to system calls but not a good thing on higher level APIs as you point out. In my C++ projects that use these, I just have a #ifdef block for every OS and don't even try to abstract it. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: > We just chose an unfortunate default for one of them I'd like to point out that it is also documented completely wrong up to this point in time and thus people who chose True are most likely to be unaware of the actual consequences. A user's explicit choice based on misinformation is not really demonstrating intent to shoot oneself in the foot. So after changing my mind a couple of times I'm still attached to the idea that in 3.10 reuse_address and reuse_port are both removed and the tiny portion of developers who need them will just create a suitably setsockopt'd socket and pass it to the function. 2 lines of additional code is all it takes. If the documentation had been correct all along and just the default bad, this mess would be much smaller. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: Going to SO_REUSEPORT will fix the security issue and emitting a deprecation warning for default value invocation will catch the eyes of some maintainers but it will not prevent what caused me to catch this issue in the first place - starting two processes (with same UID) accidentally listening on the same UDP port (in my case it was port 5060 as a default SIP port). Since there already is a reuse_port parameter to create_datagram_endpoint(), I assume the proposal is to set default value for reuse_addresss=False and reuse_port=True? But if reuse_address is explicitly set to True, are we going to just set SO_REUSEPORT instead and always leave SO_REUSEADDR unset? This would leave the reuse_address parameter completely useless and still allow accidental port reuse. What if I really do want SO_REUSEADDR? Ok I can create a socket separately, call setsockopt() on it and pass it as the sock parameter to create_datagram_endpoint(). Maybe I'm not fully grasping the proposal.. or maybe we should just deprecate reuse_port from both create_datagram_endpoint() and create_server() + reuse_addr from create_datagram_endpoint()? This would leave the TCP create_server() with the reuse_addr parameter, defaulting reasonably to True. To use TCP/UDP SO_REUSEPORT or UDP SO_REUSEADDR, the docs would tell you to bake your own socket with socket.socket(). Those few (like me) who really need the functionality can survive without all-in-one convenience functions on asyncio.loop -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: David, in terms of documentation changes and the emitted deprecation warning itself, I think it would be appropriate to instruct that please set the parameter explicitly to True or False to silence the warning AND point out that setting it to True has significant security and previously incorrectly documented functional implications. Now your updated docs and warning read more like we are working around a Linux security bug which is not really the case - this behavior was intentionally added to the kernels and some of the code I do for a living relies on it to work properly. Admittedly the restriction of having the same UID wouldn't hurt. And browsing again through the hits to my github searches, it makes me cringe how many people are already explicitly setting reuse_address=True in their code because the current documentation mistakenly makes it seem harmless and desirable. Makes me wonder if we need to put out a CVE? At the very least, I will be putting in PRs to the asyncio packages that I myself use and understand. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: I had a quick search through github for calls to create_datagram_endpoint() and the reuse_address is either not set or set explicitly to True, probably due to the error in the documentation. Only in one case (of my admittedly small sample) did it seem like the parameter was set to True intentionally to get the behavior of binding multiple processes to the same port. Due to the nature of the function, it is also often buried inside packages implementing higher level protocols (SIP, STUN, DNS, RTP) which want to create a convenience function to create asyncio servers and clients. The deprecation warning and getting it into 3.6 and later sounds very reasonable to me. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: Sure, I fully appreciate implications of changing default behaviour and will post on python-dev. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: I definitely propose changing the default for UDP sockets. Having multiple processes binding to a the same port and load-balancing incoming UDP traffic intentionally is a rare scenario which should be supported but not the default. For TCP the default is reasonable because everyone creating server sockets will usually want to set it. -- ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
Jukka Väisänen added the comment: Can we still consider this for 3.9? I still think this is an easy way to accidentally blow your foot off with something that will probably only show up in production. -- versions: +Python 3.9 -Python 3.7 ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port
New submission from Jukka Väisänen : When using loop.create_datagram_endpoint(), the default for reuse_address=True, which sets the SO_REUSEADDR sockopt for the UDP socket. This is a dangerous and unreasonable default for UDP, because in Linux it allows multiple processes to create listening sockets for the same UDP port and the kernel will randomly give incoming packets to these processes. I discovered this by accidentally starting two Python asyncio programs with the same UDP port and instead of getting an exception that the address is already in use, everything looked to be working except half my packets went to the wrong process. The documentation also refers to behavior with TCP sockets: https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create_datagram_endpoint "reuse_address tells the kernel to reuse a local socket in TIME_WAIT state, without waiting for its natural timeout to expire. If not specified will automatically be set to True on Unix." This is true for TCP but not UDP. Either the documentation should reflect the current (dangerous) behavior or the behavior should be changed for UDP sockets. Workaround is of course to create your own socket without SO_REUSEADDR and pass it to create_datagram_endpoint(). -- components: asyncio messages: 345209 nosy: Jukka Väisänen, asvetlov, yselivanov priority: normal severity: normal status: open title: UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue37228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com