This being a security issue I think it's okay to break 3.6. might even backport to 3.5 if it's easy?
On Dec 29, 2017 1:59 PM, "Christian Heimes" <christ...@python.org> wrote: > Hi, > > tl;dr > This mail is about internationalized domain names and TLS/SSL. It > doesn't concern you if you live in ASCII-land. Me and a couple of other > developers like to change the ssl module in a backwards-incompatible way > to fix IDN support for TLS/SSL. > > > Simply speaking the IDNA standards (internationalized domain names for > applications) describe how to encode non-ASCII domain names. The DNS > system and X.509 certificates cannot handle non-ASCII host names. Any > non-ASCII part of a hostname is punyencoded. For example the host name > 'www.bücher.de <http://www.xn--bcher-kva.de>' (books) is translated into ' > www.xn--bcher-kva.de'. In > IDNA terms, 'www.bücher.de <http://www.xn--bcher-kva.de>' is called an > IDN U-label (unicode) and > 'www.xn--bcher-kva.de' an IDN A-label (ASCII). Please refer to the TR64 > document [1] for more information. > > In a perfect world, it would be very simple. We'd only had one IDNA > standard. However there are multiple standards that are incompatible > with each other. The German TLD .de demands IDNA-2008 with UTS#46 > compatibility mapping. The hostname 'www.straße.de <http://www.strasse.de>' > maps to > 'www.xn--strae-oqa.de'. However in the older IDNA 2003 standard, > 'www.straße.de <http://www.strasse.de>' maps to 'www.strasse.de', but ' > strasse.de' is a totally > different domain! > > > CPython has only support for IDNA 2003. > > It's less of an issue for the socket module. It only converts text to > IDNA bytes on the way in. All functions support bytes and text. Since > IDNA encoding does change ASCII and IDNA-encoded data is ASCII, it is > also no problem to pass IDNA2008-encoded text or bytes to all socket > functions. > > Example: > > >>> import socket > >>> import idna # from PyPI > >>> names = ['straße.de <http://strasse.de>', b'strasse.de', idna.encode(' > straße.de <http://strasse.de>'), > idna.encode('straße.de <http://strasse.de>').encode('ascii')] > >>> for name in names: > ... print(name, socket.getaddrinfo(name, None, socket.AF_INET, > socket.SOCK_STREAM, 0, socket.AI_CANONNAME)[0][3:5]) > ... > straße.de <http://strasse.de> ('strasse.de', ('89.31.143.1', 0)) > b'strasse.de' ('strasse.de', ('89.31.143.1', 0)) > b'xn--strae-oqa.de' ('xn--strae-oqa.de', ('81.169.145.78', 0)) > xn--strae-oqa.de ('xn--strae-oqa.de', ('81.169.145.78', 0)) > > As you can see, 'straße.de <http://strasse.de>' is canonicalized as ' > strasse.de'. The IDNA > 2008 encoded hostname maps to a different IP address. > > > On the other hand ssl module is currently completely broken. It converts > hostnames from bytes to text with 'idna' codec in some places, but not > in all. The SSLSocket.server_hostname attribute and callback function > SSLContext.set_servername_callback() are decoded as U-label. > Certificate's common name and subject alternative name fields are not > decoded and therefore A-labels. The *must* stay A-labels because > hostname verification is only defined in terms of A-labels. We even had > a security issue once, because partial wildcard like 'xn*.example.org' > must not match IDN hosts like 'xn--bcher-kva.example.org'. > > In issue [2] and PR [3], we all agreed that the only sensible fix is to > make 'SSLContext.server_hostname' an ASCII text A-label. But this is an > backwards incompatible fix. On the other hand, IDNA is totally broken > without the fix. Also in my opinion, PR [3] is not going far enough. > Since we have to break backwards compatibility anyway, I'd like to > modify SSLContext.set_servername_callback() at the same time. > > Questions: > - Is everybody OK with breaking backwards compatibility? The risk is > small. ASCII-only domains are not affected and IDNA users are broken > anyway. > - Should I only fix 3.7 or should we consider a backport to 3.6, too? > > Regards, > Christian > > [1] https://www.unicode.org/reports/tr46/ > [2] https://bugs.python.org/issue28414 > [3] https://github.com/python/cpython/pull/3010 > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com