On Wednesday, 28 April 2021 06:52:30 BST Glyph wrote:
> 
> > On Apr 27, 2021, at 8:58 PM, Wim Lewis <w...@hhhh.org> wrote:
> > 
> > On Thursday, April 8, 2021 8:43:35 AM PDT, Barry Scott wrote:
> >> We just added a patch to our twisted to prevent twisted from doing idna 
> >> validation.
> >> _idnaBytes and _idnaText not convert from bytes to unicode based on the 
> >> type of
> >> the provided arg.
> >> 
> >> We had to do this because there are domain names that youtube.com uses 
> >> that are
> >> not valid under IDNA-2008 
> >> https://tools.ietf.org/html/rfc5891#section-4.2.3.1
> > 
> > My reading of the RFC is that the YouTube domain you mention 
> > (r2---sn-aigzrn7e.googlevideo.com) is an invalid "U-Label", but that 
> > doesn't mean it's an entirely invaid domain label. It just means you can't 
> > legally run it through IDNA and turn it into "xn--r2---sn-aigzrn7e-". The 
> > intent, as I understand it, is to forbid any possibility of double-encoding 
> > or double-decoding a label, not to forbid the possibility of using labels 
> > like the one you mention.
> 
> I agree with this reading.
> 
> >> I can see why a UI would need to do IDNA-2008 converts and validation
> >> but I'm not clear why its of value deep in the guts of twisted.
> > 
> > My guess is that this is just an accident of the way that the 
> > bytes/characters distinction and the IDNA features were added to Twisted, 
> > and is probably a bug.
> 
> +1.
> 
> We also have other issues with the Python IDNA library: 
> https://github.com/kjd/idna/issues/18 <https://github.com/kjd/idna/issues/18> 
> and would generally like to reduce our strictness via whatever mechanisms we 
> can, even for things that genuinely require it (which this does not).
> 
> >> Why is this code needed at all in twisted?
> >> If its for a high level API then why isn't it being called at the
> >> edge of the high level API calls?
> > 
> > I'd argue that resolving URLs is in fact a high level API (from the point 
> > of view of the name resoution system) but even so, it seems to me that 
> > Twisted is doing the wrong thing here. The format of that label should 
> > prevent it from ever being transformed by IDNA, but shouldn't prevent it 
> > from being passed through unchanged, since it doesn't contain any 
> > codepoints outside of the usual ASCII range.
> 
> Also agreed with all of this.
> 
> >> The key idea here is that its human input that will be converted.
> >> But the code is used deep in the _sslverify.py where no human
> >> input is entered.
> > 
> > _sslverify has to check whether the information in the server's certificate 
> > matches the URL that the user supplied. Certificates can contain Unicode 
> > text — at least in the (completely obsolete) CN-as-domain-name situation — 
> > so _sslverify probably picked up the requirement for IDNA transformations 
> > from that. (I don't remember whether dNSName SANs can contain unicode.)
> 
> Yep.
> 
> > What is the patch you decided to add to your version? Where in _sslverify 
> > did the problem surface?

When _idaBytes was called to raise an exception in ClientTLSOptions.__init__.

> I am also very curious about this :).

Attached is the patch we are using. We are using 19.07 for sad reasons.

Barry
Only in tmp2: twisted-remove-idna-checks.patch
diff -r -u tmp1/twisted-twisted-19.7.0/src/twisted/internet/_idna.py tmp2/twisted-twisted-19.7.0/src/twisted/internet/_idna.py
--- tmp1/twisted-twisted-19.7.0/src/twisted/internet/_idna.py	2019-07-28 10:17:29.000000000 +0100
+++ tmp2/twisted-twisted-19.7.0/src/twisted/internet/_idna.py	2021-04-08 14:38:04.449987636 +0100
@@ -22,14 +22,10 @@
     @return: The domain name's IDNA representation, encoded as bytes.
     @rtype: L{bytes}
     """
-    try:
-        import idna
-    except ImportError:
-        return text.encode("idna")
+    if type(text) == bytes:
+        return text
     else:
-        return idna.encode(text)
-
-
+        return text.encode('ascii')
 
 def _idnaText(octets):
     """
@@ -43,12 +39,10 @@
     @return: A human-readable domain name.
     @rtype: L{unicode}
     """
-    try:
-        import idna
-    except ImportError:
-        return octets.decode("idna")
+    if type(octets) == bytes:
+        return octets.decode('ascii') 
     else:
-        return idna.decode(octets)
+        return octets
 
 
 
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to