Re: [Freeipa-devel] [PATCH 0070] Normalization check only for IDNA domains

Martin Basti Mon, 30 Jun 2014 04:50:26 -0700

On Mon, 2014-06-30 at 11:43 +0300, Alexander Bokovoy wrote:
> On Mon, 30 Jun 2014, Martin Basti wrote:
> >On Fri, 2014-06-27 at 14:03 +0300, Alexander Bokovoy wrote:
> >> On Fri, 27 Jun 2014, Martin Kosek wrote:
> >> >On 06/27/2014 12:10 PM, Alexander Bokovoy wrote:
> >> >> On Fri, 27 Jun 2014, Petr Spacek wrote:
> >> >>> On 27.6.2014 11:21, Jan Cholasta wrote:
> >> >>>> On 27.6.2014 10:58, Alexander Bokovoy wrote:
> >> >>>>> On Fri, 27 Jun 2014, Jan Cholasta wrote:
> >> >>>>>> On 27.6.2014 10:29, Alexander Bokovoy wrote:
> >> >>>>>>> On Fri, 27 Jun 2014, Jan Cholasta wrote:
> >> >>>>>>>> On 27.6.2014 10:15, Alexander Bokovoy wrote:
> >> >>>>>>>>> On Fri, 20 Jun 2014, Martin Basti wrote:
> >> >>>>>>>>>> On Fri, 2014-06-20 at 10:32 +0200, Jan Cholasta wrote:
> >> >>>>>>>>>>> On 18.6.2014 16:49, Martin Basti wrote:
> >> >>>>>>>>>>>> Due to compability with older versions, only IDNA domains 
> >> >>>>>>>>>>>> should be
> >> >>>>>>>>>>>> checked
> >> >>>>>>>>>>>> Patch attached.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> I'm not particularly happy about the u'\xdf' special case. 
> >> >>>>>>>>>>> Isn't
> >> >>>>>>>>>>> there a
> >> >>>>>>>>>>> better way to do this check?
> >> >>>>>>>>>> I cant find better way. u'\xdf' is mapped to ss, and ss is not 
> >> >>>>>>>>>> IDN
> >> >>>>>>>>>> string.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Or just remove this validation.
> >> >>>>>>>>>>
> >> >>>>>>>>>>> (BTW I really think this should be a warning, not an error, 
> >> >>>>>>>>>>> but that
> >> >>>>>>>>>>> would require larger amount of work, so I guess it's OK for 
> >> >>>>>>>>>>> now.)
> >> >>>>>>>>>> (More pain than gain)
> >> >>>>>>>>> Main thing in this patch is that the check should not be done 
> >> >>>>>>>>> against
> >> >>>>>>>>> non-IDN strings. I want this version of the patch to go in for 
> >> >>>>>>>>> that
> >> >>>>>>>>> reason as currently you cannot even complete ipa-adtrust-install
> >> >>>>>>>>> run due
> >> >>>>>>>>> to IDN normalisation check being applied to non-IDN domains.
> >> >>>>>>>>
> >> >>>>>>>> On non-IDN domains, the only effect of IDN normalization is that 
> >> >>>>>>>> it
> >> >>>>>>>> lower-cases the names (right?), so the check should compare
> >> >>>>>>>> lower-cased original name with the normalized name, instead of
> >> >>>>>>>> special-casing certain characters etc.
> >> >>>>>>> .. what's the reason to do such comparison then? lower-cased 
> >> >>>>>>> non-IDN
> >> >>>>>>> name will be equal to lower-cased normalized non-IDN name by 
> >> >>>>>>> definition,
> >> >>>>>>> so the check is not needed in this case, at all.
> >> >>>>>>
> >> >>>>>> The point is that it works for both IDN and non-IDN, without
> >> >>>>>> u'\xdf'-style hacks.
> >> >>>>> No, your proposal of comparing low-cased value and normalized value 
> >> >>>>> is
> >> >>>>> not going to work because low-cased value is in general not equal to
> >> >>>>> normalized value for IDN names, only for non-IDN ones, due to the 
> >> >>>>> fact
> >> >>>>> that lower case for non-ASCII Unicode character may map to a 
> >> >>>>> completely
> >> >>>>> different character than in normalization situation. Take, for 
> >> >>>>> example,
> >> >>>>> Turkish alphabet where there are six letters with different case 
> >> >>>>> rules
> >> >>>>> (uppercase dotted i, dottless lowercase i, upper- and lowercase G 
> >> >>>>> with
> >> >>>>> breve accent, and upper- and lowercase S with cedilla), which will 
> >> >>>>> break
> >> >>>>> your generalized check.
> >> >>>>> So you'll anyway will need to split these cases.
> >> >>>>>
> >> >>>>
> >> >>>> I see.
> >> >>>>
> >> >>>> I'm still not comfortable with carrying the bit of knowledge about 
> >> >>>> u'\xdf' in
> >> >>>> this particular spot. Can we check that a name is IDN some other way 
> >> >>>> than
> >> >>>> "domain_name.is_idn() or u'\xdf' in value"?
> >> >>>
> >> >>> Why can't we simply fix string constants in ipa-adtrust-install and 
> >> >>> avoid
> >> >>> adding hacks for it?
> >> >> Because they are correct, in the sense that they follow what is defined
> >> >> for Active Directory. Yes, AD puts them in that case into DNS. There is
> >> >> simply no reason to force lower case for non-IDN names.
> >> >>
> >> >> That said, a newer fix is attached, where error message is formatted
> >> >> properly.
> >> >
> >> >I would personally be OK with the change if the is_* are fixed as Honza
> >> >proposed, current way is not so Python-ic/readable. I.e.:
> >> >
> >> >Instead of
> >> >+            is_idna = True in [encodings.idna.ToASCII(x) != x for x in 
> >> >labels]
> >> >Use
> >> >+            is_idna = any(encodings.idna.ToASCII(x) != x for x in labels)
> >> >
> >> >Instead of
> >> >+                is_nonnorm = True in [encodings.idna.nameprep(x) != x 
> >> >for x in
> >> >labels]
> >> >use
> >> >+                is_nonnorm = any(encodings.idna.nameprep(x) != x for x 
> >> >in labels)
> >> >
> >> >However, we can wait till Monday for Martin2's feedback.
> >> I've fixed this and also made a proper split on all dots that could
> >> separate labels according to RFC3490:
> >>
> >>     U+002E ( . ) FULL STOP
> >>     U+FF0E ( ． ) FULLWIDTH FULL STOP
> >>     U+3002 ( 。 ) IDEOGRAPHIC FULL STOP
> >>     U+FF61 ( ｡ ) HALFWIDTH IDEOGRAPHIC FULL STOP
> >>
> >>
> >> _______________________________________________
> >> Freeipa-devel mailing list
> >> Freeipa-devel@redhat.com
> >> https://www.redhat.com/mailman/listinfo/freeipa-devel
> >
> >Hi,
> >I analyzed how python detects IDNA labels.
> >
> >Python tests if domain is IDNA in this way:
> >
> >def ToASCII(label):
> >    try:
> >        # Step 1: try ASCII
> >        label = label.encode("ascii")
> >    except UnicodeError:
> >        pass
> >    else:
> >        # Skip to step 3: UseSTD3ASCIIRules is false, so
> >        # Skip to step 8.
> >        if 0 < len(label) < 64:
> >            return label
> >        raise UnicodeError("label empty or too long")
> >
> >    # Step 2: nameprep
> >    label = nameprep(label)
> >...
> >
> >We can use 'label = label.encode("ascii")' to detect if IDNA is needed,
> >without idna.ToASCII() conversion, and then use:
> >
> >is_nonnorm = any(encodings.idna.nameprep(x) != x for x in labels)
> Sounds good but don't forget exceptions' handling. :)
>


Updated patch attached.

I modified error messages, IDNA mapping is not only mapping to lowercase
-- 
Martin^2 Basti

>From a0e14f69e293573d631d0fd058991d60d1845471 Mon Sep 17 00:00:00 2001
From: Martin Basti <mba...@redhat.com>
Date: Wed, 18 Jun 2014 15:58:17 +0200
Subject: [PATCH] Check normalization only for IDNA domains

Backward compability with older IPA versions which allow to use uppper
case. Only IDNA domains will be checked.
---
 ipalib/parameters.py                    | 25 +++++++++++++++----------
 ipatests/test_xmlrpc/test_dns_plugin.py |  5 ++---
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/ipalib/parameters.py b/ipalib/parameters.py
index 1dff13cc1371c26208cacd8a7cbaf44634622e4a..0cf14a4cd2900459ccd5d6d52912960c642223aa 100644
--- a/ipalib/parameters.py
+++ b/ipalib/parameters.py
@@ -1961,16 +1961,21 @@ class DNSNameParam(Param):
                 error = _('DNS label cannot be longer than 63 characters')
             except dns.exception.SyntaxError:
                 error = _('invalid domain name')
-
-            #compare if IDN normalized and original domain match
-            #there is N:1 mapping between unicode and IDNA names
-            #user should use normalized names to avoid mistakes
-            normalized_domain_name = encodings.idna.nameprep(value)
-            if value != normalized_domain_name:
-                error = _("domain name '%(domain)s' and normalized domain name"
-                          " '%(normalized)s' do not match. Please use only"
-                          " normalized domains") % {'domain': value,
-                          'normalized': normalized_domain_name}
+            else:
+                #compare if IDN normalized and original domain match
+                #there is N:1 mapping between unicode and IDNA names
+                #user should use normalized names to avoid mistakes
+                labels = re.split(u'[.\uff0e\u3002\uff61]', value, flags=re.UNICODE)
+                try:
+                    map(lambda label: label.encode("ascii"), labels)
+                except UnicodeError:
+                    # IDNA
+                    is_nonnorm = any(encodings.idna.nameprep(x) != x for x in labels)
+                    if is_nonnorm:
+                        error = _("domain name '%(domain)s' should be normalized to"
+                          ": %(normalized)s") % {
+                          'domain': value,
+                          'normalized': '.'.join([encodings.idna.nameprep(x) for x in labels])}
             if error:
                 raise ConversionError(name=self.get_param_name(), index=index,
                                       error=error)
diff --git a/ipatests/test_xmlrpc/test_dns_plugin.py b/ipatests/test_xmlrpc/test_dns_plugin.py
index 66af0efb89d622083adbd55cad96493f459843d1..2c8c85f93c73605dda93ddf50b80132ce0e50897 100644
--- a/ipatests/test_xmlrpc/test_dns_plugin.py
+++ b/ipatests/test_xmlrpc/test_dns_plugin.py
@@ -2504,11 +2504,10 @@ class test_dns(Declarative):
 
 
         dict(
-            desc='Add A denormalized record to %r in zone %r' % (idnres1, idnzone1),
+            desc='Add A denormalized record in zone %r' % (idnzone1),
             command=('dnsrecord_add', [idnzone1, u'gro\xdf'], {'arecord': u'172.16.0.1'}),
             expected=errors.ConversionError(name='name',
-                error=u'domain name \'gro\xdf\' and normalized domain name \'gross\''
-                + ' do not match. Please use only normalized domains'),
+                error=u'domain name \'gro\xdf\' should be normalized to: gross')
         ),
 
 
-- 
1.8.3.1

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH 0070] Normalization check only for IDNA domains

Reply via email to