On Fri, 2014-06-27 at 14:03 +0300, Alexander Bokovoy wrote: > On Fri, 27 Jun 2014, Martin Kosek wrote: > >On 06/27/2014 12:10 PM, Alexander Bokovoy wrote: > >> On Fri, 27 Jun 2014, Petr Spacek wrote: > >>> On 27.6.2014 11:21, Jan Cholasta wrote: > >>>> On 27.6.2014 10:58, Alexander Bokovoy wrote: > >>>>> On Fri, 27 Jun 2014, Jan Cholasta wrote: > >>>>>> On 27.6.2014 10:29, Alexander Bokovoy wrote: > >>>>>>> On Fri, 27 Jun 2014, Jan Cholasta wrote: > >>>>>>>> On 27.6.2014 10:15, Alexander Bokovoy wrote: > >>>>>>>>> On Fri, 20 Jun 2014, Martin Basti wrote: > >>>>>>>>>> On Fri, 2014-06-20 at 10:32 +0200, Jan Cholasta wrote: > >>>>>>>>>>> On 18.6.2014 16:49, Martin Basti wrote: > >>>>>>>>>>>> Due to compability with older versions, only IDNA domains should > >>>>>>>>>>>> be > >>>>>>>>>>>> checked > >>>>>>>>>>>> Patch attached. > >>>>>>>>>>> > >>>>>>>>>>> I'm not particularly happy about the u'\xdf' special case. Isn't > >>>>>>>>>>> there a > >>>>>>>>>>> better way to do this check? > >>>>>>>>>> I cant find better way. u'\xdf' is mapped to ss, and ss is not IDN > >>>>>>>>>> string. > >>>>>>>>>> > >>>>>>>>>> Or just remove this validation. > >>>>>>>>>> > >>>>>>>>>>> (BTW I really think this should be a warning, not an error, but > >>>>>>>>>>> that > >>>>>>>>>>> would require larger amount of work, so I guess it's OK for now.) > >>>>>>>>>> (More pain than gain) > >>>>>>>>> Main thing in this patch is that the check should not be done > >>>>>>>>> against > >>>>>>>>> non-IDN strings. I want this version of the patch to go in for that > >>>>>>>>> reason as currently you cannot even complete ipa-adtrust-install > >>>>>>>>> run due > >>>>>>>>> to IDN normalisation check being applied to non-IDN domains. > >>>>>>>> > >>>>>>>> On non-IDN domains, the only effect of IDN normalization is that it > >>>>>>>> lower-cases the names (right?), so the check should compare > >>>>>>>> lower-cased original name with the normalized name, instead of > >>>>>>>> special-casing certain characters etc. > >>>>>>> .. what's the reason to do such comparison then? lower-cased non-IDN > >>>>>>> name will be equal to lower-cased normalized non-IDN name by > >>>>>>> definition, > >>>>>>> so the check is not needed in this case, at all. > >>>>>> > >>>>>> The point is that it works for both IDN and non-IDN, without > >>>>>> u'\xdf'-style hacks. > >>>>> No, your proposal of comparing low-cased value and normalized value is > >>>>> not going to work because low-cased value is in general not equal to > >>>>> normalized value for IDN names, only for non-IDN ones, due to the fact > >>>>> that lower case for non-ASCII Unicode character may map to a completely > >>>>> different character than in normalization situation. Take, for example, > >>>>> Turkish alphabet where there are six letters with different case rules > >>>>> (uppercase dotted i, dottless lowercase i, upper- and lowercase G with > >>>>> breve accent, and upper- and lowercase S with cedilla), which will break > >>>>> your generalized check. > >>>>> So you'll anyway will need to split these cases. > >>>>> > >>>> > >>>> I see. > >>>> > >>>> I'm still not comfortable with carrying the bit of knowledge about > >>>> u'\xdf' in > >>>> this particular spot. Can we check that a name is IDN some other way than > >>>> "domain_name.is_idn() or u'\xdf' in value"? > >>> > >>> Why can't we simply fix string constants in ipa-adtrust-install and avoid > >>> adding hacks for it? > >> Because they are correct, in the sense that they follow what is defined > >> for Active Directory. Yes, AD puts them in that case into DNS. There is > >> simply no reason to force lower case for non-IDN names. > >> > >> That said, a newer fix is attached, where error message is formatted > >> properly. > > > >I would personally be OK with the change if the is_* are fixed as Honza > >proposed, current way is not so Python-ic/readable. I.e.: > > > >Instead of > >+ is_idna = True in [encodings.idna.ToASCII(x) != x for x in > >labels] > >Use > >+ is_idna = any(encodings.idna.ToASCII(x) != x for x in labels) > > > >Instead of > >+ is_nonnorm = True in [encodings.idna.nameprep(x) != x for x > >in > >labels] > >use > >+ is_nonnorm = any(encodings.idna.nameprep(x) != x for x in > >labels) > > > >However, we can wait till Monday for Martin2's feedback. > I've fixed this and also made a proper split on all dots that could > separate labels according to RFC3490: > > U+002E ( . ) FULL STOP > U+FF0E ( . ) FULLWIDTH FULL STOP > U+3002 ( 。 ) IDEOGRAPHIC FULL STOP > U+FF61 ( 。 ) HALFWIDTH IDEOGRAPHIC FULL STOP > > > _______________________________________________ > Freeipa-devel mailing list > Freeipa-devel@redhat.com > https://www.redhat.com/mailman/listinfo/freeipa-devel
Hi, I analyzed how python detects IDNA labels. Python tests if domain is IDNA in this way: def ToASCII(label): try: # Step 1: try ASCII label = label.encode("ascii") except UnicodeError: pass else: # Skip to step 3: UseSTD3ASCIIRules is false, so # Skip to step 8. if 0 < len(label) < 64: return label raise UnicodeError("label empty or too long") # Step 2: nameprep label = nameprep(label) ... We can use 'label = label.encode("ascii")' to detect if IDNA is needed, without idna.ToASCII() conversion, and then use: is_nonnorm = any(encodings.idna.nameprep(x) != x for x in labels) -- Martin^2 Basti _______________________________________________ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel