"Erik van der Poel" <[EMAIL PROTECTED]> writes: >> If I invoke: >> >> [EMAIL PROTECTED]:~$ idn --debug --quiet foo․bar > > Yes, libidn handles this case (ASCIIs followed by U+2024, followed by > ASCIIs) the same way as the others. Libidn handles the > <non-ASCII-label>U+2024<ASCII-label> differently. For example, in > HTML: > > <a href="http://十․com">blah</a>
Ah, now it is clear to me what the problem is. Thanks. >> The web page for the same input is: >> >> http://josefsson.org/idn.php/?data=foo%E2%80%A4bar&profile=Nameprep&mode=toascii&debug=on&charset=UTF-8&lastcharset=UTF-8 >> >> This looks correct to me. What is wrong? > > Try this one instead: > > http://josefsson.org/idn.php/?data=%E5%8D%81%E2%80%A4com&profile=Nameprep&mode=toascii&debug=on&charset=UTF-8&lastcharset=UTF-8 > > MSIE 7 and Firefox 2 both end up with xn--kkr.com while libidn > produces xn--.com-pq0g Interesting. As far as I can tell from RFC 3490, I think the libidn behaviour is what follows from the specification. The specification doesn't say anything about treating U+2024 as a label separator that I could find. Do you agree with this? If so, I think the first step is to update the RFC, and when that is done we can adapt the new behaviour in libidn. If libidn implements RFC 3490 incorrectly, we should definitely fix that. Right now I don't understand what part of RFC 3490 we implement incorrectly. So please explain further how the RFC 3490 language and libidn differ. I think one could argue more convincingly that MSIE/Firefox implements RFC 3490 incorrectly here. U+2024 isn't a label separator according to RFC 3490, but they treat it as if it were. Thanks, /Simon _______________________________________________ Help-libidn mailing list Help-libidn@gnu.org http://lists.gnu.org/mailman/listinfo/help-libidn