If this is better suited elsewhere, such as dev-tech-network, please let me know.
For about five years I've been trying to figure out the IDNA algorithm that a) browsers follow and b) browsers want to follow, but I've not had much luck thus far getting folks to reply. E.g., https://lists.w3.org/Archives/Public/www-archive/2017Feb/0006.html went largely unaddressed. One big difference between http://www.unicode.org/reports/tr46/ and browsers is how ASCII is handled. Per UTS #46 ASCII is handled the same as non-ASCII. However, in browsers ASCII takes a "fast path" and skips the ToASCII algorithm. YouTube now depends on that (it has CDN domains with hyphens in the third and fourth position, as reported at, e.g., https://github.com/nodejs/node/issues/12965). A question I've had is whether we should standardize the fast path or try to get consistent handling. I've also raised this with the Unicode folks at https://docs.google.com/document/d/11PEww2N0PbXyPhbsCdW_PjD3BNgZMy5XHUv02SSXNqY/edit and elsewhere and it seems an upcoming draft adds another flag to the UTS #36 ToASCII algorithms to ignore hyphen requirements. However, hyphens are not the only requirement that might influence how a pure ASCII domain is handled and therefore it's unclear it actually solves the problem, especially if browsers continue to ignore it. Since I haven't gotten much cooperation my plan is to just standardize what browsers do (in https://url.spec.whatwg.org/ which ends up invoking UTS #46 ToASCII) and go from there, especially as not doing what browsers do tends to break other ecosystems, but I thought I'd raise this here as a final attempt to get some input. -- https://annevankesteren.nl/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform