On Jul 25, 2010, at 6:43 AM, Boris Zbarsky wrote: > On 7/25/10 8:57 AM, Adam Barth wrote: >>> There's also the related question of what browsers should do with input >>> typed into the URL field. Other than establishing that these rules may be >>> different between the URL field and URLs present in content, I'm not sure >>> this is amenable to spec. But perhaps a survey of what browsers do would be >>> useful. >> >> I wasn't planning to cover that because it's not a critical to >> interoperability > > Unfortunately, it is. In particular, servers need to know what to > expect the browser to send if a user types non-ASCII into the url bar. > There are real interoperability problems out there due to differing > server and browser behavior in this regard. > > It may not be an _html_ interoperability problem, but it's certainly a > _web_ interoperability problem. > >> There are also other >> considerations there because the URLs are displayed to users as >> security indicators. > > What's displayed is not a concern, in my opinion, in terms of > interoperability. What's put on the wire is. The constraints that need > to be imposed are much looser than on <a href> (e.g. we don't need to > define exactly what url gets loaded if the user types "monkey" in the > url bar), but sorting out the non-ASCII issue is definitely desirable.
One thing to keep in mind is that browsers do all sorts of non-interoperable things for input that is not a valid URL, such as guessing that it is a hostname or performing a search with a search engine. So there's a limit to how much this can be spec'd. I agree that for certain URL-like strings that a user may type or cut & paste, there is an interop issue. Regards, Maciej