Martin Duerst <[EMAIL PROTECTED]> writes: > Hello Simon, > > Very nice to put up such a script.
I believe I have fixed the problems you mention, thanks for taking the time to point them out. > It would be great if the default page was served as UTF-8. > That way, on any recent browser, any user can just copy/paste > or type in their idn and submit the query, without having to > worry about encoding issues. The page is served in the charset you select. Chose UTF-8 if you want UTF-8. Only supporting UTF-8 would restrict the page's usefulness. Standards compliant browsers handle charset conversions in copy/paste. > Using various different encodings the way you do is exposing > your system internals in a way the Web was designed (and is > implemented) to abstract from. > > The 'force charset to' drop-down menu is particularly dangerous, > because it does not force the browser to send the characters > that the user has pasted or input to the server in that encoding, > it just forces the server to MISinterpret the octets that the > browser sent. > > At the top of the page, you write: > Report problems to [EMAIL PROTECTED], but first please make sure your > browser really is encoding the data you type in the charset you select. > If not, incorrect output or an error is the proper response. > > This is heavily backwards. The browser will do the right thing if > you just allow it to do so, and don't allow the user to mess > around with it. I have tried to make the intended behaviour more clear. You must type characters in the charset the page uses. If you want to use another charset, it is a two step process: first change charset, then enter new data. > Also, some browsers tend to send named or numeric character references > when characters in a text field are outside of the encoding of the > page. That as such is non-standard, and you don't necessarily > have to deal with it. However, you should make sure that the > output you send back is properly escaped. For example not > > $ echo 'Dürst.josefsson.org' | /usr/local/bin/idn --idna-to-ascii 2>&1 > > but > > $ echo 'D&uuml;rst.josefsson.org' | /usr/local/bin/idn > --idna-to-ascii 2>&1 Since it is non-standard, I'll deal with it using the garbage in garbage out philosophy. Someone might even find the current behaviour useful. > I tested this with several browsers. With IE, there were difficulties > to interpret the encoding of your page correctly in the first place. > My current guess is that this is due to the fact that you use additional > double quotes in > <meta http-equiv='Content-Type' content='text/html; charset="ISO-8859-1"' />, > instead of simply > <meta http-equiv='Content-Type' content='text/html; charset=ISO-8859-1' /> > I might be wrong, but other than that, I can't see any reason at the moment. I don't see anything wrong with the code, and I don't have access to IE to test this further. If you, or someone else, wants to investigate this further, it would be appreciated.
