But your claim that a browser would send form data containing numeric character references is wrong here: it violates the format needed for forms submitted by "GET" method (should be UTF-8 unless something else is specified or the HTML form is not encoded with UTF-8, and then URL-encoded), or "POST" method.
I have heard such "claims" since last summer. As far as I know, IE 6 started using NCRs for characters that cannot be converted to the desired charset. Now it's Mozilla 1.5 as well. Wrong or not, browsers do it. In my opinion, this is at least better than substitution characters (like '?').
I don't know which other of these two submission formats are supported by browsers, but I think that browsers should now adopt some XML format for form data submitted by "POST". This way, browsers will be able to use numeric cahracter references for characters not supported in the selected target encoding.
http://www.w3.org/MarkUp/Forms/
markus
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.

