On 25-Feb-06, at 10:05 AM, robert yates wrote:
Am no expert on character encodings, but are there a few issues
with the current draft and character encoding? Namely,
1) The fields dix:/message-type, dix:/membersite-url, dix:/
signature and possibly others require Latin characters in their
contents. However isn't a browser free to post the form with a
character encoding that does not permit these characters?
2) The Canonicalization Algorithm relies on the browser posting the
form in the same character encoding used by the server to generate
it. With the current draft this can this be guaranteed?
This article http://www-306.ibm.com/software/globalization/misc/
code_considerations/index.jsp provides for a good overview of the
problem, although may be a bit out of date. Having read this what
do folks think of the following proposal to address the above issues.
The dix spec ensures that information is always exchanged (via the
browser) encoded as UTF8. It does this by stating that the
character encoding of the forms pages must be UTF8 and also
mandates the following rules for the html forms used for the data
transportation.
1. The HTML head section of the form MUST contain <META http-
equiv="Content-Type" content="text/html; charset=utf-8">. There
needs to be a corresponding statement for XHTML.
2. The FORM element MUST contain the Accept-Charset attribute and
it MUST be set to 'utf-8'i.e. <FORM Accept-Charset="utf-8" Type= ...>
Great suggestion!
_______________________________________________
dix mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/dix