I think that is a very commn mistake people WILL make.
Doug Ewell wrote:
Thanks to all who pointed out that noncharacters, unlike surrogate code
points, are NOT illegal or invalid in UTF-8 or any other CES. I don't
know why I said they were. (Bad brain! Bad, bad brain!)
-Doug Ewell
Fullerton,
Frank Tang wrote:
I think that is a very commn mistake people WILL make.
Especially if they keep telling each other the wrong thing,
and then rely on folklore about the standard as their source
of information.
The ultimate source of information about a standard is the
standard itself.
If
Doug Ewell wrote:
Yung-Fong Tang [EMAIL PROTECTED] wrote:
I am working on serveral project which need to validate UTF-8 text.
Some people outside my company also ask me to update the UTF-8
validation code to reflect the changes introduced in Unicode 3.1
and 3.2.
I am
ok, I find some problem and rewirte a little bit. Here is the rev 0.4 .
Please give me your comment.
Title: The Algorithm to Valide an UTF-8 String
The Algorithm to Validate an UTF-8 String
Frank Yung-Fong Tang [EMAIL PROTECTED]
Draft: 0.4
Status of this document: DRAFT.
[EMAIL PROTECTED]
wrote:
Unfortunatelly, FSS-UTF in Unicode 1.1 IS NOT UTF-8. Most of the people
refer to UTF-8 by looking at RFC 2279 http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2279.htmland
RFC 2044 http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2044.htmlbut
in that two RFCs, when
Folks:
I am working on serveral project which need to validate UTF-8 text. Some
people outside my company also ask me to update the UTF-8 validation
code to reflect the changes introduced in Unicode 3.1 and 3.2. It will
be nice if some of you could review this paper for me. I plan to use it
Yung-Fong Tang [EMAIL PROTECTED] wrote:
I am working on serveral project which need to validate UTF-8 text.
Some people outside my company also ask me to update the UTF-8
validation code to reflect the changes introduced in Unicode 3.1
and 3.2.
I am still puzzled by claims that there have
7 matches
Mail list logo