Philippe,

> Also a broken opening tag for HTML/XML documents

In addition to not having endian problems UTF-8 is also useful when tracing
intersystem communications data because XML and other tags are usually in
the ASCII subset of UTF-8 and stand out making it easier to find the
specific data you are looking for.

However, within the program itself UTF-8 presents a problem when looking for
specific data in memory buffers.  It is nasty, time consuming and error
prone.  Mapping UTF-16 to code points is a snap as long as you do not have a
lot of surrogates.  If you do then probably UTF-32 should be considered.

>From a cost to support there are valid reasons to use a mix of UTF formats.

Carl



Reply via email to