Hi, Hannu, Hannu Krosing wrote:
>>> Are you sure it's UCS-4 ? I've always thought that XML is what is given >>> in <xml > tag, and utf-8 if no charset is given. >> You have to distinguish between the supported charset, and the document >> encoding. > UCS-4 and UTF-8 are both encodings for UNICODE > see: http://en.wikipedia.org/wiki/UTF-32 Yes, I know. The Point I wanted to make was that the document encoding is independent from the allowed charset (except having to be a subset). That is what XML entities were defined for. So even in an document using LATIN-1 as encoding, the charset still is Unicode, giving us the possibility to use &entities; to use non-latin1 characters. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
signature.asc
Description: OpenPGP digital signature