On Wednesday, April 12, 2006 2:26 AM +0200 Dirk <[EMAIL PROTECTED]> wrote:
In order to solve this issue, I'm thinking of embedding my own
windows-1252 to UTF-8 conversion table into the ssphys programm. Then I
would output UTF-8 encoded strings and everybody would be happy.
Is there any code for copy'n pasting around?
There's a 1252-to-UTF8 conversion routine here:
<http://discuss.joelonsoftware.com/default.asp?joel.3.325282.13>
The author, Ben Bryant, initially says it's for 8859-1 but then corrects
himself in a subsequent post.
The trick is to pass anything outside the problem band (0x7f to 0x9f) since
it's already the same as UTF-8, and only do the lookup and conversion for
those 17 characters using a small table of equivalent Unicode points and
the appropriate UTF escape mechanism.
I think the CXMLFormatter wants to use UTF-8 for the output encoding. I
don't think you need to set the locale. Digging further, it looks like the
place to fix this is in sanitizeForXML in XML.cpp. Instead of converting
invalid characters to underscores, use the table lookup and regenerate the
string.
I haven't yet dug far enough to see what consumes the resulting XML. Does
that understand the multi-byte UTF-8 characters that would result from this
conversion?
The other possibility is to ignore the conversion here and suppress the
sanitizer, and declare the encoding as Windows-1252, on the assumption that
the XML consumer knows how to read that. Again, don't setlocale(), as we're
not using the locale features of iostream to render the output (I don't
think). Or maybe setlocal("C") to suppress any locale processing. The
setlocale can happen around the call to TiXmlDocument::Print in
~CXMLFormatter to isolate its effect to the file output.
It looks like tinyxml uses fprintf for everything. If it really wanted to
be tiny, it probably should have used fwrite. I don't think that pays
attention to the locale and treats everything as binary.
_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org