On Wednesday, April 12, 2006 2:26 AM +0200 Dirk <[EMAIL PROTECTED]> wrote:

In order to solve this issue, I'm thinking of embedding my own
windows-1252 to UTF-8 conversion table into the ssphys programm. Then I
would output UTF-8 encoded strings and everybody would be happy.

Is there any code for copy'n pasting around?

There's a 1252-to-UTF8 conversion routine here:

<http://discuss.joelonsoftware.com/default.asp?joel.3.325282.13>

The author, Ben Bryant, initially says it's for 8859-1 but then corrects himself in a subsequent post.

The trick is to pass anything outside the problem band (0x7f to 0x9f) since it's already the same as UTF-8, and only do the lookup and conversion for those 17 characters using a small table of equivalent Unicode points and the appropriate UTF escape mechanism.

I think the CXMLFormatter wants to use UTF-8 for the output encoding. I don't think you need to set the locale. Digging further, it looks like the place to fix this is in sanitizeForXML in XML.cpp. Instead of converting invalid characters to underscores, use the table lookup and regenerate the string.

I haven't yet dug far enough to see what consumes the resulting XML. Does that understand the multi-byte UTF-8 characters that would result from this conversion?

The other possibility is to ignore the conversion here and suppress the sanitizer, and declare the encoding as Windows-1252, on the assumption that the XML consumer knows how to read that. Again, don't setlocale(), as we're not using the locale features of iostream to render the output (I don't think). Or maybe setlocal("C") to suppress any locale processing. The setlocale can happen around the call to TiXmlDocument::Print in ~CXMLFormatter to isolate its effect to the file output.

It looks like tinyxml uses fprintf for everything. If it really wanted to be tiny, it probably should have used fwrite. I don't think that pays attention to the locale and treats everything as binary.

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org

Reply via email to