Dirk wrote:

Kenneth, I must admit that I am very inexperienced with issues regarding code pages and converting among encodings... according to your description, is ticket 26 <http://www.pumacode.org/projects/vss2svn/ticket/26> still valid? It sounds like the XML parser will "do the right thing" as long as the correct encoding is written to the XML files?


Oh, this old devil jumps back into my neck ...

First: vss itself does not contain any codepage information. Since every client will write the settings in its own codepage, you can have a mixed archive with different encodings, but you can not tell from the outside, which one is the correct one. E.g. consider two developers with different codepages working on the same archive. Both will write log messages in their own codepage, and therefor can not correctly decode the log messages from the other. And there is no way to prevent this.

Second: You have to distinguish the version controlled files and the associated information like author and comment. No source control system will deal with the "encoding" of the stored file itself. You can consider this as black box data. Only comment and author information needs ot be encoded in the corret codepage. This is why you will see utf-8 encoded comments but codepage encoded data in the dumpfile.

OK, thanks for helping explain the issue better for me. I think us Americans happen to have fewer troubles with encodings than the rest of the world!

Since it seems that all of us are finding it harder to get time to work on this project, I think we should find a solution that works "pretty well" with the least effort possible, so users don't need to recompile ssphys to get a good conversion. I think defaulting to Windows-1252 with the option to manually specify otherwise sounds like the way to go...

1.) I'm not sure about the state of the Ken/Unicode [1] and Ken/ssphys-trusted-encoding branches [2] <http://www.pumacode.org/projects/vss2svn/browser/branches/Ken/ssphys-trusted-encoding> 2.) What happens if you patch this line[3] to change to your correct codepage
>  TiXmlDeclaration decl ("1.0", "windows-1252", "");

If I understand Kenneth's reply, #2 works correctly and the XML parser will convert to the proper encoding.

toby

_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user

Reply via email to