Dirk wrote:
Kenneth, I must admit that I am very inexperienced with issues
regarding code pages and converting among encodings... according to
your description, is ticket 26
<http://www.pumacode.org/projects/vss2svn/ticket/26> still valid? It
sounds like the XML parser will "do the right thing" as long as the
correct encoding is written to the XML files?
Oh, this old devil jumps back into my neck ...
First: vss itself does not contain any codepage information. Since
every client will write the settings in its own codepage, you can have
a mixed archive with different encodings, but you can not tell from
the outside, which one is the correct one. E.g. consider two
developers with different codepages working on the same archive. Both
will write log messages in their own codepage, and therefor can not
correctly decode the log messages from the other. And there is no way
to prevent this.
Second: You have to distinguish the version controlled files and the
associated information like author and comment. No source control
system will deal with the "encoding" of the stored file itself. You
can consider this as black box data. Only comment and author
information needs ot be encoded in the corret codepage.
This is why you will see utf-8 encoded comments but codepage encoded
data in the dumpfile.
OK, thanks for helping explain the issue better for me. I think us
Americans happen to have fewer troubles with encodings than the rest of
the world!
Since it seems that all of us are finding it harder to get time to work
on this project, I think we should find a solution that works "pretty
well" with the least effort possible, so users don't need to recompile
ssphys to get a good conversion. I think defaulting to Windows-1252 with
the option to manually specify otherwise sounds like the way to go...
1.) I'm not sure about the state of the Ken/Unicode [1] and
Ken/ssphys-trusted-encoding branches [2]
<http://www.pumacode.org/projects/vss2svn/browser/branches/Ken/ssphys-trusted-encoding>
2.) What happens if you patch this line[3] to change to your correct
codepage
> TiXmlDeclaration decl ("1.0", "windows-1252", "");
If I understand Kenneth's reply, #2 works correctly and the XML parser
will convert to the proper encoding.
toby
_______________________________________________
vss2svn-users mailing list
Project homepage:
http://www.pumacode.org/projects/vss2svn/
Subscribe/Unsubscribe/Admin:
http://lists.pumacode.org/mailman/listinfo/vss2svn-users-lists.pumacode.org
Mailing list web interface (with searchable archives):
http://dir.gmane.org/gmane.comp.version-control.subversion.vss2svn.user