Attached is a java test routine. It produces the output below. In both cases I uses the combined unicode character.
I supposed it is a conversion problem between the java code and the svnlook command. cmd.exe /C chcp 65001 & cmd.exe /C C:\Program Files (x86)\Subversion\bin\svnadmin create C:\test\repo Aktive Codepage: 65001. cmd.exe /C chcp 65001 & cmd.exe /C C:\Program Files (x86)\Subversion\bin\svn checkout file:///C:/test/repo C:\test\wc Aktive Codepage: 65001. Ausgecheckt, Revision 0. cmd.exe /C chcp 65001 & cmd.exe /C C:\Program Files (x86)\Subversion\bin\svn add C:\test\wc\a --depth infinity Aktive Codepage: 65001. A wc\a A wc\a\o? cmd.exe /C chcp 65001 & cmd.exe /C C:\Program Files (x86)\Subversion\bin\svn commit C:\test\wc -m comment Aktive Codepage: 65001. Füge hinzu wc\a Füge hinzu wc\a\o? Revision 1 übertragen. cmd.exe /C chcp 65001 & cmd.exe /C C:\Program Files (x86)\Subversion\bin\svnlook proplist C:\test\repo //a//o? svnlook: E160013: Pfad »/a/o¨« existiert nicht Aktive Codepage: 65001. -----Ursprüngliche Nachricht----- Von: Philip Martin [mailto:philip.mar...@wandisco.com] Gesendet: Montag, 15. Dezember 2014 14:59 An: Matthias Ludwig Cc: users@subversion.apache.org Betreff: Re: svnlook proplist & unicode characters "Matthias Ludwig" <matthias.lud...@stl-software.de> writes: > I try to call Svnlook proplist within a svn hook on windows. > > Svnlook proplist <repo-path> <pathToFile> > > The <pathToFile> contains unicode only characters (unicode combinining > characters). > > The unicode characters are not passed correctly to svnlook. > > I googled around and found that one should that the code page with chcp. This > changes the stdout-encoding of svnlook for the output. But I did not succeed > to change the interpretation oft he calling parameter. > > The caller is a java routine. I tried Runtime.getRuntime().exe() and native > calls via jna. > > I do not exactly know where the problem is. Does the call mess up the > unicode characters? Or is svnlook not capable of processing unicode > characters in input paremeters? svnlook should handle unicode characters in parameters. However Subversion has no special support for combining characters and just uses whatever literal UTF-8 sequence is supplied. That means the composed and decomposed forms are different paths in the repository: e.g š encoded as 's' + 'U+030C' is not the same path as š encoded as 'U+0161' $ svnadmin create repo $ svnmucc -mm -U file://`pwd`/repo mkdir `printf "s\u030c"` propset p v `printf "s\u030c"` $ svnlook tree repo / š/ $ svnlook proplist repo `printf "s\u030c"` Properties on '/š': p $ svnlook proplist repo `printf "u\0161"` svnlook: E160013: Path '/š' does not exist All Subversion utilities do conversion between UTF-8 and whatever local encoding is in use. If your local encoding is not UTF-8 then the conversion to UTF-8 will probably generate either the composed or decomposed form and it can be difficult to generate the other form, you may have to switch your local encoding to UTF-8 and generate it yourself. I have no idea what that involves on Windows. See also http://subversion.tigris.org/issues/show_bug.cgi?id=2464 which is about choosing a canonical representation. -- Philip Martin | Subversion Committer WANdisco // *Non-Stop Data*
Test.java
Description: Binary data