2012/2/3 Peter Samuelson <pe...@p12n.org>: > > [Hiroaki Nakamura] >> Existing repositories, I think it would be better to convert them too using >> svndump/svnload. And we change svnload to convert filenames to NFC. >> However in reality we cannot force users to convert every existing >> repository. > > Also note that if you convert a repository (via dump/load or whatever), > all working copies based on the repository are invalidated and need to > be re-checked-out. Avoiding _that_ problem would be really hairy, I > think, very similar to the sort of work that would be needed to support > obliterate without losing working copies. > >> We also need to changes servers in order to deal with existing 1.x >> clients. We convert filenames to NFC when web_dav_svn and svnserve >> receive filenames from clients, they must first convert filenames to >> NFC. > > You keep saying what we "must" do on the server side. I propose > something that is purely on the client side. It will solve the OS X / > non-OS X interoperability problem. It will not solve every problem > ever faced by a Subversion user. That's a job for 2.0.
OK. When I started this thread, I suppose we'd like to focus to long term solution 2.x. That's because the short term solution options (4) written in http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames seems too diificult and complex for me. But if a modification to my proposal will fit in short term 1.x, I will modify it delightedly. > >> Yes, like I said above, "clients" actually includes components that >> run on servers like web_dav_svn, and it should read as any components >> that access to repositories and working copies. > > No. By "clients" I mean components that run on the client side. If my > proposal had required changes to mod_dav_svn, I would not have said > "strictly client-side". I do not propose any change to mod_dav_svn, > svnserve, svnadmin, libsvn_repos, libsvn_fs, the repository data, or > anything else on the server side. > >> If you think in analogy to ASCII uppercase and lowercase examples, >> you miss the point. Please reread the Unicode Standard Annex #15 >> UAX #15: Unicode Normalization Forms >> http://unicode.org/reports/tr15/ > > Thanks, I've read it. The analogy stands. We could prevent NFC/NFD > collisions as an additional service to users, something we have not > done for the past 10 years. This would be along the lines of > preventing users from shooting themselves in the foot. > > The actual _software_ problem that is solved by preventing collisions > is the same as the software problem solved by preventing upper/lower > case collisions: certain clients are unable to check out a folder that > has such collisions. (Windows clients, in the case of upper/lower > collisions; OS X clients, in the case of NFC/NFD collisions.) Yes, I agree with that. > > I think we are talking past each other. You are trying to solve two > distinct but related problems: 1. OS X client-side confusion when faced > with a non-NFD repository path; 2. NFC/NFD collisions. I am only > trying to solve problem 1. I'm ignoring problem 2 for two reasons: > > (a) Problem 2 requires server-side work and complex compatibility / > upgrade scenarios (dump/load, re-check-out all wcs, etc). > > (b) Problem 2 can be worked around, for new repositories (or > repositories with no existing collisions), with a pre-commit hook. > > ...neither of which are true for my proposal to solve problem 1. > > So long as you continue to insist that, to solve problem 1, we must > also solve problem 2, I'm pretty sure we will never come to any > agreement. OK. So how about changing my proposal like: (1) No sever modification. Just modify svn_path_cstring_to_utf8 only. (2) Let users install a pre-commit hook which rejects any non-NFC filenames. In this way, we only need one function. Modification is just like the original OS X unicode path patch: utf8precompose_macosx_2.patch http://subversion.tigris.org/nonav/issues/showattachment.cgi/813/utf8precompose_macosx_2.patch in http://subversion.tigris.org/issues/show_bug.cgi?id=2464 Only difference the original patch to my patch will be mine use utf8proc so that we can use it on all platforms, Mac OS X, Windows and Linux. -- )Hiroaki Nakamura) hnaka...@gmail.com