On Jun 14, 2011, at 18:59, Stefan Sperling wrote:

> On Tue, Jun 14, 2011 at 04:24:46PM -0700, Geoff Hoffman wrote:
>> I have a file with some (I believe) Portuguese characters in the filename
>> that someone managed to store in the repo without any problem, and I checked
>> it out without issues, too. However, now on my working copy, it thinks that
>> file is locally new.
> 
>> MacbookPro:ClearSale geoffh$ ls -la
>  ^^^
> 
> It's a Mac, so please see this issue:
> http://subversion.tigris.org/issues/show_bug.cgi?id=2464
> and make sure to read the notes in this file:
> http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
> 
> Short summary:
> Do not use anything but ASCII in your filenames if you need things
> to work between Macs and other systems. The problem is that the Mac
> changes the filename in a subtle way.


I would clarify this by saying the problem is that Subversion assumes that a 
filename submitted in one version of UTF-8 encoding will always stay in that 
version of UTF-8 encoding, and on the HFS+ filesystem, used by Mac OS X, that 
assumption is not necessarily true. (It normalizes all UTF-8 filenames to 
decomposed form.) Subversion would happily allow you to create two filenames 
that humans would consider identical (one with UTF-8 entities composed, one 
with UTF-8 entities decomposed). So clearly that's a bug in Subversion (or 
possibly apr or apr-util); it should normalize UTF-8 strings before running 
comparisons. It also seems like a bug in Windows and Linux filesystems; I 
assume they also let you create multiple files whose names look identical (but 
differ only in the composition of their UTF-8 characters). Mac OS X's is the 
only filesystem I know of that has fixed this bug -- which therefore exposes 
the problem when collaborating between Mac OS X systems (which have the fix) 
and other systems (which do not).


Using only ASCII characters in your filenames is one way to combat the problem. 
This strategy works fine for me, but users not using primarily English might 
find that harder. If you want to continue using UTF-8 characters in filenames, 
you can get a version of Subversion for Mac OS X that attempts to work around 
this problem, by installing MacPorts and then running:

sudo port install subversion +unicode_path

The patch the +unicode_path variant applies is of course not officially 
supported.



Reply via email to