[Stefan Sperling] > It is indeed harder because we are passing paths verbatim to sqlite. > I doubt having more than one form of a given path in wc.db is fun...
That's the implementation I would like to see, to be honest. Start with the observation that we can treat Mac OS X NFD paths as a client character encoding. Now observe that it is lossy. But ... almost all non-Unicode client charsets are equally lossy, for exactly the same reason! This suggests maintaining a mapping table in wc.db between server paths (UTF-8, unspecified NF) and wc paths (local charset, which is occasionally UTF-8 with NFD). This mapping table would be maintained any time we write to the wc. It would be consulted any time we search for files in the wc. It's not really extra work - we have to do those UTF-8 <-> local charset conversions all the time anyway. This would in fact cache those conversions. The implementation on OS X might be a bit hairy, if there isn't an easy way to retrieve the real pathname of the file you just created. Anywhere else, we just store the pathname we just calcuated. Peter