> -----Original Message-----
> From: br...@apache.org [mailto:br...@apache.org]
> Sent: maandag 12 november 2012 16:37
> To: comm...@subversion.apache.org
> Subject: svn commit: r1408325 - /subversion/branches/wc-collate-
> path/subversion/libsvn_subr/sqlite.c
> 
> Author: brane
> Date: Mon Nov 12 15:36:47 2012
> New Revision: 1408325
> 
> URL: http://svn.apache.org/viewvc?rev=1408325&view=rev
> Log:
> On the wc-collate-path branch: Enable GLOB and LIKE operator
> replacements.

Completely unrelated to this patch, but I'm still wondering what your total 
approach/plan on this branch will be.

I can see that we handle this collate in sqlite (even though this breaks using 
a plain sqlite3 as tool on wc.db, etc.), but the 
notes/unicode-composition-for-filenames describes several other problems that 
need a fix at the same time in order not to break at least some current 
subversion users.

One of these things is that we use hashtables to represent all nodes in a 
directory in several places. In some cases we get this from the working copy, 
in some cases from the db and in even other cases from the repository. Some of 
these may be normalized in some way, while others are not (especially with our 
compatibility guarantees within 1.X)

I'm afraid that just getting wc.db compatible with normalization will just 
shift the problem one layer, while still not fixing the real problem. Erik 
Huelsmann thoroughly investigated this problem space some years ago and he 
documented that fixing the wc library is not enough for fixing the generic 
case. And if we are not fixing the generic case, I'm wondering if we should 
really work on a major slowdown of every common operation.

We currently have a binary format, that can be used as a hash key, so many 
comparison and lookup operations are constant time.
I'm not sure how they are after installing the collate handling.


If we leave the generic case, there are easier ways to resolve this issue. One 
such thing would be to make apr (or a wrapper in Subversion) normalize the on 
disk paths in the other direction and deny (on the server) the non-normalized 
paths. This would eliminate the slowdown on most use cases that don't have a 
problem right now, and keep the code clean for future problems.

If we have to check for collate handling everywhere in libsvn_wc and 
libsvn_client we make it much harder for outside developers to create patches 
and even fewer core subversion developers would dare touch these layers.



I'm glad somebody is finally looking into these issues, but I think we should 
look at the full picture before we can talk about getting this back on trunk.

        Bert


Reply via email to