On Dec 1, 2009, at 10:31 AM, Philip Martin wrote: > "Hyrum K. Wright" <[email protected]> writes: > >> On Dec 1, 2009, at 6:46 AM, Philip Martin wrote: >> >>> At the moment locking doesn't really work. For example: >>> >>> svnadmin create repo >>> svn co file://`pwd`/repo wc >>> svn mkdir --parents wc/foo/bar/zig/zag >> >> In this specific case, I can see us simply finding the parent >> directory which is a member of a working copy, and just locking it >> to depth infinity. This would implicitly lock any children, and > > There are no children when wc gets locked. > >> when that lock is removed, all the children would then be logically >> unlocked. (Or at least, that's what the little voice in my head >> tells me. If this reasoning is flawed, please let me know!) > > So we lock the wc database. Then we come to create the wc/foo > database we need to check the lock status of the parent database and > automatically propogate the lock from the parent into the wc/foo > database. Then the same for wc/foo/bar with wc/foo as the parent. > When we finally unlock the wc database it unlocks the wc/foo, > wc/foo/bar databases as well. > > Or do you see just a single lock in the wc database, and no locks in > any of the other databases?
Yes. The explicit lock on wc/ has depth infinity, so when future iterations ask the question "is wc/foo/ locked?" the answer is "yes" via the implicit locking mechanism. wc_db callers only have to manage one actual row in the database, and other locks can be inferred from that row, in much the same way that access batons opened to infinite depth also open their children (only new children have to be explicitly opened in this case). >>> As far as I can see locking is currently pretty close to being >>> completely broken. Are we going to try and fix it before we move to a >>> centralised database or do we just abandon locking? >> >> No. I don't think abandoning locking is prudent or needed. >> >> Assuming the design meets our anticipated requirements, I think we >> can move forward with it even without centralized metadata. We can >> implement the wc_db APIs in such a way that they process and return >> locks consistent with the design, and then let the implementation >> worry about where those locks are actually stored, and what we need >> to do to find them. In this way, callers are freed from worrying >> about where the metadata actually lives, similar to >> svn_wc__db_read_info() and friends. > > svn_wc__db_read_info is a very hard function to use. For example if I > call it to get repos_relpath say, then I can't simply use the returned > value because sometimes it's null. I need to examine something, > status or the repos_relpath value I'm not exactly sure, and possibly > search through the parents to get the real value. I suppose strictly > speaking I don't have to worry about the location of the metadata, but > it's not an easy to use interface. Regarding the easy use of svn_wc__db_read_info(), I agree. I believe the decision was that that API would tell a caller what it knew *right then*, and if that information wasn't known (i.e., it was NULL), the caller could then do what it took to get that information, either by recursing up the tree or some other means. read_info() is actually pretty dumb, and in all likelihood, requires helpers to make it useful. (Some of these have already started, see node.c.) >> Assuming (and this is a big assumption) the design passes muster, >> I'd like to do some of the locking implementation work over the next >> few days to see how it works out. Critical review highly >> appreciated. > > Obviously with enough effort we can make it work, it's just that it > seems to be a long way off working at the moment. At the moment we > can a) take locks that have no effect and b) make changes without > locks. Apart from Julian's recent change there are no locking > failures, but given the aparent absence of locking checks how can we > have any confidence that locks are being taken correctly? We can test edge cases, but testing concurrency and exclusion bugs is just Hard; I'd much rather have a solution in place that, while not necessarily formally proven, has some good thought and design behind it. The current hodge-podge of brokenness is not acceptable long-term. > One way to proceed would be to start by restricting locks in the > wc_lock table to those that have an empty local_dir_relpath. Then > each directories database would hold the lock for that directory, much > like the current lock files. This is exactly what currently happens. We don't programmatically restrict that value, but in practice, that's the only value which should end up in the wc_lock table. > But perhaps we could do it differently: all the subdirectories could > have an empty wc_lock table, or maybe no table at all, and all locks > could be stored in the wc_lock table in the wc root. Essentially we > could move to centralised locking before we move to centralised > metadata. This is an interesting idea. I guess it's a question of what part of the single db<->centralized metadata spectrum we want to be on right now, how our various infrastructure (such as the upgrade bits) support such a scheme, and what the benefits and risks are. Since there are already other places we recurse up the tree, I don't think that doing so to infer locks will be hard. Pioneering the centralized metadata scheme for locks, on the other hand... Then again, we're going centralized at some point, maybe this is the right place to start. -Hyrum

