Re: Two svn_wc__db_t for single-db upgrade
Bert Huijben b...@vmoo.com writes: The hard cases, like missing and obstructions of metadata are not handled and cannot be handled by the single-db WC-DB api as these cannot occur there . (There are no tests for this, and anything that looks like a test for this is disabled for some 4th tree reason). You can't even see that data is missing from most of the wc-db apis at they fall back to the parent stub. (And the wc-db api doesn't allow annotating a node that it misses some information) I don't think we can ever handle all the upgrade state with just the single-db optimized API. E.g. by just using the API you can't look through child-of-copy deletions, nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this information directly. The upgrade code only ever needs to use the database to look at parents. If the parent information in the database is not correct then the upgrade has already failed. I don't understand your child-of-copy deletion example. That's one case the the existing multi-db upgrade code gets wrong, it creates a base_node where there should only be a working_node. It gets it wrong because svn_wc_entry_t does not have the copied information. The information is in the database and can be obtained by calling _read_info. Or we could make our own in-memory database of entries and query that, but I don't see why that is better. So you are suggesting that we change the DB API's to provide this information (or keep providing this multi-db specific information like the obstruction statee that really have to be removed to get back at 1.6 speed), while we will never need these functions and statee after this conversion to single-db? Using the db api for intermediate storage during update is making things harder for future api users that never want to look back at the pre-single-db era. And it slows down common code paths just to support upgrading from a version that is hopefully ancient for us in a few months. And then we can only fix the api by fixing this problem later... or by moving to wc-ng-NG. I'm not proposing to change any APIs. I introduce a new API to allow the creation of svn_wc__db_t that ignores 1.6 .svn directories when constructing pdhs. That's the only change. It allows the existing upgrade code to work in single-db just as it works in multi-db, with the same features and bugs. Just had a thought. Do you think I am proposing that we extract svn_wc_entry_t structures from the database during upgrade? That's not what I am proposing. Upgrade will read the 1.6 .svn/entries to get svn_wc_entry_t structires, but when we need information about the parent we use the existing wc-ng API to get things like svn_wc__db_status_t. -- Philip
RE: Two svn_wc__db_t for single-db upgrade
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: vrijdag 27 augustus 2010 11:50 To: Bert Huijben Cc: 'Greg Stein'; dev@subversion.apache.org Subject: Re: Two svn_wc__db_t for single-db upgrade Bert Huijben b...@vmoo.com writes: The hard cases, like missing and obstructions of metadata are not handled and cannot be handled by the single-db WC-DB api as these cannot occur there . (There are no tests for this, and anything that looks like a test for this is disabled for some 4th tree reason). You can't even see that data is missing from most of the wc-db apis at they fall back to the parent stub. (And the wc-db api doesn't allow annotating a node that it misses some information) I don't think we can ever handle all the upgrade state with just the single-db optimized API. E.g. by just using the API you can't look through child-of-copy deletions, nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this information directly. The upgrade code only ever needs to use the database to look at parents. If the parent information in the database is not correct then the upgrade has already failed. I don't understand your child-of-copy deletion example. That's one case the the existing multi-db upgrade code gets wrong, it creates a base_node where there should only be a working_node. It gets it wrong because svn_wc_entry_t does not have the copied information. The information is in the database and can be obtained by calling _read_info. Or we could make our own in-memory database of entries and query that, but I don't see why that is better. In case of a delete of copy you can have BASE normal (checked out N levels up) NODE_DATA normal (descendant of copy 2 levels up) NODE_DATA normal (child of copy 1 level up) WORKING: deleted (node itself) _read_info() will give you the information from working (It is deleted). And _base_get_info() shows you there was a base in the repository. So you see a normal base delete using the wc-db api. But actually you have 3 deletes in one place (one inherited). The current DB code can only represent the information you get from _read_info and _base_get_info(), so the rest is lost. svn_wc_entry_t however can represent one intermediate layer. Not the two from this example, but just one. (The other case cannot be created by libsvn_wc. It will say that you have to commit first). But even in that case there can be different information in the parent stub and the child directory itself. So you are suggesting that we change the DB API's to provide this information (or keep providing this multi-db specific information like the obstruction statee that really have to be removed to get back at 1.6 speed), while we will never need these functions and statee after this conversion to single-db? Using the db api for intermediate storage during update is making things harder for future api users that never want to look back at the pre-single-db era. And it slows down common code paths just to support upgrading from a version that is hopefully ancient for us in a few months. And then we can only fix the api by fixing this problem later... or by moving to wc-ng-NG. I'm not proposing to change any APIs. I introduce a new API to allow the creation of svn_wc__db_t that ignores 1.6 .svn directories when constructing pdhs. That's the only change. It allows the existing upgrade code to work in single-db just as it works in multi-db, with the same features and bugs. So, we have to add at least one IF that always evaluates to one case for all future Subversion versions except for upgrading for = 1.6? In my book, that is changing APIs just for upgrade. Just had a thought. Do you think I am proposing that we extract svn_wc_entry_t structures from the database during upgrade? That's not what I am proposing. Upgrade will read the 1.6 .svn/entries to get svn_wc_entry_t structires, but when we need information about the parent we use the existing wc-ng API to get things like svn_wc__db_status_t. No, (No!) I already suggested dropping all upgrade support for formats 13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. Why would you ever want to use the entry read code via wc__db_t? You can just read the entry files recursively and insert in the db as needed. The wc_db api doesn't have to care about entries files at all; only for detecting that WCs must be upgraded. Bert
Re: Two svn_wc__db_t for single-db upgrade
Bert Huijben b...@qqmail.nl writes: But even in that case there can be different information in the parent stub and the child directory itself. That's why I want to use the database. So you are suggesting that we change the DB API's to provide this information (or keep providing this multi-db specific information like the obstruction statee that really have to be removed to get back at 1.6 speed), while we will never need these functions and statee after this conversion to single-db? Using the db api for intermediate storage during update is making things harder for future api users that never want to look back at the pre-single-db era. And it slows down common code paths just to support upgrading from a version that is hopefully ancient for us in a few months. And then we can only fix the api by fixing this problem later... or by moving to wc-ng-NG. I'm not proposing to change any APIs. I introduce a new API to allow the creation of svn_wc__db_t that ignores 1.6 .svn directories when constructing pdhs. That's the only change. It allows the existing upgrade code to work in single-db just as it works in multi-db, with the same features and bugs. So, we have to add at least one IF that always evaluates to one case for all future Subversion versions except for upgrading for = 1.6? In my book, that is changing APIs just for upgrade. Just had a thought. Do you think I am proposing that we extract svn_wc_entry_t structures from the database during upgrade? That's not what I am proposing. Upgrade will read the 1.6 .svn/entries to get svn_wc_entry_t structires, but when we need information about the parent we use the existing wc-ng API to get things like svn_wc__db_status_t. No, (No!) I already suggested dropping all upgrade support for formats 13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. Why would you ever want to use the entry read code via wc__db_t? You can just read the entry files recursively and insert in the db as needed. The wc_db api doesn't have to care about entries files at all; only for detecting that WCs must be upgraded. Bert -- Philip
RE: Two svn_wc__db_t for single-db upgrade
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: vrijdag 27 augustus 2010 14:57 To: Bert Huijben Cc: 'Bert Huijben'; 'Greg Stein'; dev@subversion.apache.org Subject: Re: Two svn_wc__db_t for single-db upgrade Bert Huijben b...@qqmail.nl writes: But even in that case there can be different information in the parent stub and the child directory itself. That's why I want to use the database. But even then, this doesn't fix that we must be able to recover when the upgrade fails. A working copy can never have a wc.db and an entries file. So you would also have to make a svn_wc__db_t that uses something else than '.svn/wc.db', and... and... and... (you are essentially reinventing the kind of hacks we had in WC-1.0, that we try to avoid with WC-NG) I really think that it is much easier to just walk the entries files using an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere outside the normal location using upgrade specific code. Inserting the node data, properties checksums, etc. while walking and adding workingqueue operations as we go for moving the pristine files in place later (and deleting unneeded administrative data). If this step fails you just have to remove the upgrade.db file, because the old working copy is still unmodified. But when this step is finished, you can destroy the top level wcroot and install the upgrade.db as new wc.db file (actions: move wc.db in place and then delete the entries file). You can then fix the rest of the working copy by just running the working queue, which will remove all the 1.6 like markings from all the subdirs and fix the pristine store before the db is ready for usage (normal patter). This step is restartable by calling svn cleanup on the root. Bert
Re: Two svn_wc__db_t for single-db upgrade
Bert Huijben b...@qqmail.nl writes: In case of a delete of copy you can have BASE normal (checked out N levels up) NODE_DATA normal (descendant of copy 2 levels up) NODE_DATA normal (child of copy 1 level up) WORKING: deleted (node itself) _read_info() will give you the information from working (It is deleted). And _base_get_info() shows you there was a base in the repository. So you see a normal base delete using the wc-db api. But actually you have 3 deletes in one place (one inherited). The current DB code can only represent the information you get from _read_info and _base_get_info(), so the rest is lost. svn_wc_entry_t however can represent one intermediate layer. Not the two from this example, but just one. (The other case cannot be created by libsvn_wc. It will say that you have to commit first). That shows how complicated the update process can be. I'm not sure how it answers my question about why it's wrong to query the database. But even in that case there can be different information in the parent stub and the child directory itself. If I query the database I don't have to worry about parent stubs. There aren't any. I'm not proposing to change any APIs. I introduce a new API to allow the creation of svn_wc__db_t that ignores 1.6 .svn directories when constructing pdhs. That's the only change. It allows the existing upgrade code to work in single-db just as it works in multi-db, with the same features and bugs. So, we have to add at least one IF that always evaluates to one case for all future Subversion versions except for upgrading for = 1.6? In my book, that is changing APIs just for upgrade. The code that I want to change already understands 1.6 format. This is the change in svn_wc__db_pdh_parse_local_abspath: Index: ../src/subversion/libsvn_wc/wc_db_pdh.c === --- ../src/subversion/libsvn_wc/wc_db_pdh.c (revision 990126) +++ ../src/subversion/libsvn_wc/wc_db_pdh.c (working copy) @@ -526,7 +526,7 @@ five ancetors higher. We don't test for directory presence (just for the presence of subdirs/files), so we don't know when we can stop checking ... so just check always. */ - if (!moved_upwards || always_check) + if (!db-ignore_1_6 (!moved_upwards || always_check)) { SVN_ERR(get_old_version(wc_format, local_abspath, scratch_pool)); if (wc_format != 0) The ignore_1_6 flag is new, and I need a way to set it, but that's it. When the flag is set the wc_db code is almost the same, it just won't detect old 1.6 working copies. Just had a thought. Do you think I am proposing that we extract svn_wc_entry_t structures from the database during upgrade? That's not what I am proposing. Upgrade will read the 1.6 .svn/entries to get svn_wc_entry_t structires, but when we need information about the parent we use the existing wc-ng API to get things like svn_wc__db_status_t. No, (No!) I already suggested dropping all upgrade support for formats 13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. Why would you ever want to use the entry read code via wc__db_t? You can just read the entry files recursively and insert in the db as needed. The wc_db api doesn't have to care about entries files at all; only for detecting that WCs must be upgraded. I still don't understand your objections. I am not proposing that svn_wc__db_t be used to read entries. The upgrade process work as follows: - read the 1.6 entries for a directory into svn_wc_entry_t in memory - write nodes into the SQLite db for the directory and immediate children - cleanup the old 1.6 files - move on to the subdirectories If we were to move the entries file after reading it but before writing to the database then the single-db upgrade code would work. The directory being upgraded would not look like a 1.6 directory and the pdh code would skip over it an find the root. This is exactly what happens once the old 1.6 files have been cleaned and we move on to the subdirectories. Moving the entries file is not a good solution because an interrupted upgrade would be harder to handle. The change I want to make has the same effect as moving the entries file: once we have started upgrading the directory then give me a way to treat it as a wc-ng directory not a 1.6 directory. Another way to achieve what I want is to provide a way to insert an appropriate pdh: svn_error_t * svn_wc__db_pdh_skip_1_6(svn_wc__db_t *db, const char *local_abspath, apr_pool_t *scratch_pool) { svn_wc__db_pdh_t *pdh; const char *parent_abspath = svn_dirent_dirname(local_abspath, scratch_pool); svn_wc__db_pdh_t *parent_pdh = apr_hash_get(db-dir_data, parent_abspath, APR_HASH_KEY_STRING); SVN_ERR_ASSERT(parent_pdh);
Re: Two svn_wc__db_t for single-db upgrade
Bert Huijben b...@qqmail.nl writes: I really think that it is much easier to just walk the entries files using an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere outside the normal location using upgrade specific code. That might be another way to do it. If we construct a temporary 1.7 root within the 1.6 root, something like: .svn/tmp/qyh2h2n2/.svn/wc.db then we should be able to query that database by converting a path /some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B. -- Philip
Re: Two svn_wc__db_t for single-db upgrade
Greg Stein gst...@gmail.com writes: Back up a step. *What* data do you need to query? Maybe there is a more direct solution. Upgrade calls _scan_addition on the parent when writing a node, see entries.c:write_entry. I very much dislike a special mode for wc_db. It just screams hack. If I put the new database it in .svn/wcng/.svn/wc.db, in the root of the working copy, then I no longer need the special mode. When the upgrade is complete I move it to .svn/wc.db. -- Philip
Re: Two svn_wc__db_t for single-db upgrade
Back up a step. *What* data do you need to query? Maybe there is a more direct solution. I very much dislike a special mode for wc_db. It just screams hack. On Aug 27, 2010 10:07 AM, Philip Martin philip.mar...@wandisco.com wrote: Bert Huijben b...@qqmail.nl writes: I really think that it is much easier to just walk the entries files using an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere outside the normal location using upgrade specific code. That might be another way to do it. If we construct a temporary 1.7 root within the 1.6 root, something like: .svn/tmp/qyh2h2n2/.svn/wc.db then we should be able to query that database by converting a path /some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B. -- Philip
Re: Two svn_wc__db_t for single-db upgrade
On Fri, Aug 27, 2010 at 12:32 PM, Stefan Sperling s...@elego.de wrote: On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote: I'm just talking as a user here... and not an svn dev... but do you really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just have 1.7 not work with 1.7 WCs and tell the users they need to do a new checkout with 1.7. I mean... it might annoy some people, but I just think that the svn dev team would have that much more time to work on the real features/functionality of 1.7. I'm sure upgrades from WC-NG to WC-NG.Next will be much simpler and can/should still be included. You're not alone. I just raised the same question in the dev IRC channel. There's value in the upgrade capability for users who have many and large working copies. But I also think that we shouldn't let weeks of developer time sink into this feature. I think that getting the simple upgrade cases working (no local mods, no conflicts) would already make many people happy. We've already punted on some aspect of upgradability (no pending logs, for instance), and it may be valuable to shift the line further toward the common case. You have local mods? Better commit 'em or make a patch before upgrading. -Hyrum
Re: Two svn_wc__db_t for single-db upgrade
On Fri, 2010-08-27 at 12:46 -0400, Hyrum K. Wright wrote: On Fri, Aug 27, 2010 at 12:32 PM, Stefan Sperling s...@elego.de wrote: On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote: I'm just talking as a user here... and not an svn dev... but do you really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just have 1.7 not work with 1.7 WCs and tell the users they need to do a new checkout with 1.7. I mean... it might annoy some people, but I just think that the svn dev team would have that much more time to work on the real features/functionality of 1.7. I'm sure upgrades from WC-NG to WC-NG.Next will be much simpler and can/should still be included. You're not alone. I just raised the same question in the dev IRC channel. There's value in the upgrade capability for users who have many and large working copies. But I also think that we shouldn't let weeks of developer time sink into this feature. I think that getting the simple upgrade cases working (no local mods, no conflicts) would already make many people happy. We've already punted on some aspect of upgradability (no pending logs, for instance), and it may be valuable to shift the line further toward the common case. You have local mods? Better commit 'em or make a patch before upgrading. The trouble is, people often won't find out until some time after they've upgraded, especially if it's a WC they aren't working on at the moment and they try to come back to work on it some weeks later. And for most people un-upgrading in order to do fix it isn't a practical option. - Julian
Re: Two svn_wc__db_t for single-db upgrade
On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote: I'm just talking as a user here... and not an svn dev... but do you really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just have 1.7 not work with 1.7 WCs and tell the users they need to do a new checkout with 1.7. I mean... it might annoy some people, but I just think that the svn dev team would have that much more time to work on the real features/functionality of 1.7. I'm sure upgrades from WC-NG to WC-NG.Next will be much simpler and can/should still be included. You're not alone. I just raised the same question in the dev IRC channel. There's value in the upgrade capability for users who have many and large working copies. But I also think that we shouldn't let weeks of developer time sink into this feature. I think that getting the simple upgrade cases working (no local mods, no conflicts) would already make many people happy. Stefan
RE: Two svn_wc__db_t for single-db upgrade
Back up a step. *What* data do you need to query? Maybe there is a more direct solution. I very much dislike a special mode for wc_db. It just screams hack. On Aug 27, 2010 10:07 AM, Philip Martin philip.mar...@wandisco.com wrote: Bert Huijben b...@qqmail.nl writes: I really think that it is much easier to just walk the entries files using an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere outside the normal location using upgrade specific code. That might be another way to do it. If we construct a temporary 1.7 root within the 1.6 root, something like: .svn/tmp/qyh2h2n2/.svn/wc.db then we should be able to query that database by converting a path /some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B. -- Philip I'm just talking as a user here... and not an svn dev... but do you really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just have 1.7 not work with 1.7 WCs and tell the users they need to do a new checkout with 1.7. I mean... it might annoy some people, but I just think that the svn dev team would have that much more time to work on the real features/functionality of 1.7. I'm sure upgrades from WC-NG to WC-NG.Next will be much simpler and can/should still be included. BOb
Re: Two svn_wc__db_t for single-db upgrade
On Fri, Aug 27, 2010 at 05:54:38PM +0100, Julian Foad wrote: The trouble is, people often won't find out until some time after they've upgraded, especially if it's a WC they aren't working on at the moment and they try to come back to work on it some weeks later. And for most people un-upgrading in order to do fix it isn't a practical option. That's reality, but not exactly best-practice. If your local mods have been lying in the working copy for weeks, they're probably not that important. If it's important, it should be checked in, or backed up in patch, or something. But yes, I can see how communicating this problem in a large organisation can be hard. The admin installs the new software, working copies stop working, and weeks later they still get support calls about subversion being broken. Still, we should dicide on a reasonable feature set for 1.6-1.7 working copy upgrades, and try not to get carried away by the huge amount of complicated details of this problem. Stefan
RE: Two svn_wc__db_t for single-db upgrade
On Fri, Aug 27, 2010 at 05:54:38PM +0100, Julian Foad wrote: The trouble is, people often won't find out until some time after they've upgraded, especially if it's a WC they aren't working on at the moment and they try to come back to work on it some weeks later. And for most people un-upgrading in order to do fix it isn't a practical option. That's reality, but not exactly best-practice. If your local mods have been lying in the working copy for weeks, they're probably not that important. If it's important, it should be checked in, or backed up in patch, or something. But yes, I can see how communicating this problem in a large organisation can be hard. The admin installs the new software, working copies stop working, and weeks later they still get support calls about subversion being broken. Still, we should dicide on a reasonable feature set for 1.6-1.7 working copy upgrades, and try not to get carried away by the huge amount of complicated details of this problem. Stefan Is there any way the 1.6 revert/cleanup/patch stuff can be left in 1.7? So if you have a 1.6 repo those are the only commands you can use on it? Or are there just to many dependencies? I do agree, if you have pending changes siting in a WC for weeks.. you probably don't need em. ;) Or, if not, the user can do a new checkout, and then use a compare tool to apply your pending changes to your new WC. This means, don't auto-update a WC that has pending changes in it. BOb
Re: Two svn_wc__db_t for single-db upgrade
On Fri, Aug 27, 2010 at 01:20:31PM -0400, Bob Archer wrote: Or, if not, the user can do a new checkout, and then use a compare tool to apply your pending changes to your new WC. This means, don't auto-update a WC that has pending changes in it. There won't be any auto-update, I think. The plan so far was that users will have to issue an svn upgrade command to upgrade the working copy. BTW, anyone working on upgrade, please don't feel discouraged by my comments to make it as good as you can. It's definitely a nice feature to have, so if you want to spend time on it, please do so. I only meant to suggest to keep the cost/benefit ratio in mind -- which I guess varies between people, depending on which use cases are expected to work. Stefan
Re: Two svn_wc__db_t for single-db upgrade
On 28.08.2010 02:37, Stefan Sperling wrote: On Fri, Aug 27, 2010 at 01:20:31PM -0400, Bob Archer wrote: Or, if not, the user can do a new checkout, and then use a compare tool to apply your pending changes to your new WC. This means, don't auto-update a WC that has pending changes in it. There won't be any auto-update, I think. The plan so far was that users will have to issue an svn upgrade command to upgrade the working copy. Hmmm, in that case I understand even less why this feature should be part of the regular svn binary. The point is that if we put svn upgrade in, we have to maintain it indefinitely in the 1.x version set, or at the very least until we release 1.9 and drop support for 1.7; and we have to retain any special functionality that the feature relies on. So I ask again, why not make this a separate program? -- Brane
RE: Two svn_wc__db_t for single-db upgrade
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: donderdag 26 augustus 2010 16:34 To: dev@subversion.apache.org Subject: Two svn_wc__db_t for single-db upgrade One of the problems with single-db upgrade is that write_entry, called from svn_wc__write_upgraded_entries, want's to be able to query the new database using things like svn_wc__db_scan_addition. This fails because svn_wc__db_pdh_parse_local_abspath encounters old .svn dirs and creates pdhs with the wrong wcroot. One way to fix this would be to revamp write_entry so that it doesn't need to query the existing database, but that would involve caching in memory some of the stuff we write to the database. I think this code needs revamping anyway, to properly calculate op_depths and other things we need for the 4th tree that will never be returned by just calling read_info(). I think the upgrade code should just read the entries in a directory and then upgrade the nodes it sees, passing a chain of parent entries (or similar) so you never have to rely on the intermediate DB state while loading. (This also avoids actively maintaining the upgrade code in future versions). The normal recursive iterpool pattern should keep the amount of in memory data manageable. (Shouldn't be much more than an old svn_wc_walk_entries call + the upgrade state). An alternative is to use two svn_wc__db_t handles and arrange for one of them to ignore the old .svn directories. The following patch does this and passes the upgrade regression tests (admittedly not much of a test). Does this sound like a good approach? Thinking about the 4th tree, I think this will work now, but break in only a few weeks. A slightly easier variant (that needs less hacks) might be to create the wc.db in a temp directory, like you recently implemented for 'svn copy wc-dir wc-dir2'. Bert
Re: Two svn_wc__db_t for single-db upgrade
I'm with Bert. The entry writing is used *only* for upgrades. It may as well be tuned for exactly that: track any information you need while performing the upgrade. Also remember that the wc.db file that is being constructed during the upgrade process really should be called something like wc.db.upgrade (see SDB_FILE and SDB_FILE_UPGRADE). We just aren't doing that right now. And what *that* means is that the wc_db functions will not work during the upgrade process. And really: you shouldn't rely on them. You're in a transient state, trying to build a proper database. Cheers, -g On Thu, Aug 26, 2010 at 10:58, Bert Huijben b...@vmoo.com wrote: -Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: donderdag 26 augustus 2010 16:34 To: dev@subversion.apache.org Subject: Two svn_wc__db_t for single-db upgrade One of the problems with single-db upgrade is that write_entry, called from svn_wc__write_upgraded_entries, want's to be able to query the new database using things like svn_wc__db_scan_addition. This fails because svn_wc__db_pdh_parse_local_abspath encounters old .svn dirs and creates pdhs with the wrong wcroot. One way to fix this would be to revamp write_entry so that it doesn't need to query the existing database, but that would involve caching in memory some of the stuff we write to the database. I think this code needs revamping anyway, to properly calculate op_depths and other things we need for the 4th tree that will never be returned by just calling read_info(). I think the upgrade code should just read the entries in a directory and then upgrade the nodes it sees, passing a chain of parent entries (or similar) so you never have to rely on the intermediate DB state while loading. (This also avoids actively maintaining the upgrade code in future versions). The normal recursive iterpool pattern should keep the amount of in memory data manageable. (Shouldn't be much more than an old svn_wc_walk_entries call + the upgrade state). An alternative is to use two svn_wc__db_t handles and arrange for one of them to ignore the old .svn directories. The following patch does this and passes the upgrade regression tests (admittedly not much of a test). Does this sound like a good approach? Thinking about the 4th tree, I think this will work now, but break in only a few weeks. A slightly easier variant (that needs less hacks) might be to create the wc.db in a temp directory, like you recently implemented for 'svn copy wc-dir wc-dir2'. Bert
Re: Two svn_wc__db_t for single-db upgrade
Greg Stein gst...@gmail.com writes: I'm with Bert. The entry writing is used *only* for upgrades. It may as well be tuned for exactly that: track any information you need while performing the upgrade. I realise we can do that, but I don't see why it's better. It means creating/using our own database in memory. Why is that better than using the real database we have just written? It means duplicating part of of _scan_addition. Also remember that the wc.db file that is being constructed during the upgrade process really should be called something like wc.db.upgrade (see SDB_FILE and SDB_FILE_UPGRADE). We just aren't doing that right now. And what *that* means is that the wc_db functions will not work But that's not hard to fix, when I create the second handle that ignores 1.6 I can tell it to use the upgrade name. during the upgrade process. And really: you shouldn't rely on them. You're in a transient state, trying to build a proper database. You and Bert have both mentioned transient state. What problem do you see? Why should we be creating some invalid, transient state? We write parents before children, we don't subsequently tweak the parent, the parent should be correct before we start on the children. If _scan_addition gives us the wrong answer during the upgrade how is it going to work when the upgrade is finished? [I'm aware that we don't add incomplete children when we add a complete parent, but the children don't care about siblings. And it should be easy to fix.] -- Philip
Re: Two svn_wc__db_t for single-db upgrade
Philip Martin philip.mar...@wandisco.com writes: [I'm aware that we don't add incomplete children when we add a complete parent, but the children don't care about siblings. And it should be easy to fix.] Turns out we do this the right way. We add a parent, and incomplete directory children (plus any files) in a single transaction, then we move on each directory child and do the same again. -- Philip
RE: Two svn_wc__db_t for single-db upgrade
-Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: donderdag 26 augustus 2010 21:33 To: Greg Stein Cc: Bert Huijben; dev@subversion.apache.org Subject: Re: Two svn_wc__db_t for single-db upgrade Philip Martin philip.mar...@wandisco.com writes: [I'm aware that we don't add incomplete children when we add a complete parent, but the children don't care about siblings. And it should be easy to fix.] Turns out we do this the right way. We add a parent, and incomplete directory children (plus any files) in a single transaction, then we move on each directory child and do the same again. Yes, we do this *right*, but only in the *easy* cases. The hard cases, like missing and obstructions of metadata are not handled and cannot be handled by the single-db WC-DB api as these cannot occur there . (There are no tests for this, and anything that looks like a test for this is disabled for some 4th tree reason). You can't even see that data is missing from most of the wc-db apis at they fall back to the parent stub. (And the wc-db api doesn't allow annotating a node that it misses some information) I don't think we can ever handle all the upgrade state with just the single-db optimized API. E.g. by just using the API you can't look through child-of-copy deletions, nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this information directly. So you are suggesting that we change the DB API's to provide this information (or keep providing this multi-db specific information like the obstruction statee that really have to be removed to get back at 1.6 speed), while we will never need these functions and statee after this conversion to single-db? Using the db api for intermediate storage during update is making things harder for future api users that never want to look back at the pre-single-db era. And it slows down common code paths just to support upgrading from a version that is hopefully ancient for us in a few months. And then we can only fix the api by fixing this problem later... or by moving to wc-ng-NG. Bert -- Philip
Re: Two svn_wc__db_t for single-db upgrade
On 26.08.2010 22:00, Bert Huijben wrote: -Original Message- From: Philip Martin [mailto:philip.mar...@wandisco.com] Sent: donderdag 26 augustus 2010 21:33 To: Greg Stein Cc: Bert Huijben; dev@subversion.apache.org Subject: Re: Two svn_wc__db_t for single-db upgrade Philip Martin philip.mar...@wandisco.com writes: [I'm aware that we don't add incomplete children when we add a complete parent, but the children don't care about siblings. And it should be easy to fix.] Turns out we do this the right way. We add a parent, and incomplete directory children (plus any files) in a single transaction, then we move on each directory child and do the same again. Yes, we do this *right*, but only in the *easy* cases. The hard cases, like missing and obstructions of metadata are not handled and cannot be handled by the single-db WC-DB api as these cannot occur there . (There are no tests for this, and anything that looks like a test for this is disabled for some 4th tree reason). You can't even see that data is missing from most of the wc-db apis at they fall back to the parent stub. (And the wc-db api doesn't allow annotating a node that it misses some information) I don't think we can ever handle all the upgrade state with just the single-db optimized API. E.g. by just using the API you can't look through child-of-copy deletions, nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this information directly. So you are suggesting that we change the DB API's to provide this information (or keep providing this multi-db specific information like the obstruction statee that really have to be removed to get back at 1.6 speed), while we will never need these functions and statee after this conversion to single-db? Using the db api for intermediate storage during update is making things harder for future api users that never want to look back at the pre-single-db era. And it slows down common code paths just to support upgrading from a version that is hopefully ancient for us in a few months. And then we can only fix the api by fixing this problem later... or by moving to wc-ng-NG. I'm beginning to wonder what all this fuss is about, though. We're talking about converting the working copy ... and converting only on one direction, from multiple to single-db WC, never the other way. First of all let's not loose sight of the fact that a WC can be reconstructed from the repository. But since that's not always a welcome alternative, does the upgrade process really need to handle all the zillions of half-baked states that a working copy can have? And another question, does the upgrade functionality really belong into the svn binary, or even libsvn_wc? Because there's sure to be functionality in there that will only ever be used by the wc-db upgrade, i.e., only in versions 1.7 and 1.8; so why not make the upgrader a separate binary, with the messy parts of the DB handling done there instead of polluting the longer-lived part of the code? (Also I'm not quite clear here on the feature set for 1.7 ... if we intend to release with single-db support, then why support multi-db at all, except perhaps for detaching parts of the working copy?) -- Brane