Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Philip Martin
Bert Huijben b...@vmoo.com writes:

 The hard cases, like missing and obstructions of metadata are not handled
 and cannot be handled by the single-db WC-DB api as these cannot occur there
 . (There are no tests for this, and anything that looks like a test for this
 is disabled for some 4th tree reason). You can't even see that data is
 missing from most of the wc-db apis at they fall back to the parent stub.
 (And the wc-db api doesn't allow annotating a node that it misses some
 information)

 I don't think we can ever handle all the upgrade state with just the
 single-db optimized API.
 E.g. by just using the API you can't look through child-of-copy deletions,
 nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this
 information directly.

The upgrade code only ever needs to use the database to look at
parents.  If the parent information in the database is not correct
then the upgrade has already failed.

I don't understand your child-of-copy deletion example.  That's one
case the the existing multi-db upgrade code gets wrong, it creates a
base_node where there should only be a working_node.  It gets it wrong
because svn_wc_entry_t does not have the copied information.  The
information is in the database and can be obtained by calling
_read_info.  Or we could make our own in-memory database of entries
and query that, but I don't see why that is better.

 So you are suggesting that we change the DB API's to provide this
 information (or keep providing this multi-db specific information like the
 obstruction statee that really have to be removed to get back at 1.6 speed),
 while we will never need these functions and statee after this conversion to
 single-db?

 Using the db api for intermediate storage during update is making things
 harder for future api users that never want to look back at the
 pre-single-db era. 

 And it slows down common code paths just to support upgrading from a version
 that is hopefully ancient for us in a few months.

 And then we can only fix the api by fixing this problem later... or by
 moving to wc-ng-NG.

I'm not proposing to change any APIs. I introduce a new API to allow
the creation of svn_wc__db_t that ignores 1.6 .svn directories when
constructing pdhs.  That's the only change.  It allows the existing
upgrade code to work in single-db just as it works in multi-db, with
the same features and bugs.

Just had a thought.  Do you think I am proposing that we extract
svn_wc_entry_t structures from the database during upgrade?  That's
not what I am proposing.  Upgrade will read the 1.6 .svn/entries to
get svn_wc_entry_t structires, but when we need information about the
parent we use the existing wc-ng API to get things like
svn_wc__db_status_t.

-- 
Philip


RE: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Bert Huijben


 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: vrijdag 27 augustus 2010 11:50
 To: Bert Huijben
 Cc: 'Greg Stein'; dev@subversion.apache.org
 Subject: Re: Two svn_wc__db_t for single-db upgrade
 
 Bert Huijben b...@vmoo.com writes:
 
  The hard cases, like missing and obstructions of metadata are not
 handled
  and cannot be handled by the single-db WC-DB api as these cannot
 occur there
  . (There are no tests for this, and anything that looks like a test
 for this
  is disabled for some 4th tree reason). You can't even see that data
 is
  missing from most of the wc-db apis at they fall back to the parent
 stub.
  (And the wc-db api doesn't allow annotating a node that it misses
 some
  information)
 
  I don't think we can ever handle all the upgrade state with just the
  single-db optimized API.
  E.g. by just using the API you can't look through child-of-copy
 deletions,
  nor do we need any apis for this for 1.7. But svn_wc_entry_t just has
 this
  information directly.
 
 The upgrade code only ever needs to use the database to look at
 parents.  If the parent information in the database is not correct
 then the upgrade has already failed.
 
 I don't understand your child-of-copy deletion example.  That's one
 case the the existing multi-db upgrade code gets wrong, it creates a
 base_node where there should only be a working_node.  It gets it wrong
 because svn_wc_entry_t does not have the copied information.  The
 information is in the database and can be obtained by calling
 _read_info.  Or we could make our own in-memory database of entries
 and query that, but I don't see why that is better.

In case of a delete of copy you can have

BASE normal (checked out N levels up)
NODE_DATA normal (descendant of copy 2 levels up)
NODE_DATA normal (child of copy 1 level up)
WORKING: deleted (node itself)

_read_info() will give you the information from working (It is deleted). And
_base_get_info() shows you there was a base in the repository. So you see a
normal base delete using the wc-db api. But actually you have 3 deletes in
one place (one inherited).

The current DB code can only represent the information you get from
_read_info and _base_get_info(), so the rest is lost.

svn_wc_entry_t 
however can represent one intermediate layer. Not the two from this example,
but just one. (The other case cannot be created by libsvn_wc. It will say
that you have to commit first).

But even in that case there can be different information in the parent stub
and the child directory itself.

  So you are suggesting that we change the DB API's to provide this
  information (or keep providing this multi-db specific information
 like the
  obstruction statee that really have to be removed to get back at 1.6
 speed),
  while we will never need these functions and statee after this
 conversion to
  single-db?
 
  Using the db api for intermediate storage during update is making
 things
  harder for future api users that never want to look back at the
  pre-single-db era.
 
  And it slows down common code paths just to support upgrading from a
 version
  that is hopefully ancient for us in a few months.
 
  And then we can only fix the api by fixing this problem later... or
 by
  moving to wc-ng-NG.
 
 I'm not proposing to change any APIs. I introduce a new API to allow
 the creation of svn_wc__db_t that ignores 1.6 .svn directories when
 constructing pdhs.  That's the only change.  It allows the existing
 upgrade code to work in single-db just as it works in multi-db, with
 the same features and bugs.

So, we have to add at least one IF that always evaluates to one case for all
future Subversion versions except for upgrading for = 1.6?

In my book, that is changing APIs just for upgrade.

 Just had a thought.  Do you think I am proposing that we extract
 svn_wc_entry_t structures from the database during upgrade?  That's
 not what I am proposing.  Upgrade will read the 1.6 .svn/entries to
 get svn_wc_entry_t structires, but when we need information about the
 parent we use the existing wc-ng API to get things like
 svn_wc__db_status_t.

No, (No!)

I already suggested dropping all upgrade support for formats
13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. 

Why would you ever want to use the entry read code via wc__db_t?

You can just read the entry files recursively and insert in the db as
needed. The wc_db api doesn't have to care about entries files at all; only
for detecting that WCs must be upgraded.

Bert



Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Philip Martin
Bert Huijben b...@qqmail.nl writes:

 But even in that case there can be different information in the parent stub
 and the child directory itself.

That's why I want to use the database.


  So you are suggesting that we change the DB API's to provide this
  information (or keep providing this multi-db specific information
 like the
  obstruction statee that really have to be removed to get back at 1.6
 speed),
  while we will never need these functions and statee after this
 conversion to
  single-db?
 
  Using the db api for intermediate storage during update is making
 things
  harder for future api users that never want to look back at the
  pre-single-db era.
 
  And it slows down common code paths just to support upgrading from a
 version
  that is hopefully ancient for us in a few months.
 
  And then we can only fix the api by fixing this problem later... or
 by
  moving to wc-ng-NG.
 
 I'm not proposing to change any APIs. I introduce a new API to allow
 the creation of svn_wc__db_t that ignores 1.6 .svn directories when
 constructing pdhs.  That's the only change.  It allows the existing
 upgrade code to work in single-db just as it works in multi-db, with
 the same features and bugs.

 So, we have to add at least one IF that always evaluates to one case for all
 future Subversion versions except for upgrading for = 1.6?

 In my book, that is changing APIs just for upgrade.

 Just had a thought.  Do you think I am proposing that we extract
 svn_wc_entry_t structures from the database during upgrade?  That's
 not what I am proposing.  Upgrade will read the 1.6 .svn/entries to
 get svn_wc_entry_t structires, but when we need information about the
 parent we use the existing wc-ng API to get things like
 svn_wc__db_status_t.

   No, (No!)

 I already suggested dropping all upgrade support for formats
 13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. 

 Why would you ever want to use the entry read code via wc__db_t?

 You can just read the entry files recursively and insert in the db as
 needed. The wc_db api doesn't have to care about entries files at all; only
 for detecting that WCs must be upgraded.

   Bert


-- 
Philip


RE: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Bert Huijben


 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: vrijdag 27 augustus 2010 14:57
 To: Bert Huijben
 Cc: 'Bert Huijben'; 'Greg Stein'; dev@subversion.apache.org
 Subject: Re: Two svn_wc__db_t for single-db upgrade
 
 Bert Huijben b...@qqmail.nl writes:
 
  But even in that case there can be different information in the
 parent stub
  and the child directory itself.
 
 That's why I want to use the database.

But even then, this doesn't fix that we must be able to recover when the
upgrade fails.

A working copy can never have a wc.db and an entries file.

So you would also have to make a svn_wc__db_t that uses something else than
'.svn/wc.db', and... and... and... (you are essentially reinventing the kind
of hacks we had in WC-1.0, that we try to avoid with WC-NG)


I really think that it is much easier to just walk the entries files using
an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere
outside the normal location using upgrade specific code. Inserting the node
data, properties checksums, etc. while walking and adding workingqueue
operations as we go for moving the pristine files in place later (and
deleting unneeded administrative data).

If this step fails you just have to remove the upgrade.db file, because the
old working copy is still unmodified.


But when this step is finished, you can destroy the top level wcroot and
install the upgrade.db as new wc.db file (actions: move wc.db in place and
then delete the entries file). You can then fix the rest of the working copy
by just running the working queue, which will remove all the 1.6 like
markings from all the subdirs and fix the pristine store before the db is
ready for usage (normal patter).

This step is restartable by calling svn cleanup on the root.

Bert



Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Philip Martin
Bert Huijben b...@qqmail.nl writes:

 In case of a delete of copy you can have

 BASE normal (checked out N levels up)
 NODE_DATA normal (descendant of copy 2 levels up)
 NODE_DATA normal (child of copy 1 level up)
 WORKING: deleted (node itself)

 _read_info() will give you the information from working (It is deleted). And
 _base_get_info() shows you there was a base in the repository. So you see a
 normal base delete using the wc-db api. But actually you have 3 deletes in
 one place (one inherited).

 The current DB code can only represent the information you get from
 _read_info and _base_get_info(), so the rest is lost.

 svn_wc_entry_t 
 however can represent one intermediate layer. Not the two from this example,
 but just one. (The other case cannot be created by libsvn_wc. It will say
 that you have to commit first).

That shows how complicated the update process can be.  I'm not sure
how it answers my question about why it's wrong to query the database.

 But even in that case there can be different information in the parent stub
 and the child directory itself.

If I query the database I don't have to worry about parent stubs.
There aren't any.

 I'm not proposing to change any APIs. I introduce a new API to allow
 the creation of svn_wc__db_t that ignores 1.6 .svn directories when
 constructing pdhs.  That's the only change.  It allows the existing
 upgrade code to work in single-db just as it works in multi-db, with
 the same features and bugs.

 So, we have to add at least one IF that always evaluates to one case for all
 future Subversion versions except for upgrading for = 1.6?

 In my book, that is changing APIs just for upgrade.

The code that I want to change already understands 1.6 format.  This
is the change in svn_wc__db_pdh_parse_local_abspath:

Index: ../src/subversion/libsvn_wc/wc_db_pdh.c
===
--- ../src/subversion/libsvn_wc/wc_db_pdh.c (revision 990126)
+++ ../src/subversion/libsvn_wc/wc_db_pdh.c (working copy)
@@ -526,7 +526,7 @@
  five ancetors higher. We don't test for directory presence (just
  for the presence of subdirs/files), so we don't know when we can
  stop checking ... so just check always.  */
-  if (!moved_upwards || always_check)
+  if (!db-ignore_1_6  (!moved_upwards || always_check))
 {
   SVN_ERR(get_old_version(wc_format, local_abspath, scratch_pool));
   if (wc_format != 0)

The ignore_1_6 flag is new, and I need a way to set it, but that's it.
When the flag is set the wc_db code is almost the same, it just won't
detect old 1.6 working copies.

 Just had a thought.  Do you think I am proposing that we extract
 svn_wc_entry_t structures from the database during upgrade?  That's
 not what I am proposing.  Upgrade will read the 1.6 .svn/entries to
 get svn_wc_entry_t structires, but when we need information about the
 parent we use the existing wc-ng API to get things like
 svn_wc__db_status_t.

   No, (No!)

 I already suggested dropping all upgrade support for formats
 13-(SINGLE_DB-1) from libsvn_wc and leave that to some hacky python script. 

 Why would you ever want to use the entry read code via wc__db_t?

 You can just read the entry files recursively and insert in the db as
 needed. The wc_db api doesn't have to care about entries files at all; only
 for detecting that WCs must be upgraded.

I still don't understand your objections.  I am not proposing that
svn_wc__db_t be used to read entries.

The upgrade process work as follows:

 - read the 1.6 entries for a directory into svn_wc_entry_t in memory
 - write nodes into the SQLite db for the directory and immediate children
 - cleanup the old 1.6 files
 - move on to the subdirectories

If we were to move the entries file after reading it but before
writing to the database then the single-db upgrade code would work.
The directory being upgraded would not look like a 1.6 directory and
the pdh code would skip over it an find the root.  This is exactly
what happens once the old 1.6 files have been cleaned and we move on
to the subdirectories.

Moving the entries file is not a good solution because an interrupted
upgrade would be harder to handle.  The change I want to make has the
same effect as moving the entries file: once we have started upgrading
the directory then give me a way to treat it as a wc-ng directory not
a 1.6 directory.

Another way to achieve what I want is to provide a way to insert an
appropriate pdh:

svn_error_t *
svn_wc__db_pdh_skip_1_6(svn_wc__db_t *db,
const char *local_abspath,
apr_pool_t *scratch_pool)
{
  svn_wc__db_pdh_t *pdh;
  const char *parent_abspath = svn_dirent_dirname(local_abspath, scratch_pool);
  svn_wc__db_pdh_t *parent_pdh = apr_hash_get(db-dir_data, parent_abspath,
  APR_HASH_KEY_STRING);
  SVN_ERR_ASSERT(parent_pdh);
  

Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Philip Martin
Bert Huijben b...@qqmail.nl writes:

 I really think that it is much easier to just walk the entries files using
 an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere
 outside the normal location using upgrade specific code.

That might be another way to do it.  If we construct a temporary 1.7
root within the 1.6 root, something like:

 .svn/tmp/qyh2h2n2/.svn/wc.db

then we should be able to query that database by converting a path
/some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B.

-- 
Philip


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Philip Martin
Greg Stein gst...@gmail.com writes:

 Back up a step. *What* data do you need to query? Maybe there is a more
 direct solution.

Upgrade calls _scan_addition on the parent when writing a node, see
entries.c:write_entry.

 I very much dislike a special mode for wc_db. It just screams hack.

If I put the new database it in .svn/wcng/.svn/wc.db, in the root of
the working copy, then I no longer need the special mode.  When the
upgrade is complete I move it to .svn/wc.db.

-- 
Philip


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Greg Stein
Back up a step. *What* data do you need to query? Maybe there is a more
direct solution.

I very much dislike a special mode for wc_db. It just screams hack.

On Aug 27, 2010 10:07 AM, Philip Martin philip.mar...@wandisco.com
wrote:
 Bert Huijben b...@qqmail.nl writes:

 I really think that it is much easier to just walk the entries files
using
 an old style-lock, constructing a new sqlite db 'upgrade.db' somewhere
 outside the normal location using upgrade specific code.

 That might be another way to do it. If we construct a temporary 1.7
 root within the 1.6 root, something like:

 .svn/tmp/qyh2h2n2/.svn/wc.db

 then we should be able to query that database by converting a path
 /some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B.

 --
 Philip


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Hyrum K. Wright
On Fri, Aug 27, 2010 at 12:32 PM, Stefan Sperling s...@elego.de wrote:
 On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote:
 I'm just talking as a user here... and not an svn dev... but do you
 really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just
 have 1.7 not work with 1.7 WCs and tell the users they need to do a
 new checkout with 1.7. I mean... it might annoy some people, but I
 just think that the svn dev team would have that much more time to
 work on the real features/functionality of 1.7. I'm sure upgrades from
 WC-NG to WC-NG.Next will be much simpler and can/should still be
 included.

 You're not alone. I just raised the same question in the dev IRC channel.
 There's value in the upgrade capability for users who have many and large
 working copies. But I also think that we shouldn't let weeks of developer
 time sink into this feature.

 I think that getting the simple upgrade cases working (no local mods,
 no conflicts) would already make many people happy.

We've already punted on some aspect of upgradability (no pending logs,
for instance), and it may be valuable to shift the line further toward
the common case.  You have local mods?  Better commit 'em or make a
patch before upgrading.

-Hyrum


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Julian Foad
On Fri, 2010-08-27 at 12:46 -0400, Hyrum K. Wright wrote:
 On Fri, Aug 27, 2010 at 12:32 PM, Stefan Sperling s...@elego.de wrote:
  On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote:
  I'm just talking as a user here... and not an svn dev... but do you
  really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just
  have 1.7 not work with 1.7 WCs and tell the users they need to do a
  new checkout with 1.7. I mean... it might annoy some people, but I
  just think that the svn dev team would have that much more time to
  work on the real features/functionality of 1.7. I'm sure upgrades from
  WC-NG to WC-NG.Next will be much simpler and can/should still be
  included.
 
  You're not alone. I just raised the same question in the dev IRC channel.
  There's value in the upgrade capability for users who have many and large
  working copies. But I also think that we shouldn't let weeks of developer
  time sink into this feature.
 
  I think that getting the simple upgrade cases working (no local mods,
  no conflicts) would already make many people happy.
 
 We've already punted on some aspect of upgradability (no pending logs,
 for instance), and it may be valuable to shift the line further toward
 the common case.  You have local mods?  Better commit 'em or make a
 patch before upgrading.

The trouble is, people often won't find out until some time after
they've upgraded, especially if it's a WC they aren't working on at the
moment and they try to come back to work on it some weeks later.  And
for most people un-upgrading in order to do fix it isn't a practical
option.

- Julian




Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Stefan Sperling
On Fri, Aug 27, 2010 at 12:03:04PM -0400, Bob Archer wrote:
 I'm just talking as a user here... and not an svn dev... but do you
 really need to spend time on a 1.6 to 1.7 WC upgrade? Why not just
 have 1.7 not work with 1.7 WCs and tell the users they need to do a
 new checkout with 1.7. I mean... it might annoy some people, but I
 just think that the svn dev team would have that much more time to
 work on the real features/functionality of 1.7. I'm sure upgrades from
 WC-NG to WC-NG.Next will be much simpler and can/should still be
 included.

You're not alone. I just raised the same question in the dev IRC channel.
There's value in the upgrade capability for users who have many and large
working copies. But I also think that we shouldn't let weeks of developer
time sink into this feature.

I think that getting the simple upgrade cases working (no local mods,
no conflicts) would already make many people happy.

Stefan


RE: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Bob Archer
 Back up a step. *What* data do you need to query? Maybe there is a
 more
 direct solution.
 
 I very much dislike a special mode for wc_db. It just screams
 hack.
 
 On Aug 27, 2010 10:07 AM, Philip Martin
 philip.mar...@wandisco.com
 wrote:
  Bert Huijben b...@qqmail.nl writes:
 
  I really think that it is much easier to just walk the entries
 files
 using
  an old style-lock, constructing a new sqlite db 'upgrade.db'
 somewhere
  outside the normal location using upgrade specific code.
 
  That might be another way to do it. If we construct a temporary
 1.7
  root within the 1.6 root, something like:
 
  .svn/tmp/qyh2h2n2/.svn/wc.db
 
  then we should be able to query that database by converting a
 path
  /some/wc/A/B by to /some/wc/.svn/tmp/qyh2h2n2/A/B.
 
  --
  Philip

I'm just talking as a user here... and not an svn dev... but do you really need 
to spend time on a 1.6 to 1.7 WC upgrade? Why not just have 1.7 not work with 
1.7 WCs and tell the users they need to do a new checkout with 1.7. I mean... 
it might annoy some people, but I just think that the svn dev team would have 
that much more time to work on the real features/functionality of 1.7. I'm 
sure upgrades from WC-NG to WC-NG.Next will be much simpler and can/should 
still be included.

BOb



Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Stefan Sperling
On Fri, Aug 27, 2010 at 05:54:38PM +0100, Julian Foad wrote:
 The trouble is, people often won't find out until some time after
 they've upgraded, especially if it's a WC they aren't working on at the
 moment and they try to come back to work on it some weeks later.  And
 for most people un-upgrading in order to do fix it isn't a practical
 option.

That's reality, but not exactly best-practice.
If your local mods have been lying in the working copy for weeks,
they're probably not that important. If it's important, it should
be checked in, or backed up in patch, or something.

But yes, I can see how communicating this problem in a large
organisation can be hard. The admin installs the new software,
working copies stop working, and weeks later they still get support
calls about subversion being broken.

Still, we should dicide on a reasonable feature set for 1.6-1.7 working
copy upgrades, and try not to get carried away by the huge amount of
complicated details of this problem.

Stefan


RE: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Bob Archer
 On Fri, Aug 27, 2010 at 05:54:38PM +0100, Julian Foad wrote:
  The trouble is, people often won't find out until some time after
  they've upgraded, especially if it's a WC they aren't working on
 at the
  moment and they try to come back to work on it some weeks later.
 And
  for most people un-upgrading in order to do fix it isn't a
 practical
  option.
 
 That's reality, but not exactly best-practice.
 If your local mods have been lying in the working copy for weeks,
 they're probably not that important. If it's important, it should
 be checked in, or backed up in patch, or something.
 
 But yes, I can see how communicating this problem in a large
 organisation can be hard. The admin installs the new software,
 working copies stop working, and weeks later they still get support
 calls about subversion being broken.
 
 Still, we should dicide on a reasonable feature set for 1.6-1.7
 working
 copy upgrades, and try not to get carried away by the huge amount
 of
 complicated details of this problem.
 
 Stefan

Is there any way the 1.6 revert/cleanup/patch stuff can be left in 1.7? So if 
you have a 1.6 repo those are the only commands you can use on it? Or are there 
just to many dependencies?

I do agree, if you have pending changes siting in a WC for weeks.. you probably 
don't need em. ;) 

Or, if not, the user can do a new checkout, and then use a compare tool to 
apply your pending changes to your new WC. This means, don't auto-update a WC 
that has pending changes in it.

BOb
  


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Stefan Sperling
On Fri, Aug 27, 2010 at 01:20:31PM -0400, Bob Archer wrote:
 Or, if not, the user can do a new checkout, and then use a compare
 tool to apply your pending changes to your new WC. This means, don't
 auto-update a WC that has pending changes in it.

There won't be any auto-update, I think. The plan so far was that users will
have to issue an svn upgrade command to upgrade the working copy. 

BTW, anyone working on upgrade, please don't feel discouraged by my comments
to make it as good as you can. It's definitely a nice feature to have,
so if you want to spend time on it, please do so. I only meant to
suggest to keep the cost/benefit ratio in mind -- which I guess varies
between people, depending on which use cases are expected to work.

Stefan


Re: Two svn_wc__db_t for single-db upgrade

2010-08-27 Thread Branko Čibej
 On 28.08.2010 02:37, Stefan Sperling wrote:
 On Fri, Aug 27, 2010 at 01:20:31PM -0400, Bob Archer wrote:
 Or, if not, the user can do a new checkout, and then use a compare
 tool to apply your pending changes to your new WC. This means, don't
 auto-update a WC that has pending changes in it.
 There won't be any auto-update, I think. The plan so far was that users will
 have to issue an svn upgrade command to upgrade the working copy.

Hmmm, in that case I understand even less why this feature should be
part of the regular svn binary. The point is that if we put svn
upgrade in, we have to maintain it indefinitely in the 1.x version set,
or at the very least until we release 1.9 and drop support for 1.7; and
we have to retain any special functionality that the feature relies on.

So I ask again, why not make this a separate program?

-- Brane


RE: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Bert Huijben


 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: donderdag 26 augustus 2010 16:34
 To: dev@subversion.apache.org
 Subject: Two svn_wc__db_t for single-db upgrade
 
 One of the problems with single-db upgrade is that write_entry, called
 from svn_wc__write_upgraded_entries, want's to be able to query the
 new database using things like svn_wc__db_scan_addition.  This fails
 because svn_wc__db_pdh_parse_local_abspath encounters old .svn dirs
 and creates pdhs with the wrong wcroot.
 
 One way to fix this would be to revamp write_entry so that it doesn't
 need to query the existing database, but that would involve caching in
 memory some of the stuff we write to the database.

I think this code needs revamping anyway, to properly calculate op_depths
and other things we need for the 4th tree that will never be returned by
just calling read_info().

I think the upgrade code should just read the entries in a directory and
then upgrade the nodes it sees, passing a chain of parent entries (or
similar) so you never have to rely on the intermediate DB state while
loading. (This also avoids actively maintaining the upgrade code in future
versions).

The normal recursive iterpool pattern should keep the amount of in memory
data manageable. (Shouldn't be much more than an old svn_wc_walk_entries
call + the upgrade state).

 An alternative is to use two svn_wc__db_t handles and arrange for one
 of them to ignore the old .svn directories. The following patch does
 this and passes the upgrade regression tests (admittedly not much of a
 test).  Does this sound like a good approach?

Thinking about the 4th tree, I think this will work now, but break in only a
few weeks.


A slightly easier variant (that needs less hacks) might be to create the
wc.db in a temp directory, like you recently implemented for 'svn copy
wc-dir wc-dir2'.

Bert



Re: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Greg Stein
I'm with Bert. The entry writing is used *only* for upgrades. It may
as well be tuned for exactly that: track any information you need
while performing the upgrade.

Also remember that the wc.db file that is being constructed during the
upgrade process really should be called something like wc.db.upgrade
(see SDB_FILE and SDB_FILE_UPGRADE). We just aren't doing that right
now. And what *that* means is that the wc_db functions will not work
during the upgrade process. And really: you shouldn't rely on them.
You're in a transient state, trying to build a proper database.

Cheers,
-g

On Thu, Aug 26, 2010 at 10:58, Bert Huijben b...@vmoo.com wrote:


 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: donderdag 26 augustus 2010 16:34
 To: dev@subversion.apache.org
 Subject: Two svn_wc__db_t for single-db upgrade

 One of the problems with single-db upgrade is that write_entry, called
 from svn_wc__write_upgraded_entries, want's to be able to query the
 new database using things like svn_wc__db_scan_addition.  This fails
 because svn_wc__db_pdh_parse_local_abspath encounters old .svn dirs
 and creates pdhs with the wrong wcroot.

 One way to fix this would be to revamp write_entry so that it doesn't
 need to query the existing database, but that would involve caching in
 memory some of the stuff we write to the database.

 I think this code needs revamping anyway, to properly calculate op_depths
 and other things we need for the 4th tree that will never be returned by
 just calling read_info().

 I think the upgrade code should just read the entries in a directory and
 then upgrade the nodes it sees, passing a chain of parent entries (or
 similar) so you never have to rely on the intermediate DB state while
 loading. (This also avoids actively maintaining the upgrade code in future
 versions).

 The normal recursive iterpool pattern should keep the amount of in memory
 data manageable. (Shouldn't be much more than an old svn_wc_walk_entries
 call + the upgrade state).

 An alternative is to use two svn_wc__db_t handles and arrange for one
 of them to ignore the old .svn directories. The following patch does
 this and passes the upgrade regression tests (admittedly not much of a
 test).  Does this sound like a good approach?

 Thinking about the 4th tree, I think this will work now, but break in only a
 few weeks.


 A slightly easier variant (that needs less hacks) might be to create the
 wc.db in a temp directory, like you recently implemented for 'svn copy
 wc-dir wc-dir2'.

        Bert




Re: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Philip Martin
Greg Stein gst...@gmail.com writes:

 I'm with Bert. The entry writing is used *only* for upgrades. It may
 as well be tuned for exactly that: track any information you need
 while performing the upgrade.

I realise we can do that, but I don't see why it's better.  It means
creating/using our own database in memory.  Why is that better than
using the real database we have just written?  It means duplicating
part of of _scan_addition.

 Also remember that the wc.db file that is being constructed during the
 upgrade process really should be called something like wc.db.upgrade
 (see SDB_FILE and SDB_FILE_UPGRADE). We just aren't doing that right
 now. And what *that* means is that the wc_db functions will not work

But that's not hard to fix, when I create the second handle that
ignores 1.6 I can tell it to use the upgrade name.

 during the upgrade process. And really: you shouldn't rely on them.
 You're in a transient state, trying to build a proper database.

You and Bert have both mentioned transient state.  What problem do you
see?  Why should we be creating some invalid, transient state?  We
write parents before children, we don't subsequently tweak the parent,
the parent should be correct before we start on the children.  If
_scan_addition gives us the wrong answer during the upgrade how is it
going to work when the upgrade is finished?

[I'm aware that we don't add incomplete children when we add a
complete parent, but the children don't care about siblings.  And it
should be easy to fix.]

-- 
Philip


Re: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Philip Martin
Philip Martin philip.mar...@wandisco.com writes:

 [I'm aware that we don't add incomplete children when we add a
 complete parent, but the children don't care about siblings.  And it
 should be easy to fix.]

Turns out we do this the right way.  We add a parent, and incomplete
directory children (plus any files) in a single transaction, then we
move on each directory child and do the same again.

-- 
Philip


RE: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Bert Huijben


 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: donderdag 26 augustus 2010 21:33
 To: Greg Stein
 Cc: Bert Huijben; dev@subversion.apache.org
 Subject: Re: Two svn_wc__db_t for single-db upgrade
 
 Philip Martin philip.mar...@wandisco.com writes:
 
  [I'm aware that we don't add incomplete children when we add a
  complete parent, but the children don't care about siblings.  And it
  should be easy to fix.]
 
 Turns out we do this the right way.  We add a parent, and incomplete
 directory children (plus any files) in a single transaction, then we
 move on each directory child and do the same again.

Yes, we do this *right*, but only in the *easy* cases. 
The hard cases, like missing and obstructions of metadata are not handled
and cannot be handled by the single-db WC-DB api as these cannot occur there
. (There are no tests for this, and anything that looks like a test for this
is disabled for some 4th tree reason). You can't even see that data is
missing from most of the wc-db apis at they fall back to the parent stub.
(And the wc-db api doesn't allow annotating a node that it misses some
information)

I don't think we can ever handle all the upgrade state with just the
single-db optimized API.
E.g. by just using the API you can't look through child-of-copy deletions,
nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this
information directly.

So you are suggesting that we change the DB API's to provide this
information (or keep providing this multi-db specific information like the
obstruction statee that really have to be removed to get back at 1.6 speed),
while we will never need these functions and statee after this conversion to
single-db?

Using the db api for intermediate storage during update is making things
harder for future api users that never want to look back at the
pre-single-db era. 

And it slows down common code paths just to support upgrading from a version
that is hopefully ancient for us in a few months.

And then we can only fix the api by fixing this problem later... or by
moving to wc-ng-NG.

Bert

 
 --
 Philip



Re: Two svn_wc__db_t for single-db upgrade

2010-08-26 Thread Branko Čibej
On 26.08.2010 22:00, Bert Huijben wrote:

   
 -Original Message-
 From: Philip Martin [mailto:philip.mar...@wandisco.com]
 Sent: donderdag 26 augustus 2010 21:33
 To: Greg Stein
 Cc: Bert Huijben; dev@subversion.apache.org
 Subject: Re: Two svn_wc__db_t for single-db upgrade

 Philip Martin philip.mar...@wandisco.com writes:

 
 [I'm aware that we don't add incomplete children when we add a
 complete parent, but the children don't care about siblings.  And it
 should be easy to fix.]
   
 Turns out we do this the right way.  We add a parent, and incomplete
 directory children (plus any files) in a single transaction, then we
 move on each directory child and do the same again.
 
 Yes, we do this *right*, but only in the *easy* cases. 
 The hard cases, like missing and obstructions of metadata are not handled
 and cannot be handled by the single-db WC-DB api as these cannot occur there
 . (There are no tests for this, and anything that looks like a test for this
 is disabled for some 4th tree reason). You can't even see that data is
 missing from most of the wc-db apis at they fall back to the parent stub.
 (And the wc-db api doesn't allow annotating a node that it misses some
 information)

 I don't think we can ever handle all the upgrade state with just the
 single-db optimized API.
 E.g. by just using the API you can't look through child-of-copy deletions,
 nor do we need any apis for this for 1.7. But svn_wc_entry_t just has this
 information directly.

 So you are suggesting that we change the DB API's to provide this
 information (or keep providing this multi-db specific information like the
 obstruction statee that really have to be removed to get back at 1.6 speed),
 while we will never need these functions and statee after this conversion to
 single-db?

 Using the db api for intermediate storage during update is making things
 harder for future api users that never want to look back at the
 pre-single-db era. 

 And it slows down common code paths just to support upgrading from a version
 that is hopefully ancient for us in a few months.

 And then we can only fix the api by fixing this problem later... or by
 moving to wc-ng-NG.
   

I'm beginning to wonder what all this fuss is about, though. We're
talking about converting the working copy ... and converting only on one
direction, from multiple to single-db WC, never the other way.

First of all let's not loose sight of the fact that a WC can be
reconstructed from the repository. But since that's not always a welcome
alternative, does the upgrade process really need to handle all the
zillions of half-baked states that a working copy can have? And another
question, does the upgrade functionality really belong into the svn
binary, or even libsvn_wc? Because there's sure to be functionality in
there that will only ever be used by the wc-db upgrade, i.e., only in
versions 1.7 and 1.8; so why not make the upgrader a separate binary,
with the messy parts of the DB handling done there instead of polluting
the longer-lived part of the code?

(Also I'm not quite clear here on the feature set for 1.7 ... if we
intend to release with single-db support, then why support multi-db at
all, except perhaps for detaching parts of the working copy?)

-- Brane