Re: Lock non-existent to allow reserving a path

2014-03-03 Thread Thomas Åkesson
Thanks Philip for sharing your insight into the lock mechanisms.

Sorry about the delay, wanted to find time to investigate.


On 24 feb 2014, at 19:56, Philip Martin philip.mar...@wandisco.com wrote:

 Thomas Åkesson thomas.akes...@simonsoft.se writes:
 
 Svn does not allow locking non-existent paths. It is blocked both in
 libsvn_fs_base/libsvn_fs_fs as well as in mod_dav_svn. In the same
 areas of the code in fs comments say:
 While our locking implementation easily supports the locking of
 nonexistent paths, we deliberately choose not to allow such madness.
 
 Given that we now present valid use-cases, it is reassuring that the
 authors of the locking code said it easily supports the locking of
 nonexistent paths.
 
 There is a way to create such locks at present: checkout, lock a file,
 delete the file or parent directory, commit with --no-unlock.  We have a
 regression test for this: lock_tests.py:deleted_path_lock.  (Possibly
 this behaviour could be considered a bug, perhaps 'commit --no-unlock'
 should remove the locks on deleted files, but implementing that would be
 hard.)

As discussed else-thread, should likely be considered a bug. It is not useful 
for these use cases since it requires the file to already exist. 

 
 Breakdown of changes:
 
 1. Make mod_dav_svn and fs allow locking of non-existent paths. This
 part is mandatory and I am attaching a rough patch that demonstrates
 the feasibility. With this change, the use-case 3 can be supported.
 
 This part is the most important for us, to enable our software to
 avoid race conditions btw users.
 
 It can be discussed whether it should be possible to configure the
 server to disallow these locks. I don't think they do much harm and
 configurability makes testing more complex.
 
 2. There are already functional tools to manage these locks (svnadmin
 lslocks/rmlocks/unlock and svn unlock URL). Are any improvements
 required? E.g. making svn lslocks available (remotely).
 
 3. Make the Working Copy support svn lock after svn add. This also
 requires taking care of (or blocking) some combinations with move and
 switch.
 
 The working copy is likely to be the hard bit.
 
 Probably we would want to block operations that undo the add: revert,
 move, delete, etc.  when the lock is present.  Attempting to move/remove
 the lock automatically would make these non-network operations into
 network operations.  I'm not sure exactly what behaviour would be
 acceptable here.  Do we want to prevent people reverting without network
 access?

Yes, I think we would want to block a good number of operations. If using locks 
on added files one might very well have to unlock before performing certain 
operations.

I just noticed that revert does not have a switch to release the locks. 
Actually, when reverting it is very easy to accidentally leave locks behind. It 
might make sense to never allow a revert of locked files without specifying 
either --unlock or --no-unlock.  

 
 Another tricky bit is how are update/status going to interact with these
 locks?  I think we would want the behaviour to be similar to the current
 locks.  So update will report if a lock gets broken/stolen and will
 remove the lock from the working copy.  I think this requires
 update/status to start reporting added files to the server so that the
 server can report on the locks.  That looks as if it will require
 protocol changes.

Interesting point. I agree we would want a consistent behaviour. 

Perhaps it would be an acceptable performance degradation to get a separate 
lock report when having locked added files in WC.

 
 It gets even more tricky with directories:
 
  svn mkdir wc1/A
  touch wc1/A/f
  svn add wc1/A/f
  svn lock wc1/A/f
 
  svn mkdir wc2/A
 
 The lock wc1/A/f will prevent the commit of wc2/A, or at least I think
 it will.  

Yes it does. Verified with the patch in my first post. 

This code is in fs_make_dir(..) in tree.c:
  /* Check (recursively) to see if some lock is 'reserving' a path at
 that location, or even some child-path; if so, check that we can
 use it. */ 
  if (root-txn_flags  SVN_FS_TXN_CHECK_LOCKS)
SVN_ERR(svn_fs_fs__allow_locked_operation(path, root-fs, TRUE, FALSE,
  pool));

I can't see any reason to make a recursive check for locks when doing mkdir 
(does not matter now, when locking non-existent is not allowed). Should be a 
simple change. An mkdir does not carry the same implications as a move or rmdir 
(they must of course do recursive check).

Copy is more complicated.

 svn mkdir wc1/A
 touch wc1/A/f
 svn add wc1/A/f
 svn lock wc1/A/f

 svn cp wc2/B wc2/A

Copy also checks locks recursively for to_path. Not sure if that matters now 
(when locking non-existent is not allowed). 

If allowing locking non-existent I personally think it would be fine to fail 
that copy if there are locks below (no change to current code). Otherwise we 
could investigate if the locks actually conflict with 

Disk format change for trunk's FSFS version 7

2014-03-03 Thread Philip Martin
Stefan Fuhrmann stefan.fuhrm...@wandisco.com writes:

 On Sun, Mar 2, 2014 at 2:54 AM, Philip Martin 
 philip.mar...@wandisco.comwrote:

 There is a problem with the FNV1a checksums in format 7: the on-disk
 representation for big-endian systems, like SPARC, is different from
 that of little-endian systems, like x86.  Both systems calculate the
 same checksum value, however the checksum code calls htonl() before
 returning the value to the caller.  On SPARC this has no effect and on
 x86 it reverses the byte order, changing the value.  If we were to write
 the resulting memory directly to disk as a block of data this would be
 good because the disk representations would be the same, but that is not
 what happens. The value is passed to encode_uint() which uses arithmetic
 to write out a representation of the bytes.  Since the htonl() call
 changes the x86 value this means the disk representation changes.

 Committed with a few addition as r1573371 and r1573375.

This changes the on-disk representation for the unreleased FSFS format 7
on little-endian systems, that's probably most systems.  Anyone with
repositories in this format needs to dump/load.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Disk format change for trunk's FSFS version 7

2014-03-03 Thread Branko Čibej
On 03.03.2014 14:03, Philip Martin wrote:
 Stefan Fuhrmann stefan.fuhrm...@wandisco.com writes:

 On Sun, Mar 2, 2014 at 2:54 AM, Philip Martin 
 philip.mar...@wandisco.comwrote:

 There is a problem with the FNV1a checksums in format 7: the on-disk
 representation for big-endian systems, like SPARC, is different from
 that of little-endian systems, like x86.  Both systems calculate the
 same checksum value, however the checksum code calls htonl() before
 returning the value to the caller.  On SPARC this has no effect and on
 x86 it reverses the byte order, changing the value.  If we were to write
 the resulting memory directly to disk as a block of data this would be
 good because the disk representations would be the same, but that is not
 what happens. The value is passed to encode_uint() which uses arithmetic
 to write out a representation of the bytes.  Since the htonl() call
 changes the x86 value this means the disk representation changes.
 Committed with a few addition as r1573371 and r1573375.
 This changes the on-disk representation for the unreleased FSFS format 7
 on little-endian systems, that's probably most systems.  Anyone with
 repositories in this format needs to dump/load.

That's fine ... it'll teach them not to use unreleased versions. :)

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. br...@wandisco.com


RE: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' output

2014-03-03 Thread Leo Davis
Hello,

Another approach is to dump 'svnlook proplist' altogether and use 'svnlook 
propget svn:mime-type' and 'svnlook propget svn:eol-style' instead.  That could 
be maximally backward compatible without introducing XML.

Regards,

Leo

From: Ben Reser b...@reser.org
Sent: Sunday, March 02, 2014 11:40 PM
To: Leo Davis; Daniel Shahaf
Cc: dev@subversion.apache.org
Subject: Re: [PATCH]: fix check-mime-type.pl for changes to 'svnlook proplist' 
output

On 3/2/14, 5:34 PM, Leo Davis wrote:
 As Ben pointed out, the current parser in the script for svnlook = 1.7.x is 
 broken and unfixable for multiline properties.  The closest we can get to 
 DTRT in this case is to have svnlook output XML.  Hopefully no one still 
 cares about svnlook = 1.3 (?) that cannot output XML.

 On Mar 2, 2014, at 8:11 AM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 One more issue: however you change the parser, it will break if
 svnlook1.7 or older is used.  It would be nice to DTRT in that case
 (either error out or retain the old parser).

It would be nice to have, but I think the effort to provide it is just too
great unless we go down the XML path.  Which is a pretty large change in the
requirements of the script.

Anyone that wants it can go get a copy of the script off the 1.7.x branch
(assuming we merge the fix to the 1.8.x branch).  This is contrib we have no
compatibility guarantees to worry about either.

Just put a prominent notice at the top of the script saying that it is only
intended for use with 1.8.x or newer servers.




Mergeinfo containing r0 makes svnsync and svnadmin dump fail

2014-03-03 Thread Julian Foad
A customer found that 'svnsync' errored out on trying to sync a revision in 
which the source repository introduced some mergeinfo starting with r0, similar 
to this example:

  $ svn propget svn:mergeinfo ^/foo@1234567
  /bar:0-100,111,122

The svnsync error message was:

  $ svnsync sync ...
  svnsync: E175008: At least one property change failed; repository is unchanged
  svnsync: E175002: Error setting property 'svn:mergeinfo':
  Could not execute PROPPATCH.

We believe this mergeinfo entered the repository through a 1.6 server. It was 
committed in mid-2012. 1.7.x servers reject such commits, as do the later 1.6.x 
servers [1]. Probably 1.6 clients were in use too, although it may have been 
committed from a non-Subversion client such as git-svn.

Anyhow, the situation is that we have at least one Subversion repository 
containing data that the 1.7 server tools reject as invalid. 'svnsync' errors 
out. Even 'svnadmin dump' errors out at this revision if you specify any 
non-zero start revision, because it parses the mergeinfo to see if it points to 
a revision earlier than the start rev. Like this:

  $ svnadmin dump --incremental -r5 repo
  svnadmin: E200020: Invalid revision number '0' found in range list

What is our migration path for this data?

We can figure out a short term work-around, perhaps using the unsupported 
SVNSYNC_UNSUPPORTED_STRIP_MERGEINFO environment variable to bypass the 
mergeinfo change for each revision that adds or changes such mergeinfo, if 
there aren't too many of them and if they aren't present on active branches. 
(We can write a pre-commit hook to make svnsync stop after just one revision, 
since it doesn't have a --revision option.)

But for a proper fix?

In the past we decided that some known ways in which mergeinfo can be malformed 
should be silently corrected or tolerated.

  leading slash is required
    = in some cases we add it if missing

  path-revs pointing to non-existent node-revs
    = in some cases, such as a dump-load cycle, we remove these ranges

  revision zero
    = in some cases, such as dump, we error out on reading it

  other parse errors
    = in some cases we error out
    = in the case of a merge, we report invalid mergeinfo: merge tracking not 
possible
    = in some cases we ignore the error in parsing whatever amount of 
mergeinfo we were trying to parse and skip the processing we were going to do 
(issue #3896)

This all makes me a bit uneasy. We seem to have a number of data 
transformations going on at quite a low level, and I'm not sure what the 
canonical position is. I would like us to have a definition of what constitutes 
the same mergeinfo in a repository before and after dump/load, and a way of 
testing that.

Philip pointed out that it's a good idea for 'dump' to dump whatever is 
present, and not error out and not try to correct or normalize it. If any 
correction or normalization is to be done, 'load' is a better place to do it. 
That minimizes the risk of a damaged archive due to bugs, if you archive the 
dump file.

Clearly there are some things we should do:

  * Make 'dump' not error out, but rather ignore the broken mergeinfo for the 
purposes of the warning that it refers to older revisions.

Other changes?

  * Make 'svnsync sync' strip out the revision 0 from that mergeinfo? Or make 
it strip out the whole mergeinfo property if it fails to parse? Or just that 
line of the property value? (If we do something like that, I'd like us to do it 
everywhere we ignore bad mergeinfo, not just here.)

Thoughts?

- Julian

[1] Servers = 1.6.18 for HTTP, = 1.6.17 for the other protocols, reject 
unparseable mergeinfo -- see issue 
http://subversion.tigris.org/issues/show_bug.cgi?id=3895.

--
Join WANdisco's free daily demo sessions on Scaling Subversion for the 
Enterprise
http://www.wandisco.com/training/webinars


Re: svn_ra_get_file_revs2 vs. blame vs. FS API

2014-03-03 Thread Julian Foad
Stefan Fuhrmann wrote:
 Stefan Fuhrmann wrote:
 Julian Foad wrote:
 Stefan Fuhrmann wrote:
 Julian Foad wrote:
  -- a quick optimization API -- the definitely/maybe question -- like the
  existing implementations but documented properly and named 
appropriately; and

  -- (perhaps) a definite content comparison API: has the content 
changed?

 I tend to prefer an extra flag parameter in a bumped version [...]
 
 Well, this is just a general API design style issue, but I think [...]
 
 Fair point. I was thinking about revving svn_ra_get_file_revs2
 as well with the same new option (callers can chose whether
 need exact reports or want to process the delta anyway).
 That would be a pass-through value for svn_fs_*_changed.

 I implemented just that in my working copy and it turned out
 to be a mess. Every RA layer had to be updated and in the
 end, there would only be the blame function to use it - with
 some fixed value for the STRICT option.

 So, I decided to go with the separate API functions for the
 different behaviors in r1573111.

Looks good.

Thanks.
- Julian



Re: Mergeinfo containing r0 makes svnsync and svnadmin dump fail

2014-03-03 Thread Branko Čibej
On 03.03.2014 17:24, Julian Foad wrote:
   * Make 'svnsync sync' strip out the revision 0 from that mergeinfo? Or make 
 it strip out the whole mergeinfo property if it fails to parse? Or just that 
 line of the property value? (If we do something like that, I'd like us to do 
 it everywhere we ignore bad mergeinfo, not just here.)

In the case of r0, I think it's safe enough to convert it to r1 on sync
and load; this can't change the semantics of the mergeinfo, because
there's currently no way to create a merge from r0.

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. br...@wandisco.com


Use of non-portable find options in Makefile.in

2014-03-03 Thread Philip Martin
I've been experimenting with a SPARC Solaris build recently and the
non-portable use of find in Makefile.in is annoying, it means that the
'make clean' target fails.  There are two uses of

  find some/path -mindepth 1 -maxdepth 1 -print0 | xargs -0 rm -rf --

Solaris find doesn't support -mindepth, -maxdepth or -print0.

Are we using -print0 because problems with spaces in paths have been
observed or simply as some sort of future-proofing?  Don't we control
the paths thus allowing us to avoid spaces?

The use of -mindepth 1 -maxdepth 1 is also something I don't understand,
what advantage is there over simply using a shell expansion?  Like this:

  rm -rf some/path/*

The shell expansion doesn't appear to have a problem with spaces when I
test it.

There is also one use of

  find some/path -print0 | xargs -0 rm -rf --

looking for gcov files.  Have problems with spaces been observed or is
this also future-proofing?

I'd like to have a more portable Makefile.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Use of non-portable find options in Makefile.in

2014-03-03 Thread Stefan Sperling
On Mon, Mar 03, 2014 at 05:34:22PM +, Philip Martin wrote:
 I've been experimenting with a SPARC Solaris build recently and the
 non-portable use of find in Makefile.in is annoying, it means that the
 'make clean' target fails.  There are two uses of
 
   find some/path -mindepth 1 -maxdepth 1 -print0 | xargs -0 rm -rf --
 
 Solaris find doesn't support -mindepth, -maxdepth or -print0.
 
 Are we using -print0 because problems with spaces in paths have been
 observed or simply as some sort of future-proofing?  Don't we control
 the paths thus allowing us to avoid spaces?
 
 The use of -mindepth 1 -maxdepth 1 is also something I don't understand,

The first instance was added in r1421636, and the second one
copied from there in r1539215.

I agree that the Makefile should be as portable as possible
and support any of the alternatives you've mentioned.


Re: [PATCH] Fix recover/hotcopy erroring out for old FSFS repositories

2014-03-03 Thread Stefan Fuhrmann
On Tue, Mar 4, 2014 at 12:06 AM, Evgeny Kotkov
evgeny.kot...@visualsvn.comwrote:

 Hi Stefan,

  Except for the typo, you patch looks good.
  I like that we get around the chicken/egg
  problem with the revnum vs. HEAD check.
  Please commit.

 I fixed the typo and committed this patch in r1573744.  Thank you for
 the review.

  Mind applying equivalent changes to get_root_changes_offset()?

 Done in r1573794.


Yup, looks good. Thanks!

-- Stefan^2.