Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)

2010-09-28 Thread Gavin Beau Baumanis
Daniel,

Absolutely no apology required.
Just about everyone here is a volunteer of the own spare time.
I was checking for my own purposes, in case I had missed an alternate message 
thread.


Gavin "Beau" Baumanis


On 29/09/2010, at 11:14 AM, Daniel Trebbien wrote:

> Hi Gavin,
> 
> There have been some things that I have needed to do recently, so I
> haven't been working on the patches as much as I would like.  But, I
> am still interested in finishing them.
> 
> My apologies for the delay...
> 
> Daniel
> 
> 
> On Tue, Sep 28, 2010 at 6:07 PM, Gavin Beau Baumanis
>  wrote:
>> Hi Daniel (Trebbien),
>> 
>> Just thought I would ask for an update, please.
>> I still have you on my "watch - list" - so there is no rush - just making 
>> sure for my own sanity that I haven't missed an email / thread about your 
>> patch submission.
>> 
>> 
>> Gavin "Beau" Baumanis


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)

2010-09-28 Thread Daniel Trebbien
Hi Gavin,

There have been some things that I have needed to do recently, so I
haven't been working on the patches as much as I would like.  But, I
am still interested in finishing them.

My apologies for the delay...

Daniel


On Tue, Sep 28, 2010 at 6:07 PM, Gavin Beau Baumanis
 wrote:
> Hi Daniel (Trebbien),
>
> Just thought I would ask for an update, please.
> I still have you on my "watch - list" - so there is no rush - just making 
> sure for my own sanity that I haven't missed an email / thread about your 
> patch submission.
>
>
> Gavin "Beau" Baumanis


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)

2010-09-28 Thread Gavin Beau Baumanis
Hi Daniel (Trebbien),

Just thought I would ask for an update, please.
I still have you on my "watch - list" - so there is no rush - just making sure 
for my own sanity that I haven't missed an email / thread about your patch 
submission.


Gavin "Beau" Baumanis


On 22/09/2010, at 2:56 AM, Daniel Shahaf wrote:

> Daniel Trebbien wrote on Tue, Sep 21, 2010 at 08:12:38 -0700:
>> On Sun, Sep 19, 2010 at 9:39 PM, Daniel Shahaf  
>> wrote:
>>> Need the word "to" after the semicolon for the English to be correct.
>> 
>> Fixed.  (When I say "fixed", I mean that I committed a fix in my own
>> fork of Subversion:  http://github.com/dtrebbien/subversion )
>> 
> 
> Fine, but please post patches and log messages to this list as usual.
> It's easier for me/us to process them that way.  Thanks.
> 
 +svn_error_t *
 +svn_subst_translate_cstring2(const char *src,
 + const char **dst,
 + const char *eol_str,
 + svn_boolean_t repair,
 + apr_hash_t *keywords,
 + svn_boolean_t expand,
 + apr_pool_t *pool)
 +{
 +  return translate_cstring2(dst, NULL, src, eol_str, repair, keywords, 
 expand,
 +pool);
 +}
>>> 
>>> So, this is revved because svn_subst_translate_string2() needs it (and
>>> svn_subst_translate_string() doesn't).
>> 
>> I am unsure of what you mean.
>> 
>> I took the original implementation of `svn_subst_translate_cstring2`
>> and placed it in the new static utility function `translate_cstring2`.
>> `translate_cstring2` accepts another parameter, and its definition
>> (the old definition on `svn_subst_translate_cstring2`) was
>> appropriately modified to handle this new parameter. This way, I don't
>> have to modify the public API.
>> 
> 
> Yes, yes.  You revved three svn_subst_* functions; one publicly and two
> privately.  So I was pointing out (to myself), on each of the latter
> two, where is the call site that needs the additional functionality.
> 
>>> I suggest you go ahead (respond to the review, submit an updated patch,
>>> etc) without waiting; the svnsync part will eventually get its share of
>>> reviews.  Also, you may want to consider splitting this patch into two
>>> parts (subst work separate from svnsync work).
>> 
>> I like the idea of splitting this into two patches.
> 
> Okay.  Please post an updated patch to this list when you have it ready :-)
> 
> Best,
> 
> Daniel
> 


Re: [PATCH] Support command line options in svn_apply_autoprops.py

2010-09-28 Thread Gavin Beau Baumanis
As a result of no comments for this patch , I have logged it into the issue 
tracker so that it doesn't get lost.

http://subversion.tigris.org/issue-tracker.html
Issue Number : #3723



Gavin "Beau" Baumanis


On 04/09/2010, at 11:43 PM, Wei-Yin Chen wrote:

> Sorry, don't know why the attachment was removed again. Does this
> mailing list have a strict "Content-Type" filter? I'm trying
> text/x-patch this time from my linux box. Type
> application/octet-stream doesn't seem to pass through.
> 
> From web interface of the mailing archive, the format of the inlined
> version looks strange, and the indentation seems wrong, so here's
> another way to get it:
> 
> http://pastebin.com/YWhpnZvF
> 
> On Sat, Sep 4, 2010 at 9:05 PM, Wei-Yin Chen  
> wrote:
>> 
>> Hi,
>> 
>> This patch makes trunk/contrib/client-side/svn_apply_autoprops.py take
>> command line options.
>> The usage would be:
>> svn_apply_autoprops.py [-c config_file] [paths to process...]
>> 
>> The -c option specifies the configuration file name, and it overrides
>> the setting in environment variable SVN_CONFIG_FILENAME.
>> The rest of the command line arguments are treated as paths to process.
>> 
>> [[[
>> Make svn_apply_autoprops.py take command line options.
>> Option -c for svn configuration filename, and the rest for paths to process.
>> ]]]
>> 
>> Regards,
>> Wei-Yin
>> 
> 


Re: [PATCH] don't do autoprops on symbolic links

2010-09-28 Thread Gavin Beau Baumanis
As a result of no comments for this patch , I have logged it into the issue 
tracker so that it doesn't get lost.

http://subversion.tigris.org/issue-tracker.html
Issue Number : #3722



Gavin "Beau" Baumanis


On 04/09/2010, at 10:25 PM, Wei-Yin Chen wrote:

> Dear Gavin,
> 
> Thanks. The attachment was in my sent box, but it's absent in the mailing 
> archive. Don't know why.
> 
> Per Branko's suggestion, I'm using join this time. In case the attachment 
> gets missing again, it is also embedded in the mail body.
> 
> Regards,
> Wei-Yin
> 
> --- svn_apply_autoprops.py.old  2010-09-04 20:16:28.0 +0800
> +++ svn_apply_autoprops.py.nolink   2010-09-04 20:17:20.0 +0800
> @@ -122,10 +122,12 @@
>for autoprops_line in autoprop_lines:
>  fnmatch_str = autoprops_line[0]
>  prop_list = autoprops_line[1]
> 
>  matching_filenames = fnmatch.filter(filenames, fnmatch_str)
> +matching_filenames = [f for f in matching_filenames \
> +  if not os.path.islink(os.path.join(dirname, f))]
>  if not matching_filenames:
>continue
> 
>  for prop in prop_list:
>command = ['svn', 'propset', prop[0], prop[1]]
> 
> 
> On Sat, Sep 4, 2010 at 4:08 PM, Gavin Beau Baumanis  
> wrote:
> Hi Wei-Yin,
> 
> Just thought I would mention that you did not attach your updated patch.
> 
> 
> Gavin "Beau" Baumanis
> E: ga...@thespidernet.com
> 



Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Daniel Shahaf
Johan Corveleyn wrote on Tue, Sep 28, 2010 at 23:37:23 +0200:
> On Tue, Sep 28, 2010 at 4:11 PM, Daniel Shahaf  
> wrote:
> >> Index: subversion/include/svn_diff.h
> >> ===
> >> --- subversion/include/svn_diff.h     (revision 1001548)
> >> +++ subversion/include/svn_diff.h     (working copy)
> >> @@ -112,6 +112,11 @@
> > (personally I prefer 'svn diff -x-p' to show the function name here)
> 
> Ok, will do next time.
> 

Thanks.

> I've had some progress:
> - The blame_tests.py now all pass (tests 2 and 7 provoked a core
> dump). That makes all tests pass. However, although I fixed the
> coredump, I don't fully understand the root cause (why they ended up
> in the situation where they ended up). So I'm going to study that
> first a bit more.
> - I have now included support for files with \r eol-style.
> 
> I'll send a new version of the patch shortly.
> 

Sounds good.

> I'm also thinking that I could try to take advantage of -x-b/-x-w or
> -x--ignore-eol-style options to make it even faster (right now the
> prefix/suffix scanning will stop at the first difference, regardless
> if it's a whitespace or eol difference that could/should be ignored).
> 
...
> 
> Thoughts?

None, actually; I'm not (yet) sufficiently familiar with diff internals. 
But let's please have this work in small, digestible (read: reviewable) 
pieces --- support for -x--foo is exactly the sort of additional 
optimization that can be done in a separate patch (on top of this 
starting patch).


Re: svn commit: r1002313 - /subversion/trunk/subversion/tests/cmdline/prop_tests.py

2010-09-28 Thread Daniel Shahaf
Paul Burba wrote on Tue, Sep 28, 2010 at 18:06:36 -0400:
> On Tue, Sep 28, 2010 at 3:30 PM, Daniel Shahaf  
> wrote:
> > pbu...@apache.org wrote on Tue, Sep 28, 2010 at 18:38:19 -:
> >> +  # Run propget -vR svn:mergeinfo and collect the stdout.
> >> +  exit_code, pg_stdout, pg_stderr = svntest.actions.run_and_verify_svn(
> >> +    None, None, [], 'propget', SVN_PROP_MERGEINFO, '-vR', wc_dir)
> >> +
> >
> > exit_code and pg_stderr aren't checked anywhere.
> 
> Neither is pg_stdout...but that entire statement was cruft from an
> earlier version of the test; removed it.
> 

:-)

> >> +  redir_file = open(redirect_file, 'wb')
> >> +  pg_proc = subprocess.Popen(arglist, stdout=redir_file)
> >
> > Shouldn't this use the svntest/ infrastructure?  Compare
> > svntest.actions.check_prop().
> 
> I didn't use it for two reasons:

I didn't actually mean check_prop() specifically, but rather the
infrastructure it uses --- main.run_command() and main.svn_binary.

> First, svntest.actions.check_prop() only supports finding the props on
> a single path (and as far as I can tell that works fine, no issue
> #3721).
> 
> Second, Issue #3721 only occurs when the output of 'svn pg -vR' is
> *redirected to a file* - see
> http://subversion.tigris.org/issues/show_bug.cgi?id=3721#desc1.
> check_prop() is (obviously) all processes and pipes underneath the
> covers, so while it may also be possible to show the bug using it, I
> wrote the test to hew as closely to the actual bug I witnessed as
> possible.
> 

Aha, so you're using Popen() directly because you need to have svn's
stdout be a file and not a pipe.  I'm a bit surprised that we have a bug
that's that sensitive to trigger, but okay. :-)

> >> +  pg_proc.wait()
> >> +  redir_file.close()
> >> +  pg_stdout_redir = open(redirect_file, 'r').readlines()
> >> +
> >> +  # Check if the redirected output of svn pg -vR is what we expect.
> >> +  #
> >> +  # Currently this fails because the mergeinfo for the three paths is
> >> +  # interleaved and the lines endings are (at least on Windows) a mix
> >> +  # of  and . See
> >> +  # http://subversion.tigris.org/issues/show_bug.cgi?id=3721#desc1
> >> +  unordered_expected_output = svntest.verify.UnorderedOutput([
> >> +    "Properties on '" + B_path +  "':\n",
> >> +    "Properties on '" + C_path +  "':\n",
> >> +    "Properties on '" + D_path +  "':\n",
> >> +    "  svn:mergeinfo\n",
> >> +    "    /subversion/branches/1.5.x:872364-874936\n",
> >> +    "    /subversion/branches/1.5.x-34184:874657-874741\n",
> >> +    "    /subversion/branches/1.5.x-34432:874744-874798\n",
> >
> > So, 'unordered' also ignores repetitions?  (since the last 4 lines
> > appear only once each, rather than three times each)
> 
> I think you mean the first 3 lines appear only once, all the other
> lines appear 3 times each (because the test sets the same mergeinfo on
> all three paths and the expected output is for svn pg -v***R***).
> 
> But yeah, this test isn't perfect as it would allow repetitions.  That
> is the price we pay when using svntest.verify.UnorderedOutput(), which
> is required here because there is no guarantee as to the order in
> which svn pg -vR will report the three paths.  I tweaked the test to
> check the expected number of lines in the actual redirected output, so
> that we catch any dups.
> 

In short, "Yes" (UnorderedOutput() does ignore repetitions), but thanks
for the detailed-as-usual replies!

> Paul


Re: svn commit: r1002313 - /subversion/trunk/subversion/tests/cmdline/prop_tests.py

2010-09-28 Thread Paul Burba
Hi Daniel,

Comments inline.  All fixes mentioned were done in r1002372.

On Tue, Sep 28, 2010 at 3:30 PM, Daniel Shahaf  wrote:
> pbu...@apache.org wrote on Tue, Sep 28, 2010 at 18:38:19 -:
>> +  # Run propget -vR svn:mergeinfo and collect the stdout.
>> +  exit_code, pg_stdout, pg_stderr = svntest.actions.run_and_verify_svn(
>> +    None, None, [], 'propget', SVN_PROP_MERGEINFO, '-vR', wc_dir)
>> +
>
> exit_code and pg_stderr aren't checked anywhere.

Neither is pg_stdout...but that entire statement was cruft from an
earlier version of the test; removed it.

>> +  # Run propget -vR svn:mergeinfo, redirecting the stdout to a file.
>> +  arglist = ['svn.exe', 'propget', SVN_PROP_MERGEINFO, '-vR', wc_dir]
>
> s/.exe// ?

Actually that should ideally be 'svntest.main.svn_binary'.  Fixed that.

>> +  redir_file = open(redirect_file, 'wb')
>> +  pg_proc = subprocess.Popen(arglist, stdout=redir_file)
>
> Shouldn't this use the svntest/ infrastructure?  Compare
> svntest.actions.check_prop().

I didn't use it for two reasons:

First, svntest.actions.check_prop() only supports finding the props on
a single path (and as far as I can tell that works fine, no issue
#3721).

Second, Issue #3721 only occurs when the output of 'svn pg -vR' is
*redirected to a file* - see
http://subversion.tigris.org/issues/show_bug.cgi?id=3721#desc1.
check_prop() is (obviously) all processes and pipes underneath the
covers, so while it may also be possible to show the bug using it, I
wrote the test to hew as closely to the actual bug I witnessed as
possible.

>> +  pg_proc.wait()
>> +  redir_file.close()
>> +  pg_stdout_redir = open(redirect_file, 'r').readlines()
>> +
>> +  # Check if the redirected output of svn pg -vR is what we expect.
>> +  #
>> +  # Currently this fails because the mergeinfo for the three paths is
>> +  # interleaved and the lines endings are (at least on Windows) a mix
>> +  # of  and . See
>> +  # http://subversion.tigris.org/issues/show_bug.cgi?id=3721#desc1
>> +  unordered_expected_output = svntest.verify.UnorderedOutput([
>> +    "Properties on '" + B_path +  "':\n",
>> +    "Properties on '" + C_path +  "':\n",
>> +    "Properties on '" + D_path +  "':\n",
>> +    "  svn:mergeinfo\n",
>> +    "    /subversion/branches/1.5.x:872364-874936\n",
>> +    "    /subversion/branches/1.5.x-34184:874657-874741\n",
>> +    "    /subversion/branches/1.5.x-34432:874744-874798\n",
>
> So, 'unordered' also ignores repetitions?  (since the last 4 lines
> appear only once each, rather than three times each)

I think you mean the first 3 lines appear only once, all the other
lines appear 3 times each (because the test sets the same mergeinfo on
all three paths and the expected output is for svn pg -v***R***).

But yeah, this test isn't perfect as it would allow repetitions.  That
is the price we pay when using svntest.verify.UnorderedOutput(), which
is required here because there is no guarantee as to the order in
which svn pg -vR will report the three paths.  I tweaked the test to
check the expected number of lines in the actual redirected output, so
that we catch any dups.

Paul


Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Johan Corveleyn
Hi Daniel,

Thanks for the feedback.

On Tue, Sep 28, 2010 at 4:11 PM, Daniel Shahaf  wrote:
>> Index: subversion/include/svn_diff.h
>> ===
>> --- subversion/include/svn_diff.h     (revision 1001548)
>> +++ subversion/include/svn_diff.h     (working copy)
>> @@ -112,6 +112,11 @@
> (personally I prefer 'svn diff -x-p' to show the function name here)

Ok, will do next time.

>>    svn_error_t *(*datasource_open)(void *diff_baton,
>>                                    svn_diff_datasource_e datasource);
>>
>> +  /** Open the datasources of type @a datasources. */
>> +  svn_error_t *(*datasources_open)(void *diff_baton, apr_off_t 
>> *prefix_lines,
>> +                                   svn_diff_datasource_e datasource0,
>> +                                   svn_diff_datasource_e datasource1);
>> +
>
> So, you're extending the svn_diff_fns_t struct, which is defined in
> a public header.
>
> It's a public struct with no constructor function, so I believe you have
> to revv it (into svn_diff_fns2_t) in order to extend it (for binary
> compatibility: people allocating this struct and then using a newer
> Subversion library at runtime).
>
> If it did have a constructor function, you'd have to extend it only at
> the end, and even then make sure that if the added elements are NULL (eg
> because an old caller didn't know to fill them) then everything still
> worked.
>
> Probably more important to get the logic right than to revv it right
> away, though; the latter is a SMOP.

Doh! I'm sure that observation was in the back of my head somewhere,
but I forgot about it while working on the solution. Anyway, you're
right: there is first some more work to get the algorithm right.

I've had some progress:
- The blame_tests.py now all pass (tests 2 and 7 provoked a core
dump). That makes all tests pass. However, although I fixed the
coredump, I don't fully understand the root cause (why they ended up
in the situation where they ended up). So I'm going to study that
first a bit more.
- I have now included support for files with \r eol-style.

I'll send a new version of the patch shortly.

I'm also thinking that I could try to take advantage of -x-b/-x-w or
-x--ignore-eol-style options to make it even faster (right now the
prefix/suffix scanning will stop at the first difference, regardless
if it's a whitespace or eol difference that could/should be ignored).

However, I'm not sure if I should implement this myself, as part of
the find_identical_prefix and find_identical_suffix functions, or
switch to the usage of datasource_get_next_token (which is the
function that is used by the "real" diff algorithm to extract the
lines, and which uses util.c#svn_diff__normalize_buffer to normalize
whitespace and eol's if needed).

Right now, I don't read entire lines (tokens) but compare each byte as
I go. But I could do it line-based as well (extract line from file1,
extract line from file2, memcmp lines). I would have to make the
calculation of the adler32 hash in datasource_get_next_token
conditional on some boolean, or factor out the part of the function
that's useful to me into a new static function.

There is one caveat to this approach: I'm not sure if it would work
backwards (for suffix scanning). Well, the normalization function
wouldn't have to be changed, but the extraction of lines would have to
go backward. Surely it's possible, but I have no idea how much I'd
have to change the code in get_next_token to get lines backwards...

I'm also not sure if one would be (significantly) faster than the
other: comparing byte-by-byte while going through both files, or
extracting entire lines and then comparing the lines.

Thoughts?

Cheers,
-- 
Johan


Re: svn commit: r1001413 - in /subversion/branches/performance/subversion: include/svn_string.h libsvn_subr/svn_string.c

2010-09-28 Thread Ed Price
One correction (in the comments -- which are great, BTW):

s/descression/discretion

-Ed


On Mon, Sep 27, 2010 at 5:05 PM, Hyrum K. Wright
 wrote:
> +1 to merge to trunk.
>
> On Sun, Sep 26, 2010 at 6:01 AM,   wrote:
>> Author: stefan2
>> Date: Sun Sep 26 11:01:03 2010
>> New Revision: 1001413
>>
>> URL: http://svn.apache.org/viewvc?rev=1001413&view=rev
>> Log:
>> Extensively document the benefits of svn_stringbuf_appendbyte and
>> the rationals behind its implementation. To simplify the explanations,
>> one statement had to be moved.
>>
>> * subversion/include/svn_string.h
>>  (svn_stringbuf_appendbyte): extend docstring to indicate that this method
>>  is cheaper to call and execute
>> * subversion/libsvn_subr/svn_string.c
>>  (svn_stringbuf_appendbyte): reorder statements for easier description;
>>  add extensive description about the optimizations done
>>
>> Modified:
>>    subversion/branches/performance/subversion/include/svn_string.h
>>    subversion/branches/performance/subversion/libsvn_subr/svn_string.c
>>
>> Modified: subversion/branches/performance/subversion/include/svn_string.h
>> URL: 
>> http://svn.apache.org/viewvc/subversion/branches/performance/subversion/include/svn_string.h?rev=1001413&r1=1001412&r2=1001413&view=diff
>> ==
>> --- subversion/branches/performance/subversion/include/svn_string.h 
>> (original)
>> +++ subversion/branches/performance/subversion/include/svn_string.h Sun Sep 
>> 26 11:01:03 2010
>> @@ -259,6 +259,10 @@ void
>>  svn_stringbuf_fillchar(svn_stringbuf_t *str, unsigned char c);
>>
>>  /** Append a single character @a byte onto @a targetstr.
>> + * This is an optimized version of @ref svn_stringbuf_appendbytes
>> + * that is much faster to call and execute. Gains vary with the ABI.
>> + * The advantages extend beyond the actual call because the reduced
>> + * register pressure allows for more optimization within the caller.
>>  *
>>  * reallocs if necessary. @a targetstr is affected, nothing else is.
>>  * @since New in 1.7.
>>
>> Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c
>> URL: 
>> http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=1001413&r1=1001412&r2=1001413&view=diff
>> ==
>> --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c 
>> (original)
>> +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Sun 
>> Sep 26 11:01:03 2010
>> @@ -393,28 +393,60 @@ svn_stringbuf_ensure(svn_stringbuf_t *st
>>  }
>>
>>
>> +/* WARNING - Optimized code ahead!
>> + * This function has been hand-tuned for performance. Please read
>> + * the comments below before modifying the code.
>> + */
>>  void
>>  svn_stringbuf_appendbyte(svn_stringbuf_t *str, char byte)
>>  {
>> +  char *dest;
>> +  apr_size_t old_len = str->len;
>> +
>>   /* In most cases, there will be pre-allocated memory left
>>    * to just write the new byte at the end of the used section
>>    * and terminate the string properly.
>>    */
>> -  apr_size_t old_len = str->len;
>> -  if (str->blocksize > old_len + 1)
>> +  if (str->blocksize < old_len + 1)
>>     {
>> -      char *dest = str->data;
>> +      /* The following read does not depend this write, so we
>> +       * can issue the write first to minimize register pressure:
>> +       * The value of old_len+1 is no longer needed; on most processors,
>> +       * dest[old_len+1] will be calculated implicitly as part of
>> +       * the addressing scheme.
>> +       */
>> +      str->len = old_len+1;
>> +
>> +      /* Since the compiler cannot be sure that *src->data and *src
>> +       * don't overlap, we read src->data *once* before writing
>> +       * to *src->data. Replacing dest with str->data would force
>> +       * the compiler to read it again after the first byte.
>> +       */
>> +      dest = str->data;
>>
>> +      /* If not already available in a register as per ABI, load
>> +       * "byte" into the register (e.g. the one freed from old_len+1),
>> +       * then write it to the string buffer and terminate it properly.
>> +       *
>> +       * Including the "byte" fetch, all operations so far could be
>> +       * issued at once and be scheduled at the CPU's descression.
>> +       * Most likely, no-one will soon depend on the data that will be
>> +       * written in this function. So, no stalls there, either.
>> +       */
>>       dest[old_len] = byte;
>>       dest[old_len+1] = '\0';
>> -
>> -      str->len = old_len+1;
>>     }
>>   else
>>     {
>>       /* we need to re-allocate the string buffer
>>        * -> let the more generic implementation take care of that part
>>        */
>> +
>> +      /* Depending on the ABI, "byte" is a register value. If we were
>> +       * to take its address directly, the compiler might decide to
>> +       * put in on the stack *uncond

Re: [WIP] Gracefully handle svn_wc_prop_set4() errors in svn patch

2010-09-28 Thread Daniel Näslund
On Tue, Sep 28, 2010 at 07:18:39PM +0100, Philip Martin wrote:
> Daniel Näslund  writes:
> 
> > On Tue, Sep 28, 2010 at 03:45:33PM +0200, Daniel Shahaf wrote:
> >> Daniel Näslund wrote on Mon, Sep 27, 2010 at 22:09:26 +0200:
> >> 
> >> err_abort() is called when an error object hadn't been svn_error_clear()'d.
> >> (The error creation installed err_abort() as a pool cleanup callback,
> >> and clearing the error unregisters the callback.)
> >
> > Yes, that was how I understood it.
> 
> If you run the program gdb, it will catch the abort.  If you step up
> the stack to err_abort and print err[0] then you will see the file and
> line where the error was created.  (You may well have worked this out
> already).

Turned out that it was caused by prop_target->was_deleted (the flag that
was set when an error was detected) not beeing initialized to FALSE.

Thanks for the suggestion on checking err. Didn't think of that (but I
really should have!).

Thanks,
Daniel


Re: svn commit: r1002313 - /subversion/trunk/subversion/tests/cmdline/prop_tests.py

2010-09-28 Thread Daniel Shahaf
pbu...@apache.org wrote on Tue, Sep 28, 2010 at 18:38:19 -:
> +  # Run propget -vR svn:mergeinfo and collect the stdout.
> +  exit_code, pg_stdout, pg_stderr = svntest.actions.run_and_verify_svn(
> +None, None, [], 'propget', SVN_PROP_MERGEINFO, '-vR', wc_dir)
> +

exit_code and pg_stderr aren't checked anywhere.

> +  # Run propget -vR svn:mergeinfo, redirecting the stdout to a file.
> +  arglist = ['svn.exe', 'propget', SVN_PROP_MERGEINFO, '-vR', wc_dir]

s/.exe// ?

> +  redir_file = open(redirect_file, 'wb')
> +  pg_proc = subprocess.Popen(arglist, stdout=redir_file)

Shouldn't this use the svntest/ infrastructure?  Compare
svntest.actions.check_prop().

> +  pg_proc.wait()
> +  redir_file.close()
> +  pg_stdout_redir = open(redirect_file, 'r').readlines()
> +
> +  # Check if the redirected output of svn pg -vR is what we expect.
> +  #
> +  # Currently this fails because the mergeinfo for the three paths is
> +  # interleaved and the lines endings are (at least on Windows) a mix
> +  # of  and . See
> +  # http://subversion.tigris.org/issues/show_bug.cgi?id=3721#desc1
> +  unordered_expected_output = svntest.verify.UnorderedOutput([
> +"Properties on '" + B_path +  "':\n",
> +"Properties on '" + C_path +  "':\n",
> +"Properties on '" + D_path +  "':\n",
> +"  svn:mergeinfo\n",
> +"/subversion/branches/1.5.x:872364-874936\n",
> +"/subversion/branches/1.5.x-34184:874657-874741\n",
> +"/subversion/branches/1.5.x-34432:874744-874798\n",

So, 'unordered' also ignores repetitions?  (since the last 4 lines
appear only once each, rather than three times each)


Re: svn commit: r1002151 - in /subversion/trunk/subversion: libsvn_auth_kwallet/kwallet.cpp svn/main.c svnsync/main.c

2010-09-28 Thread Blair Zajac

On 9/28/2010 6:07 AM, s...@apache.org wrote:

Author: stsp
Date: Tue Sep 28 13:07:23 2010
New Revision: 1002151

URL: http://svn.apache.org/viewvc?rev=1002151&view=rev
Log:
Remove some goo introduced in r878078 and follow-ups, which was related to
the Linux-specific code which has been removed in r1002144.



-  INITIALIZE_APPLICATION
+  QCoreApplication *app;
+  if (! qApp)
+{
+  int argc = 1;
+  app = new QCoreApplication(argc, (char *[1]) {(char *) "svn"});
+}


Out of curiosity, what does this do?

QCoreApplication *app isn't static, so does this do some setup logic 
that we don't need to keep track of?


Blair


Re: [WIP] Gracefully handle svn_wc_prop_set4() errors in svn patch

2010-09-28 Thread Philip Martin
Daniel Näslund  writes:

> On Tue, Sep 28, 2010 at 03:45:33PM +0200, Daniel Shahaf wrote:
>> Daniel Näslund wrote on Mon, Sep 27, 2010 at 22:09:26 +0200:
>> 
>> err_abort() is called when an error object hadn't been svn_error_clear()'d.
>> (The error creation installed err_abort() as a pool cleanup callback,
>> and clearing the error unregisters the callback.)
>
> Yes, that was how I understood it.

If you run the program gdb, it will catch the abort.  If you step up
the stack to err_abort and print err[0] then you will see the file and
line where the error was created.  (You may well have worked this out
already).

>> > @@ -2260,14 +2283,23 @@ install_patched_prop_targets(patch_target_t *targe
>> > +  err = svn_wc_prop_set4(ctx->wc_ctx, target->local_abspath,
>> > + prop_target->name,
>> > + svn_string_create_from_buf(prop_content, 
>> > +iterpool),
>> > + TRUE /* skip_checks */,
>> > + NULL, NULL,
>> > + iterpool);
>> > +  if (err)
>> > +{
>> > +  prop_target->was_rejected = TRUE;
>> > +  prop_target->err = err;
>> 
>> Does prop_target->err always get cleared?
>> 
>> (The answer is probably "No".)
>
> As I said above, my intention was to clear it in
> send_patch_notification().

You will still have a problem if there is more that one error and the
assignment above overwrites a previous err, the overwritten err will
have leaked.

-- 
Philip


Re: [WIP] Gracefully handle svn_wc_prop_set4() errors in svn patch

2010-09-28 Thread Daniel Näslund
On Tue, Sep 28, 2010 at 03:45:33PM +0200, Daniel Shahaf wrote:
> Daniel Näslund wrote on Mon, Sep 27, 2010 at 22:09:26 +0200:
> > * What is wrong with the way I handle the error? I hit err_abort() when
> >   the program terminates. (I'm expecting the answer to hurt a bit - this
> >   is surely something that I should have understood by now). I thought
> >   that since the error is allocated on the heap I could just store the
> >   pointer to it and free it later, e.g. call svn_error_clear().
> 
> err_abort() is called when an error object hadn't been svn_error_clear()'d.
> (The error creation installed err_abort() as a pool cleanup callback,
> and clearing the error unregisters the callback.)

Yes, that was how I understood it.

> So, yeah, you can do whatever you want with the error (they get
> allocated in a global pool) as long as you svn_error_clear() them
> eventually.

Ok.

> > Index: subversion/svn/notify.c
> > ===
> > --- subversion/svn/notify.c (revision 1001829)
> > +++ subversion/svn/notify.c (arbetskopia)
> > @@ -464,6 +464,16 @@ notify(void *baton, const svn_wc_notify_t *n, apr_
> >  goto print_error;
> >break;
> >  
> > +case svn_wc_notify_failed_prop_patch:
> > +  nb->received_some_change = TRUE;
> > +  err = svn_cmdline_printf(pool,
> > +   _("property '%s' rejected from '%s'.\n"),
> > +   n->prop_name, path_local);
> > +  svn_handle_warning2(stderr, n->err, "svn: ");
> > +  if (err)
> > +goto print_error;
> > +  break;
> > +
> 
> That's fine, print_error: clears the error.

> > Index: subversion/libsvn_client/patch.c
> > ===
> > --- subversion/libsvn_client/patch.c(revision 1001829)
> > +++ subversion/libsvn_client/patch.c(arbetskopia)
> > @@ -130,6 +130,12 @@ typedef struct prop_patch_target_t {
> >  
> >/* ### Here we'll add flags telling if the prop was added, deleted,
> > * ### had_rejects, had_local_mods prior to patching and so on. */
> > +
> > +  /* TRUE if the property could not be set on the path. */
> > +  svn_boolean_t was_rejected;
> > +
> > +  /* Set if was_rejected is TRUE. */
> > +  svn_error_t *err;
> >  } prop_patch_target_t;
> >  
> >  typedef struct patch_target_t {
> > @@ -1573,6 +1579,22 @@ send_patch_notification(const patch_target_t *targ
> >
> >prop_target = svn__apr_hash_index_val(hash_index);
> >  
> > +  if (prop_target->was_rejected)
> > +{
> > +  svn_wc_notify_t *notify;
> > +  svn_wc_notify_action_t action = 
> > svn_wc_notify_failed_prop_patch;
> > +
> > +  notify = svn_wc_create_notify(target->local_abspath 
> > +? target->local_abspath
> > +: target->local_relpath,
> > +action, pool);
> > +  notify->prop_name = prop_target->name;
> > +  notify->err = prop_target->err;
> > +
> > +  (*ctx->notify_func2)(ctx->notify_baton2, notify, pool);
> > +  svn_error_clear(prop_target->err);

Here I'm clearing prop_target->err. Since prop_target->was_rejected is
only set if and error exists (e.g. ! prop_target->err) I was expecting
that err would always be cleared.

[...]

> > @@ -2260,14 +2283,23 @@ install_patched_prop_targets(patch_target_t *targe
> > +  err = svn_wc_prop_set4(ctx->wc_ctx, target->local_abspath,
> > + prop_target->name,
> > + svn_string_create_from_buf(prop_content, 
> > +iterpool),
> > + TRUE /* skip_checks */,
> > + NULL, NULL,
> > + iterpool);
> > +  if (err)
> > +{
> > +  prop_target->was_rejected = TRUE;
> > +  prop_target->err = err;
> 
> Does prop_target->err always get cleared?
> 
> (The answer is probably "No".)

As I said above, my intention was to clear it in
send_patch_notification().

I'll check again and see if I can spot why the err isn't always cleared.

Thanks for your feedback,
Daniel (dannas)


Re: svn commit: r1002271 - /subversion/trunk/subversion/include/svn_client.h

2010-09-28 Thread Daniel Shahaf
julianf...@apache.org wrote on Tue, Sep 28, 2010 at 17:16:59 -:
> Author: julianfoad
> Date: Tue Sep 28 17:16:59 2010
> New Revision: 1002271
> 
> URL: http://svn.apache.org/viewvc?rev=1002271&view=rev
> Log:
> * subversion/include/svn_client.h
>   (svn_client_move4): Document the current rather than the historical
> behaviour of the 'force' flag.  A follow-up to r1002260.
> 
> +++ subversion/trunk/subversion/include/svn_client.h Tue Sep 28 17:16:59 2010
> @@ -3892,11 +3892,10 @@ svn_client_move5(svn_commit_info_t **com
>   * move_as_child set to @c FALSE, @a revprop_table passed as NULL, and
>   * @a make_parents set to @c FALSE.
>   *
> - * If @a src_path is a working copy path:
> - *
> - *   - If one of @a src_paths contains locally modified and/or unversioned
> - * items and @a force is not set, the move will fail. If @a force is set
> - * such items will be removed.
> + * Note: The behaviour of @a force changed in r860885 and r861421, when the

Given that's it's a public API's docstring, wouldn't it make more sense
to talk here in terms of release numbers than revision numbers?

i.e., "the behaviour of @a force changed in 1.7.2 (r860885 and r861421) ..."

> + * 'move' semantics were improved to just move the source including any
> + * modified and/or unversioned items in it.  Before that, @a force
> + * controlled what happened to such items, but now @a force is ignored.
>   *
>   * @since New in 1.4.
>   *
> 
> 


Re: Some tips on profiling

2010-09-28 Thread Philip Martin
Stefan Fuhrmann  writes:

>  On 27.09.2010 21:44, Ramkumar Ramachandra wrote:
>> Could you tell me which tools you use to profile the various
>> applications in trunk? I'm looking to profile svnrdump to fix some
>> perf issues, but OProfile doesn't seem to work for me.
>
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).

The performance of svnrdump is likely to be dominated by IO from the
repository, network or disk depending on the RA layer.  strace is a
useful tool to see opens/reads/writes.  You can see what order the
calls occur, how many there are, how big they are and how long they
take.

Valgrind/Callgrind is good and doesn't require you to instrument the
code, but it does help to build with debug information.  It does
impose a massive runtime overhead.

OProfile will give you CPU usage with far lower overhead than
valgrind/callgrind.  Like valgrind/callgrind you don't need to
instrument the code but it works better with debug information and
with modern gcc if you use -O2 then you need -fno-omit-frame-pointer
for the callgraph stuff to work.  I use it like so:

opcontrol --init
opcontrol --no-vmlinux --separate=library --callgraph=32
opcontrol --start
opcontrol --reset
subversion/svnrdump/svnrdump ...
opcontrol --stop
opcontrol --dump
opreport --merge all -l image:/path/to/lt-svnrdump

This is what I get when dumping 1000 revisions from a local mirror of
the Subversion repository over ra_neon:

CPU: Core 2, speed 1200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (Unhalted core cycles) count 10
samples  %app name symbol name
4738 41.1893  no-vmlinux   (no symbols)
1037  9.0150  libxml2.so.2.6.32(no symbols)
700   6.0854  libneon.so.27.1.2(no symbols)
238   2.0690  libc-2.7.so  _int_malloc
228   1.9821  libc-2.7.so  memcpy
221   1.9212  libc-2.7.so  memset
217   1.8865  libc-2.7.so  strlen
191   1.6604  libsvn_subr-1.so.0.0.0   decode_bytes
180   1.5648  libc-2.7.so  vfprintf
171   1.4866  libc-2.7.so  strcmp
153   1.3301  libapr-1.so.0.2.12   apr_hashfunc_default
134   1.1649  libapr-1.so.0.2.12   apr_vformatter
130   1.1301  libapr-1.so.0.2.12   apr_palloc

That's on my Debian desktop.  At the recent Apache Retreat I tried to
demonstrate OProfile on my Ubuntu laptop and could not get it to work
properly, probably because I forgot about -fno-omit-frame-pointer.

Finally there is traditional gprof.  It's a long time since I used it
so I don't remember the details.  You instrument the code at compile
time using CFLAGS=-pg.  If an instrumented function foo calls into a
library bar that is not instrumented then bar is invisible, all you
see is how long foo took to execute.

-- 
Philip


Re: Some tips on profiling

2010-09-28 Thread Ramkumar Ramachandra
Hi Stefan,

Stefan Fuhrmann writes:
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).

Ah, they're brilliant! Just what I was looking for :D I'm yet to
interpret the output and interpret it to opimize svnrdump though-
might need some more help there; are you available on #svn-dev or
something sometime? Will keep you updated on this either way.

> Under Windows, I'm using the VisualStudio sampling profiler. The
> measurements
> are pretty accurate and the overhead is low. It does not tell me, how often
> a certain code path has been executed. Due to the low overhead, it is
> well-suited for long running (>1 min) operations.

Interesting. I didn't know Visual Studio had a profiler. Then again,
I've not used Windows for several years now.

> Also, I find a simple "time " very useful to get a first impression
> whether my code is bound by user-runtime, I/O wait or system runtime.

Ah, I want to know how to interpret this correctly. For equivalent
operations
svnrdump says: 7.49s user 1.34s system 91% cpu 9.682 total
svnadmin says: 4.84s user 0.44s system 97% cpu 5.417 total

> To collect data on I/O activity, compression rates and other not readily
> available information, I use simple static counters and printf()
> their values
> at the end of each client call, for instance.

I see.

> Hope that helps.

A ton! Thanks a lot :)

-- Ram


Re: add NODES.prior_deleted boolean column

2010-09-28 Thread Greg Stein
On Tue, Sep 28, 2010 at 08:46, Philip Martin  wrote:
> Julian Foad  writes:
>
>>> I believe the answer is "often". A simple 'svn status' will need to
>>> distinguish between 'A' and 'R', and that is done by checking
>>> prior_deleted.
>>>
>>> And no... this amount of denormalization will not hurt us.
>>
>> OK.  I know that "svn status" speed is a big deal.
>
> I'm not sure prior_deleted is an optimisation.  When I last looked at

Great analysis below. I'll take your word for it :-)

prior_deleted is effectively something like base_shadowed, and yeah...
if we're iterating over all nodes within a parent, then it can be
generated "simply".

> the performance of SQLite I concluded that status would be too slow if
> it ran per-node queries, it has to run per-dir queries.  So
>
>  SELECT ... FROM BASE_NODE WHERE wc_id = ?1 AND local_relpath = ?2
>
> is too slow, we need to run
>
>  SELECT ... FROM BASE_NODE WHERE wc_id = ?1 AND parent_relpath = ?2
>
> iterate over the rows caching the data and then generate the status
> for each child.

To do this, it seems that we're going to need to expose (from wc_db.h)
a structure containing "all" the row data. Thankfully, this big ol'
structure is *private* to libsvn_wc, and we can change it at will
(unlike svn_wc_status_t). I would suggest that we use a callback --
wc_db can step through each row of the result, fill in the structure,
and execute a callback (clearing an iterpool on each "step" with the
cursor).

The one tricky part to a callback is eliding all but the highest
op_depth. Does an ORDER BY in the query affect its runtime at all?

>...
> but that nested SELECT is expensive.  It's not as bad as doing
> per-node queries but it is still too slow, the database query alone is
> slower than 1.6 status.  I don't think this is something that can be
> fixed using an index because on my test data the runtime generally
> goes up by orders of magnitude when the indexing is wrong.

Yeah... the more indexes present, the more that need to be maintained
when rows are modified.

> I can get round this by querying for all op_depth and using the cache
> to select the highest.  The cache is a hash, keyed on local_relpath,
> that stores the data associated with the highest op_depth seen and it
> simply discards/overwrites lower op_depth.  I've prototyped this and

If we order by local_relpath, then the "cache" is simply one row
(hanging out in a structure, waiting to be passed to that callback).

> it's as fast as the per-dir BASE_NODE query on my test data.  This is
> not surprising since my test data consists of mostly single op_depth
> with a few double op_depth.  We have to have good performance on such
> working copies because they are the most common, I think it will be
> unusual to have a large working copy where lots of nodes have a large
> number of op_depth.

Agreed.

> Now to prior_deleted.  The fundamental status query looks like
>
>  SELECT ... FROM NODES WHERE wc_id = ?1 AND parent_relpath = ?2
>
> and all op_depth are seen.  It's quite simple to cache the two highest
> op_depth so prior_deleted only provides information that is already
> available. It's not an optimisation for status, if anything it will
> make the database bigger and slower.

Again, thanks for the analysis. Okay... let's skip it!

>...

Cheers,
-g


Re: svnrdump tool

2010-09-28 Thread Ramkumar Ramachandra
Hi Daniel,

Daniel Shahaf writes:
> > $ cat /test.dump
> > Node-path: test.txt
> > Node-path: trunk
> > Node-path: trunk/A
> > Node-path: trunk/A/B
> > Node-path: trunk/A/B/E
> > Node-path: trunk/A/B/E/alpha
> > Node-path: trunk/A/B/E/beta
> > Node-path: trunk/A/B/F
> > Node-path: trunk/A/B/lambda
> > Node-path: trunk/A/C
> > Node-path: trunk/A/D
> > Node-path: trunk/A/D/G
> > Node-path: trunk/A/D/G/pi
> > Node-path: trunk/A/D/G/rho
> > Node-path: trunk/A/D/G/tau
> > Node-path: trunk/A/D/H
> > Node-path: trunk/A/D/H/chi
> > Node-path: trunk/A/D/H/omega
> > Node-path: trunk/A/D/H/psi
> > Node-path: trunk/A/D/gamma
> > Node-path: trunk/A/mu
> > Node-path: trunk/iota
> > Node-path: trunk/A/D/H/psi
> > Node-path: trunk/A
> > Node-path: trunk/iota
> > Node-path: trunk/B
> > Node-path: trunk/A
> > Node-path: trunk/A
> > Node-path: trunk/B/D
> > Node-path: trunk
> 
> Why does "Node-path: trunk" appear twice?

Some prop change. Unrelated anyway.

-- Ram


Re: migration to NODES

2010-09-28 Thread Hyrum K. Wright
On Tue, Sep 28, 2010 at 4:47 AM, Greg Stein  wrote:
> On Mon, Sep 27, 2010 at 13:25, Julian Foad  wrote:
>> On Fri, 2010-09-24, Greg Stein wrote:
>>> On Fri, Sep 24, 2010 at 11:16, Julian Foad  wrote:
>>> >...
>>> > I think we should produce a test framework that can give us a WC
>>> > containing all the different possible WC states.  Then we can write
>>> > tests against this framework, some tests that test specific state, and
>>> > other tests that apply the same operations to all the states and check
>>> > that it works in all the states that it should.
>>>
>>> This requires a manual process of thinking of all states and all
>>> permutations. I don't trust it.
>>
>> This kind of testing is more about checking that the design is
>> implemented and that the code paths are exercised ...
>>
>>> If we could somehow capture working copies from during normal test
>>> runs, *then* I think we'd have "everything". We can easily get the
>>> terminal state for each test, which is a great first step. It would be
>>> great if we could also get the intermediate steps.
>>
>> ... while this kind is more about checking for regression or unwanted
>> changes of behaviour.  The two approaches are complementary.
>
> Fair enough.
>
> I took the 1.6.x branch and ran the test suite. It produced about 950
> working copies, and the (compressed) tarball sits at about 850k. I'm
> thinking about recording 'svn status' for each working copy, and
> checking the tarball in as test data. We can then run an 'svn upgrade'
> on each working copy, then 'svn status', and verify that the two
> status results match. (caveat minor improvements in state tracking and
> status reporting)
>
> But... running upgrade on about 950 working copies and checking their
> status isn't cheap. The tarball sizes don't bother me too much... I'm
> more concerned about test suite runtime.
>
> Anybody else? Should this be "normal test run"? Or should we set up an
> "extended" set of tests and drop this into that batch?

I've been advocating for a long time the need for an "extended" test
suite.  (I mean, really, merge_tests.py exercises all of the "basic"
functions in so many more ways than basic_tests.py, that the latter is
pretty much redundant when doing pre-commit checks.)  This might be an
opportunity to start that ball rolling, but it's probably also a lot
of work. :)

-Hyrum


Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Daniel Shahaf
> Index: subversion/include/svn_diff.h
> ===
> --- subversion/include/svn_diff.h (revision 1001548)
> +++ subversion/include/svn_diff.h (working copy)
> @@ -112,6 +112,11 @@
(personally I prefer 'svn diff -x-p' to show the function name here)
>svn_error_t *(*datasource_open)(void *diff_baton,
>svn_diff_datasource_e datasource);
>  
> +  /** Open the datasources of type @a datasources. */
> +  svn_error_t *(*datasources_open)(void *diff_baton, apr_off_t *prefix_lines,
> +   svn_diff_datasource_e datasource0,
> +   svn_diff_datasource_e datasource1);
> +

So, you're extending the svn_diff_fns_t struct, which is defined in
a public header.

It's a public struct with no constructor function, so I believe you have
to revv it (into svn_diff_fns2_t) in order to extend it (for binary
compatibility: people allocating this struct and then using a newer
Subversion library at runtime).

If it did have a constructor function, you'd have to extend it only at
the end, and even then make sure that if the added elements are NULL (eg
because an old caller didn't know to fill them) then everything still
worked.

Probably more important to get the logic right than to revv it right
away, though; the latter is a SMOP.

>/** Close the datasource of type @a datasource. */
>svn_error_t *(*datasource_close)(void *diff_baton,
> svn_diff_datasource_e datasource);
> @@ -124,6 +129,9 @@
>  void *diff_baton,
>  svn_diff_datasource_e 
> datasource);
>  
> +  /** Get the number of identical prefix lines from the @a diff_baton. */
> +  apr_off_t (*get_prefix_lines)(void *diff_baton);
> +
>/** A function for ordering the tokens, resembling 'strcmp' in 
> functionality.
> * @a compare should contain the return value of the comparison:
> * If @a ltoken and @a rtoken are "equal", return 0.  If @a ltoken is
> Index: subversion/libsvn_diff/diff_memory.c
> ===
> --- subversion/libsvn_diff/diff_memory.c  (revision 1001548)
> +++ subversion/libsvn_diff/diff_memory.c  (working copy)
> @@ -95,7 +95,23 @@
>return SVN_NO_ERROR;
>  }
>  
> +/* Implements svn_diff_fns_t::datasources_open */
> +static svn_error_t *
> +datasources_open(void *baton, apr_off_t *prefix_lines,
> + svn_diff_datasource_e datasource0, 
> + svn_diff_datasource_e datasource1)
> +{
> +  /* Do nothing: everything is already there and initialized to 0 */
> +  return SVN_NO_ERROR;
> +}
>  
> +/* Implements svn_diff_fns_t::datasource_get_prefix_lines */
> +static apr_off_t
> +get_prefix_lines(void *baton)
> +{
> +  return 0;
> +}
> +
>  /* Implements svn_diff_fns_t::datasource_close */
>  static svn_error_t *
>  datasource_close(void *baton, svn_diff_datasource_e datasource)
> @@ -189,8 +205,10 @@
>  static const svn_diff_fns_t svn_diff__mem_vtable =
>  {
>datasource_open,
> +  datasources_open,
>datasource_close,
>datasource_get_next_token,
> +  get_prefix_lines,
>token_compare,
>token_discard,
>token_discard_all
> Index: subversion/libsvn_diff/diff_file.c
> ===
> --- subversion/libsvn_diff/diff_file.c(revision 1001548)
> +++ subversion/libsvn_diff/diff_file.c(working copy)
> @@ -77,6 +77,10 @@
>char *curp[4];
>char *endp[4];
>  
> +  apr_off_t prefix_lines;
> +  int suffix_start_chunk[4];
> +  apr_off_t suffix_offset_in_chunk[4];
> +
>/* List of free tokens that may be reused. */
>svn_diff__file_token_t *tokens;
>  
> @@ -233,7 +237,330 @@
>  curp, length, 0, file_baton->pool);
>  }
>  
> +static svn_error_t *
> +increment_pointer_or_chunk(svn_diff__file_baton_t *file_baton,
> +   char **curp, char **endp, int *chunk_number,
> +   char *buffer, apr_off_t last_chunk_number, int 
> idx)
> +{
> +  apr_off_t length;
>  
> +  if ((*curp) == (*endp) - 1)
> +{
> +  if (*chunk_number == last_chunk_number)
> +(*curp)++; /* *curp == *endp with last chunk signals end of file */
> +  else
> +{
> +  (*chunk_number)++;
> +  length = *chunk_number == last_chunk_number ?
> +offset_in_chunk(file_baton->size[idx]) : CHUNK_SIZE;
> +  SVN_ERR(read_chunk(file_baton->file[idx],
> + file_baton->path[idx],
> + buffer, length,
> + chunk_to_offset(*chunk_number),
> + file_baton->pool));
> +  *endp = buffer + length;
> +  *curp = buffer;
> +}
> +}
> +  else
> +{
> +  (*curp)++;
> +}
> +
> +  ret

Re: svnrdump tool

2010-09-28 Thread Daniel Shahaf
Ramkumar Ramachandra wrote on Tue, Sep 28, 2010 at 00:44:40 +0530:
> Hi Daniel,
> 
> Daniel Shahaf writes:
> > svnsync allows you to sync a subdir of a repository (i.e.,
> >   svnsync $REPOS/trunk/A/B $MIRROR
> > ), but does it also create /trunk/A/B in the mirror?
> 
> Yes, it does.
> 
> > But for now I still think that svnrdump invocation I quoted above
> > shouldn't have outputted a 'create /trunk' entry in the dumpfile :-).
> > @all, what do you think?
> 
> Again, it works exactly like svnsync- you might like to read the
> tests. Here's a verbose example.
> 

So, I suppose we should have to leave it the way it is, to parallel
svnsync.  Okay.

> $ cat /test.dump
> Node-path: test.txt
> Node-path: trunk
> Node-path: trunk/A
> Node-path: trunk/A/B
> Node-path: trunk/A/B/E
> Node-path: trunk/A/B/E/alpha
> Node-path: trunk/A/B/E/beta
> Node-path: trunk/A/B/F
> Node-path: trunk/A/B/lambda
> Node-path: trunk/A/C
> Node-path: trunk/A/D
> Node-path: trunk/A/D/G
> Node-path: trunk/A/D/G/pi
> Node-path: trunk/A/D/G/rho
> Node-path: trunk/A/D/G/tau
> Node-path: trunk/A/D/H
> Node-path: trunk/A/D/H/chi
> Node-path: trunk/A/D/H/omega
> Node-path: trunk/A/D/H/psi
> Node-path: trunk/A/D/gamma
> Node-path: trunk/A/mu
> Node-path: trunk/iota
> Node-path: trunk/A/D/H/psi
> Node-path: trunk/A
> Node-path: trunk/iota
> Node-path: trunk/B
> Node-path: trunk/A
> Node-path: trunk/A
> Node-path: trunk/B/D
> Node-path: trunk

Why does "Node-path: trunk" appear twice?

> $ # svnadmin create /test-repo, /mirror and enable pre-revprop-change
> $ svnadmin load /test-repo < /test.dump
> $ svnsync init file:///mirror file:///test-repo/trunk/A/B
> $ svnsync sync file:///mirror
> $ svnadmin dump /mirror | grep Node-path:
> Node-path: trunk
> Node-path: trunk/A
> Node-path: trunk/A/B
> Node-path: trunk/A/B/E
> Node-path: trunk/A/B/E/alpha
> Node-path: trunk/A/B/E/beta
> Node-path: trunk/A/B/F
> Node-path: trunk/A/B/lambda
> Node-path: trunk/A
> Node-path: trunk/A
> Node-path: trunk/A
> Node-path: trunk/A/G
> Node-path: trunk/A/G/pi
> Node-path: trunk/A/G/rho
> Node-path: trunk/A/G/tau
> Node-path: trunk/A/H
> Node-path: trunk/A/H/chi
> Node-path: trunk/A/H/omega
> Node-path: trunk/A/H/psi
> Node-path: trunk/A/gamma
> Node-path: trunk
> $ svnrdump dump file:///test-repo/trunk/A/B
> Node-path: trunk
> Node-path: trunk/A
> Node-path: trunk/A/B
> Node-path: trunk/A/B/E
> Node-path: trunk/A/B/E/alpha
> Node-path: trunk/A/B/E/beta
> Node-path: trunk/A/B/F
> Node-path: trunk/A/B/lambda
> Node-path: trunk/A
> Node-path: trunk/A
> Node-path: trunk/A
> Node-path: trunk/A/G
> Node-path: trunk/A/G/pi
> Node-path: trunk/A/G/rho
> Node-path: trunk/A/G/tau
> Node-path: trunk/A/H
> Node-path: trunk/A/H/chi
> Node-path: trunk/A/H/omega
> Node-path: trunk/A/H/psi
> Node-path: trunk/A/gamma
> Node-path: trunk
> 
> Hope that helps.
> 
> -- Ram

Thanks for the example,

Daniel


Re: place of svnrdump

2010-09-28 Thread Daniel Shahaf
Ramkumar Ramachandra wrote on Tue, Sep 28, 2010 at 00:27:38 +0530:
> Hi Neels,
> 
> Neels J Hofmeyr writes:
> > > I just benchmarked it recently and found that it dumps 1 revisions
> > > of the ASF repository in 106 seconds: that's about 94 revisions per
> > > second.
> > 
> > Wow! That's magnitudes better than the 5 to 10 revisions per second I'm used
> > to (I think that's using svnsync).
> 
> Yep :)
> 
> > While we're at it... svnsync's slowness is particularly painful when doing
> > 'svnsync copy-revprops'. With revprop changes enabled, any revprops may be
> > changed at any time. So to maintain an up-to-date mirror, one would like to
> > copy *all* revprops at the very least once per day. With a repos of average
> > corporate size, though, that can take the whole night and soon longer than
> > the developers need to go home and come back to work next morning (to find
> > the mirror lagging). So one could copy only the youngest 1000 revprops each
> > night and do a complete run every weekend. Or script a revprop-change hook
> > that propagates revprop change signals to mirrors. :(
> 
> Wow. This is quite a serious problem. I'm a very new developer, and I
> don't really use Subversion. You should probably let the other
> Subversion developers know about this on a new thread?
> 
> @Daniel, @Stefan: Thoughts on this?
> 

Use the commits@ list and run copy-revprops only on revisions that
actually had been revprop-edited?

> > svnrdump won't help in that compartment, would it?
> 
> That would be a feature request (although I'm not sure svnrdump will
> ever be extended to handle that),

How could svnrdump help here?  What we might need is an RA call that has
the server provide the N last revisions to have undergone revprop edits...

> because svnrdump is still very
> young- it just dumps/ loads dumpfiles from remote repositories quickly
> at the moment. I've decided to feature freeze until I fix the perf
> issues for the upcoming release- I'll keep this in mind though.
> 
> -- Ram


Re: [WIP] Gracefully handle svn_wc_prop_set4() errors in svn patch

2010-09-28 Thread Daniel Shahaf
Daniel Näslund wrote on Mon, Sep 27, 2010 at 22:09:26 +0200:
> * What is wrong with the way I handle the error? I hit err_abort() when
>   the program terminates. (I'm expecting the answer to hurt a bit - this
>   is surely something that I should have understood by now). I thought
>   that since the error is allocated on the heap I could just store the
>   pointer to it and free it later, e.g. call svn_error_clear().

err_abort() is called when an error object hadn't been svn_error_clear()'d.
(The error creation installed err_abort() as a pool cleanup callback,
and clearing the error unregisters the callback.)

So, yeah, you can do whatever you want with the error (they get
allocated in a global pool) as long as you svn_error_clear() them
eventually.

> 
> Index: subversion/svn/notify.c
> ===
> --- subversion/svn/notify.c   (revision 1001829)
> +++ subversion/svn/notify.c   (arbetskopia)
> @@ -464,6 +464,16 @@ notify(void *baton, const svn_wc_notify_t *n, apr_
>  goto print_error;
>break;
>  
> +case svn_wc_notify_failed_prop_patch:
> +  nb->received_some_change = TRUE;
> +  err = svn_cmdline_printf(pool,
> +   _("property '%s' rejected from '%s'.\n"),
> +   n->prop_name, path_local);
> +  svn_handle_warning2(stderr, n->err, "svn: ");
> +  if (err)
> +goto print_error;
> +  break;
> +

That's fine, print_error: clears the error.

> Index: subversion/libsvn_client/patch.c
> ===
> --- subversion/libsvn_client/patch.c  (revision 1001829)
> +++ subversion/libsvn_client/patch.c  (arbetskopia)
> @@ -130,6 +130,12 @@ typedef struct prop_patch_target_t {
>  
>/* ### Here we'll add flags telling if the prop was added, deleted,
> * ### had_rejects, had_local_mods prior to patching and so on. */
> +
> +  /* TRUE if the property could not be set on the path. */
> +  svn_boolean_t was_rejected;
> +
> +  /* Set if was_rejected is TRUE. */
> +  svn_error_t *err;
>  } prop_patch_target_t;
>  
>  typedef struct patch_target_t {
> @@ -1573,6 +1579,22 @@ send_patch_notification(const patch_target_t *targ
>
>prop_target = svn__apr_hash_index_val(hash_index);
>  
> +  if (prop_target->was_rejected)
> +{
> +  svn_wc_notify_t *notify;
> +  svn_wc_notify_action_t action = 
> svn_wc_notify_failed_prop_patch;
> +
> +  notify = svn_wc_create_notify(target->local_abspath 
> +? target->local_abspath
> +: target->local_relpath,
> +action, pool);
> +  notify->prop_name = prop_target->name;
> +  notify->err = prop_target->err;
> +
> +  (*ctx->notify_func2)(ctx->notify_baton2, notify, pool);
> +  svn_error_clear(prop_target->err);
> +}
> +
>for (i = 0; i < prop_target->content_info->hunks->nelts; i++)
>  {
>const hunk_info_t *hi;
> @@ -2189,6 +2211,7 @@ install_patched_prop_targets(patch_target_t *targe
>svn_stringbuf_t *prop_content;
>const char *eol_str;
>svn_boolean_t eof;
> +  svn_error_t *err;
>  
>svn_pool_clear(iterpool);
>  
> @@ -2260,14 +2283,23 @@ install_patched_prop_targets(patch_target_t *targe
> * ### stsp: I'd say reject the property hunk.
> * ###   We should verify all modified prop hunk texts using
> * ###   svn_wc_canonicalize_svn_prop() before starting the
> -   * ###   patching process, to reject them as early as possible. */
> -  SVN_ERR(svn_wc_prop_set4(ctx->wc_ctx, target->local_abspath,
> -   prop_target->name,
> -   svn_string_create_from_buf(prop_content, 
> -  iterpool),
> -   TRUE /* skip_checks */,
> -   NULL, NULL,
> -   iterpool));
> +   * ###   patching process, to reject them as early as possible. 
> +   *
> +   * ### dannas: Unfortunately we need the prop_content to run
> +   * ### svn_wc_canonicalize_svn_prop() and we don't have that 
> +   * ### until we've applied our text changes. */
> +  err = svn_wc_prop_set4(ctx->wc_ctx, target->local_abspath,
> + prop_target->name,
> + svn_string_create_from_buf(prop_content, 
> +iterpool),
> + TRUE /* skip_checks */,
> + NULL, NULL,
> + iterpool);
> +  if (err)
> +{
> +  prop_target->was_rejected = TRUE;
> +  

Re: migration to NODES

2010-09-28 Thread Daniel Shahaf
I'd rather have O(1) hand-crafted wc's than all 950 wc's the test suite
generates, of which likely 99% are 'normal'... (especially as it's the
end-of-run state, after any conflicts have been resolved etc)

Greg Stein wrote on Tue, Sep 28, 2010 at 05:47:36 -0400:
> I took the 1.6.x branch and ran the test suite. It produced about 950
> working copies, and the (compressed) tarball sits at about 850k. I'm
> thinking about recording 'svn status' for each working copy, and
> checking the tarball in as test data. We can then run an 'svn upgrade'
> on each working copy, then 'svn status', and verify that the two
> status results match. (caveat minor improvements in state tracking and
> status reporting)
> 
> But... running upgrade on about 950 working copies and checking their
> status isn't cheap. The tarball sizes don't bother me too much... I'm
> more concerned about test suite runtime.
> 
> Anybody else? Should this be "normal test run"? Or should we set up an
> "extended" set of tests and drop this into that batch?
> 
> Or not even go with this approach?


Re: add NODES.prior_deleted boolean column

2010-09-28 Thread Philip Martin
Julian Foad  writes:

>> I believe the answer is "often". A simple 'svn status' will need to
>> distinguish between 'A' and 'R', and that is done by checking
>> prior_deleted.
>> 
>> And no... this amount of denormalization will not hurt us.
>
> OK.  I know that "svn status" speed is a big deal.

I'm not sure prior_deleted is an optimisation.  When I last looked at
the performance of SQLite I concluded that status would be too slow if
it ran per-node queries, it has to run per-dir queries.  So

  SELECT ... FROM BASE_NODE WHERE wc_id = ?1 AND local_relpath = ?2

is too slow, we need to run

  SELECT ... FROM BASE_NODE WHERE wc_id = ?1 AND parent_relpath = ?2

iterate over the rows caching the data and then generate the status
for each child.

NODES and op_depth adds a complication.  Getting the equivalent of
BASE_NODE is easy, we add "AND op_depth = 0", but status wants the row
with the highest op_depth for each node, and that's a different number
for each row in a per-dir query.  We can do it like this:

  SELECT ... FROM NODES n WHERE wc_id = ?1 AND parent_relpath = ?2
  AND op_depth = (SELECT MAX(op_depth) FROM NODES m
   WHERE m.wc_id = n.wc_id
 AND m.local_relpath = n.local_relpath)

but that nested SELECT is expensive.  It's not as bad as doing
per-node queries but it is still too slow, the database query alone is
slower than 1.6 status.  I don't think this is something that can be
fixed using an index because on my test data the runtime generally
goes up by orders of magnitude when the indexing is wrong.

I can get round this by querying for all op_depth and using the cache
to select the highest.  The cache is a hash, keyed on local_relpath,
that stores the data associated with the highest op_depth seen and it
simply discards/overwrites lower op_depth.  I've prototyped this and
it's as fast as the per-dir BASE_NODE query on my test data.  This is
not surprising since my test data consists of mostly single op_depth
with a few double op_depth.  We have to have good performance on such
working copies because they are the most common, I think it will be
unusual to have a large working copy where lots of nodes have a large
number of op_depth.

Now to prior_deleted.  The fundamental status query looks like

  SELECT ... FROM NODES WHERE wc_id = ?1 AND parent_relpath = ?2

and all op_depth are seen.  It's quite simple to cache the two highest
op_depth so prior_deleted only provides information that is already
available. It's not an optimisation for status, if anything it will
make the database bigger and slower.

Below are the Python script that generates a big NODES database, and a
C program that prototypes the status queries:

#!/usr/bin/python

import os, sqlite3

try: os.remove('wcx.db')
except: pass

c = sqlite3.connect('wcx.db')
c.execute("""create table repository (
   id integer primary key autoincrement,
   root text unique not null,
   uuid text not null)""")
c.execute("""create index i_uuid on repository (uuid)""")
c.execute("""create index i_root on repository (root)""")
c.execute("""create table wcroot (
   id integer primary key autoincrement,
   local_abspath text unique)""")
c.execute("""create unique index i_local_abspath on wcroot (local_abspath)""")
c.execute("""create table nodes (
   wc_id integer not null references wcroot (id),
   local_relpath text not null,
   op_depth integer not null,
   parent_relpath text,
   repos_id integer references repository (id),
   repos_path text,
   revision integer,
   presence text not null,
   depth text,
   moved_here integer,
   moved_to text,
   kind text not null,
   changed_revision integer,
   changed_date integer,
   changed_author text,
   checksum text
   properties blob,
   translated_size integer,
   last_mod_time integer,
   dav_cache blob,
   symlink_target text,
   file_external text,
   primary key(wc_id, local_relpath, op_depth))""")
c.execute("""create index i_parent on nodes (wc_id,
 parent_relpath,
 local_relpath)""")
c.execute("""insert into repository (root, uuid) values (
   "http://example.com/repo";,
   "f738be9e-409d-481f-b246-1fb6a969aba2")""")
c.execute("""insert into wcroot(local_abspath) values ("/wc")""")

c.execute("""insert into nodes (
   wc_id,
   local_relpath,
   op_depth,
   repos_id,
   repos_path,
   parent_relpath,
   presence,
   kind)
 values (
   1,
   "",
   0,
   1,
  

Re: NODES table - conditional compilation for op_depth

2010-09-28 Thread Philip Martin
Greg Stein  writes:

> Seems like this will make things even more complicated. I'd be in
> favor of *not* switching to NODES unless/until the op_depth is done
> properly. If you switch early, then you're going to require another
> format bump to reset all the op_depth fields to their proper values. I
> don't think an early-switch to NODES before proper op_depth
> computation will buy us anything.

There is an overhead to maintaining both sets of code.  It's possible
to build the wrong version, run regression tests with the wrong
version, or break the other version.

-- 
Philip


Re: svn propchange: r1001678 - svn:log

2010-09-28 Thread Philip Martin
Julian Foad  writes:

> Greg Stein wrote:
>> Why are tests failing because of the different values?
>
> I think it's not because the op_depth value is different, but because
> creating a row with a different op_depth now in some cases means
> creating an *additional* row, and the rest of the code is not yet
> prepared for finding two or more WORKING rows for the same node.
> Something like that.

A typical failure was a delete followed by an add/replace.  The delete
creates a row with op_depth=2 (a temporary value) and the add creates
a row with op_depth=1 (a correct value).  When we select the row with
the highest op_depth we don't see the add.

We had a brief discussion on IRC about this.  The add does an INSERT
OR REPLACE and the op_depth difference makes it an insert rather than
a replace.  Should add determine whether it should insert or replace
and make the appropriate query?  The temporary op_depth values lead to
a corrupt database if the op_depth values are assummed to be correct.
Should add detect the corruption?

-- 
Philip


Re: svn propchange: r1001678 - svn:log

2010-09-28 Thread Julian Foad
Greg Stein wrote:
> Why are tests failing because of the different values?

I think it's not because the op_depth value is different, but because
creating a row with a different op_depth now in some cases means
creating an *additional* row, and the rest of the code is not yet
prepared for finding two or more WORKING rows for the same node.
Something like that.

- Julian


> Philip stated that we weren't really examining the values. That "zero
> vs non-zero" was the only real test so far. So why does inserting
> other values break things? That seems to be a more interesting problem
> to solve than simply throwing up our collective hands and #ifdef'ing
> the values out (along with that additional complexity!)
> 
> Cheers,
> -g
> 
> On Mon, Sep 27, 2010 at 07:58,   wrote:
> > Author: julianfoad
> > Revision: 1001678
> > Modified property: svn:log
> >
> > Modified: svn:log at Mon Sep 27 11:58:18 2010
> > --
> > --- svn:log (original)
> > +++ svn:log Mon Sep 27 11:58:18 2010
> > @@ -1,2 +1,12 @@
> > +Guard some ongoing op_depth work inside a new conditional: 
> > SVN_WC__OP_DEPTH.
> > +Guards the change made in r1000557, but not yet the similar changes made in
> > +r1000955.  These op_depth changes are causing test failures, and the fixes
> > +are expected to be extensive.
> > +
> > +Also document an existing function.
> > +
> >  * subversion/libsvn_wc/wc_db.c
> >   (construct_like_arg): Add a doc string. Fix naming of the pool arg.
> > +  (svn_wc__db_op_add_directory, svn_wc__db_op_add_file,
> > +   svn_wc__db_op_add_symlink): Change op_depth back to a hard-coded '2'
> > +unless SVN_WC__OP_DEPTH is defined.
> >
> >




Re: svn propchange: r1001678 - svn:log

2010-09-28 Thread Greg Stein
Why are tests failing because of the different values?

Philip stated that we weren't really examining the values. That "zero
vs non-zero" was the only real test so far. So why does inserting
other values break things? That seems to be a more interesting problem
to solve than simply throwing up our collective hands and #ifdef'ing
the values out (along with that additional complexity!)

Cheers,
-g

On Mon, Sep 27, 2010 at 07:58,   wrote:
> Author: julianfoad
> Revision: 1001678
> Modified property: svn:log
>
> Modified: svn:log at Mon Sep 27 11:58:18 2010
> --
> --- svn:log (original)
> +++ svn:log Mon Sep 27 11:58:18 2010
> @@ -1,2 +1,12 @@
> +Guard some ongoing op_depth work inside a new conditional: SVN_WC__OP_DEPTH.
> +Guards the change made in r1000557, but not yet the similar changes made in
> +r1000955.  These op_depth changes are causing test failures, and the fixes
> +are expected to be extensive.
> +
> +Also document an existing function.
> +
>  * subversion/libsvn_wc/wc_db.c
>   (construct_like_arg): Add a doc string. Fix naming of the pool arg.
> +  (svn_wc__db_op_add_directory, svn_wc__db_op_add_file,
> +   svn_wc__db_op_add_symlink): Change op_depth back to a hard-coded '2'
> +    unless SVN_WC__OP_DEPTH is defined.
>
>


Re: NODES table - conditional compilation for op_depth

2010-09-28 Thread Greg Stein
Seems like this will make things even more complicated. I'd be in
favor of *not* switching to NODES unless/until the op_depth is done
properly. If you switch early, then you're going to require another
format bump to reset all the op_depth fields to their proper values. I
don't think an early-switch to NODES before proper op_depth
computation will buy us anything.

Cheers,
-g

On Mon, Sep 27, 2010 at 10:00, Julian Foad  wrote:
> Philip and I have started implementing op_depth in the NODES table, but
> we soon found there is more to do than simply calculating a value here
> and there.
>
> ("Implementing op_depth" means enabling multiply nested
> copies/replacements/adds/deletes etc., with the ability to revert at
> each level.)
>
> In the meantime, some tests were breaking, so we have made the full
> op_depth support conditional on SVN_WC__OP_DEPTH.
>
> Why?  The interim 0/1/2 op_depth values have been working OK in the
> limited context of the amount of nesting that BASE+WORKING supports.  We
> might want to make a transition from BASE_NODE/WORKING_NODE to NODES
> first, before enabling the full op_depth support.  That is probably the
> main reason why this further conditional is useful.  The alternative
> would be to complete op_depth support before switching the default build
> over to NODES.
>
> Any concerns about working within SVN_WC__OP_DEPTH?
>
> - Julian
>
>
>


Re: migration to NODES

2010-09-28 Thread Greg Stein
On Mon, Sep 27, 2010 at 13:25, Julian Foad  wrote:
> On Fri, 2010-09-24, Greg Stein wrote:
>> On Fri, Sep 24, 2010 at 11:16, Julian Foad  wrote:
>> >...
>> > I think we should produce a test framework that can give us a WC
>> > containing all the different possible WC states.  Then we can write
>> > tests against this framework, some tests that test specific state, and
>> > other tests that apply the same operations to all the states and check
>> > that it works in all the states that it should.
>>
>> This requires a manual process of thinking of all states and all
>> permutations. I don't trust it.
>
> This kind of testing is more about checking that the design is
> implemented and that the code paths are exercised ...
>
>> If we could somehow capture working copies from during normal test
>> runs, *then* I think we'd have "everything". We can easily get the
>> terminal state for each test, which is a great first step. It would be
>> great if we could also get the intermediate steps.
>
> ... while this kind is more about checking for regression or unwanted
> changes of behaviour.  The two approaches are complementary.

Fair enough.

I took the 1.6.x branch and ran the test suite. It produced about 950
working copies, and the (compressed) tarball sits at about 850k. I'm
thinking about recording 'svn status' for each working copy, and
checking the tarball in as test data. We can then run an 'svn upgrade'
on each working copy, then 'svn status', and verify that the two
status results match. (caveat minor improvements in state tracking and
status reporting)

But... running upgrade on about 950 working copies and checking their
status isn't cheap. The tarball sizes don't bother me too much... I'm
more concerned about test suite runtime.

Anybody else? Should this be "normal test run"? Or should we set up an
"extended" set of tests and drop this into that batch?

Or not even go with this approach?

>...

Cheers,
-g


RE: place of svnrdump

2010-09-28 Thread Bolstridge, Andrew


-Original Message-
From: Ramkumar Ramachandra [mailto:artag...@gmail.com] 
Sent: 27 September 2010 19:58
To: Neels J Hofmeyr
Cc: dev@subversion.apache.org; Daniel Shahaf; Stefan Sperling
Subject: Re: place of svnrdump

Neels J Hofmeyr writes:

> While we're at it... svnsync's slowness is particularly painful when 
> doing 'svnsync copy-revprops'. With revprop changes enabled, any 
> revprops may be changed at any time. So to maintain an up-to-date 
> mirror, one would like to copy *all* revprops at the very least once 
> per day. With a repos of average corporate size, though, that can take

> the whole night and soon longer than the developers need to go home 
> and come back to work next morning (to find the mirror lagging). So 
> one could copy only the youngest 1000 revprops each night and do a 
> complete run every weekend. Or script a revprop-change hook that 
> propagates revprop change signals to mirrors. :(

Of course, you could put a post-revprop-change hook in place to note
which revprop was changed, and then run a script that only syncs those
revprops.

I wouldn't recommend putting the 'sync copy-revprops' command in the
post-revprop-change hook, if someone commits a revision then immediately
updates the revprop the sync will fail (as the rev may not have been
synced yet). 

If anything, changing svnsync to ignore a failed copy-revprop command if
no revision existed to sync to would fix this problem, and the
copy-revprop could then be put in the hook without worry.