[svnbench] Revision: 1135007 compiled Jun 13 2011, 00:21:18

2011-06-12 Thread neels
/home/neels/svnbench/20110613-002336
Started at Mon Jun 13 00:23:36 UTC 2011


-

Results for dir levels: 5  spread: 5
Timings for 5x5_1.6
N   min max avgoperation  (unit is seconds)
6  594.60  608.46  601.67  TOTAL RUN
  8870.01   20.540.13  add
   120.02   19.917.15  checkout
   489.69   81.98   31.70  commit
60.450.670.52  copy
6   21.33   30.67   25.77  delete
   128.86   27.17   16.48  merge
  9600.010.450.01  mkdir
 64070.016.600.01  prop mod
  2610.010.430.01  propdel
125500.011.250.01  proplist
253730.017.890.01  propset
   120.010.010.01  resolve
   120.241.830.94  resolved
   840.131.760.40  status
6   16.63   26.98   21.52  switch
   900.12   31.08   10.32  update
---
Timings for 5x5_trunk
N   min max avgoperation  (unit is seconds)
6  339.54  346.05  342.57  TOTAL RUN
  8870.011.730.02  add
   120.026.433.09  checkout
   481.46   81.40   24.36  commit
60.200.250.22  copy
60.760.780.77  delete
   127.28   18.53   12.14  merge
  9600.010.430.01  mkdir
 64070.010.410.01  prop mod
  2610.010.010.01  propdel
125500.010.010.01  proplist
253730.010.270.01  propset
   120.010.010.01  resolve
   120.470.640.56  resolved
   840.190.860.50  status
63.563.613.59  switch
   900.166.811.98  update
---
COMPARE 5x5_1.6 to 5x5_trunk
  TOTAL RUN times: 601.7 seconds avg for 5x5_1.6
   342.6 seconds avg for 5x5_trunk
  min  max  avg operation
   0.57|-255.0520.57|-262.4140.57|-259.098  TOTAL RUN
   1.58| +0.003 0.08|-18.811 0.16| -0.112   add
   0.94| -0.001 0.32|-13.481 0.43| -4.059   checkout
   0.15| -8.234 0.99| -0.583 0.77| -7.340   commit
   0.45| -0.247 0.37| -0.419 0.42| -0.303   copy
   0.04|-20.576 0.03|-29.888 0.03|-24.997   delete
   0.82| -1.576 0.68| -8.635 0.74| -4.341   merge
   1.49| +0.003 0.95| -0.023 1.57| +0.004   mkdir
   1.47| +0.003 0.06| -6.195 0.86| -0.002   prop mod
   1.39| +0.002 0.03| -0.420 0.77| -0.003   propdel
   1.47| +0.002 0.01| -1.239 1.32| +0.002   proplist
   1.49| +0.003 0.03| -7.619 0.92| -0.001   propset
   1.34| +0.002 1.33| +0.002 1.33| +0.002   resolve
   1.98| +0.234 0.35| -1.192 0.59| -0.383   resolved
   1.46| +0.060 0.49| -0.893 1.24| +0.096   status
   0.21|-13.072 0.13|-23.370 0.17|-17.929   switch
   1.40| +0.046 0.22|-24.271 0.19| -8.340   update
("1.23|+0.45"  means factor=1.23, difference in seconds = 0.45
factor < 1 or difference < 0 means '5x5_trunk' is faster than '5x5_1.6')

-

Results for dir levels: 100  spread: 1
Timings for 100x1_1.6
N   min max avgoperation  (unit is seconds)
6   69.99   73.65   72.11  TOTAL RUN
  1480.010.880.04  add
   120.014.912.35  checkout
   481.334.402.09  commit
60.210.880.33  copy
60.981.141.04  delete
   120.700.960.80  merge
  1070.010.060.01  mkdir
  3610.010.010.01  prop mod
   170.010.020.01  propdel
  6870.010.010.01  proplist
 11130.010.150.01  propset
   120.010.010.01  resolve
   120.060.350.11  resolved
   840.040.270.08  status
63.904.774.41  switch
   900.055.852.53  update
---
Timings for 100x1_trunk
N   min max avgoperation  (unit is seconds)
6   24.49   25.60   24.93  TOTAL RUN
  1480.010.140.02  add
   120.020.360.19  checkout
   480.294.241.60  commit
60.170.210.19  copy
60.110.120.11  delete
   120.581.280.79  merge
  1070.010.010.01  mkdir
  3610.010.010.01  prop mod
   170.010.010.01  propdel
  6870.010.010.01  proplist
 11130.010.020.01  propset
   120.010.010.01  resolve
   120.060.090.07  resolved
   840.040.110.08  status
60.310.360.34  switch
   900.040.510.23  update
---
COMPARE 100x1_1.6 to 100x1_trunk
  TOTAL RUN times:  72.1 seconds avg for 100x1_1.6
24.9 seconds avg for 100x1_trunk
  min  max  avg operation
   0.35|-45.497 0.35|-48.054 0.35|-47.179   TOTAL RUN
   1.62| +0.003 0.15| -0.744 0.39| -0.025   add
   1.40| +0.005 0.07| -4.548 0.08| -2.156   checkout
   0.22| -1.039 0.96| -0.1

Re: Improvements to diff3 (merge) performance

2011-06-12 Thread Greg Hudson
My executive summary of your post is that you want diff3 to try to merge
related, but not identical, changes occuring between a pair of sync
points.  I'm wary about this for two reasons.

First, the benefit appears to arise chiefly for what Bram Cohen calls
"implicit cherrypicking" use cases--that is, cases where a change is
merged and then merged again together with other changes.  After
extensive research, Bram eventually concluded that trying to support
this is a bad idea (http://bramcohen.livejournal.com/52148.html).  I
tend to think that a merge algorithm should not increase its false
negative rate for the benefit of implicit cherrypicking.

Second, I can see a lot of potential for incorrect merges based on
superficial similarities.  For instance, if you and I both add code
between the same two sync points, and the last line of my change is the
same as the first line of yours (perhaps a blank line), that could be
enough to unambiguously determine an ordering.  Perhaps both of our code
additions do the same thing in different ways, and now it gets done
twice (which is almost never desirable).  Certainly, the existing
algorithm can produce incorrect merges too, but my intuition says that
the practical likelihood would become much higher.

Of course, I could be off base.  All merge algorithms are heuristic, and
it would take a lot of research to really compare the efficacy of one
against another.  You need a cost function to determine the tradeoff
between the false negative rate and the false positive rate, and you
also need to measure how any given algorithmic change affects the false
negative and false positive rates in practice.  Both of these seem
really hard.

It would definitely affect my opinion if I learned that the three-way
merge algorithms in other popular version control systems worked this
way, or if I learned that Subversion was more restrictive than, say, gnu
diff3.



Re: Improvements to diff3 (merge) performance

2011-06-12 Thread Morten Kloster
Ugh - using "---" to separate sections of the post was clearly a
really bad idea; sorry about that. :-(

Here it is again, in a hopefully more readable format:

Hi,

I'm not very happy with the current performance of the diff3
algorithm; I think it returns too many and too big conflicts. I have
some ideas on how to improve it, but I'd like to hear other people's
opinions first. Note that the primary purpose is to discuss how we
want merge to behave, not whether it is easy to implement.

I'll use the terms Base (original), Mine (modified) and Theirs (latest)
to refer to the three source files. In the merged file, C indicates a
conflict.

===

First, here's a quick run-though of how diff3 works now: After
running LCS on Base vs. Mine and Base vs. Theirs, it looks for
"sync points" - points where both Mine and Theirs match the Base.
Note that there is no need for an actual overlap: If Mine matches
Base up to a point and Theirs matches Base from that point
onwards, that is a valid sync point. If no two of the three are
identical between two neighboring sync points, that creates a
conflict.

Adjacent changes are accepted:

abcd (Base)
a1cd (Mine)
ab2d (Theirs)
a12d (Merged)

and

abcd (Base)
acd (Mine)
abc (Theirs)
ad (Merged)

while strictly larger changes are not:

abcd (Base)
a1cd (Mine)
a12d (Theirs)
aCd (Merged)

and

abcd (Base)
acd (Mine)
ad (Theirs)
aCd (Merged)

Note that this means that if changed files A and B are merged to
produce C, then attempting to merge A (or B) with C could well
produce a conflict (using the same Base the whole time), whereas
in my opinion, that ought to work fine, yielding C again.

===

My proposed main change is to let diff3 also sync on points where
Mine and Theirs have identical changes relative to Base, if either file
(Mine or Theirs) can also be synced to Base at the same point. This
can happen if either Mine or Theirs is unchanged from Base either
before or after that point, or if either of the changes from Base was a
pure insertion (as there is then no uncertainty as to the position in
the Base file). This would let diff3 incorporate all changes relative to
Base from both Mine and Theirs as long as there is no true conflict.

Some consequences:

Given two changed files A and B that only have deletions relative to
Base, the merged file would only contain the lines that are present
in both A and B.

Different insertions would be merged, not marked as conflict,
unless they truly conflict. There are only true conflicts if the diff2
between the two insertions have actual changes, not just separated
deletions and insertions (or equivalently, if the LCS has gaps in
both files between neighboring segments):

ag (Base)
abcefg (Mine)
acdeg (Theirs)
abcdefg (Merged)

Here, the order of the inserted lines is unambiguously determined;
partially from Mine and partially from Theirs.

Incremental and overlapping changes would merge cleanly:

abcde (Base)
a12de (Mine)
ab23e (Theirs)
a123e (Merged)

Of course there are cases where the merged result might not be
what would be desired, but that is also the case with the current
algorithm - keep in mind that any conflict can be "resolved" by
inserting blank lines in Base, Mine and Theirs to serve as sync
points. Here's an unfortunate case with the current algorithm:

abcbcd (Base)
acbcd (Mine)
abcd (Theirs)
acd (Merged)

- even though Theirs < Mine < Base, in strict subsequence sense,
Merged < Theirs.


As a secondary change, I would like to reduce the sizes of
detected conflicts when the change in one file is a pure deletion.
Example:

abcdef (Base)
abc1ef (Mine
af (Theirs)
aCf (Merged)

Here, only the change d -> 1 is a true conflict, and in a graphical
interface such as TortoiseSVN, I would like to see only that single
line conflicted. However, as was pointed out when I suggested
this on our dev IRC, if can be useful to have the context when
editing the conflict in e.g. vim. One suggestion on IRC was to
split the conflict into 3; two not-really-conflicts around the true
conflict. Ideally, the not-really-conflicts would be marked as such
in the diff, which would require an API change.

An alternate way of handling it, however, would be to merely
improve the resolved_diff result for that conflict (which currently
only contains the whole thing as a conflict). This means that a
resolved_diff could also contain diffs of type
svn_diff__type_diff_modified or svn_diff__type_diff_latest, which
is currently not the case. It might be best to add more display
options for the output functions, so an API change might still be
required. Some or all of the output functions would need to be
adjusted to handle these possibilities.

===

If you know of reasons why these changes would be a bad
idea, please provide examples when possible.


Morten Kloster


Improvements to diff3 (merge) performance

2011-06-12 Thread Morten Kloster
Hi,

I'm not very happy with the current performance of the diff3
algorithm; I think it returns too many and too big conflicts. I have
some ideas on how to improve it, but I'd like to hear other people's
opinions first. Note that the primary purpose is to discuss how we
want merge to behave, not whether it is easy to implement.

I'll use the terms Base (original), Mine (modified) and Theirs (latest)
to refer to the three source files. In the merged file, C indicates a
conflict.

---

First, here's a quick run-though of how diff3 works now: After
running LCS on Base vs. Mine and Base vs. Theirs, it looks for
"sync points" - points where both Mine and Theirs match the Base.
Note that there is no need for an actual overlap: If Mine matches
Base up to a point and Theirs matches Base from that point
onwards, that is a valid sync point. If no two of the three are
identical between two neighboring sync points, that creates a
conflict.

Adjacent changes are accepted:

abcd (Base)
a1cd (Mine)
ab2d (Theirs)
a12d (Merged)

and

abcd (Base)
acd (Mine)
abc (Theirs)
ad (Merged)

while strictly larger changes are not:

abcd (Base)
a1cd (Mine)
a12d (Theirs)
aCd (Merged)

and

abcd (Base)
acd (Mine)
ad (Theirs)
aCd (Merged)

Note that this means that if changed files A and B are merged to
produce C, then attempting to merge A (or B) with C could well
produce a conflict (using the same Base the whole time), whereas
in my opinion, that ought to work fine, yielding C again.

---

My proposed main change is to let diff3 also sync on points where
Mine and Theirs have identical changes relative to Base, if either file
(Mine or Theirs) can also be synced to Base at the same point. This
can happen if either Mine or Theirs is unchanged from Base either
before or after that point, or if either of the changes from Base was a
pure insertion (as there is then no uncertainty as to the position in
the Base file). This would let diff3 incorporate all changes relative to
Base from both Mine and Theirs as long as there is no true conflict.

Some consequences:

Given two changed files A and B that only have deletions relative to
Base, the merged file would only contain the lines that are present
in both A and B.

Different insertions would be merged, not marked as conflict,
unless they truly conflict. There are only true conflicts if the diff2
between the two insertions have actual changes, not just separated
deletions and insertions (or equivalently, if the LCS has gaps in
both files between neighboring segments):

ag (Base)
abcefg (Mine)
acdeg (Theirs)
abcdefg (Merged)

Here, the order of the inserted lines is unambiguously determined;
partially from Mine and partially from Theirs.

Incremental and overlapping changes would merge cleanly:

abcde (Base)
a12de (Mine)
ab23e (Theirs)
a123e (Merged)

Of course there are cases where the merged result might not be
what would be desired, but that is also the case with the current
algorithm - keep in mind that any conflict can be "resolved" by
inserting blank lines in Base, Mine and Theirs to serve as sync
points. Here's an unfortunate case with the current algorithm:

abcbcd (Base)
acbcd (Mine)
abcd (Theirs)
acd (Merged)

- even though Theirs < Mine < Base, in strict subsequence sense,
Merged < Theirs.


As a secondary change, I would like to reduce the sizes of
detected conflicts when the change in one file is a pure deletion.
Example:

abcdef (Base)
abc1ef (Mine
af (Theirs)
aCf (Merged)

Here, only the change d -> 1 is a true conflict, and in a graphical
interface such as TortoiseSVN, I would like to see only that single
line conflicted. However, as was pointed out when I suggested
this on our dev IRC, if can be useful to have the context when
editing the conflict in e.g. vim. One suggestion on IRC was to
split the conflict into 3; two not-really-conflicts around the true
conflict. Ideally, the not-really-conflicts would be marked as such
in the diff, which would require an API change.

An alternate way of handling it, however, would be to merely
improve the resolved_diff result for that conflict (which currently
only contains the whole thing as a conflict). This means that a
resolved_diff could also contain diffs of type
svn_diff__type_diff_modified or svn_diff__type_diff_latest, which
is currently not the case. It might be best to add more display
options for the output functions, so an API change might still be
required. Some or all of the output functions would need to be
adjusted to handle these possibilities.

---

If you know of reasons why these changes would be a bad
idea, please provide examples when possible.


Morten Kloster


Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-12 Thread Brian Neal
On Sat, Jun 11, 2011 at 11:36 PM, noorul Islam. Kamal Malmiyoda
 wrote:
>
>> 
>
> I think conditions can be combined to form one "if" statement without "else 
> if".
>
> Thanks and Regards
> Noorul

Hello -

I'm not sure I follow you, can you please explain a bit more? I'm not
sure how you could capture all 3 cases without an else-if.

I probably could change it from an "if/else if/else" to an "if/else
if", but I would have to repeat a condition test in both the "if" and
"else if" parts. Because of that, I feel the current patch is more
readable even if it takes a little more vertical space.

Thanks for the feedback and best regards,
BN


Re: svn commit: r1130303 - in /subversion/trunk/subversion: libsvn_repos/authz.c mod_dav_svn/authz.c tests/cmdline/svnsync_tests.py

2011-06-12 Thread Daniel Shahaf
hwri...@apache.org wrote on Wed, Jun 01, 2011 at 21:09:23 -:
> Author: hwright
> Date: Wed Jun  1 21:09:22 2011
> New Revision: 1130303
> 
> URL: http://svn.apache.org/viewvc?rev=1130303&view=rev
> Log:
> Commit the fix for CVE-2011-1921 and CVE-2011-1783.
> 
> (Hopefully somebody with a bit more knowledge than me will fill in the 
> detailed
> log message.)
> 
> * subversion/mod_dav_svn/authz.c
>   (dav_svn__allow_read): Foo.
> 

Ahem ping?

> * subversion/tests/cmdline/svnsync_tests.py
>   (specific_deny_authz): New test.
>   (test_list): Run the new test.
> 
> * subversion/libsvn_repos/authz.c
>   (svn_repos_authz_check_access): Foo.
> 

And here.

> Modified:
> subversion/trunk/subversion/libsvn_repos/authz.c
> subversion/trunk/subversion/mod_dav_svn/authz.c
> subversion/trunk/subversion/tests/cmdline/svnsync_tests.py
> 
> Modified: subversion/trunk/subversion/libsvn_repos/authz.c
> URL: 
> http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_repos/authz.c?rev=1130303&r1=1130302&r2=1130303&view=diff
> ==
> --- subversion/trunk/subversion/libsvn_repos/authz.c (original)
> +++ subversion/trunk/subversion/libsvn_repos/authz.c Wed Jun  1 21:09:22 2011
> @@ -776,6 +776,9 @@ svn_repos_authz_check_access(svn_authz_t
>return SVN_NO_ERROR;
>  }
>  
> +  /* Sanity check. */
> +  SVN_ERR_ASSERT(path[0] == '/');
> +
>/* Determine the granted access for the requested path. */
>path = svn_fspath__canonicalize(path, pool);
>current_path = path;
> 
> Modified: subversion/trunk/subversion/mod_dav_svn/authz.c
> URL: 
> http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/authz.c?rev=1130303&r1=1130302&r2=1130303&view=diff
> ==
> --- subversion/trunk/subversion/mod_dav_svn/authz.c (original)
> +++ subversion/trunk/subversion/mod_dav_svn/authz.c Wed Jun  1 21:09:22 2011
> @@ -54,6 +54,11 @@ dav_svn__allow_read(request_rec *r,
>return TRUE;
>  }
>  
> +  /* Sometimes we get paths that do not start with '/' and 
> + hence below uri concatenation would lead to wrong uris .*/
> +  if (path && path[0] != '/')
> +path = apr_pstrcat(pool, "/", path, NULL);
> +
>/* If bypass is specified and authz has exported the provider.
>   Otherwise, we fall through to the full version.  This should be
>   safer than allowing or disallowing all accesses if there is a
> 
> Modified: subversion/trunk/subversion/tests/cmdline/svnsync_tests.py
> URL: 
> http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/cmdline/svnsync_tests.py?rev=1130303&r1=1130302&r2=1130303&view=diff
> ==
> --- subversion/trunk/subversion/tests/cmdline/svnsync_tests.py (original)
> +++ subversion/trunk/subversion/tests/cmdline/svnsync_tests.py Wed Jun  1 
> 21:09:22 2011
> @@ -870,6 +870,67 @@ def commit_a_copy_of_root(sbox):
>#Testcase for issue 3438.
>run_test(sbox, "repo-with-copy-of-root-dir.dump")
>  
> +
> +@Skip(svntest.main.is_ra_type_file)
> +def specific_deny_authz(sbox):
> +  "verify if specifically denied paths dont sync"
> +
> +  sbox.build("specific-deny-authz")
> +
> +  dest_sbox = sbox.clone_dependent()
> +  build_repos(dest_sbox)
> +
> +  svntest.actions.enable_revprop_changes(dest_sbox.repo_dir)
> +
> +  run_init(dest_sbox.repo_url, sbox.repo_url)
> +
> +  svntest.main.run_svn(None, "cp",
> +   os.path.join(sbox.wc_dir, "A"),
> +   os.path.join(sbox.wc_dir, "A_COPY")
> +   )
> +  svntest.main.run_svn(None, "ci", "-mm", sbox.wc_dir)
> +
> +  write_restrictive_svnserve_conf(sbox.repo_dir)
> +
> +  # For mod_dav_svn's parent path setup we need per-repos permissions in
> +  # the authz file...
> +  if sbox.repo_url.startswith('http'):
> +svntest.main.file_write(sbox.authz_file,
> +"[specific-deny-authz:/]\n"
> +"* = r\n"
> +"\n"
> +"[specific-deny-authz:/A]\n"
> +"* = \n"
> +"\n"
> +"[specific-deny-authz:/A_COPY/B/lambda]\n"
> +"* = \n"
> +"\n"
> +"[specific-deny-authz-1:/]\n"
> +"* = rw\n")
> +  # Otherwise we can just go with the permissions needed for the source
> +  # repository.
> +  else:
> +svntest.main.file_write(sbox.authz_file,
> +"[/]\n"
> +"* = r\n"
> +"\n"
> +"[/A]\n"
> +"* = \n"
> +"\n"
> +"[/A_COPY/B/lambda]\n"
> +