[svnbench] Revision: 1135007 compiled Jun 13 2011, 00:21:18
/home/neels/svnbench/20110613-002336 Started at Mon Jun 13 00:23:36 UTC 2011 - Results for dir levels: 5 spread: 5 Timings for 5x5_1.6 N min max avgoperation (unit is seconds) 6 594.60 608.46 601.67 TOTAL RUN 8870.01 20.540.13 add 120.02 19.917.15 checkout 489.69 81.98 31.70 commit 60.450.670.52 copy 6 21.33 30.67 25.77 delete 128.86 27.17 16.48 merge 9600.010.450.01 mkdir 64070.016.600.01 prop mod 2610.010.430.01 propdel 125500.011.250.01 proplist 253730.017.890.01 propset 120.010.010.01 resolve 120.241.830.94 resolved 840.131.760.40 status 6 16.63 26.98 21.52 switch 900.12 31.08 10.32 update --- Timings for 5x5_trunk N min max avgoperation (unit is seconds) 6 339.54 346.05 342.57 TOTAL RUN 8870.011.730.02 add 120.026.433.09 checkout 481.46 81.40 24.36 commit 60.200.250.22 copy 60.760.780.77 delete 127.28 18.53 12.14 merge 9600.010.430.01 mkdir 64070.010.410.01 prop mod 2610.010.010.01 propdel 125500.010.010.01 proplist 253730.010.270.01 propset 120.010.010.01 resolve 120.470.640.56 resolved 840.190.860.50 status 63.563.613.59 switch 900.166.811.98 update --- COMPARE 5x5_1.6 to 5x5_trunk TOTAL RUN times: 601.7 seconds avg for 5x5_1.6 342.6 seconds avg for 5x5_trunk min max avg operation 0.57|-255.0520.57|-262.4140.57|-259.098 TOTAL RUN 1.58| +0.003 0.08|-18.811 0.16| -0.112 add 0.94| -0.001 0.32|-13.481 0.43| -4.059 checkout 0.15| -8.234 0.99| -0.583 0.77| -7.340 commit 0.45| -0.247 0.37| -0.419 0.42| -0.303 copy 0.04|-20.576 0.03|-29.888 0.03|-24.997 delete 0.82| -1.576 0.68| -8.635 0.74| -4.341 merge 1.49| +0.003 0.95| -0.023 1.57| +0.004 mkdir 1.47| +0.003 0.06| -6.195 0.86| -0.002 prop mod 1.39| +0.002 0.03| -0.420 0.77| -0.003 propdel 1.47| +0.002 0.01| -1.239 1.32| +0.002 proplist 1.49| +0.003 0.03| -7.619 0.92| -0.001 propset 1.34| +0.002 1.33| +0.002 1.33| +0.002 resolve 1.98| +0.234 0.35| -1.192 0.59| -0.383 resolved 1.46| +0.060 0.49| -0.893 1.24| +0.096 status 0.21|-13.072 0.13|-23.370 0.17|-17.929 switch 1.40| +0.046 0.22|-24.271 0.19| -8.340 update ("1.23|+0.45" means factor=1.23, difference in seconds = 0.45 factor < 1 or difference < 0 means '5x5_trunk' is faster than '5x5_1.6') - Results for dir levels: 100 spread: 1 Timings for 100x1_1.6 N min max avgoperation (unit is seconds) 6 69.99 73.65 72.11 TOTAL RUN 1480.010.880.04 add 120.014.912.35 checkout 481.334.402.09 commit 60.210.880.33 copy 60.981.141.04 delete 120.700.960.80 merge 1070.010.060.01 mkdir 3610.010.010.01 prop mod 170.010.020.01 propdel 6870.010.010.01 proplist 11130.010.150.01 propset 120.010.010.01 resolve 120.060.350.11 resolved 840.040.270.08 status 63.904.774.41 switch 900.055.852.53 update --- Timings for 100x1_trunk N min max avgoperation (unit is seconds) 6 24.49 25.60 24.93 TOTAL RUN 1480.010.140.02 add 120.020.360.19 checkout 480.294.241.60 commit 60.170.210.19 copy 60.110.120.11 delete 120.581.280.79 merge 1070.010.010.01 mkdir 3610.010.010.01 prop mod 170.010.010.01 propdel 6870.010.010.01 proplist 11130.010.020.01 propset 120.010.010.01 resolve 120.060.090.07 resolved 840.040.110.08 status 60.310.360.34 switch 900.040.510.23 update --- COMPARE 100x1_1.6 to 100x1_trunk TOTAL RUN times: 72.1 seconds avg for 100x1_1.6 24.9 seconds avg for 100x1_trunk min max avg operation 0.35|-45.497 0.35|-48.054 0.35|-47.179 TOTAL RUN 1.62| +0.003 0.15| -0.744 0.39| -0.025 add 1.40| +0.005 0.07| -4.548 0.08| -2.156 checkout 0.22| -1.039 0.96| -0.1
Re: Improvements to diff3 (merge) performance
My executive summary of your post is that you want diff3 to try to merge related, but not identical, changes occuring between a pair of sync points. I'm wary about this for two reasons. First, the benefit appears to arise chiefly for what Bram Cohen calls "implicit cherrypicking" use cases--that is, cases where a change is merged and then merged again together with other changes. After extensive research, Bram eventually concluded that trying to support this is a bad idea (http://bramcohen.livejournal.com/52148.html). I tend to think that a merge algorithm should not increase its false negative rate for the benefit of implicit cherrypicking. Second, I can see a lot of potential for incorrect merges based on superficial similarities. For instance, if you and I both add code between the same two sync points, and the last line of my change is the same as the first line of yours (perhaps a blank line), that could be enough to unambiguously determine an ordering. Perhaps both of our code additions do the same thing in different ways, and now it gets done twice (which is almost never desirable). Certainly, the existing algorithm can produce incorrect merges too, but my intuition says that the practical likelihood would become much higher. Of course, I could be off base. All merge algorithms are heuristic, and it would take a lot of research to really compare the efficacy of one against another. You need a cost function to determine the tradeoff between the false negative rate and the false positive rate, and you also need to measure how any given algorithmic change affects the false negative and false positive rates in practice. Both of these seem really hard. It would definitely affect my opinion if I learned that the three-way merge algorithms in other popular version control systems worked this way, or if I learned that Subversion was more restrictive than, say, gnu diff3.
Re: Improvements to diff3 (merge) performance
Ugh - using "---" to separate sections of the post was clearly a really bad idea; sorry about that. :-( Here it is again, in a hopefully more readable format: Hi, I'm not very happy with the current performance of the diff3 algorithm; I think it returns too many and too big conflicts. I have some ideas on how to improve it, but I'd like to hear other people's opinions first. Note that the primary purpose is to discuss how we want merge to behave, not whether it is easy to implement. I'll use the terms Base (original), Mine (modified) and Theirs (latest) to refer to the three source files. In the merged file, C indicates a conflict. === First, here's a quick run-though of how diff3 works now: After running LCS on Base vs. Mine and Base vs. Theirs, it looks for "sync points" - points where both Mine and Theirs match the Base. Note that there is no need for an actual overlap: If Mine matches Base up to a point and Theirs matches Base from that point onwards, that is a valid sync point. If no two of the three are identical between two neighboring sync points, that creates a conflict. Adjacent changes are accepted: abcd (Base) a1cd (Mine) ab2d (Theirs) a12d (Merged) and abcd (Base) acd (Mine) abc (Theirs) ad (Merged) while strictly larger changes are not: abcd (Base) a1cd (Mine) a12d (Theirs) aCd (Merged) and abcd (Base) acd (Mine) ad (Theirs) aCd (Merged) Note that this means that if changed files A and B are merged to produce C, then attempting to merge A (or B) with C could well produce a conflict (using the same Base the whole time), whereas in my opinion, that ought to work fine, yielding C again. === My proposed main change is to let diff3 also sync on points where Mine and Theirs have identical changes relative to Base, if either file (Mine or Theirs) can also be synced to Base at the same point. This can happen if either Mine or Theirs is unchanged from Base either before or after that point, or if either of the changes from Base was a pure insertion (as there is then no uncertainty as to the position in the Base file). This would let diff3 incorporate all changes relative to Base from both Mine and Theirs as long as there is no true conflict. Some consequences: Given two changed files A and B that only have deletions relative to Base, the merged file would only contain the lines that are present in both A and B. Different insertions would be merged, not marked as conflict, unless they truly conflict. There are only true conflicts if the diff2 between the two insertions have actual changes, not just separated deletions and insertions (or equivalently, if the LCS has gaps in both files between neighboring segments): ag (Base) abcefg (Mine) acdeg (Theirs) abcdefg (Merged) Here, the order of the inserted lines is unambiguously determined; partially from Mine and partially from Theirs. Incremental and overlapping changes would merge cleanly: abcde (Base) a12de (Mine) ab23e (Theirs) a123e (Merged) Of course there are cases where the merged result might not be what would be desired, but that is also the case with the current algorithm - keep in mind that any conflict can be "resolved" by inserting blank lines in Base, Mine and Theirs to serve as sync points. Here's an unfortunate case with the current algorithm: abcbcd (Base) acbcd (Mine) abcd (Theirs) acd (Merged) - even though Theirs < Mine < Base, in strict subsequence sense, Merged < Theirs. As a secondary change, I would like to reduce the sizes of detected conflicts when the change in one file is a pure deletion. Example: abcdef (Base) abc1ef (Mine af (Theirs) aCf (Merged) Here, only the change d -> 1 is a true conflict, and in a graphical interface such as TortoiseSVN, I would like to see only that single line conflicted. However, as was pointed out when I suggested this on our dev IRC, if can be useful to have the context when editing the conflict in e.g. vim. One suggestion on IRC was to split the conflict into 3; two not-really-conflicts around the true conflict. Ideally, the not-really-conflicts would be marked as such in the diff, which would require an API change. An alternate way of handling it, however, would be to merely improve the resolved_diff result for that conflict (which currently only contains the whole thing as a conflict). This means that a resolved_diff could also contain diffs of type svn_diff__type_diff_modified or svn_diff__type_diff_latest, which is currently not the case. It might be best to add more display options for the output functions, so an API change might still be required. Some or all of the output functions would need to be adjusted to handle these possibilities. === If you know of reasons why these changes would be a bad idea, please provide examples when possible. Morten Kloster
Improvements to diff3 (merge) performance
Hi, I'm not very happy with the current performance of the diff3 algorithm; I think it returns too many and too big conflicts. I have some ideas on how to improve it, but I'd like to hear other people's opinions first. Note that the primary purpose is to discuss how we want merge to behave, not whether it is easy to implement. I'll use the terms Base (original), Mine (modified) and Theirs (latest) to refer to the three source files. In the merged file, C indicates a conflict. --- First, here's a quick run-though of how diff3 works now: After running LCS on Base vs. Mine and Base vs. Theirs, it looks for "sync points" - points where both Mine and Theirs match the Base. Note that there is no need for an actual overlap: If Mine matches Base up to a point and Theirs matches Base from that point onwards, that is a valid sync point. If no two of the three are identical between two neighboring sync points, that creates a conflict. Adjacent changes are accepted: abcd (Base) a1cd (Mine) ab2d (Theirs) a12d (Merged) and abcd (Base) acd (Mine) abc (Theirs) ad (Merged) while strictly larger changes are not: abcd (Base) a1cd (Mine) a12d (Theirs) aCd (Merged) and abcd (Base) acd (Mine) ad (Theirs) aCd (Merged) Note that this means that if changed files A and B are merged to produce C, then attempting to merge A (or B) with C could well produce a conflict (using the same Base the whole time), whereas in my opinion, that ought to work fine, yielding C again. --- My proposed main change is to let diff3 also sync on points where Mine and Theirs have identical changes relative to Base, if either file (Mine or Theirs) can also be synced to Base at the same point. This can happen if either Mine or Theirs is unchanged from Base either before or after that point, or if either of the changes from Base was a pure insertion (as there is then no uncertainty as to the position in the Base file). This would let diff3 incorporate all changes relative to Base from both Mine and Theirs as long as there is no true conflict. Some consequences: Given two changed files A and B that only have deletions relative to Base, the merged file would only contain the lines that are present in both A and B. Different insertions would be merged, not marked as conflict, unless they truly conflict. There are only true conflicts if the diff2 between the two insertions have actual changes, not just separated deletions and insertions (or equivalently, if the LCS has gaps in both files between neighboring segments): ag (Base) abcefg (Mine) acdeg (Theirs) abcdefg (Merged) Here, the order of the inserted lines is unambiguously determined; partially from Mine and partially from Theirs. Incremental and overlapping changes would merge cleanly: abcde (Base) a12de (Mine) ab23e (Theirs) a123e (Merged) Of course there are cases where the merged result might not be what would be desired, but that is also the case with the current algorithm - keep in mind that any conflict can be "resolved" by inserting blank lines in Base, Mine and Theirs to serve as sync points. Here's an unfortunate case with the current algorithm: abcbcd (Base) acbcd (Mine) abcd (Theirs) acd (Merged) - even though Theirs < Mine < Base, in strict subsequence sense, Merged < Theirs. As a secondary change, I would like to reduce the sizes of detected conflicts when the change in one file is a pure deletion. Example: abcdef (Base) abc1ef (Mine af (Theirs) aCf (Merged) Here, only the change d -> 1 is a true conflict, and in a graphical interface such as TortoiseSVN, I would like to see only that single line conflicted. However, as was pointed out when I suggested this on our dev IRC, if can be useful to have the context when editing the conflict in e.g. vim. One suggestion on IRC was to split the conflict into 3; two not-really-conflicts around the true conflict. Ideally, the not-really-conflicts would be marked as such in the diff, which would require an API change. An alternate way of handling it, however, would be to merely improve the resolved_diff result for that conflict (which currently only contains the whole thing as a conflict). This means that a resolved_diff could also contain diffs of type svn_diff__type_diff_modified or svn_diff__type_diff_latest, which is currently not the case. It might be best to add more display options for the output functions, so an API change might still be required. Some or all of the output functions would need to be adjusted to handle these possibilities. --- If you know of reasons why these changes would be a bad idea, please provide examples when possible. Morten Kloster
Re: [PATCH] Issue #3919: fix for spurious property conflict during merge
On Sat, Jun 11, 2011 at 11:36 PM, noorul Islam. Kamal Malmiyoda wrote: > >> > > I think conditions can be combined to form one "if" statement without "else > if". > > Thanks and Regards > Noorul Hello - I'm not sure I follow you, can you please explain a bit more? I'm not sure how you could capture all 3 cases without an else-if. I probably could change it from an "if/else if/else" to an "if/else if", but I would have to repeat a condition test in both the "if" and "else if" parts. Because of that, I feel the current patch is more readable even if it takes a little more vertical space. Thanks for the feedback and best regards, BN
Re: svn commit: r1130303 - in /subversion/trunk/subversion: libsvn_repos/authz.c mod_dav_svn/authz.c tests/cmdline/svnsync_tests.py
hwri...@apache.org wrote on Wed, Jun 01, 2011 at 21:09:23 -: > Author: hwright > Date: Wed Jun 1 21:09:22 2011 > New Revision: 1130303 > > URL: http://svn.apache.org/viewvc?rev=1130303&view=rev > Log: > Commit the fix for CVE-2011-1921 and CVE-2011-1783. > > (Hopefully somebody with a bit more knowledge than me will fill in the > detailed > log message.) > > * subversion/mod_dav_svn/authz.c > (dav_svn__allow_read): Foo. > Ahem ping? > * subversion/tests/cmdline/svnsync_tests.py > (specific_deny_authz): New test. > (test_list): Run the new test. > > * subversion/libsvn_repos/authz.c > (svn_repos_authz_check_access): Foo. > And here. > Modified: > subversion/trunk/subversion/libsvn_repos/authz.c > subversion/trunk/subversion/mod_dav_svn/authz.c > subversion/trunk/subversion/tests/cmdline/svnsync_tests.py > > Modified: subversion/trunk/subversion/libsvn_repos/authz.c > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_repos/authz.c?rev=1130303&r1=1130302&r2=1130303&view=diff > == > --- subversion/trunk/subversion/libsvn_repos/authz.c (original) > +++ subversion/trunk/subversion/libsvn_repos/authz.c Wed Jun 1 21:09:22 2011 > @@ -776,6 +776,9 @@ svn_repos_authz_check_access(svn_authz_t >return SVN_NO_ERROR; > } > > + /* Sanity check. */ > + SVN_ERR_ASSERT(path[0] == '/'); > + >/* Determine the granted access for the requested path. */ >path = svn_fspath__canonicalize(path, pool); >current_path = path; > > Modified: subversion/trunk/subversion/mod_dav_svn/authz.c > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/authz.c?rev=1130303&r1=1130302&r2=1130303&view=diff > == > --- subversion/trunk/subversion/mod_dav_svn/authz.c (original) > +++ subversion/trunk/subversion/mod_dav_svn/authz.c Wed Jun 1 21:09:22 2011 > @@ -54,6 +54,11 @@ dav_svn__allow_read(request_rec *r, >return TRUE; > } > > + /* Sometimes we get paths that do not start with '/' and > + hence below uri concatenation would lead to wrong uris .*/ > + if (path && path[0] != '/') > +path = apr_pstrcat(pool, "/", path, NULL); > + >/* If bypass is specified and authz has exported the provider. > Otherwise, we fall through to the full version. This should be > safer than allowing or disallowing all accesses if there is a > > Modified: subversion/trunk/subversion/tests/cmdline/svnsync_tests.py > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/cmdline/svnsync_tests.py?rev=1130303&r1=1130302&r2=1130303&view=diff > == > --- subversion/trunk/subversion/tests/cmdline/svnsync_tests.py (original) > +++ subversion/trunk/subversion/tests/cmdline/svnsync_tests.py Wed Jun 1 > 21:09:22 2011 > @@ -870,6 +870,67 @@ def commit_a_copy_of_root(sbox): >#Testcase for issue 3438. >run_test(sbox, "repo-with-copy-of-root-dir.dump") > > + > +@Skip(svntest.main.is_ra_type_file) > +def specific_deny_authz(sbox): > + "verify if specifically denied paths dont sync" > + > + sbox.build("specific-deny-authz") > + > + dest_sbox = sbox.clone_dependent() > + build_repos(dest_sbox) > + > + svntest.actions.enable_revprop_changes(dest_sbox.repo_dir) > + > + run_init(dest_sbox.repo_url, sbox.repo_url) > + > + svntest.main.run_svn(None, "cp", > + os.path.join(sbox.wc_dir, "A"), > + os.path.join(sbox.wc_dir, "A_COPY") > + ) > + svntest.main.run_svn(None, "ci", "-mm", sbox.wc_dir) > + > + write_restrictive_svnserve_conf(sbox.repo_dir) > + > + # For mod_dav_svn's parent path setup we need per-repos permissions in > + # the authz file... > + if sbox.repo_url.startswith('http'): > +svntest.main.file_write(sbox.authz_file, > +"[specific-deny-authz:/]\n" > +"* = r\n" > +"\n" > +"[specific-deny-authz:/A]\n" > +"* = \n" > +"\n" > +"[specific-deny-authz:/A_COPY/B/lambda]\n" > +"* = \n" > +"\n" > +"[specific-deny-authz-1:/]\n" > +"* = rw\n") > + # Otherwise we can just go with the permissions needed for the source > + # repository. > + else: > +svntest.main.file_write(sbox.authz_file, > +"[/]\n" > +"* = r\n" > +"\n" > +"[/A]\n" > +"* = \n" > +"\n" > +"[/A_COPY/B/lambda]\n" > +