[PATCH] enhance svn diff by adding --diff-copy-from switch
Hi, This patch enhances the current svn diff by adding a new --diff-copy-from switch. I have attached the patch and the log message along with this mail. Please review and share your comments on the same. I am resending this patch with the extension added to the log and the patch files. Regards Prabhu [[[ Make svn diff to accept --diff-copy-from inorder to compare/diff against the copy-source file. * subversion/libsvn_ra/deprecated.c (): deprecated svn_ra_do_diff3 (svn_ra_do_diff2): pass FALSE for 'diff_copy_from' argument in do_diff function. * subversion/libsvn_ra/wrapper_template.h (compat_do_diff): pass FALSE for 'diff_copy_from' argument in do_diff function. * subversion/libsvn_ra/ra_loader.c (svn_ra_do_diff4): introduced 'diff_copy_from' argument. pass 'diff_copy_from' argument in do_diff function. * subversion/libsvn_ra/ra_loader.h svn_ra__vtable_t: introduced 'diff-copy-from' argument in do_diff. * subversion/libsvn_ra_local/ra_plugin.c (svn_ra_local__do_diff): introduced 'diff_copy_from' argument. if 'diff_copy_from', pass true for send_copyfrom_args if 'diff_copy_from', pass false for ignore_ancestry * subversion/libsvn_ra_svn/client.c (ra_svn_diff): introduced 'diff_copy_from' option and pass it to svn_ra_svn_write_cmd function. * subversion/svn/cl.h svn_cl__opt_state_t: introduced 'diff_copy_from' option. * subversion/svn/diff-cmd.c (svn_cl__diff): passed 'diff_copy_from' option to svn_client_diff5 function and the svn_client_diff_peg5 function. * subversion/svn/log-cmd.c (log_entry_receiver): passed false for diff_copy_from to svn_client_diff5 function. * subversion/svn/main.c (): introduced the diff_copy_from option and display information to user. svn_cl__options[]: added information about the diff_copy_from option. svn_cl__cmd_table[]: added diff-copy-from to the subcommand table. (main): handle the 'diff-copy-from' case. * subversion/include/svn_client.h (svn_client_diff5): introduced the 'diff_copy_from' argument. (svn_client_diff_peg5): introduced the 'diff_copy_from' argument. * subversion/include/svn_ra.h (): deprecated svn_ra_do_diff3 and introduced the svn_ra_do_diff4 to accept 'diff_copy_from' argument. * subversion/libsvn_client/deprecated.c (svn_client_diff4): pass false for diff_copy_from in svn_client_diff5 for backporting purpose. (svn_client_diff_peg4):pass false for diff_copy_from in svn_client_diff_peg5 for backporting purpose. * subversion/libsvn_client/repos_diff.c edit_baton: store the value of diff_copy_from. (get_file_from_ra): introduced the 'path' argument to pass the copyfrom_path. (diff_deleted_dir): passed path in the 'path' argument. (delete_entry): pass 'path' in the get_file_from_ra function. (add_file): if 'diff_copy_from' and copyfrom_revision are valid, add the file from the copy-source. (close_file): if diff_copy_from is set, call the file_changed callback. (svn_client__get_diff_editor): introduced the 'diff_copy_from' argument and assigned it in the edit baton. * subversion/libsvn_client/client.h (svn_client__get_diff_editor): introduced the 'diff_copy_from' argument. * subversion/libsvn_client/merge.c (drive_merge_report_editor): pass false for diff_copy_from to svn_ra_do_diff4 function and svn_client__get_diff_editor funtion. * subversion/libsvn_client/diff.c (diff_repos_repos): if diff_copy_from is set, perform diff with respect to the copy-source. (diff_repos_wc): if diff_copy_from is set, perform diff with respect to the copy-source. (do_diff): perform diff with respect to the diff_copy_from argument. (diff_summarize_repos_repos): pass false for 'diff_copy_from' to svn_ra_do_diff4 function. (svn_client_diff5): introduced the 'diff_copy_from' argument and pass it to do_diff function. (svn_client_diff_peg5): introduced the 'diff_copy_from' argument and pass it to do_diff function. * subversion/libsvn_ra_neon/ra_neon.h (svn_ra_neon__do_diff): introduced the 'diff_copy_from' argument. * subversion/libsvn_ra_neon/fetch.c (svn_ra_neon__do_diff): introduced the 'diff_copy_from' argument and pass it to make_reporter. * subversion/libsvn_ra_serf/ra_serf.h (svn_ra_serf__do_diff): introduced the 'diff_copy_from' argument. * subversion/libsvn_ra_serf/update.c (svn_ra_serf__do_diff): introduced the 'diff_copy_from' argument and pass it to make_update_reporter. * subversion/svnserve/serve.c (diff): introduced the 'diff_copy_from' argument and pass it to accept_reporter. Patch by: Prabhu Gnana Sundar prabh...@collab.net Suggested by: Kamesh
Re: AW: How to find out the rev number where a file was deleted?
[ moving to dev@ ] Following up on a discussion on the users list about the lack of a way to easily find the rev number in which a file was deleted... Already referred to issue #3627 (FS API support for oldest-to-youngest history traversal) and FS-NG, as mentioned on the roadmap. But the discussion continued about why this is so hard right now, and if there are alternative approaches. See below... On Mon, Nov 29, 2010 at 3:51 AM, Daniel Shahaf d...@daniel.shahaf.name wrote: Johan Corveleyn wrote on Sun, Nov 28, 2010 at 21:20:28 +0100: On Sun, Nov 28, 2010 at 6:35 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: Stefan Sperling wrote on Sun, Nov 28, 2010 at 16:48:30 +0100: The real problem is that we want to be able to answer these questions very fast, and some design aspects work against this. For instance, FSFS by design does not allow modifying old revisions. So where do we store the copy-to information for a given p...@n? copy-to information is immutable (never changes once created), so we could add another hierarchy (parallel to revs/ and revprops/) in which to store that information. Any 'cp f...@n bar' operation would need to create/append a file in that hierarchy. Open question: how to organize $new_hierarchy/16/16384/** to make it efficiently appendable and queryable (and for what queries? Iterate all copied-to places is one). Makes sense? I'm not sure. But there is another alternative: while we wait for FS-NG (or another solution like you propose), one could implement the slow algorithm within the current design. Are you advocating to implement it in the core (as an svn_fs_* API) or as a third-party script? The latter is certainly fine, but regarding the former I don't see the point of adding an API that cannot be implemented efficiently at this time. Why not in the core? We can't do this quickly, so we don't do it is not a very strong argument against having this very useful functionality IMHO. Having it in the core is vastly more useful for people like me (and my colleagues): works on Windows, regardless of whether or not one has perl/python installed, no need to distribute an additional script, guaranteed to be available everywhere an svn client is installed, ... It's actually quite similar to the way blame is implemented currently: we don't really have the design (line-based information) to do this quickly, but we calculate it from the other information that we have available (in a way that could also be done by a script on the client: diffing every interesting revision against the next, remembering the lines that were added/removed in every step). Can you imagine not having blame in svn core just because we can't do it quickly? Ok, blame may be a more important use case than finding the rev number where a file was deleted, but still ... So I still think it's definitely worth it to have this in the core and offer an API, and implement it slowly now because that's the only way we can do it (besides, I don't think it will be *that* slow). And optimize it later when we have FS-NG, or another way to retrieve this info quickly... However, having said all that doesn't change the fact that someone still needs to implement it, and I must admit I don't have the cycles for that currently :-(. Cheers, Johan Just automating what a user (or script) currently does when looking for this information, i.e. a binary search. Of course it would be slow, but it would certainly already provide value. At the very least, it saves users a lot of time searching FAQ's and list archives, wondering why this doesn't work, understanding the design limitations, and then finally implementing their own script or doing a one-time manual search. Then, when FS-NG arrives, or someone comes up with a way to index this information, it can be optimized. I don't know if there would be fundamental problems with that, apart from the fact that someone still needs to implement it of course ... Cheers, -- Johan
Re: AW: How to find out the rev number where a file was deleted?
Johan Corveleyn wrote on Mon, Nov 29, 2010 at 10:14:01 +0100: Having it in the core is vastly more useful for people like me (and my colleagues): works on Windows, regardless of whether or not one has perl/python installed, no need to distribute an additional script, guaranteed to be available everywhere an svn client is installed, ... You are talking about having the functionality supported by the svn* binaries. I was talking about having the functionality supported by the svn_fs_* API. I agree these questions are related, but they aren't precisely the same question. It's actually quite similar to the way blame is implemented currently: we don't really have the design (line-based information) to do this quickly, but we calculate it from the other information that we have available (in a way that could also be done by a script on the client: diffing every interesting revision against the next, remembering the lines that were added/removed in every step). If svn_client_blameN() re-uses its RA session, then it has an advantage over a shell script that calls 'svn diff' repeatedly. I agree it still doesn't have an advantage over a C bindings script that calls svn_client_diffN() repeatedly.
Apache CMS running latest apreq candidate
A new CMS service was put into place by the ASF sysadmins over the past few months, and it makes very good use of subversion, modperl2 and libapreq2. To see it in action you need to be an Apache committer and visit https://cms.apache.org/, but the code is publicly available at https://svn.apache.org/repos/infra/websites/cms Would love feedback on the design and implementation details, especially as it's based on Subversion and I had a bit of trouble with some of the existing perl glue for SVN (e.g, couldn't make heads or tails out of what the glue for `svn status` produces so i used the shell, and tainted variables don't play well with the glue either).
Re: r1039140 (or, spineless code is for the birds).
On 11/29/2010 02:23 PM, Julian Foad wrote: Julian Foad wrote: We agreed that C-Mike will clean up print_update_summary() and I will make the caller reject URL targets in the first place, like Noorul has done for other subcommands recently. r1040232 (mine) and r1040233 (Mike's). Thanks for raising this, Mike. Thanks for reviewing the commits flying by so we can ultimately catch stuff like this. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand
Re: [PATCH] extend svn_subst_translate_string() to record whether re-encoding and/or line ending translation were performed (v. 2)
Attached is a benchmark and Makefile that I used to test the speed of svn_subst_translate_string() from trunk versus the new svn_subst_translate_string2(). The program reads a text file named `2600.txt` in the current working directory and repeatedly calls svn_subst_translate_string() on the contents. For `2600.txt`, I used the plain text version of War and Peace from Project Gutenberg (http://www.gutenberg.org/ebooks/2600.txt.utf8). The data that I generated for tr...@1040115 were: trunk_at_1040115 - c(778, 791, 787, 766, 784, 776, 762, 750, 786, 780, 764, 774, 776, 785, 801, 780, 773, 770, 790, 776, 779, 797, 770, 771, 799, 783, 778, 781, 773, 760) The data for the HEAD sources (commit 6f828b0a4e07d1e14189b9b8c84bd0f884c59164 from my repo; https://github.com/dtrebbien/subversion/tree/6f828b0a4e07d1e14189b9b8c84bd0f884c59164) were: HEAD - c(805, 823, 798, 815, 795, 860, 808, 842, 800, 802, 842, 796, 801, 820, 808, 849, 819, 792, 782, 778, 788, 854, 797, 825, 883, 854, 831, 827, 801, 799) Note: This is not version 3 of the patch. It is essentially tr...@1040115 plus version 3 plus this changeset: https://github.com/dtrebbien/subversion/commit/d22329a54dcf58cddc2b618f913597c6defbcb2d The t-test allows us to conclude with high confidence that the mean time to run the benchmark with libsvn_subr-1 compiled from tr...@1040115 is less than the mean time to run the benchmark with libsvn_subr-1 compiled from the HEAD sources: t.test(trunk_at_1040115, HEAD, alternative = less, var.equal = TRUE, conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD t = -7.473, df = 58, p-value = 2.350e-10 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf -317939.7 sample estimates: mean of x mean of y 778 8164667 I realized, however, that this is not a fair comparison because the HEAD sources simply call svn_subst_translate_string2() within svn_subst_translate_string(), meaning that there is an extra layer of indirection. After modifying the benchmark to call svn_subst_translate_string2() directly, I generated these timings: HEAD_new - c(785, 789, 808, 798, 782, 788, 785, 754, 847, 823, 841, 788, 741, 749, 742, 765, 743, 743, 753, 772, 794, 778, 807, 784, 787, 797, 769, 791, 786, 762) Now we cannot reject the null hypothesis that the mean time to run the benchmark with libsvn_subr-1 compiled from tr...@1040115 is greater than or equal to the mean time to run the modified benchmark with libsvn_subr-1 compiled from the HEAD sources: t.test(trunk_at_1040115, HEAD_new, alternative = less, var.equal = TRUE, conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD_new t = -0.6839, df = 58, p-value = 0.2484 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf 33129.55 sample estimates: mean of x mean of y 778 7817000 One other set of timings that I generated were for the modified benchmark running with libsvn_subr-1 compiled from the HEAD sources, slightly modified to set `repair` to TRUE: HEAD_new_repair - c(766, 756, 757, 754, 767, 779, 746, 784, 806, 779, 800, 783, 837, 801, 773, 780, 790, 773, 773, 779, 775, 793, 786, 781, 793, 784, 789, 746, 779, 773) We cannot reject the null hypothesis that the mean time to run the modified benchmark with libsvn_subr-1 compiled from the HEAD sources is the same as the mean time to run the modified benchmark with libsvn_subr-1 compiled from slightly-modified HEAD sources (`repair` is set to TRUE): t.test(HEAD_new, HEAD_new_repair, var.equal = TRUE, conf.level = 0.90) Two Sample t-test data: HEAD_new and HEAD_new_repair t = 0.3815, df = 58, p-value = 0.7042 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -4.74 123774.74 sample estimates: mean of x mean of y 7817000 7794000 t.test(trunk_at_1040115, HEAD_new_repair, alternative = less, var.equal = TRUE, conf.level = 0.90) Two Sample t-test data: trunk_at_1040115 and HEAD_new_repair t = -0.3501, df = 58, p-value = 0.3638 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf 37836.36 sample estimates: mean of x mean of y 778 7794000 Therefore, I do not have evidence to support my earlier claim: 3.) This penalizes repair translations. My conclusion from all of this is that regardless of the value of `repair`, my changes do not appear to decrease
SQLite and callbacks
We use callbacks extensively throughout our code as a means of providing streamy feedback to callers. It's a pretty good paradigm, and one that has served us well. We don't put many restrictions on what the callbacks can do in terms of fetching more information or calling other functions. Enter wc-ng. Stefan's patch to make a recursive proplist much more performant highlights the great benefit that our sqlite-backed storage can have. However, he reverted it due to concerns about the potential for database contention. The theory was that the callback might try and call additional wc functions to get more information, and such nested statements weren't healthy for sqlite. We talked about it for a bit in IRC this morning, and the picture raised by this issue was quite dire. In an attempt to find out what the consequences of these nested queries are, I wrote a test program to attempt to demonstrate the failure, only now I can't seem to do so. Attached is the test program, but when I run it, I'm able to successfully execute multiple prepared statements on the same set of rows simultaneously, which was the concern we had about our callback mechanism in sqlite. So is this a valid problem? If so, could somebody use the attached test program to illustrate it for those of us who may not fully understand the situation? Thanks, -Hyrum #include stdio.h #include sqlite3.h #define CHECK_ERR \ if (sqlite3_errcode(db) \ (sqlite3_errcode(db) != SQLITE_ROW) \ (sqlite3_errcode(db) != SQLITE_DONE)) \ fprintf(stderr, %d: %d: %s\n, __LINE__, sqlite3_errcode(db), sqlite3_errmsg(db)); #define TEST_DATA \ create table foo (num int, message text); \ \ insert into foo values (1, 'A is for Allegator'); \ insert into foo values (2, 'B is for Bayou'); \ insert into foo values (3, 'C is for Cyprus Trees'); \ insert into foo values (4, 'D is for Dew'); \ insert into foo values (5, 'E is for Everything like'); \ insert into foo values (6, 'F Ferns or'); \ insert into foo values (7, 'G Grass that''s'); \ insert into foo values (8, 'H Home to you'); \ void callback(sqlite3 *db, int num) { const char *query = select message from foo where num = ?;; sqlite3_stmt *stmt; const unsigned char *msg; printf(Got number: %d, now getting message\n, num); sqlite3_prepare_v2(db, query, -1, stmt, NULL); CHECK_ERR; sqlite3_bind_int(stmt, 1, num); CHECK_ERR; sqlite3_step(stmt); CHECK_ERR; msg = sqlite3_column_text(stmt, 0); CHECK_ERR; printf(Message: %s\n, msg); sqlite3_finalize(stmt); CHECK_ERR; } void get_numbers(sqlite3 *db, void (*callback)(sqlite3 *, int)) { const char *query = select num from foo;; sqlite3_stmt *stmt; int code; sqlite3_prepare_v2(db, query, -1, stmt, NULL); CHECK_ERR; code = sqlite3_step(stmt); CHECK_ERR; while (code == SQLITE_ROW) { int number = sqlite3_column_int(stmt, 0); callback(db, number); code = sqlite3_step(stmt); CHECK_ERR; } sqlite3_finalize(stmt); CHECK_ERR; } int main(int argc, char *argv[]) { sqlite3 *db; remove(test.db); sqlite3_open_v2(test.db, db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, NULL); CHECK_ERR; sqlite3_extended_result_codes(db, 1); CHECK_ERR; sqlite3_exec(db, TEST_DATA, NULL, NULL, NULL); CHECK_ERR; get_numbers(db, callback); sqlite3_close(db); return 0; }
Re: svn commit: r1028084 - /subversion/trunk/build/ac-macros/java.m4
On 10/27/2010 01:16 PM, jwhitl...@apache.org wrote: Author: jwhitlock Date: Wed Oct 27 20:16:36 2010 New Revision: 1028084 URL: http://svn.apache.org/viewvc?rev=1028084view=rev Log: Fixed the problem where comiling JavaHL on OS X fails after Apple's latest Java update. * build/ac-macros/java.m4: Added two new paths to check for the jni.h when on the Darwin platform. Should we also add URLs to the download location for the Java developer kits for 10.5 and 10.6, as people may not know that they need to install it? Or add it to the INSTALL notes? 10.6: https://connect.apple.com/cgi-bin/WebObjects/MemberSite.woa/wa/getSoftware?bundleID=20719 10.5: https://connect.apple.com/cgi-bin/WebObjects/MemberSite.woa/wa/getSoftware?bundleID=20720 Blair
Re: svn commit: r1028084 - /subversion/trunk/build/ac-macros/java.m4
On Mon, Nov 29, 2010 at 6:50 PM, Blair Zajac bl...@orcaware.com wrote: On 10/27/2010 01:16 PM, jwhitl...@apache.org wrote: Author: jwhitlock Date: Wed Oct 27 20:16:36 2010 New Revision: 1028084 URL: http://svn.apache.org/viewvc?rev=1028084view=rev Log: Fixed the problem where comiling JavaHL on OS X fails after Apple's latest Java update. * build/ac-macros/java.m4: Added two new paths to check for the jni.h when on the Darwin platform. Should we also add URLs to the download location for the Java developer kits for 10.5 and 10.6, as people may not know that they need to install it? Or add it to the INSTALL notes? It would be interesting to see a proof of this idea. I mean, I could see this getting very verbose and then I'd suggest an INSTALL mention. If it weren't very verbose, outputting it as part of a configure failure could be ideal.
1.7.x bug - svn add no longer accepts wildcards?
Hi, During testing of a 1.7.x build, I've noticed that 'svn add' on trunk no longer accepts wildcards: svn 1.6.x: D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111 1alpha.txt D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111 1beta.txt D:\temp\svn_sandpit\workingcopy\trunk\Asvn add *.txt A alpha.txt A beta.txt Whereas 1.7.x gives: D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111 1alpha.txt D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111 1beta.txt D:\temp\svn_sandpit\workingcopy\trunk\AD:\temp\svn_sandpit\svn7.exe add *.txt svn: warning: 'D:\temp\svn_sandpit\workingcopy\trunk\A\*.txt' not found I thought that wildcards were expanded by the OS/Shell, and then passed to SVN. Is this not the case? Cheers, --- Daniel Becroft
Re: svn commit: r1037738 - Summary of updates
C. Michael Pilato wrote on Mon, Nov 29, 2010 at 14:17:13 -0500: On 11/26/2010 01:01 AM, Daniel Shahaf wrote: Thanks for this :-). It works as expected when the current directory is entered by its name, but not when it's entered through a symlink: [[[ % cd /tmp % ln -s wc1 wcalias % cd wc1 % $svn up /tmp/wc1/trunk/iota Updating 'trunk/iota' ... % cd /tmp/wcalias % $svn up /tmp/wc1/trunk/iota11D Updating 'trunk/iota' ... % $svn up /tmp/wcalias/trunk/iota Updating '/tmp/wcalias/trunk/iota' ... ]]] Ideally, the last output would have used the relative path 'trunk/iota', too. However, I'm not sure how to easily solve that --- is it as easy as calling some resolve symlinks function on the absolute-cwd string? Do we have a resolve symlinks function? I don't see anything in APR or in svn_io.h. Can we use realpath(3)? Or should we roll our own? Daniel (I rolled a prototype here)