[PATCH] enhance svn diff by adding --diff-copy-from switch

2010-11-29 Thread Prabhu Gnana Sundar
Hi,

This patch enhances the current svn diff by adding a new
--diff-copy-from switch. I have attached the patch and the log message
along with this mail. Please review and share your comments on the same.

I am resending this patch with the extension added to the log and the
patch files.

Regards
Prabhu
[[[
Make svn diff to accept --diff-copy-from inorder to compare/diff 
against the copy-source file.

* subversion/libsvn_ra/deprecated.c
  (): deprecated svn_ra_do_diff3
  (svn_ra_do_diff2): pass FALSE for 'diff_copy_from' argument in do_diff function.

* subversion/libsvn_ra/wrapper_template.h
  (compat_do_diff): pass FALSE for 'diff_copy_from' argument in do_diff function.

* subversion/libsvn_ra/ra_loader.c
  (svn_ra_do_diff4): introduced 'diff_copy_from' argument.
 pass 'diff_copy_from' argument in do_diff function.

* subversion/libsvn_ra/ra_loader.h
  svn_ra__vtable_t: introduced 'diff-copy-from' argument in do_diff.

* subversion/libsvn_ra_local/ra_plugin.c
  (svn_ra_local__do_diff): introduced 'diff_copy_from' argument.
   if 'diff_copy_from', pass true for send_copyfrom_args
   if 'diff_copy_from', pass false for ignore_ancestry

* subversion/libsvn_ra_svn/client.c
  (ra_svn_diff): introduced 'diff_copy_from' option and pass it to 
 svn_ra_svn_write_cmd function.

* subversion/svn/cl.h
  svn_cl__opt_state_t: introduced 'diff_copy_from' option.

* subversion/svn/diff-cmd.c
  (svn_cl__diff): passed 'diff_copy_from' option to svn_client_diff5 function
  and the svn_client_diff_peg5 function.

* subversion/svn/log-cmd.c
  (log_entry_receiver): passed false for diff_copy_from to svn_client_diff5 function.

* subversion/svn/main.c
  (): introduced the diff_copy_from option and display information to user.
  svn_cl__options[]: added information about the diff_copy_from option.
  svn_cl__cmd_table[]: added diff-copy-from to the subcommand table.
  (main): handle the 'diff-copy-from' case.

* subversion/include/svn_client.h
  (svn_client_diff5): introduced the 'diff_copy_from' argument.
  (svn_client_diff_peg5): introduced the 'diff_copy_from' argument.

* subversion/include/svn_ra.h
  (): deprecated svn_ra_do_diff3 and introduced the 
  svn_ra_do_diff4 to accept 'diff_copy_from' argument.

* subversion/libsvn_client/deprecated.c
  (svn_client_diff4): pass false for diff_copy_from in svn_client_diff5 for 
  backporting purpose.
  (svn_client_diff_peg4):pass false for diff_copy_from in svn_client_diff_peg5
 for backporting purpose.

* subversion/libsvn_client/repos_diff.c
  edit_baton: store the value of diff_copy_from.
  (get_file_from_ra): introduced the 'path' argument to pass the copyfrom_path.
  (diff_deleted_dir): passed path in the 'path' argument.
  (delete_entry): pass 'path' in the get_file_from_ra function.
  (add_file): if 'diff_copy_from' and copyfrom_revision are valid, add the file
  from the copy-source.
  (close_file): if diff_copy_from is set, call the file_changed callback.
  (svn_client__get_diff_editor): introduced the 'diff_copy_from' argument and
 assigned it in the edit baton.

* subversion/libsvn_client/client.h
  (svn_client__get_diff_editor): introduced the 'diff_copy_from' argument.

* subversion/libsvn_client/merge.c
  (drive_merge_report_editor): pass false for diff_copy_from to svn_ra_do_diff4
   function and svn_client__get_diff_editor funtion.

* subversion/libsvn_client/diff.c
  (diff_repos_repos): if diff_copy_from is set, perform diff with respect to the
  copy-source.
  (diff_repos_wc): if diff_copy_from is set, perform diff with respect to the
  copy-source.
  (do_diff): perform diff with respect to the diff_copy_from argument.
  (diff_summarize_repos_repos): pass false for 'diff_copy_from' to 
svn_ra_do_diff4 function.
  (svn_client_diff5): introduced the 'diff_copy_from' argument and pass it to
  do_diff function.
  (svn_client_diff_peg5): introduced the 'diff_copy_from' argument and pass it
  to do_diff function.

* subversion/libsvn_ra_neon/ra_neon.h
  (svn_ra_neon__do_diff): introduced the 'diff_copy_from' argument.

* subversion/libsvn_ra_neon/fetch.c
  (svn_ra_neon__do_diff): introduced the 'diff_copy_from' argument and pass it to
  make_reporter.

* subversion/libsvn_ra_serf/ra_serf.h
  (svn_ra_serf__do_diff): introduced the 'diff_copy_from' argument.

* subversion/libsvn_ra_serf/update.c
  (svn_ra_serf__do_diff): introduced the 'diff_copy_from' argument and pass it to 
  make_update_reporter.

* subversion/svnserve/serve.c
  (diff): introduced the 'diff_copy_from' argument and pass it to accept_reporter.



Patch by: Prabhu Gnana Sundar prabh...@collab.net
Suggested by: Kamesh 

Re: AW: How to find out the rev number where a file was deleted?

2010-11-29 Thread Johan Corveleyn
[ moving to dev@ ]

Following up on a discussion on the users list about the lack of a way
to easily find the rev number in which a file was deleted...

Already referred to issue #3627 (FS API support for oldest-to-youngest
history traversal) and FS-NG, as mentioned on the roadmap. But the
discussion continued about why this is so hard right now, and if there
are alternative approaches. See below...

On Mon, Nov 29, 2010 at 3:51 AM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Johan Corveleyn wrote on Sun, Nov 28, 2010 at 21:20:28 +0100:
 On Sun, Nov 28, 2010 at 6:35 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
  Stefan Sperling wrote on Sun, Nov 28, 2010 at 16:48:30 +0100:
  The real problem is that we want to be able to answer these questions
  very fast, and some design aspects work against this. For instance,
  FSFS by design does not allow modifying old revisions. So where do
  we store the copy-to information for a given p...@n?
 
  copy-to information is immutable (never changes once created), so we
  could add another hierarchy (parallel to revs/ and revprops/) in which
  to store that information.  Any 'cp f...@n bar' operation would need to
  create/append a file in that hierarchy.
 
  Open question: how to organize $new_hierarchy/16/16384/** to make it
  efficiently appendable and queryable (and for what queries? Iterate
  all copied-to places is one).
 
  Makes sense?

 I'm not sure. But there is another alternative: while we wait for
 FS-NG (or another solution like you propose), one could implement the
 slow algorithm within the current design.

 Are you advocating to implement it in the core (as an svn_fs_* API) or
 as a third-party script?  The latter is certainly fine, but regarding
 the former I don't see the point of adding an API that cannot be
 implemented efficiently at this time.

Why not in the core? We can't do this quickly, so we don't do it is
not a very strong argument against having this very useful
functionality IMHO.

Having it in the core is vastly more useful for people like me (and my
colleagues): works on Windows, regardless of whether or not one has
perl/python installed, no need to distribute an additional script,
guaranteed to be available everywhere an svn client is installed, ...

It's actually quite similar to the way blame is implemented
currently: we don't really have the design (line-based information) to
do this quickly, but we calculate it from the other information that
we have available (in a way that could also be done by a script on the
client: diffing every interesting revision against the next,
remembering the lines that were added/removed in every step). Can you
imagine not having blame in svn core just because we can't do it
quickly? Ok, blame may be a more important use case than finding the
rev number where a file was deleted, but still ...

So I still think it's definitely worth it to have this in the core and
offer an API, and implement it slowly now because that's the only way
we can do it (besides, I don't think it will be *that* slow). And
optimize it later when we have FS-NG, or another way to retrieve
this info quickly...

However, having said all that doesn't change the fact that someone
still needs to implement it, and I must admit I don't have the cycles
for that currently :-(.

Cheers,
Johan

 Just automating what a
 user (or script) currently does when looking for this information,
 i.e. a binary search.

 Of course it would be slow, but it would certainly already provide
 value. At the very least, it saves users a lot of time searching FAQ's
 and list archives, wondering why this doesn't work, understanding the
 design limitations, and then finally implementing their own script or
 doing a one-time manual search.

 Then, when FS-NG arrives, or someone comes up with a way to index this
 information, it can be optimized.

 I don't know if there would be fundamental problems with that, apart
 from the fact that someone still needs to implement it of course ...

 Cheers,
 --
 Johan



Re: AW: How to find out the rev number where a file was deleted?

2010-11-29 Thread Daniel Shahaf
Johan Corveleyn wrote on Mon, Nov 29, 2010 at 10:14:01 +0100:
 Having it in the core is vastly more useful for people like me (and my
 colleagues): works on Windows, regardless of whether or not one has
 perl/python installed, no need to distribute an additional script,
 guaranteed to be available everywhere an svn client is installed, ...
 

You are talking about having the functionality supported by the svn*
binaries.  I was talking about having the functionality supported by
the svn_fs_* API.

I agree these questions are related, but they aren't precisely the same
question.

 It's actually quite similar to the way blame is implemented
 currently: we don't really have the design (line-based information) to
 do this quickly, but we calculate it from the other information that
 we have available (in a way that could also be done by a script on the
 client: diffing every interesting revision against the next,
 remembering the lines that were added/removed in every step).
 

If svn_client_blameN() re-uses its RA session, then it has an advantage
over a shell script that calls 'svn diff' repeatedly.  I agree it still
doesn't have an advantage over a C bindings script that calls
svn_client_diffN() repeatedly.


Apache CMS running latest apreq candidate

2010-11-29 Thread Joe Schaefer
A new CMS service was put into place by the ASF
sysadmins over the past few months, and it makes
very good use of subversion, modperl2 and libapreq2.
To see it in action you need to be an Apache committer
and visit https://cms.apache.org/, but the code
is publicly available at

https://svn.apache.org/repos/infra/websites/cms

Would love feedback on the design and implementation
details, especially as it's based on Subversion and
I had a bit of trouble with some of the existing perl
glue for SVN (e.g, couldn't make heads or tails out of
what the glue for `svn status` produces so i used the shell,
and tainted variables don't play well with the glue either).


  


Re: r1039140 (or, spineless code is for the birds).

2010-11-29 Thread C. Michael Pilato
On 11/29/2010 02:23 PM, Julian Foad wrote:
 Julian Foad wrote:
 We agreed that C-Mike will clean up print_update_summary() and I will
 make the caller reject URL targets in the first place, like Noorul has
 done for other subcommands recently.
 
 r1040232 (mine) and r1040233 (Mike's).  Thanks for raising this, Mike.

Thanks for reviewing the commits flying by so we can ultimately catch stuff
like this.

-- 
C. Michael Pilato cmpil...@collab.net
CollabNet  www.collab.net  Distributed Development On Demand


Re: [PATCH] extend svn_subst_translate_string() to record whether re-encoding and/or line ending translation were performed (v. 2)

2010-11-29 Thread Danny Trebbien
Attached is a benchmark and Makefile that I used to test the speed of
svn_subst_translate_string() from trunk versus the new
svn_subst_translate_string2().  The program reads a text file named
`2600.txt` in the current working directory and repeatedly calls
svn_subst_translate_string() on the contents.  For `2600.txt`, I used
the plain text version of War and Peace from Project Gutenberg
(http://www.gutenberg.org/ebooks/2600.txt.utf8).

The data that I generated for tr...@1040115 were:
trunk_at_1040115 - c(778, 791, 787, 766, 784,
776, 762, 750, 786, 780, 764, 774,
776, 785, 801, 780, 773, 770, 790,
776, 779, 797, 770, 771, 799, 783,
778, 781, 773, 760)

The data for the HEAD sources (commit
6f828b0a4e07d1e14189b9b8c84bd0f884c59164 from my repo;
https://github.com/dtrebbien/subversion/tree/6f828b0a4e07d1e14189b9b8c84bd0f884c59164)
were:
HEAD - c(805, 823, 798, 815, 795, 860,
808, 842, 800, 802, 842, 796, 801,
820, 808, 849, 819, 792, 782, 778,
788, 854, 797, 825, 883, 854, 831,
827, 801, 799)
Note:  This is not version 3 of the patch.  It is essentially
tr...@1040115 plus version 3 plus this changeset:
https://github.com/dtrebbien/subversion/commit/d22329a54dcf58cddc2b618f913597c6defbcb2d

The t-test allows us to conclude with high confidence that the mean
time to run the benchmark with libsvn_subr-1 compiled from
tr...@1040115 is less than the mean time to run the benchmark with
libsvn_subr-1 compiled from the HEAD sources:
 t.test(trunk_at_1040115, HEAD, alternative = less, var.equal = TRUE, 
 conf.level = 0.90)

Two Sample t-test

data:  trunk_at_1040115 and HEAD
t = -7.473, df = 58, p-value = 2.350e-10
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
  -Inf -317939.7
sample estimates:
mean of x mean of y
  778   8164667

I realized, however, that this is not a fair comparison because the
HEAD sources simply call svn_subst_translate_string2() within
svn_subst_translate_string(), meaning that there is an extra layer of
indirection.  After modifying the benchmark to call
svn_subst_translate_string2() directly, I generated these timings:
HEAD_new - c(785, 789, 808, 798, 782, 788,
785, 754, 847, 823, 841, 788, 741,
749, 742, 765, 743, 743, 753, 772,
794, 778, 807, 784, 787, 797, 769,
791, 786, 762)

Now we cannot reject the null hypothesis that the mean time to run the
benchmark with libsvn_subr-1 compiled from tr...@1040115 is greater
than or equal to the mean time to run the modified benchmark with
libsvn_subr-1 compiled from the HEAD sources:
 t.test(trunk_at_1040115, HEAD_new, alternative = less, var.equal = TRUE, 
 conf.level = 0.90)

Two Sample t-test

data:  trunk_at_1040115 and HEAD_new
t = -0.6839, df = 58, p-value = 0.2484
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
 -Inf 33129.55
sample estimates:
mean of x mean of y
  778   7817000


One other set of timings that I generated were for the modified
benchmark running with libsvn_subr-1 compiled from the HEAD sources,
slightly modified to set `repair` to TRUE:
HEAD_new_repair - c(766, 756, 757, 754, 767,
779, 746, 784, 806, 779, 800, 783,
837, 801, 773, 780, 790, 773, 773,
779, 775, 793, 786, 781, 793, 784,
789, 746, 779, 773)

We cannot reject the null hypothesis that the mean time to run the
modified benchmark with libsvn_subr-1 compiled from the HEAD sources
is the same as the mean time to run the modified benchmark with
libsvn_subr-1 compiled from slightly-modified HEAD sources (`repair`
is set to TRUE):
 t.test(HEAD_new, HEAD_new_repair, var.equal = TRUE, conf.level = 0.90)

Two Sample t-test

data:  HEAD_new and HEAD_new_repair
t = 0.3815, df = 58, p-value = 0.7042
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
 -4.74 123774.74
sample estimates:
mean of x mean of y
  7817000   7794000

 t.test(trunk_at_1040115, HEAD_new_repair, alternative = less, var.equal = 
 TRUE, conf.level = 0.90)

Two Sample t-test

data:  trunk_at_1040115 and HEAD_new_repair
t = -0.3501, df = 58, p-value = 0.3638
alternative hypothesis: true difference in means is less than 0
90 percent confidence interval:
 -Inf 37836.36
sample estimates:
mean of x mean of y
  778   7794000

Therefore, I do not have evidence to support my earlier claim:  3.)
This penalizes repair translations.

My conclusion from all of this is that regardless of the value of
`repair`, my changes do not appear to decrease 

SQLite and callbacks

2010-11-29 Thread Hyrum K. Wright
We use callbacks extensively throughout our code as a means of
providing streamy feedback to callers.  It's a pretty good paradigm,
and one that has served us well.  We don't put many restrictions on
what the callbacks can do in terms of fetching more information or
calling other functions.

Enter wc-ng.

Stefan's patch to make a recursive proplist much more performant
highlights the great benefit that our sqlite-backed storage can have.
However, he reverted it due to concerns about the potential for
database contention.  The theory was that the callback might try and
call additional wc functions to get more information, and such nested
statements weren't healthy for sqlite.  We talked about it for a bit
in IRC this morning, and the picture raised by this issue was quite
dire.

In an attempt to find out what the consequences of these nested
queries are, I wrote a test program to attempt to demonstrate the
failure, only now I can't seem to do so.  Attached is the test
program, but when I run it, I'm able to successfully execute multiple
prepared statements on the same set of rows simultaneously, which was
the concern we had about our callback mechanism in sqlite.

So is this a valid problem?  If so, could somebody use the attached
test program to illustrate it for those of us who may not fully
understand the situation?

Thanks,
-Hyrum
#include stdio.h

#include sqlite3.h

#define CHECK_ERR  \
  if (sqlite3_errcode(db) \
 (sqlite3_errcode(db) != SQLITE_ROW) \
 (sqlite3_errcode(db) != SQLITE_DONE))  \
fprintf(stderr, %d: %d: %s\n, __LINE__, sqlite3_errcode(db), sqlite3_errmsg(db));

#define TEST_DATA \
  create table foo (num int, message text);   \
   \
  insert into foo values (1, 'A is for Allegator');  \
  insert into foo values (2, 'B is for Bayou');  \
  insert into foo values (3, 'C is for Cyprus Trees');  \
  insert into foo values (4, 'D is for Dew');  \
  insert into foo values (5, 'E is for Everything like');  \
  insert into foo values (6, 'F Ferns or');  \
  insert into foo values (7, 'G Grass that''s');  \
  insert into foo values (8, 'H Home to you');  \
  

void callback(sqlite3 *db, int num)
{
  const char *query = select message from foo where num = ?;;
  sqlite3_stmt *stmt;
  const unsigned char *msg;

  printf(Got number: %d, now getting message\n, num);

  sqlite3_prepare_v2(db, query, -1, stmt, NULL);
  CHECK_ERR;

  sqlite3_bind_int(stmt, 1, num);
  CHECK_ERR;

  sqlite3_step(stmt);
  CHECK_ERR;

  msg = sqlite3_column_text(stmt, 0);
  CHECK_ERR;

  printf(Message: %s\n, msg);

  sqlite3_finalize(stmt);
  CHECK_ERR;
}

void get_numbers(sqlite3 *db,
 void (*callback)(sqlite3 *, int))
{
  const char *query = select num from foo;;
  sqlite3_stmt *stmt;
  int code;

  sqlite3_prepare_v2(db, query, -1, stmt, NULL);
  CHECK_ERR;

  code = sqlite3_step(stmt);
  CHECK_ERR;
  while (code == SQLITE_ROW)
{
  int number = sqlite3_column_int(stmt, 0);
  callback(db, number);

  code = sqlite3_step(stmt);
  CHECK_ERR;
}

  sqlite3_finalize(stmt);
  CHECK_ERR;
}


int
main(int argc, char *argv[])
{
  sqlite3 *db;

  remove(test.db);

  sqlite3_open_v2(test.db, db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE,
  NULL);
  CHECK_ERR;

  sqlite3_extended_result_codes(db, 1);
  CHECK_ERR;

  sqlite3_exec(db, TEST_DATA, NULL, NULL, NULL);
  CHECK_ERR;

  get_numbers(db, callback);

  sqlite3_close(db);

  return 0;
}


Re: svn commit: r1028084 - /subversion/trunk/build/ac-macros/java.m4

2010-11-29 Thread Blair Zajac

On 10/27/2010 01:16 PM, jwhitl...@apache.org wrote:

Author: jwhitlock
Date: Wed Oct 27 20:16:36 2010
New Revision: 1028084

URL: http://svn.apache.org/viewvc?rev=1028084view=rev
Log:
Fixed the problem where comiling JavaHL on OS X fails after Apple's latest
Java update.

* build/ac-macros/java.m4: Added two new paths to check for the jni.h when
on the Darwin platform.


Should we also add URLs to the download location for the Java developer 
kits for 10.5 and 10.6, as people may not know that they need to install 
it?  Or add it to the INSTALL notes?


10.6:

https://connect.apple.com/cgi-bin/WebObjects/MemberSite.woa/wa/getSoftware?bundleID=20719

10.5:

https://connect.apple.com/cgi-bin/WebObjects/MemberSite.woa/wa/getSoftware?bundleID=20720

Blair


Re: svn commit: r1028084 - /subversion/trunk/build/ac-macros/java.m4

2010-11-29 Thread Jeremy Whitlock
On Mon, Nov 29, 2010 at 6:50 PM, Blair Zajac bl...@orcaware.com wrote:
 On 10/27/2010 01:16 PM, jwhitl...@apache.org wrote:

 Author: jwhitlock
 Date: Wed Oct 27 20:16:36 2010
 New Revision: 1028084

 URL: http://svn.apache.org/viewvc?rev=1028084view=rev
 Log:
 Fixed the problem where comiling JavaHL on OS X fails after Apple's latest
 Java update.

 * build/ac-macros/java.m4: Added two new paths to check for the jni.h when
on the Darwin platform.

 Should we also add URLs to the download location for the Java developer kits
 for 10.5 and 10.6, as people may not know that they need to install it?  Or
 add it to the INSTALL notes?

It would be interesting to see a proof of this idea.  I mean, I could
see this getting very verbose and then I'd suggest an INSTALL mention.
 If it weren't very verbose, outputting it as part of a configure
failure could be ideal.


1.7.x bug - svn add no longer accepts wildcards?

2010-11-29 Thread Daniel Becroft
Hi,

During testing of a 1.7.x build, I've noticed that 'svn add' on trunk no
longer accepts wildcards:

svn 1.6.x:
   D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111  1alpha.txt
   D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111  1beta.txt
   D:\temp\svn_sandpit\workingcopy\trunk\Asvn add *.txt
   A alpha.txt
   A beta.txt

Whereas 1.7.x gives:
   D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111  1alpha.txt
   D:\temp\svn_sandpit\workingcopy\trunk\Aecho 111  1beta.txt
   D:\temp\svn_sandpit\workingcopy\trunk\AD:\temp\svn_sandpit\svn7.exe add
*.txt
   svn: warning: 'D:\temp\svn_sandpit\workingcopy\trunk\A\*.txt' not found

I thought that wildcards were expanded by the OS/Shell, and then passed to
SVN. Is this not the case?

Cheers,
---
Daniel Becroft


Re: svn commit: r1037738 - Summary of updates

2010-11-29 Thread Daniel Shahaf
C. Michael Pilato wrote on Mon, Nov 29, 2010 at 14:17:13 -0500:
 On 11/26/2010 01:01 AM, Daniel Shahaf wrote:
  Thanks for this :-).  It works as expected when the current directory is
  entered by its name, but not when it's entered through a symlink:
  
  [[[
  % cd /tmp
  
  % ln -s wc1 wcalias
  
  % cd wc1
  
  % $svn up /tmp/wc1/trunk/iota
  Updating 'trunk/iota' ...
  
  % cd /tmp/wcalias
  
  % $svn up /tmp/wc1/trunk/iota11D
  Updating 'trunk/iota' ...
  
  % $svn up /tmp/wcalias/trunk/iota
  Updating '/tmp/wcalias/trunk/iota' ...
  ]]]
  
  
  Ideally, the last output would have used the relative path 'trunk/iota',
  too.  However, I'm not sure how to easily solve that --- is it as easy
  as calling some resolve symlinks function on the absolute-cwd string?
 
 Do we have a resolve symlinks function?

I don't see anything in APR or in svn_io.h.  Can we use realpath(3)?  Or
should we roll our own?

Daniel
(I rolled a prototype here)