Re: [PATCH] issue #3469: tree conflict under a directory external

2011-03-28 Thread Daniel Shahaf
'Daniel Shahaf' wrote on Mon, Mar 28, 2011 at 00:16:49 +0200:
> Bert Huijben wrote on Sun, Mar 27, 2011 at 19:26:50 +0200:
> > and there are easier (and more
> > performant) ways to check if a node is part of a working copy (e.g. helpers
> > to check if a path is switched/a wc root). 
> 
> Could you elaborate on these please?  I looked through the header files
> and amn't sure which helpers you're referring to.

New patch, including your previous feedbacks modulo the above point:

[[[
Resolve issue #3469 (tree conflict under a directory external).

Review by: rhuijben

* subversion/libsvn_client/merge.c
  (merge_cmd_baton_t): Add TARGET_WCROOT_ABSPATH member.
  (do_merge): Initialize new member.
  (merge_file_added):
Skip files that belong to externals or to disjoint working copies.
Exploit knowledge of libsvn_wc implementations details (queries on
directories are faster) to make the check cheaper.

TODO: remove @XFail decorator
(from tree_conflict_tests.py 22)
]]]

[[[
Index: subversion/libsvn_client/merge.c
===
--- subversion/libsvn_client/merge.c(revision 1086475)
+++ subversion/libsvn_client/merge.c(working copy)
@@ -249,6 +249,8 @@ typedef struct merge_cmd_baton_t {
  versioned dir (dry-run only) */
   const char *target_abspath; /* Absolute working copy target of
  the merge. */
+  const char *target_wcroot_abspath;  /* Absolute path to root of wc that
+ TARGET_ABSPATH belongs to. */
 
   /* The left and right URLs and revs.  The value of this field changes to
  reflect the merge_source_t *currently* being merged by do_merge(). */
@@ -1555,6 +1557,32 @@ merge_file_added(const char *local_dir_abspath,
   if (tree_conflicted)
 *tree_conflicted = FALSE;
 
+  /* Easy out: not the same working copy.  (So, a disjoint working copy or
+ an external.) */
+  {
+const char *mine_wcroot_abspath;
+
+/* ### White-box optimization:
+
+   The code knows that MINE_ABSPATH is not a directory (it's an added
+   file), and we know that internally libsvn_wc queries are faster for
+   directories (this is an implementation detail).  Therefore, query for
+   the wcroot of the containing directory of MINE_ABSPATH.
+   
+   (We rely on the implementation detail only for performance, not for
+   correctness; under any implementation it would be valid to query for
+   the parent's wcroot.)
+ */
+SVN_ERR(svn_wc_get_wc_root(&mine_wcroot_abspath, merge_b->ctx->wc_ctx,
+   svn_dirent_dirname(mine_abspath, scratch_pool),
+   scratch_pool, scratch_pool));
+if (strcmp(mine_wcroot_abspath, merge_b->target_wcroot_abspath))
+  {
+*content_state = svn_wc_notify_state_obstructed;
+return SVN_NO_ERROR;
+  }
+  }
+
   /* Apply the prop changes to a new hash table. */
   file_props = apr_hash_copy(subpool, original_props);
   for (i = 0; i < prop_changes->nelts; ++i)
@@ -8653,6 +8685,9 @@ do_merge(apr_hash_t **modified_subtrees,
   merge_cmd_baton.target_missing_child = FALSE;
   merge_cmd_baton.reintegrate_merge = reintegrate_merge;
   merge_cmd_baton.target_abspath = target_abspath;
+  SVN_ERR(svn_wc_get_wc_root(&merge_cmd_baton.target_wcroot_abspath,
+ ctx->wc_ctx, merge_cmd_baton.target_abspath,
+ pool, subpool));
   merge_cmd_baton.pool = subpool;
   merge_cmd_baton.merge_options = merge_options;
   merge_cmd_baton.diff3_cmd = diff3_cmd;
]]]


[l10n] Translation status report for trunk r1086468

2011-03-28 Thread Subversion Translation Status
Translation status report for trunk@r1086468

  lang   trans untrans   fuzzy obs
--
de2049 135 255 217
es1986 198 285 355
fr2171  13  30  24
it1840 344 479 174
ja1978 206 354 602
ko2119  65   6   1
nb2039 145 213 326
pl2072 112 166  99
 pt_BR1808 376 499 166
sv1762 422 521 174
 zh_CN2180   4   5   1
 zh_TW1741 443 557 236


Re: Performance benchmarks

2011-03-28 Thread Greg Stein
On Mon, Mar 28, 2011 at 18:51, John Beranek  wrote:
> On 28/03/2011 23:45, Greg Stein wrote:
>> On Mon, Mar 28, 2011 at 17:42, John Beranek  wrote:
>>> On 25/03/2011 17:33, Mark Phippard wrote:
 Hi,

 I have been working on a framework for writing tests to record
 performance.  I have something good enough to share:
>>>
>>> May I make an observation about these benchmarks...?
>>>
>>> When I provided some benchmarks that included 'checkout' tests I was
>>> specifically asked to make tests that separate WC and RA functionality.
>>>
>>> I did this, released results, and the (portable) benchmark code.
>>>
>>> Now Mark has released a new set of benchmarks, which don't separate WC
>>> and RA functionality. No one has (yet) noted this fact. ;)
>>
>> I think your benchmarks are going to be more helpful for us to locate
>> hotspots and get them fixed. Mark's seem more high-level, for
>> policy-making rather than coding.
>>
>> Did your benchmark scripts get checked in? (I've been out a couple
>> weeks and may have missed that) And whether they did or not, would you
>> want commit access to get them committed, and/or continue work on them
>> within the svn repository?
>
> I checked them into a Git repository, both for ease of repository
> creation, and for ease of cloning. Of course, hosting SVN tools in Git
> may be seen as sacrilegious by some... ;)

Heh. I certainly don't mind. They just aren't going to get a lot of
attention outside of our own repository, I think.

At a minimum, what's the URL?

Cheers,
-g


Re: Performance benchmarks

2011-03-28 Thread John Beranek
On 28/03/2011 23:45, Greg Stein wrote:
> On Mon, Mar 28, 2011 at 17:42, John Beranek  wrote:
>> On 25/03/2011 17:33, Mark Phippard wrote:
>>> Hi,
>>>
>>> I have been working on a framework for writing tests to record
>>> performance.  I have something good enough to share:
>>
>> May I make an observation about these benchmarks...?
>>
>> When I provided some benchmarks that included 'checkout' tests I was
>> specifically asked to make tests that separate WC and RA functionality.
>>
>> I did this, released results, and the (portable) benchmark code.
>>
>> Now Mark has released a new set of benchmarks, which don't separate WC
>> and RA functionality. No one has (yet) noted this fact. ;)
> 
> I think your benchmarks are going to be more helpful for us to locate
> hotspots and get them fixed. Mark's seem more high-level, for
> policy-making rather than coding.
> 
> Did your benchmark scripts get checked in? (I've been out a couple
> weeks and may have missed that) And whether they did or not, would you
> want commit access to get them committed, and/or continue work on them
> within the svn repository?

I checked them into a Git repository, both for ease of repository
creation, and for ease of cloning. Of course, hosting SVN tools in Git
may be seen as sacrilegious by some... ;)

John.

-- 
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Performance benchmarks

2011-03-28 Thread Greg Stein
On Mon, Mar 28, 2011 at 17:42, John Beranek  wrote:
> On 25/03/2011 17:33, Mark Phippard wrote:
>> Hi,
>>
>> I have been working on a framework for writing tests to record
>> performance.  I have something good enough to share:
>
> May I make an observation about these benchmarks...?
>
> When I provided some benchmarks that included 'checkout' tests I was
> specifically asked to make tests that separate WC and RA functionality.
>
> I did this, released results, and the (portable) benchmark code.
>
> Now Mark has released a new set of benchmarks, which don't separate WC
> and RA functionality. No one has (yet) noted this fact. ;)

I think your benchmarks are going to be more helpful for us to locate
hotspots and get them fixed. Mark's seem more high-level, for
policy-making rather than coding.

Did your benchmark scripts get checked in? (I've been out a couple
weeks and may have missed that) And whether they did or not, would you
want commit access to get them committed, and/or continue work on them
within the svn repository?

Cheers,
-g


Re: Performance benchmarks

2011-03-28 Thread John Beranek
On 28/03/2011 23:00, Mark Phippard wrote:
> On Mon, Mar 28, 2011 at 5:42 PM, John Beranek  wrote:
>> On 25/03/2011 17:33, Mark Phippard wrote:
>>> Hi,
>>>
>>> I have been working on a framework for writing tests to record
>>> performance.  I have something good enough to share:
>>
>> May I make an observation about these benchmarks...?
>>
>> When I provided some benchmarks that included 'checkout' tests I was
>> specifically asked to make tests that separate WC and RA functionality.
>>
>> I did this, released results, and the (portable) benchmark code.
> 
> If your point is why didn't I use your code it is becuase it is in
> Perl and I do not know Perl.  I also did not see any conversation
> happening around your benchmarks (or else I would not have bothered to
> try get things going again).

Don't get me wrong, I'm not trying to put down your efforts, just
restating some things from the previous discussion that did seem to make
sense to me.

For reference, the thread I'm talking about was entitled "Subversion
trunk (r1078338) HTTP(/WC?) performance problems?"
.

> I have tried to make it clear that this is just something I decided to
> work on to help.  Whether it means anything or not or whether we use
> these benchmarks to make decisions remains to be seen.  Feel free to
> try to revive discussion around the tests you wrote, I will not be
> offended.

My benchmark script certainly can't be considered "finished", but I had
added extra individual, averages tests to it. The tests that I got
around to implementing were mostly RA-related, but I could certainly
look at adding some more tests on the WC side. I was somewhat shot down
for showing WC performance problems, because of a "We know about that"
sentiment.

>> Now Mark has released a new set of benchmarks, which don't separate WC
>> and RA functionality. No one has (yet) noted this fact. ;)
> 
> I focused my tests on WC functions.  I am not sure what you mean by RA
> functionality.  Some of our biggest problems are in walking the tree
> during things like update and commit.

Well, as I understand it, a 'checkout' over HTTP is affected both by RA
performance _and_ WC performance. So, my benchmarks were modified to do
'export' instead, to separate out the RA component. Equally if you
wanted to separate out the WC component, you'd do 'checkout' operations
with ra_local, or (as in your tests) other WC operations like
'proplist', 'status' etc.

> Anyway, I was not trying to offend you.  I just wanted to help and I
> have no desire to learn Perl (or even Python which obviously would
> have been preferred).  I was going to find the email where you posted
> your tests, but since I never recalled anyone else running them or
> discussing them I did not see the benefit in doing so and it would not
> have accomplished my goal to help.

No offence taken, I too am just trying to help the testing effort.

John.

-- 
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake


Re: Performance tests - 1.7, serf and HTTPv2

2011-03-28 Thread Greg Stein
On Mon, Mar 28, 2011 at 18:14, Philip Martin  wrote:
> Mark Phippard  writes:
>
>> There is one issue that I want to raise that I do not think is an
>> outlier.  I was expecting HTTPv2 to yield significant improvements,
>> and so I stopped the Apache server after each test so I could grab its
>> logs.  I wanted to be able to show how much the HTTP traffic was
>> reduced.  I have not done this yet, but for Serf it looks like the
>> logs were bigger.  The issue I want to raise though is about Serf in
>> general.  Running these benchmarks with Neon yields an Access log of
>> about 102KB and a Subversion log of about 3KB.  Running the benchmarks
>> with Serf yields an Access log over 12MB and a Subversion log over 5
>> MB.
>
> Yes, we discussed this earlier this month.  Serf makes more requests
> than neon for checkout/export/update, a GET and PROPFIND per file versus
> a single custom REQUEST.  My measurements using Subversion trunk as a
> dataset showed serf slightly faster than neon.  John's measurements with
> part of the Linux source tree showed serf slower than neon.  When I
> switched to his dataset I got the same result as he did.  The obvious
> difference between the datasets is that his was about half the size of
> mine but had about twice as many files.  So the serf approach of making
> lots of requests appears to be less efficient on lots of small files.

I'd be interested to see whether the checkout speed is client-bound or
server-bound. I'm assuming it is client, right?

The multiple request approach allows the server to provide for more
clients simultaneously. Lots of smaller requests rather than chugging
on one big response. For even larger provisioning, multiple servers
against one filesystem can service yet more clients simultaneously.

It also helps with proxying/caching, though Mark says SSL obviates
that (which I dispute, but it *is* valid to note that it is uncommon
and tricky to set up).

Lastly, the many-requests approach works better if the connection
drops for some reason and needs to be restarted. ra_serf can request
the bits that were dropped, while neon has to restart the whole
process (though I'm not even sure that neon has automatic
re-connection logic).

Cheers,
-g


Re: Performance benchmarks

2011-03-28 Thread Johan Corveleyn
Hi Mark,

Here is another data point, for my old (t)rusty Windows XP (32-bit)
this time, on a system with a pretty slow hard disk (5.4k rpm), 1.83
GHz Intel T2400 cpu, 3 GB RAM.

I must say the results look very good for 1.7 (r1086021) compared to
1.6.16 on this system. Especially for the "Folder tests" and the
"Binaries tests" (rather: you can see how much it hurts sometimes to
use 1.6.x on this kind of system, when working with working copies
with lots of folders). Apart from "Bin-commit" and "Bin-rm", I can't
see a significant performance degradation within your test-suite on my
system.

Both test runs were run with antivirus enabled (AVG free). I also did
a run of 1.7 with AV disabled, but it made no significant difference.
I rebooted between each test run just to be sure about disk caching
effects.

Oh, and I made my 1.7 build with "Release" configuration. At least on
Windows, I know there is a significant performance difference between
"Debug" builds and "Release" builds (that became clear during my
perf-work on "svn diff"). Best to compare release builds with release
builds, I think...

Cheers,
-- 
Johan
=== TEST RESULTS ==
SVN Version: 1.7.0-dev


Tests: Basic Tests
Action   TimeMillis
--  - -
  Checkout:  0:41.735 41735
Update:  0:41.468 41468
Switch:  0:00.407   407
  Proplist:  0:00.437   437
Status:  0:00.375   375
Commit:  0:01.516  1516
svnversion:  0:00.437   437

Tests: Merge Tests
Action   TimeMillis
--  - -
 Merge-all:  0:13.734 13734
  Merge-revert:  0:04.094  4094
   Merge-synch:  0:08.907  8907
 Merge-reintegrate:  0:15.719 15719

Tests: Folder Tests
Action   TimeMillis
--  - -
 Folder-co:  8:09.922489922
 Folder-st:  0:02.000  2000
 Folder-ci:  0:15.625 15625
 Folder-up:  0:05.203  5203
svnversion:  0:03.547  3547

Tests: Binaries Tests
Action   TimeMillis
--  - -
Bin-co:  4:36.109276109
Bin-up-r25:  0:22.172 22172
Bin-sw:  3:33.875213875
   Bin-cleanup:  0:00.172   172
Bin-rm:  0:03.890  3890
Bin-st:  0:00.454   454
Bin-commit:  0:04.265  4265
Bin-mv:  0:03.703  3703
 Bin-st-mv:  0:00.453   453
Bin-commit:  0:07.219  7219
svnversion:  0:00.610   610

===  END RESULTS ==
  Total execution time: 21:38.859   1298859

Results in wiki format:

Basic Tests:
| 1.7.0-dev | r | 0:41.735 | 0:41.468 | 0:00.407 | 0:00.437 | 0:00.375 
| 0:01.516 | 0:00.437

Merge Tests:
| 1.7.0-dev | r | 0:13.734 | 0:04.094 | 0:08.907 | 0:15.719

Folder Tests:
| 1.7.0-dev | r | 8:09.922 | 0:02.000 | 0:15.625 | 0:05.203 | 0:03.547

Binaries Tests:
| 1.7.0-dev | r | 4:36.109 | 0:22.172 | 3:33.875 | 0:00.172 | 0:03.890 
| 0:00.454 | 0:04.265 | 0:03.703 | 0:00.453 | 0:07.219 | 0:00.610
=== TEST RESULTS ==
SVN Version: 1.6.16-SlikSvn-tag-1.6.16@1076804-WIN32


Tests: Basic Tests
Action   TimeMillis
--  - -
  Checkout:  1:05.328 65328
Update:  0:57.422 57422
Switch:  0:02.094  2094
  Proplist:  0:01.312  1312
Status:  0:00.407   407
Commit:  0:02.703  2703
svnversion:  0:00.812   812

Tests: Merge Tests
Action   TimeMillis
--  - -
 Merge-all:  0:12.204 12204
  Merge-revert:  0:07.250  7250
   Merge-synch:  0:11.140 11140
 Merge-reintegrate:  0:21.797 21797

Tests: Folder Tests
Action   TimeMillis
--  - -
 Folder-co: 20:41.344   1241344
 Folder-st:  1:01.140 61140
 Folder-ci:  0:58.297 58297
 Folder-up:  1:57.938117938
svnversion:  0:27.718 27718

Tests: Binaries Tests
Action   TimeMillis
--  - -
Bin-co: 10:08.313608313
Bin-up-r25:  3:30.593210593
Bin-sw:  9:06.563546563
   Bin-cleanup:  0:02.015  2015
Bin-rm:  0:02.610  2610
Bin-st:  0:00.454   454
Bin-commit:  0:07.500  7500
Bin-mv:  0:06.328  

Re: Performance tests - 1.7, serf and HTTPv2

2011-03-28 Thread Philip Martin
Mark Phippard  writes:

> There is one issue that I want to raise that I do not think is an
> outlier.  I was expecting HTTPv2 to yield significant improvements,
> and so I stopped the Apache server after each test so I could grab its
> logs.  I wanted to be able to show how much the HTTP traffic was
> reduced.  I have not done this yet, but for Serf it looks like the
> logs were bigger.  The issue I want to raise though is about Serf in
> general.  Running these benchmarks with Neon yields an Access log of
> about 102KB and a Subversion log of about 3KB.  Running the benchmarks
> with Serf yields an Access log over 12MB and a Subversion log over 5
> MB.

Yes, we discussed this earlier this month.  Serf makes more requests
than neon for checkout/export/update, a GET and PROPFIND per file versus
a single custom REQUEST.  My measurements using Subversion trunk as a
dataset showed serf slightly faster than neon.  John's measurements with
part of the Linux source tree showed serf slower than neon.  When I
switched to his dataset I got the same result as he did.  The obvious
difference between the datasets is that his was about half the size of
mine but had about twice as many files.  So the serf approach of making
lots of requests appears to be less efficient on lots of small files.

-- 
Philip


Re: Performance benchmarks

2011-03-28 Thread Mark Phippard
On Mon, Mar 28, 2011 at 5:42 PM, John Beranek  wrote:
> On 25/03/2011 17:33, Mark Phippard wrote:
>> Hi,
>>
>> I have been working on a framework for writing tests to record
>> performance.  I have something good enough to share:
>
> May I make an observation about these benchmarks...?
>
> When I provided some benchmarks that included 'checkout' tests I was
> specifically asked to make tests that separate WC and RA functionality.
>
> I did this, released results, and the (portable) benchmark code.

If your point is why didn't I use your code it is becuase it is in
Perl and I do not know Perl.  I also did not see any conversation
happening around your benchmarks (or else I would not have bothered to
try get things going again).

I have tried to make it clear that this is just something I decided to
work on to help.  Whether it means anything or not or whether we use
these benchmarks to make decisions remains to be seen.  Feel free to
try to revive discussion around the tests you wrote, I will not be
offended.

> Now Mark has released a new set of benchmarks, which don't separate WC
> and RA functionality. No one has (yet) noted this fact. ;)

I focused my tests on WC functions.  I am not sure what you mean by RA
functionality.  Some of our biggest problems are in walking the tree
during things like update and commit.  So we have to run those
commands in order to see the performance issues.  I have created WC's
with three different "shapes" to show different areas where there
might be problems.  I have avoided all purely RA functions like log
and I also do not bother to repeatedly show the results for commands
like checkout.  I generally show it once for each shape working copy
and then just skip reporting how long it took in subsequent tests.

Anyway, I was not trying to offend you.  I just wanted to help and I
have no desire to learn Perl (or even Python which obviously would
have been preferred).  I was going to find the email where you posted
your tests, but since I never recalled anyone else running them or
discussing them I did not see the benefit in doing so and it would not
have accomplished my goal to help.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Performance benchmarks

2011-03-28 Thread John Beranek
On 25/03/2011 17:33, Mark Phippard wrote:
> Hi,
> 
> I have been working on a framework for writing tests to record
> performance.  I have something good enough to share:

May I make an observation about these benchmarks...?

When I provided some benchmarks that included 'checkout' tests I was
specifically asked to make tests that separate WC and RA functionality.

I did this, released results, and the (portable) benchmark code.

Now Mark has released a new set of benchmarks, which don't separate WC
and RA functionality. No one has (yet) noted this fact. ;)

Cheers,

John.

-- 
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 5.1.1)

2011-03-28 Thread Gavin Baumanis
Thanks Hyrum,

Appreciate the update.
I'll remove it from my "watch" list.


On 28/03/2011, at 11:58 PM, Hyrum K Wright wrote:

> Actually, it did.  Mike committed it in r1084335, and reported that
> fact in this thread on Mar 22.
> 
> Best,
> -Hyrum
> 
> On Sun, Mar 27, 2011 at 7:27 PM, Gavin Baumanis  
> wrote:
>> Ping.
>> This patch submission has received no comments.
>> 
>> 
>> 
>> On 22/03/2011, at 8:27 AM, Danny Trebbien wrote:
>> 
>>> On Mon, Mar 21, 2011 at 10:20 AM, Philip Martin
>>>  wrote:
 Danny Trebbien  writes:
> Attached are the patch, the log message, and the two TGZ archives of
> DUMP files (for the tests).
 
 The patch is missing.
 
 --
 Philip
>>> 
>>> Sorry.  Attached is the patch with a .txt extension.
>>> 
>> 
>> 

Gavin "Beau" Baumanis
E: gav...@thespidernet.com




Performance tests - 1.7, serf and HTTPv2

2011-03-28 Thread Mark Phippard
I ran the benchmark test against an http 1.6 and 1.7 server.  Details are here:

https://ctf.open.collab.net/sf/wiki/do/viewPage/projects.csvn/wiki/HTTPv2

The results are not encouraging.

* Compared to 1.6, checkout seems to be the biggest problem.

* HTTPv2 seems to help Neon more than it does Serf.  In many cases,
both are slower.

* Neon is generally still faster than Serf.  In some cases, the HTTPv2
changes seem to have widened the gap.

As I noted on the wiki, I am taking these numbers with a grain of
salt.  I think the tests just need to be run more times and by more
people before we draw conclusions.  I saw enough outliers during the
tests to say something was going on (perhaps Anti-Virus rearing its
head again?).

There is one issue that I want to raise that I do not think is an
outlier.  I was expecting HTTPv2 to yield significant improvements,
and so I stopped the Apache server after each test so I could grab its
logs.  I wanted to be able to show how much the HTTP traffic was
reduced.  I have not done this yet, but for Serf it looks like the
logs were bigger.  The issue I want to raise though is about Serf in
general.  Running these benchmarks with Neon yields an Access log of
about 102KB and a Subversion log of about 3KB.  Running the benchmarks
with Serf yields an Access log over 12MB and a Subversion log over 5
MB.  That concerns me.  I am just one user running only a handful of
commands against the server.  It looks like a real Subversion server
with just a few dozen active users will be generating logs in excess
of a GB every day.  At a minimum we are going to need to document this
in the release notes.  I can tell from users@ that most users do not
have log rotation set up.  I am concerned about how this might affect
performance of the server and it will certainly leave users vulnerable
to running out of disk space.

Other than looking at some of these logs to see if they show some kind
of problem in Serf, I do not know what else we can do.  We know Serf
generates many more requests than Neon when doing checkout/update.
Still, this is alarming when you see the reality of what it means.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Performance benchmarks

2011-03-28 Thread Daniel Shahaf
Mark Phippard wrote on Mon, Mar 28, 2011 at 13:08:05 -0400:
> On Mon, Mar 28, 2011 at 11:28 AM, Arwin Arni  wrote:
> 
> > I'm running Ubuntu 10.04 on an Intel Pentium 4 CPU 2.26GHz with 2GiB of RAM.
> >
> > Here are the benchmark results for svn 1.6.6 (provided by canonical for my
> > OS) and svn trunk (r1086245).
> >
> > Trunk is taking nearly twice as long as 1.6.6... Am I doing something
> > wrong... is it because of enable-maintainer-mode...
> 
> Thanks, I added your results to the wiki.  AFAIK,
> enable-maintainer-mode does not impact performance.

Inaccurate, we do have a few places where debug checks are enabled only
in maintainer mode.  (For example, enabling the FOREIGN_KEYS pragma and
a stat in the pristines code.)

Oh, and compile-time optimizations are disabled by default in maintainer mode.


Re: Performance benchmarks

2011-03-28 Thread Mark Phippard
On Mon, Mar 28, 2011 at 1:12 PM, Hyrum K Wright  wrote:

> Very cool to see something which will hopefully give us some
> quantitative measure of performance.
>
> I've seen people submit reports based on particular revisions.  Would
> it be possible to run the same suite of tools across a number of
> different revisions to give us some sense of change over time?  It'd
> be nice to know if we're getting better or worse, or how particular
> changes impacted performance, etc.
>
> Just a thought.

Not sure if you have looked at the wiki but I have been posting that
info with the stats for that reason.  I figured many of the same
people would just post new results for 1.7 over time and we would
start seeing a pattern of improvement.

I did not go back to some of the older 1.7 revisions to show some of
the recent performance work, but that is something I have considered.

I am currently running the tests to an HTTP server and changing the
server between 1.6 and 1.7.  I will post the results when all the
combinations(serf/neon1.6/1.7) are done.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Performance benchmarks

2011-03-28 Thread Hyrum K Wright
On Fri, Mar 25, 2011 at 12:33 PM, Mark Phippard  wrote:
> Hi,
>
> I have been working on a framework for writing tests to record
> performance.  I have something good enough to share:
>
> https://ctf.open.collab.net/sf/projects/csvn
>
> It is pretty easy to add new tests if you have ideas on more tests you
> think we should add.  I think I have pretty good coverage of the major
> functions.  The wiki on the site I linked to above has details on how
> I have constructed the current tests.  I am going to put out a call to
> users for feedback and try to get more people to run the tests and
> record results.
>
> I am not claiming these are anything definitive or even that we will
> use them to help us make the release decision, but I think it is a
> start on coming up with some reproducible tests that people can run
> easily.  If after people look at and run the tests they think they are
> useful or can be tweaked to be useful, then great.  If not, then at
> least I got to write some code for a change :)
>
> The tests are written in Java because that is what I know and it gives
> me good cross platform coverage.  However, the Java just drives the
> command line so all you need to do is have the svn command line in
> your PATH and that is what it uses for all the work.

Very cool to see something which will hopefully give us some
quantitative measure of performance.

I've seen people submit reports based on particular revisions.  Would
it be possible to run the same suite of tools across a number of
different revisions to give us some sense of change over time?  It'd
be nice to know if we're getting better or worse, or how particular
changes impacted performance, etc.

Just a thought.

-Hyrum


Re: Performance benchmarks

2011-03-28 Thread Mark Phippard
On Mon, Mar 28, 2011 at 11:28 AM, Arwin Arni  wrote:

> I'm running Ubuntu 10.04 on an Intel Pentium 4 CPU 2.26GHz with 2GiB of RAM.
>
> Here are the benchmark results for svn 1.6.6 (provided by canonical for my
> OS) and svn trunk (r1086245).
>
> Trunk is taking nearly twice as long as 1.6.6... Am I doing something
> wrong... is it because of enable-maintainer-mode...

Thanks, I added your results to the wiki.  AFAIK,
enable-maintainer-mode does not impact performance.  Your results were
somewhat similar to what Mike Pilato posted for Ubuntu.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: Performance benchmarks

2011-03-28 Thread Arwin Arni

Hi Mark,

I'm running Ubuntu 10.04 on an Intel Pentium 4 CPU 2.26GHz with 2GiB of RAM.

Here are the benchmark results for svn 1.6.6 (provided by canonical for 
my OS) and svn trunk (r1086245).


Trunk is taking nearly twice as long as 1.6.6... Am I doing something 
wrong... is it because of enable-maintainer-mode...


Regards,
Arwin Arni
=== TEST RESULTS ==
SVN Version: 1.6.6
(This is what Canonical provides for Ubuntu 10.04)

Tests: Basic Tests
Action   TimeMillis
--  - -
  Checkout:  0:06.062  6062
Update:  0:09.009  9009
Switch:  0:00.344   344
  Proplist:  0:00.405   405
Status:  0:00.262   262
Commit:  0:00.934   934
svnversion:  0:00.108   108

Tests: Merge Tests
Action   TimeMillis
--  - -
 Merge-all:  0:03.822  3822
  Merge-revert:  0:01.395  1395
   Merge-synch:  0:10.959 10959
 Merge-reintegrate:  0:05.924  5924

Tests: Folder Tests
Action   TimeMillis
--  - -
 Folder-co:  3:14.795194795
 Folder-st:  0:04.627  4627
 Folder-ci:  0:04.725  4725
 Folder-up:  0:03.541  3541
svnversion:  0:01.182  1182

Tests: Binaries Tests
Action   TimeMillis
--  - -
Bin-co:  2:19.648139648
Bin-up-r25:  2:44.211164211
Bin-sw:  2:39.057159057
   Bin-cleanup:  0:00.341   341
Bin-rm:  0:03.137  3137
Bin-st:  0:00.454   454
Bin-commit:  0:02.709  2709
Bin-mv:  0:02.731  2731
 Bin-st-mv:  0:00.445   445
Bin-commit:  0:02.025  2025
svnversion:  0:00.364   364

===  END RESULTS ==
  Total execution time: 12:48.904768904

Results in wiki format:

Basic Tests:
| 1.6.6 | r | 0:06.062 | 0:09.009 | 0:00.344 | 0:00.405 | 0:00.262 | 
0:00.934 | 0:00.108

Merge Tests:
| 1.6.6 | r | 0:03.822 | 0:01.395 | 0:10.959 | 0:05.924

Folder Tests:
| 1.6.6 | r | 3:14.795 | 0:04.627 | 0:04.725 | 0:03.541 | 0:01.182

Binaries Tests:
| 1.6.6 | r | 2:19.648 | 2:44.211 | 2:39.057 | 0:00.341 | 0:03.137 | 
0:00.454 | 0:02.709 | 0:02.731 | 0:00.445 | 0:02.025 | 0:00.364

=== TEST RESULTS ==
SVN Version: 1.7.0-dev


Tests: Basic Tests
Action   TimeMillis
--  - -
  Checkout:  0:11.959 11959
Update:  0:12.614 12614
Switch:  0:01.002  1002
  Proplist:  0:00.303   303
Status:  0:00.284   284
Commit:  0:02.731  2731
svnversion:  0:00.389   389

Tests: Merge Tests
Action   TimeMillis
--  - -
 Merge-all:  0:06.818  6818
  Merge-revert:  0:03.280  3280
   Merge-synch:  0:06.718  6718
 Merge-reintegrate:  0:17.598 17598

Tests: Folder Tests
Action   TimeMillis
--  - -
 Folder-co:  3:06.657186657
 Folder-st:  0:02.842  2842
 Folder-ci:  0:21.746 21746
 Folder-up:  0:07.181  7181
svnversion:  0:03.889  3889

Tests: Binaries Tests
Action   TimeMillis
--  - -
Bin-co:  8:03.254483254
Bin-up-r25:  0:08.678  8678
Bin-sw:  7:18.574438574
   Bin-cleanup:  0:00.112   112
Bin-rm:  0:08.315  8315
Bin-st:  0:00.790   790
Bin-commit:  0:05.599  5599
Bin-mv:  0:06.790  6790
 Bin-st-mv:  0:00.821   821
Bin-commit:  0:07.569  7569
svnversion:  0:00.885   885

===  END RESULTS ==
  Total execution time: 21:31.163   1291163

Results in wiki format:

Basic Tests:
| 1.7.0-dev | r1086245 | 0:11.959 | 0:12.614 | 0:01.002 | 0:00.303 | 0:00.284 | 
0:02.731 | 0:00.389

Merge Tests:
| 1.7.0-dev | r1086245 | 0:06.818 | 0:03.280 | 0:06.718 | 0:17.598

Folder Tests:
| 1.7.0-dev | r1086245 | 3:06.657 | 0:02.842 | 0:21.746 | 0:07.181 | 0:03.889

Binaries Tests:
| 1.7.0-dev | r1086245 | 8:03.254 | 0:08.678 | 7:18.574 | 0:00.112 | 0:08.315 | 
0:00.790 | 0:05.599 | 0:06.790 | 0:00.821 | 0:07.569 | 0:00.885



Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Stefan Sperling
On Mon, Mar 28, 2011 at 08:42:56AM -0500, Hyrum K Wright wrote:
> So the SELECT 1 determines what data gets return (which you don't care
> about), and the LIMIT 1 determines how much data gets returned (of
> which you only care about 0 or >0 rows, hence the LIMIT).

Thanks, that makes sense :)  r1086245 adds the LIMIT 1.


RE: svn commit: r1085523 - /subversion/trunk/subversion/tests/cmdline/svnrdump_tests.py

2011-03-28 Thread Bert Huijben


> -Original Message-
> From: cmpil...@apache.org [mailto:cmpil...@apache.org]
> Sent: vrijdag 25 maart 2011 20:11
> To: comm...@subversion.apache.org
> Subject: svn commit: r1085523 -
> /subversion/trunk/subversion/tests/cmdline/svnrdump_tests.py
> 
> Author: cmpilato
> Date: Fri Mar 25 19:10:51 2011
> New Revision: 1085523
> 
> URL: http://svn.apache.org/viewvc?rev=1085523&view=rev
> Log:
> For issue #3844 ("svnrdump chokes on dumpfiles with multiple prop
> edits to the same node"):  Make svnrdump_tests.py 38 PASS by driving
> the binaries the right way.
> 
> * subversion/tests/cmdline/svnrdump_tests.py
>   (run_load_test): Add 'expect_deltas' parameter, passed to
> run_and_verify_dump() to turn out a --deltas dump.
>   (multi_prop_edit_load): Pass False for expect_deltas to
> run_load_test(), and remove @XFail decorator.

It looks like this test still fails for svnserve. (I think it needs a second ra 
session)

Bert
 




[PATCH] issue #3014 "svn log | head" should not print "Write error: Broken pipe"

2011-03-28 Thread Stefan Sperling
Here is a patch that attempts to fix issue #3014.

In the issue, Julian suggests that there should be a policy as to when
svn should and should not report errors on broken pipes.
I think that would complicate things too much.
So this patch takes a much simpler approach (see log message).

Does anyone see problems with this approach?
It's a bit hackish, but works.

This patch touches everything expect svnrdump, for now.

[[[
Fix issue #3014 "svn log | head" should not print "Write error: Broken pipe".

Make svn and other commands to exit silently if a pipe write error occured.
The rationale being that the process closing the pipe is also responsible
for printing a related error message, if necessary.

* subversion/libsvn_subr/cmdline.c
  (svn_cmdline_fputs, svn_cmdline_fflush): Return
   SVN_ERR_IO_PIPE_WRITE_ERROR when writes fail due to EPIPE, so that
   callers can choose to ignore this kind of error.
  (svn_cmdline_handle_exit_error): Don't print message for pipe write errors.

* subversion/libsvn_subr/opt.c
  (svn_opt_print_generic_help2): Don't print message for pipe write errors.

* subversion/libsvn_subr/io.c
  (do_io_file_wrapper_cleanup): Same.

* subversion/svn/notify.c
  (notify): Same.

* subversion/svn/main.c
  (main): Same.

* subversion/svn/obliterate-cmd.c
  (notify): Same.

* subversion/svnadmin/main.c
  (main): Same.

* subversion/include/svn_error_codes.h
  (SVN_ERR_IO_PIPE_WRITE_ERROR): New error code.
]]]

Index: subversion/libsvn_subr/cmdline.c
===
--- subversion/libsvn_subr/cmdline.c(revision 1086209)
+++ subversion/libsvn_subr/cmdline.c(working copy)
@@ -338,7 +338,19 @@ svn_cmdline_fputs(const char *string, FILE* stream
   if (fputs(out, stream) == EOF)
 {
   if (errno)
-return svn_error_wrap_apr(errno, _("Write error"));
+{
+  err = svn_error_wrap_apr(errno, _("Write error"));
+
+  /* ### Issue #3014: Return a specific error for broken pipes,
+   * ### with a single element in the error chain. */
+  if (APR_STATUS_IS_EPIPE(err->apr_err))
+{
+  svn_error_clear(err);
+  return svn_error_create(SVN_ERR_IO_PIPE_WRITE_ERROR, NULL, NULL);
+}
+  else
+return svn_error_return(err);
+}
   else
 return svn_error_create
   (SVN_ERR_IO_WRITE_ERROR, NULL, NULL);
@@ -355,7 +367,19 @@ svn_cmdline_fflush(FILE *stream)
   if (fflush(stream) == EOF)
 {
   if (errno)
-return svn_error_wrap_apr(errno, _("Write error"));
+{
+  svn_error_t *err = svn_error_wrap_apr(errno, _("Write error"));
+
+  /* ### Issue #3014: Return a specific error for broken pipes,
+   * ### with a single element in the error chain. */
+  if (APR_STATUS_IS_EPIPE(err->apr_err))
+{
+  svn_error_clear(err);
+  return svn_error_create(SVN_ERR_IO_PIPE_WRITE_ERROR, NULL, NULL);
+}
+  else
+return svn_error_return(err);
+}
   else
 return svn_error_create(SVN_ERR_IO_WRITE_ERROR, NULL, NULL);
 }
@@ -376,7 +400,15 @@ svn_cmdline_handle_exit_error(svn_error_t *err,
   apr_pool_t *pool,
   const char *prefix)
 {
-  svn_handle_error2(err, stderr, FALSE, prefix);
+  /* Issue #3014:
+   * Don't print anything on broken pipes. The pipe was likely
+   * closed by the process at the other end. We expect that
+   * process to perform error reporting as necessary.
+   *
+   * ### This assumes that there is only one error in a chain for
+   * ### SVN_ERR_IO_PIPE_WRITE_ERROR. See svn_cmdline_fputs(). */
+  if (err->apr_err != SVN_ERR_IO_PIPE_WRITE_ERROR)
+svn_handle_error2(err, stderr, FALSE, prefix);
   svn_error_clear(err);
   if (pool)
 svn_pool_destroy(pool);
Index: subversion/libsvn_subr/opt.c
===
--- subversion/libsvn_subr/opt.c(revision 1086209)
+++ subversion/libsvn_subr/opt.c(working copy)
@@ -383,7 +383,15 @@ svn_opt_print_generic_help2(const char *header,
   return;
 
  print_error:
-  svn_handle_error2(err, stderr, FALSE, "svn: ");
+  /* Issue #3014:
+   * Don't print anything on broken pipes. The pipe was likely
+   * closed by the process at the other end. We expect that
+   * process to perform error reporting as necessary.
+   *
+   * ### This assumes that there is only one error in a chain for
+   * ### SVN_ERR_IO_PIPE_WRITE_ERROR. See svn_cmdline_fputs(). */
+  if (err->apr_err != SVN_ERR_IO_PIPE_WRITE_ERROR)
+svn_handle_error2(err, stderr, FALSE, "svn: ");
   svn_error_clear(err);
 }
 
Index: subversion/libsvn_subr/io.c
===
--- subversion/libsvn_subr/io.c (revision 1086209)
+++ subversion/libsvn_subr/io.c (working copy)
@@ -3034,6 +3034,11 @@ do_io_file_wra

Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Hyrum K Wright
On Mon, Mar 28, 2011 at 8:30 AM, Stefan Sperling  wrote:
> On Mon, Mar 28, 2011 at 08:42:02AM -0400, Greg Stein wrote:
>> On Mar 28, 2011 8:32 AM, "Stefan Sperling"  wrote:
>> >
>> > On Mon, Mar 28, 2011 at 07:49:22AM -0400, Greg Stein wrote:
>> > > On Mar 28, 2011 7:41 AM, "Greg Stein"  wrote:
>> > > > Think about how the query optimizer/planner would work. Consider
>> whether a
>> > > result set needs to be constructed.
>> > > >
>> > > > ... a LIMIT 1 provides a ton of help to the SQL engine about what
>> needs to
>> > > be fetched (or not).
>> > > >
>> > > > There is also a semantic element to it. Readers of the code will see
>> that
>> > > you're looking for basic presence rather than a set of results.
>> > > >
>> > > > If/when you need a query containing the set of nodes with mods, then
>> you
>> > > can consider another query. Until that time, the LIMIT is helpful/best.
>> > > >
>> > > > Cheers,
>> > > > -g
>> > >
>> > > In fact, you can switch from selecting local_relpath to a simple 1:
>> > >
>> > > --STMT_ANY_NODES_WITH_PROP_MODS
>> > > SELECT 1 FROM ...
>> > >
>> > > (note stmt name change, too)
>> >
>> > Fair enough. Does r1086208 look better?
>>
>> I like the new names, but the LIMIT didn't make it into the commit :-)
>
> Didn't you imply that SELECT 1 has the same effect as LIMIT 1?
> Or did I misunderstand?

(late joining the conversation)

SELECT 1 doesn't have the same effect as LIMIT 1.  I would assume
because you're simply testing row existence, you don't actually need
any data from the row, so you can use SELECT 1 (which would return 1
for every row it encounters).  The LIMIT 1 limits the number of rows
the query returns.

So the SELECT 1 determines what data gets return (which you don't care
about), and the LIMIT 1 determines how much data gets returned (of
which you only care about 0 or >0 rows, hence the LIMIT).

-Hyrum


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Stefan Sperling
On Mon, Mar 28, 2011 at 08:42:02AM -0400, Greg Stein wrote:
> On Mar 28, 2011 8:32 AM, "Stefan Sperling"  wrote:
> >
> > On Mon, Mar 28, 2011 at 07:49:22AM -0400, Greg Stein wrote:
> > > On Mar 28, 2011 7:41 AM, "Greg Stein"  wrote:
> > > > Think about how the query optimizer/planner would work. Consider
> whether a
> > > result set needs to be constructed.
> > > >
> > > > ... a LIMIT 1 provides a ton of help to the SQL engine about what
> needs to
> > > be fetched (or not).
> > > >
> > > > There is also a semantic element to it. Readers of the code will see
> that
> > > you're looking for basic presence rather than a set of results.
> > > >
> > > > If/when you need a query containing the set of nodes with mods, then
> you
> > > can consider another query. Until that time, the LIMIT is helpful/best.
> > > >
> > > > Cheers,
> > > > -g
> > >
> > > In fact, you can switch from selecting local_relpath to a simple 1:
> > >
> > > --STMT_ANY_NODES_WITH_PROP_MODS
> > > SELECT 1 FROM ...
> > >
> > > (note stmt name change, too)
> >
> > Fair enough. Does r1086208 look better?
> 
> I like the new names, but the LIMIT didn't make it into the commit :-)

Didn't you imply that SELECT 1 has the same effect as LIMIT 1?
Or did I misunderstand?


RE: resetting sqlite statements and cancellation

2011-03-28 Thread Bert Huijben


> -Original Message-
> From: Branko Čibej [mailto:br...@xbc.nu] On Behalf Of Branko Cibej
> Sent: maandag 28 maart 2011 14:54
> To: dev@subversion.apache.org
> Subject: Re: resetting sqlite statements and cancellation
> 
> On 28.03.2011 14:13, Greg Stein wrote:
> > Stepping back... should cancellation even be in the API? When it was first
> > designed, I never imagined any function taking long enough to require a
> > cancel func.
> 
> As an example, I can cite that recursive proplist that I turned upside
> down some time ago. On a large working copy, it takes quite a bit of
> time, so it checks for cancelation between callbacks. Of course it does
> that only while reading from the temporary table of intermediate
> results, so that's per-connection and doesn't affect the wc-db proper.
> 
> >  That would be left to callers, since each function would be
> > short/atomic/quick.
> >
> > And what happens with the transaction here?
> 
> I'd imagine clearing some scratch pool will cause the sqlite handle to
> be closed, which should roll back any pending transactions?

The SQLite handle is stored in the svn_wc__db_t instance, which in turn is 
stored in the svn_wc_context_t, which is kept in svn_client_ctx_t. So only when 
the pool of the client context is recycled the database handle is closed. So, 
"No: we can't wait for sqlite instance cleanup"

> > And is it really expected for this function to run for more than a second?
> 
> I don't know about "this" specifically, but certainly "some" will.

Some of the recent additions from Stephan might include comparing files to 
their pristines within a db transaction. So given that this can be on network 
paths we really want cancelation support here.

Or (like I suggested in an earlier mail) we can move the comparision to a 
callback outside wc_db.c. 
In that case wc_db just has to provide proper error handling for the callback, 
which it already had to do for normal io errors.

Bert



Re: [PATCH] Remove unused import from svntest/main.py

2011-03-28 Thread Hyrum K Wright
r1086218.  Thanks for the ping.

-Hyrum

On Sun, Mar 27, 2011 at 7:12 PM, Gavin Baumanis  wrote:
> Ping. This patch submission has received no comments.
>
>
> On 28/02/2011, at 2:15 AM, Noorul Islam K M wrote:
>
>>
>> Attached patch removes unused import.
>>
>> Log
>> [[[
>>
>> * subversion/tests/cmdline/svntest/main.py
>>    Remove unused import.
>>
>> Patch by: Noorul Islam K M 
>> ]]]
>>
>> Thanks and Regards
>> Noorul
>>
>> Index: subversion/tests/cmdline/svntest/main.py
>> ===
>> --- subversion/tests/cmdline/svntest/main.py  (revision 1074971)
>> +++ subversion/tests/cmdline/svntest/main.py  (working copy)
>> @@ -29,7 +29,6 @@
>> import re
>> import stat    # for ST_MODE
>> import subprocess
>> -import copy    # for deepcopy()
>> import time    # for time()
>> import traceback # for print_exc()
>> import threading
>
>


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 5.1.1)

2011-03-28 Thread Hyrum K Wright
Actually, it did.  Mike committed it in r1084335, and reported that
fact in this thread on Mar 22.

Best,
-Hyrum

On Sun, Mar 27, 2011 at 7:27 PM, Gavin Baumanis  wrote:
> Ping.
> This patch submission has received no comments.
>
>
>
> On 22/03/2011, at 8:27 AM, Danny Trebbien wrote:
>
>> On Mon, Mar 21, 2011 at 10:20 AM, Philip Martin
>>  wrote:
>>> Danny Trebbien  writes:
 Attached are the patch, the log message, and the two TGZ archives of
 DUMP files (for the tests).
>>>
>>> The patch is missing.
>>>
>>> --
>>> Philip
>>
>> Sorry.  Attached is the patch with a .txt extension.
>> 
>
>


Re: resetting sqlite statements and cancellation

2011-03-28 Thread Branko Čibej
On 28.03.2011 14:13, Greg Stein wrote:
> Stepping back... should cancellation even be in the API? When it was first
> designed, I never imagined any function taking long enough to require a
> cancel func.

As an example, I can cite that recursive proplist that I turned upside
down some time ago. On a large working copy, it takes quite a bit of
time, so it checks for cancelation between callbacks. Of course it does
that only while reading from the temporary table of intermediate
results, so that's per-connection and doesn't affect the wc-db proper.

>  That would be left to callers, since each function would be
> short/atomic/quick.
>
> And what happens with the transaction here?

I'd imagine clearing some scratch pool will cause the sqlite handle to
be closed, which should roll back any pending transactions?

> And is it really expected for this function to run for more than a second?

I don't know about "this" specifically, but certainly "some" will.

-- Brane

> On Mar 28, 2011 8:06 AM, "Stefan Sperling"  wrote:
>> We have this pattern in several places within wc_db:
>>
>> svn_sqlite__get_statement(&stmt, ...)
>> svn_sqlite_step(&have_row, stmt);
>> while (have_row)
>> {
>> if (cancel_func)
>> SVN_ERR(cancel_func(cancel_baton));
>>
>> /* do stuff */
>>
>> svn_sqlite_step(&have_row, stmt);
>> }
>> svn_sqlite_reset(stmt);
>>
>> So a cancellation handler can return without resetting the SQL statement.
>>
>> This is probably not an issue with the CLI client which will just exit.
>> But what about GUI clients? If the user clicks a cancel button and tries
>> another operation that reuses the statement, will there be a problem?
>>
>> If so, what's the best way to solve this?
>> I think it would be ugly to check the return value of cancel_func
> manually.
>> What about a convenience SVN_ERR-like macro that also accepts a statement
>> and resets it in case an error is thrown?
>>
>> if (cancel_func)
>> SVN_ERR_RESET_STMT(cancel_func(cancel_baton), stmt);



Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Greg Stein
On Mar 28, 2011 8:32 AM, "Stefan Sperling"  wrote:
>
> On Mon, Mar 28, 2011 at 07:49:22AM -0400, Greg Stein wrote:
> > On Mar 28, 2011 7:41 AM, "Greg Stein"  wrote:
> > > Think about how the query optimizer/planner would work. Consider
whether a
> > result set needs to be constructed.
> > >
> > > ... a LIMIT 1 provides a ton of help to the SQL engine about what
needs to
> > be fetched (or not).
> > >
> > > There is also a semantic element to it. Readers of the code will see
that
> > you're looking for basic presence rather than a set of results.
> > >
> > > If/when you need a query containing the set of nodes with mods, then
you
> > can consider another query. Until that time, the LIMIT is helpful/best.
> > >
> > > Cheers,
> > > -g
> >
> > In fact, you can switch from selecting local_relpath to a simple 1:
> >
> > --STMT_ANY_NODES_WITH_PROP_MODS
> > SELECT 1 FROM ...
> >
> > (note stmt name change, too)
>
> Fair enough. Does r1086208 look better?

I like the new names, but the LIMIT didn't make it into the commit :-)


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Stefan Sperling
On Mon, Mar 28, 2011 at 07:49:22AM -0400, Greg Stein wrote:
> On Mar 28, 2011 7:41 AM, "Greg Stein"  wrote:
> > Think about how the query optimizer/planner would work. Consider whether a
> result set needs to be constructed.
> >
> > ... a LIMIT 1 provides a ton of help to the SQL engine about what needs to
> be fetched (or not).
> >
> > There is also a semantic element to it. Readers of the code will see that
> you're looking for basic presence rather than a set of results.
> >
> > If/when you need a query containing the set of nodes with mods, then you
> can consider another query. Until that time, the LIMIT is helpful/best.
> >
> > Cheers,
> > -g
> 
> In fact, you can switch from selecting local_relpath to a simple 1:
> 
> --STMT_ANY_NODES_WITH_PROP_MODS
> SELECT 1 FROM ...
> 
> (note stmt name change, too)

Fair enough. Does r1086208 look better?


Re: API policy

2011-03-28 Thread Greg Stein
On Mar 28, 2011 8:23 AM, "Philip Martin"  wrote:
>
> Greg Stein  writes:
>
> > I recognize we need to grow the scope of transactions for speed
purposes,
> > but let's do it in a way based on semantics rather than implementation.
For
> > example: to speed checkout, we could have an API to add a complete
directory
> > of nodes, rather than an API to start/end a txn across those single
> > additions.
>
> That doesn't seem to fit with our current editor.  When we want to add a
> directory we don't know the nodes.  If we delay adding the directory
> until the nodes are known we would have to store the entire checkout
> somewhere else.

I'm well aware of the implications. I'm just trying to say: let's try to
avoid poisoning our entire WC library with SQL transaction concepts, which
should be restricted to just the internals of wc_db.

Cheers,
-g


RE: Bug in svnrdump (trunk@1085375) -- malformed dump file with multiple properties

2011-03-28 Thread neil.winton
On 25 March 2011 at 19:19, Hyrum K Wright wrote
> On Fri, Mar 25, 2011 at 2:11 PM, C. Michael Pilato  
> wrote:
>> On 03/25/2011 11:31 AM, Hyrum K Wright wrote:
>>> Neil,
>>> Thanks for the report.  In attempting to reproduce, I wasn't even able
>>> to get as far as comparing the dumpfiles, as svnrdump asserted
>>> somewhere in the process.  I opened issue 3844[1] to track this bug,
>>> and committed an XFailing test in r1085428.  I also marked the issue
>>> as important for a 1.7.0 release.
>>>
>>> -Hyrum
>>>
>>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=3844
>>
>> See r1085523.
> 
> Thanks, Mike.
> 
> Neil, when you get a chance, could you test a r1085523 and see if it
> fixes your problem?
> 
> Cheers,
> -Hyrum

Hi guys,

Sorry, but no cigar :( This probably fixes the assertion found by Hyrum, but 
doesn't address the original issue. The current test case added to the test 
suite addresses loading multiple propedits. That's good, but it's not 
sufficient. The heart of the issue is that a transaction with multiple 
propedits to a single path is *dumped* incorrectly. So for an original dump 
representation like this:

---
Node-path: 
Node-kind: dir
Node-action: change
Prop-content-length: 42
Content-length: 42

K 3
foo
V 3
bar
K 3
bar
V 3
baz
PROPS-END
---

The svnrdump output for the same transaction comes out like this:

---
Node-path: 
Node-kind: dir
Node-action: change
Prop-delta: true
Prop-content-length: 26
Content-length: 26

K 3
foo
V 3
bar
PROPS-END


Prop-delta: true
Prop-content-length: 26
Content-length: 26

K 3
bar
V 3
baz
PROPS-END
---

Note the second "orphaned" prop-delta.

It would be great if svnrdump produced exactly the same output as svnadmin 
(it'd make the test assertions nice and easy), but from a bit of 
experimentation (I haven't read the svnadmin code here) I don't believe that 
it's necessary for the two prop-deltas actually to be merged for the dumpfile 
to be valid. So if it's easier to repeat the preceding node-path information 
for each prop-delta then I guess that would be OK, as long as the net result is 
the same.

Thanks,
Neil


Re: API policy

2011-03-28 Thread Philip Martin
Greg Stein  writes:

> I recognize we need to grow the scope of transactions for speed purposes,
> but let's do it in a way based on semantics rather than implementation. For
> example: to speed checkout, we could have an API to add a complete directory
> of nodes, rather than an API to start/end a txn across those single
> additions.

That doesn't seem to fit with our current editor.  When we want to add a
directory we don't know the nodes.  If we delay adding the directory
until the nodes are known we would have to store the entire checkout
somewhere else.

-- 
Philip


Re: resetting sqlite statements and cancellation

2011-03-28 Thread Greg Stein
Stepping back... should cancellation even be in the API? When it was first
designed, I never imagined any function taking long enough to require a
cancel func. That would be left to callers, since each function would be
short/atomic/quick.

And what happens with the transaction here?

And is it really expected for this function to run for more than a second?
On Mar 28, 2011 8:06 AM, "Stefan Sperling"  wrote:
> We have this pattern in several places within wc_db:
>
> svn_sqlite__get_statement(&stmt, ...)
> svn_sqlite_step(&have_row, stmt);
> while (have_row)
> {
> if (cancel_func)
> SVN_ERR(cancel_func(cancel_baton));
>
> /* do stuff */
>
> svn_sqlite_step(&have_row, stmt);
> }
> svn_sqlite_reset(stmt);
>
> So a cancellation handler can return without resetting the SQL statement.
>
> This is probably not an issue with the CLI client which will just exit.
> But what about GUI clients? If the user clicks a cancel button and tries
> another operation that reuses the statement, will there be a problem?
>
> If so, what's the best way to solve this?
> I think it would be ugly to check the return value of cancel_func
manually.
> What about a convenience SVN_ERR-like macro that also accepts a statement
> and resets it in case an error is thrown?
>
> if (cancel_func)
> SVN_ERR_RESET_STMT(cancel_func(cancel_baton), stmt);


resetting sqlite statements and cancellation

2011-03-28 Thread Stefan Sperling
We have this pattern in several places within wc_db:

  svn_sqlite__get_statement(&stmt, ...)
  svn_sqlite_step(&have_row, stmt);
  while (have_row)
{
  if (cancel_func)
SVN_ERR(cancel_func(cancel_baton));

  /* do stuff */  

  svn_sqlite_step(&have_row, stmt);
}
  svn_sqlite_reset(stmt);

So a cancellation handler can return without resetting the SQL statement.

This is probably not an issue with the CLI client which will just exit.
But what about GUI clients? If the user clicks a cancel button and tries
another operation that reuses the statement, will there be a problem?

If so, what's the best way to solve this?
I think it would be ugly to check the return value of cancel_func manually.
What about a convenience SVN_ERR-like macro that also accepts a statement
and resets it in case an error is thrown?

  if (cancel_func)
SVN_ERR_RESET_STMT(cancel_func(cancel_baton), stmt);


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Greg Stein
On Mar 28, 2011 7:41 AM, "Greg Stein"  wrote:
>
>
> On Mar 28, 2011 7:32 AM, "Stefan Sperling"  wrote:
> >
> > On Sat, Mar 26, 2011 at 09:46:48AM -0400, Greg Stein wrote:
> > > If it only needs to detect a single row, then LIMIT 1 should be in the
> > > query.
> >
> > Is there really a significant difference between LIMIT 1 and not
stepping
> > through more than one result row?
> >
> > I'd rather keep the queries reusable for contexts where a complete
> > list of modified nodes is required, instead of adding COUNT() (as
> > suggested elsewhere) or LIMIT 1.
>
> Think about how the query optimizer/planner would work. Consider whether a
result set needs to be constructed.
>
> ... a LIMIT 1 provides a ton of help to the SQL engine about what needs to
be fetched (or not).
>
> There is also a semantic element to it. Readers of the code will see that
you're looking for basic presence rather than a set of results.
>
> If/when you need a query containing the set of nodes with mods, then you
can consider another query. Until that time, the LIMIT is helpful/best.
>
> Cheers,
> -g

In fact, you can switch from selecting local_relpath to a simple 1:

--STMT_ANY_NODES_WITH_PROP_MODS
SELECT 1 FROM ...

(note stmt name change, too)

Cheers,
-g


Re: svn propchange: r1081487 - svn:log

2011-03-28 Thread Stefan Sperling
On Fri, Mar 25, 2011 at 08:38:55PM -, hwri...@apache.org wrote:
> Author: hwright
> Revision: 1081487
> Modified property: svn:log
> 
> Modified: svn:log at Fri Mar 25 20:38:55 2011
> --
> --- svn:log (original)
> +++ svn:log Fri Mar 25 20:38:55 2011
> @@ -1,2 +1,6 @@
> +Flush output from each of the C tests immediately after it runs.  This 
> enables
> +a process watching the log file to get immediate results, rather than waiting
> +until all tests have run.
> +
>  * subversion/tests/svn_test_main.c
>(do_test_num): Flush stdout between individual tests.

This seems to make the test progress bar actually work for me :)
Thanks!


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Greg Stein
On Mar 28, 2011 7:32 AM, "Stefan Sperling"  wrote:
>
> On Sat, Mar 26, 2011 at 09:46:48AM -0400, Greg Stein wrote:
> > If it only needs to detect a single row, then LIMIT 1 should be in the
> > query.
>
> Is there really a significant difference between LIMIT 1 and not stepping
> through more than one result row?
>
> I'd rather keep the queries reusable for contexts where a complete
> list of modified nodes is required, instead of adding COUNT() (as
> suggested elsewhere) or LIMIT 1.

Think about how the query optimizer/planner would work. Consider whether a
result set needs to be constructed.

... a LIMIT 1 provides a ton of help to the SQL engine about what needs to
be fetched (or not).

There is also a semantic element to it. Readers of the code will see that
you're looking for basic presence rather than a set of results.

If/when you need a query containing the set of nodes with mods, then you can
consider another query. Until that time, the LIMIT is helpful/best.

Cheers,
-g


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Branko Čibej
On 28.03.2011 13:32, Stefan Sperling wrote:
> On Sat, Mar 26, 2011 at 09:46:48AM -0400, Greg Stein wrote:
>> If it only needs to detect a single row, then LIMIT 1 should be in the
>> query.
> Is there really a significant difference between LIMIT 1 and not stepping
> through more than one result row?

I'd expect there to be a difference in the amount of memory used and
duration of the sqlite transaction.

> I'd rather keep the queries reusable for contexts where a complete
> list of modified nodes is required, instead of adding COUNT() (as
> suggested elsewhere) or LIMIT 1.

Well ... one can't actually parametrize the statements like this (you
can parametrize the limit, but not the existence of a limit clause). I
like reusable statements too, but on the other hand, I believe in doing
as much as possible within the query engine. It always makes a
difference, if nowhere else, then in the complexity of the processing code.

-- Brane


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Stefan Sperling
On Sat, Mar 26, 2011 at 09:46:48AM -0400, Greg Stein wrote:
> If it only needs to detect a single row, then LIMIT 1 should be in the
> query.

Is there really a significant difference between LIMIT 1 and not stepping
through more than one result row?

I'd rather keep the queries reusable for contexts where a complete
list of modified nodes is required, instead of adding COUNT() (as
suggested elsewhere) or LIMIT 1.


Re: svn commit: r1081484 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

2011-03-28 Thread Stefan Sperling
On Sat, Mar 26, 2011 at 09:51:46AM -0400, Greg Stein wrote:
> > --- subversion/trunk/subversion/libsvn_wc/wc-queries.sql (original)
> > +++ subversion/trunk/subversion/libsvn_wc/wc-queries.sql Mon Mar 14
> 17:24:05 2011
> > @@ -910,6 +910,11 @@ WHERE wc_id = ?1 AND (local_relpath = ?2
> >presence = 'base-deleted')
> >   AND file_external IS NULL;
> >
> > +-- STMT_SELECT_NODES_WITH_PROP_MODIFICATIONS
> > +SELECT local_relpath FROM actual_node
> > +WHERE wc_id = ?1 AND (local_relpath = ?2 OR local_relpath LIKE ?3 ESCAPE
> '#')
> > +  AND properties IS NOT NULL;
> 
> This looks like you're searching for *children* of the target node. Why?

The entire point of this query is that it is recursive.
We want to know if there are any nodes in the working copy with
property modifications.

> > +  /* If this query returns a row, the working copy is modified. */
> > +  SVN_ERR(svn_sqlite__step(&have_row, stmt));
> > +  *is_modified = have_row;
> 
> Why use 'have_row' at all? Why not pass is_modified directly?

Sure, r1086192.