Re: Proposal: Change repository's UUID over RA layer

2010-08-06 Thread Justin Erenkrantz
On Fri, Aug 6, 2010 at 7:34 AM, Branko Čibej  wrote:
> Ahem. You guys have forgotten about Justin's RW-master/RO-slave
> replication hack, which *requires* the slave repositories to have the
> same UUID as the master. And that 'svnadmin load' has both --ignore-uuid
> and --force-uuid.

When doing any master/slave initial configuration, doing a local
'svnadmin setuuid' should be sufficient.  IOW, I don't see that use
case as driving this need.  -- justin


Re: Noise: spicy autoindex httpd.conf workaround #fail

2010-08-20 Thread Justin Erenkrantz
Based on what you describe, I think this is likely what you are looking for:

http://httpd.apache.org/docs/2.0/mod/mod_mime.html#modmimeusepathinfo

http://mail-archives.apache.org/mod_mbox/httpd-dev/200209.mbox/%3c20020904211343.ga16...@apache.org%3e
http://mail-archives.apache.org/mod_mbox/httpd-dev/200306.mbox/%3c2147483647.1054772...@[10.0.1.37]%3e

HTH.  -- justin

On Fri, Aug 20, 2010 at 1:27 AM, C. Michael Pilato  wrote:
> [Warning:  This matter is far from highly pertinent.  One tackles strange
> non-problems when in an atypical environment, such as a hotel room in CA.]
>
> I had someone ask me about Subversion autoindex support.  So, like, you
> point a web browser at
> http://svn.apache.org/repos/asf/subversion/site/publish/ and *pow* magically
> you are now looking at the index.html inside that directory.
>
> Clearly, this could be done with an hour or two of mod_dav_svn hackery and
> some new directives there.  But I was trying to come up with an httpd.conf
> workaround that did the trick.  Here's what I tried.  (On my system, all my
> Subversion repositories live inside the /repos/ Location.)
>
>   # If this is a GET request (but not a subrequest) aimed at my
>   # collection of Subversion repositories and with a trailing slash, and
>   # if there exists an index.html file inside that directory, then
>   # temporarily redirect the browser to the index.html file.
>   RewriteEngine on
>   RewriteCond %{IS_SUBREQ}              false
>   RewriteCond %{REQUEST_METHOD}         GET
>   RewriteCond %{REQUEST_URI}            ^/repos/.*/$
>   RewriteCond %{REQUEST_URI}index.html  -U
>   RewriteRule /repos/(.*)/$             /repos/$1/index.html [L,R]
>
> The result was that for every directory in which an index.html was found,
> that file was served (via a browser redirect).  Yay!  Unfortunately, the
> redirect was transmitted for directories which had no index.html child, too.
>  Boo!
>
> Sadly, I found that despite the fact that the Apache docs say about that
> "-U" test the following:
>
>   '-U' (is existing URL, via subrequest)
>   Checks whether or not TestString is a valid URL, accessible via all
>   the server's currently-configured access controls for that path. This
>   uses an internal subrequest to do the check, so use it with care - it
>   can impact your server's performance!
>
> In reality "validity" in this context seems to have nothing to do with
> "existence".  I traced the subrequest that mod_rewrite made into Subversion,
> and found that it never enters mod_dav to actually perform an existence get.
>  I guess I expected that the subrequest would GET all the way into
> Subversion, where it would get the appropriate error code (HTTP_NOT_FOUND).
>  In retrospect, I think I knew that subrequests don't behavior like
> full-fledged content-fetching requests.  But the documentation quoted above
> is pretty misleading, at any rate, IMO.
>
> --
> C. Michael Pilato 
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>


Re: Reduce the 1.7 release feature set a bit?

2010-08-20 Thread Justin Erenkrantz
On Fri, Aug 20, 2010 at 3:29 AM, Stefan Sperling  wrote:
>  just me): http://subversion.tigris.org/issues/show_bug.cgi?id=3684

Lieven fixed this in serf trunk a few months back.  IOW, I can
reproduce with serf 0.6.1, but not serf trunk.

Greg/Lieven: shall we cut a new release?  Greg did the last release,
so I would prefer he do it.

I'll also go figure out what needs to be backported in ra_serf from
trunk to 1.6.x to work against current serf releases.  -- justin


Re: [serf-dev] Re: Reduce the 1.7 release feature set a bit?

2010-08-20 Thread Justin Erenkrantz
On Fri, Aug 20, 2010 at 11:15 AM, Lieven Govaerts
 wrote:
> The memory usage problem is probably still there. Last time I looked
> at it with valgrind (at the elego days), most of the memory was
> allocated in the sqlite libraries. This has probably improved a lot
> with the single-db move, but I haven't tested yet.

FWIW, I didn't see the memory go above ~48MB on my Mac when checking
out SVN trunk with subversion & serf trunk.  So, I'm not tempted to
pursue it further at this time.

Getting ra_serf in 1.6.x happy with current versions of serf is more
important (to me).  =)  -- justin


Re: [serf-dev] Re: Reduce the 1.7 release feature set a bit?

2010-08-20 Thread Justin Erenkrantz
On Fri, Aug 20, 2010 at 11:19 AM, Justin Erenkrantz
 wrote:
> On Fri, Aug 20, 2010 at 11:15 AM, Lieven Govaerts
>  wrote:
>> The memory usage problem is probably still there. Last time I looked
>> at it with valgrind (at the elego days), most of the memory was
>> allocated in the sqlite libraries. This has probably improved a lot
>> with the single-db move, but I haven't tested yet.
>
> FWIW, I didn't see the memory go above ~48MB on my Mac when checking
> out SVN trunk with subversion & serf trunk.  So, I'm not tempted to
> pursue it further at this time.
>
> Getting ra_serf in 1.6.x happy with current versions of serf is more
> important (to me).  =)  -- justin

1.6.x + r879757 backport group now works with serf trunk again.  All
'make check' tests (via davautocheck.sh) pass except:

FAIL:  merge_tests.py 133: replace that causes a tree-conflict

Considering everything else passes, I'll go out on a limb and say
that's likely not due to the serf version.  (I'm guessing it's a known
failure??)

Greg will likely cut a Serf release this weekend that will incorporate
r1401 from serf that will let the old-style authentication handlers in
1.6.x-era ra_serf work.  I guess it'll either be 0.7.0 or 0.6.2 - not
sure what his plans are - I just validated against serf trunk.  (If we
didn't incorporate r1401 in the new serf release, we'd have to
backport all of the authn changes in ra_serf from trunk and I'd rather
not do that.  ra_serf in SVN trunk already uses the new serf authn
framework.)

BTW, those who complain that no one's working on ra_serf hasn't looked
at the diffs between 1.6.x and trunk.  New authn framework, digest
auth, Kerberos auth, error handling fixes, better proxy support,
HTTPv2 support, etc.  Many many kudos to everyone driving ra_serf
forward!  *group hug*  -- justin


Re: svn commit: r987689 - /subversion/branches/1.6.x/STATUS

2010-08-22 Thread Justin Erenkrantz
On Sun, Aug 22, 2010 at 1:02 PM, Lieven Govaerts  wrote:
> Hey Justin,
>
> On Sat, Aug 21, 2010 at 4:30 AM,   wrote:
>> Author: jerenkrantz
>> Date: Sat Aug 21 02:30:31 2010
>> New Revision: 987689
>>
>> URL: http://svn.apache.org/viewvc?rev=987689&view=rev
>> Log:
>> Propose r879757 & r880320 for backport to 1.6.x.
>>
>> Modified:
>>    subversion/branches/1.6.x/STATUS
>>
>> Modified: subversion/branches/1.6.x/STATUS
>> URL: 
>> http://svn.apache.org/viewvc/subversion/branches/1.6.x/STATUS?rev=987689&r1=987688&r2=987689&view=diff
>> ==
>> --- subversion/branches/1.6.x/STATUS (original)
>> +++ subversion/branches/1.6.x/STATUS Sat Aug 21 02:30:31 2010
>> @@ -233,6 +233,17 @@ Candidate changes:
>>    Votes:
>>      +1: danielsh
>>
>> + * r879757, r880320
>> +   Let ra_serf work with current serf releases.
>> +   Justification:
>> +     Having a dud client is bad. This seems to be the minimal required 
>> changes.
>> +   Backport branch:
>> +     ^/subversion/branches/1.6.x-r879757
>> +   Notes:
>> +     r879757 is the main fix.  r880320 is a follow up fix.
>> +   Votes:
>> +     +1: jerenkrantz
>
> I didn't want to propose r879757 for backport because it changes the
> svn_ra_serf__conn_setup function declaration, which is used as a
> callback function for serf, in an incompatible way with serf 0.3. As
> long as one builds and runs svn with the same serf version there is no
> problem. The idea was to just raise the minimum serf version with svn
> 1.7 release, so this problem couldn't happen.
>
> Is this something we make promises about?

It has the ifdef so older serf's should work fine.  Or, am I missing
something?  Is the issue compiling against 0.4.x+ and running with
0.3.x+?  If so, I'm not sure that's worth worrying about.  (Serf
doesn't have a promise of binary API compatibility...)

The core issue that I'm trying to address in this backport is that we
don't have enough auto-fu checks currently in place to block
combinations of 1.6.x with current releases of serf.  We have no
version checks at configure-time - only at compile-time; and the
compile-time checks in 1.6.x don't complain if it sees a serf version
it doesn't know about.  So, right now, 1.6.x (without r879757) builds
"successfully", but due to the API mismatches, we get a dud client
with serf 0.4.0+ and 1.6.x.  -- justin


Re: Upgrade from 1.6 must use the same incremental steps? [was: svn commit: r987526 - ...]

2010-08-22 Thread Justin Erenkrantz
On Sun, Aug 22, 2010 at 11:14 AM, Daniel Shahaf  wrote:
> Greg Stein wrote on Fri, Aug 20, 2010 at 14:11:21 -0400:
>> I wish you wouldn't change the subject line so often.
>
> Why not?  It has proper References: and In-Reply-To:.  Should be enough
> for threading to work, no?

Not with Gmail - changes in subjects are treated as a new
conversation.  -- justin


Re: svn commit: r987689 - /subversion/branches/1.6.x/STATUS

2010-08-22 Thread Justin Erenkrantz
On Sun, Aug 22, 2010 at 7:25 PM, Greg Stein  wrote:
> I also added an API to serf to facilitate runtime version checks (ie.
> hopefully before a call to a bogus signature is performed). But that
> call is only in later versions :-P

Yup - I noticed that when I added the minimum version auto-fu check
today.  I'm not sure how critical it is to do that run-time check at
configure-time - I know that APR does it for BDB, but...

If we decide not to do the proposed backport, then 1.6.x might
actually need a *maximum* version check.  But, I'd rather just make
1.6.x work with a current serf instead...and probably force Subversion
to require 0.6.2/0.7.0 (depending upon what Greg rolls soon) as
0.4.0-0.6.1 won't work with 1.6.x, but 0.3.1 might.

My head hurts.  =)  -- justin


Re: svn commit: r987956 - /subversion/trunk/build/ac-macros/serf.m4

2010-08-23 Thread Justin Erenkrantz
On Mon, Aug 23, 2010 at 4:28 AM, Daniel Shahaf  wrote:
> Does this actually expand SERF_VERSION_STRING?  A quick independent test
> indicates it wouldn't...

Ah, you're right.  Hmm.  Any ideas?  -- justin


Re: Noise: spicy autoindex httpd.conf workaround #fail

2010-08-23 Thread Justin Erenkrantz
Why do you want mod_rewrite at all?Enabling that mod_mime
directive will let mod_autoindex remap / to /index.html when it exists
in the virtual filesystem.  -- justin

On Mon, Aug 23, 2010 at 10:48 AM, C. Michael Pilato  wrote:
> Hrm.  I'm not seeing the connection here.  I'm looking for a way to pull off
> a Rewrite condition based on the existence of a given URI.  The docs imply
> that this can be done with "RewriteCond SOME_URI -U", but appear to just be
> wrong -- the existence of SOME_URI doesn't appear to tested at all, only
> it's accessibility (from an authn/authz standpoint).
>
>
> On 08/20/2010 12:51 PM, Justin Erenkrantz wrote:
>> Based on what you describe, I think this is likely what you are looking for:
>>
>> http://httpd.apache.org/docs/2.0/mod/mod_mime.html#modmimeusepathinfo
>>
>> http://mail-archives.apache.org/mod_mbox/httpd-dev/200209.mbox/%3c20020904211343.ga16...@apache.org%3e
>> http://mail-archives.apache.org/mod_mbox/httpd-dev/200306.mbox/%3c2147483647.1054772...@[10.0.1.37]%3e
>>
>> HTH.  -- justin
>>
>> On Fri, Aug 20, 2010 at 1:27 AM, C. Michael Pilato  
>> wrote:
>>> [Warning:  This matter is far from highly pertinent.  One tackles strange
>>> non-problems when in an atypical environment, such as a hotel room in CA.]
>>>
>>> I had someone ask me about Subversion autoindex support.  So, like, you
>>> point a web browser at
>>> http://svn.apache.org/repos/asf/subversion/site/publish/ and *pow* magically
>>> you are now looking at the index.html inside that directory.
>>>
>>> Clearly, this could be done with an hour or two of mod_dav_svn hackery and
>>> some new directives there.  But I was trying to come up with an httpd.conf
>>> workaround that did the trick.  Here's what I tried.  (On my system, all my
>>> Subversion repositories live inside the /repos/ Location.)
>>>
>>>   # If this is a GET request (but not a subrequest) aimed at my
>>>   # collection of Subversion repositories and with a trailing slash, and
>>>   # if there exists an index.html file inside that directory, then
>>>   # temporarily redirect the browser to the index.html file.
>>>   RewriteEngine on
>>>   RewriteCond %{IS_SUBREQ}              false
>>>   RewriteCond %{REQUEST_METHOD}         GET
>>>   RewriteCond %{REQUEST_URI}            ^/repos/.*/$
>>>   RewriteCond %{REQUEST_URI}index.html  -U
>>>   RewriteRule /repos/(.*)/$             /repos/$1/index.html [L,R]
>>>
>>> The result was that for every directory in which an index.html was found,
>>> that file was served (via a browser redirect).  Yay!  Unfortunately, the
>>> redirect was transmitted for directories which had no index.html child, too.
>>>  Boo!
>>>
>>> Sadly, I found that despite the fact that the Apache docs say about that
>>> "-U" test the following:
>>>
>>>   '-U' (is existing URL, via subrequest)
>>>   Checks whether or not TestString is a valid URL, accessible via all
>>>   the server's currently-configured access controls for that path. This
>>>   uses an internal subrequest to do the check, so use it with care - it
>>>   can impact your server's performance!
>>>
>>> In reality "validity" in this context seems to have nothing to do with
>>> "existence".  I traced the subrequest that mod_rewrite made into Subversion,
>>> and found that it never enters mod_dav to actually perform an existence get.
>>>  I guess I expected that the subrequest would GET all the way into
>>> Subversion, where it would get the appropriate error code (HTTP_NOT_FOUND).
>>>  In retrospect, I think I knew that subrequests don't behavior like
>>> full-fledged content-fetching requests.  But the documentation quoted above
>>> is pretty misleading, at any rate, IMO.
>>>
>>> --
>>> C. Michael Pilato 
>>> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>>>
>
>
> --
> C. Michael Pilato 
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>


httpd trunk broken with Subversion: ap_log_rerror busted

2010-08-24 Thread Justin Erenkrantz
In r951893, httpd modified a #define for APLOG_MARK to add in a new
parameter called APLOG_MODULE_INDEX (in addition to file and line
info).

This busts Subversion - specifically, mod_authz_svn which has a function called:

static void
log_access_verdict(const char *file, int line,
   const request_rec *r, int allowed,
   const char *repos_path, const char *dest_repos_path)

it is called with:

  log_access_verdict(APLOG_MARK, r, 1, repos_path, dest_repos_path);

Reading the very obtuse (unhelpful) commit log for r951893 as well as
any comments in that general area of http_log.h (which are lacking &
unhelpful), I have no idea what this APLOG_MODULE_INDEX is or why it's
even there.

Furthermore, this error case is very very hard to track down because
we're relying upon multiple levels of #define's and indirections
(hidden static variables??, weird STDC wrappers, redefinitions of
function names, etc, etc.).  So, it's not going to be obvious to
downstream developers what is going on and why it is broken.  In
general, I'm not quite sure it's a good idea to bust a #define that
has been the same since the 1.3 days (if not earlier actually).  If we
want modules to use a new optimized log API, then we should introduce
a new optimized log API and not bust the old one in a way that the
failure case isn't obvious to track down.

Yes, we could fix this by making mod_authz_svn conditional on the new
MMN, but - again, it's about even figuring out that the API is changed
and what to do about it.  The root cause is just way too obscured,
IMO.

(I wish I could veto this whole commit just on over-complication alone
- it's not an elegant solution at all.)

My $.02.  -- justin


Re: httpd trunk broken with Subversion: ap_log_rerror busted

2010-08-25 Thread Justin Erenkrantz
On Wed, Aug 25, 2010 at 4:48 AM, Stefan Fritsch  wrote:
> I agree that the comments/documentation should be improved. I will write a
> how-to for adjusting modules to the new API.

Here is a constructive suggestion (*grin*): in APR, for some of the
more complex declarations (see apr_pools.h and apr_pool_create), what
we have done is to create a #ifdef DOXYGEN tag which indicates the
normalized reduced form of the function declaration.  The #else clause
contains the optimized, macro-ized version.  (In httpd, I don't know
what the appropriate #ifdef should be though.)

This way those reading http_log.h will be able to see what the
declaration for ap_log_rerror actually reduces to without trying to go
through several layers of indirection.  -- justin


Re: [serf-dev] Re: serf 0.7.0 released

2010-08-25 Thread Justin Erenkrantz
On Wed, Aug 25, 2010 at 2:45 PM, Blair Zajac  wrote:
> Does it change the ABI?  Do we need to wait for the next svn 1.6.x release
> before we can use it?  I'm asking because I maintain the MacPorts serf
> package and don't want to break existing installs.

Subversion 1.6.12 is busted with serf 0.6.1 anyway.  =(

You'd need the next release of Subversion 1.6.x (assuming the pending
backport is approved) to make 1.6.x work in combination with 0.7.0.
-- justin


Re: [serf-dev] serf 0.7.0 released

2010-08-25 Thread Justin Erenkrantz
On Wed, Aug 25, 2010 at 2:16 PM, Greg Stein  wrote:
> Hi all,
>
> I'm pleased to announce the 0.7.0 release of serf.

Thanks!  -- justin


Re: [serf-dev] Re: serf 0.7.0 released

2010-08-25 Thread Justin Erenkrantz
On Wed, Aug 25, 2010 at 4:01 PM, Peter Samuelson  wrote:
> Yeah but presumably he's asking because he's linking libsvn_ra_serf to
> serf 0.3.x now, so the question remains: is 0.7.0 ABI-compatible with
> 0.3.x?  As far as I know, 0.3 is ABI-compatible with 0.2 and 0.1.

Macports is at 0.6.1.  IOW, it's already busted now.  =(  -- justin


Re: [serf-dev] Re: serf 0.7.0 released

2010-08-25 Thread Justin Erenkrantz
On Wed, Aug 25, 2010 at 4:24 PM, Blair Zajac  wrote:
> On 8/25/10 4:16 PM, Justin Erenkrantz wrote:
>>
>> On Wed, Aug 25, 2010 at 4:01 PM, Peter Samuelson  wrote:
>>>
>>> Yeah but presumably he's asking because he's linking libsvn_ra_serf to
>>> serf 0.3.x now, so the question remains: is 0.7.0 ABI-compatible with
>>> 0.3.x?  As far as I know, 0.3 is ABI-compatible with 0.2 and 0.1.
>>
>> Macports is at 0.6.1.  IOW, it's already busted now.  =(  -- justin
>
> Busted checkouts and operations are better than core dumps :)

Subversion 1.6.12 and serf 0.6.1 from Macports doesn't even do any
type of a checkout - it always errors out immediately.

Backing up, there are three faults:

 - API/ABI compatibility - this was broken way back in 0.4.0 - no
current 1.6.x release of Subversion can work with anything from serf
0.4.0 or higher.  You need the proposed backport to Subversion
(1.6.x-r879757) to make 1.6.x work with the serf 0.4.0+ API.

 - Authn framework fix - this is fixed in 0.7.0.

 - Memory usage - this is also fixed in 0.7.0.

1.7.x/trunk of Subversion could work with 0.4.0-0.6.1 - this is what
Stefan and others reported the memory leaks - but Subversion has never
officially released a version that works with serf 0.4.0+.

> Can I get an answer to the question?

I'm not sure there is any API/ABI changes between 0.6.1 and 0.7.0.
Greg?  Lieven?  -- justin


Re: Worried about single-db performance

2010-09-03 Thread Justin Erenkrantz
On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein  wrote:
> It "should" already be faster. Obviously, that's not the case.

I just spent a little bit time with Shark and gdb.  A cold run of 'svn
st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds
and 1.7 is 0.919 seconds.  Hot runs for 1.6 are about 0.055 seconds
with 1.7 at 0.750 seconds.

One striking difference in the perf profile between 1.6 & trunk is
that we seem to do a larger amount of stat() calls in 1.7.

>From looking at the traces and code, I *think*
svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path
may be in play here:

  /* ### at some point in the future, we may need to find a way to get
 ### rid of this stat() call. it is going to happen for EVERY call
 ### into wc_db which references a file. calls for directories could
 ### get an early-exit in the hash lookup just above.  */
  SVN_ERR(svn_io_check_special_path(local_abspath, &kind,
&special /* unused */, scratch_pool));

I see this stat() call getting called approximately seven times *per*
file in a WC - that can't be good.  The patch below brings us down to
one invocation of this particular svn_io_check_special_path call per
status run.  (If anyone would like the stacktraces, I can send 'em
along.  If we could somehow stop asking for this seven times per file,
that'd be good too.  *grin*)

Shark says this patch saves us about 2.5% time-wise - not a whole lot
- but, something like this is likely needed to relieve pressure on the
I/O subsystem.  (I haven't rigorously tested this patch to see if it
breaks anything else - hope someone more familiar with libsvn_wc will
see if this is useful.)

I'll keep looking through the timing profiles to see what else I can
uncover.  -- justin

---

Remember WC files we've seen before to save on stat() calls.

* subversion/libsvn_wc/wc_db_private.h
  (svn_wc__db_t): Add a hash table to remember file entries.
* subversion/libsvn_wc/wc_db_pdh.c
  (svn_wc__db_open): Create file_data hash.
  (svn_wc__db_pdh_parse_local_abspath): Use hash to save stat() lookups.

Index: wc_db_pdh.c
===
--- wc_db_pdh.c (revision 992534)
+++ wc_db_pdh.c (working copy)
@@ -216,6 +216,7 @@ svn_wc__db_open(svn_wc__db_t **db,
   (*db)->auto_upgrade = auto_upgrade;
   (*db)->enforce_empty_wq = enforce_empty_wq;
   (*db)->dir_data = apr_hash_make(result_pool);
+  (*db)->file_data = apr_hash_make(result_pool);
   (*db)->state_pool = result_pool;

   return SVN_NO_ERROR;
@@ -424,10 +425,21 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
   return SVN_NO_ERROR;
 }

-  /* ### at some point in the future, we may need to find a way to get
- ### rid of this stat() call. it is going to happen for EVERY call
- ### into wc_db which references a file. calls for directories could
- ### get an early-exit in the hash lookup just above.  */
+  /* Have we already successfully seen this file before? */
+  *pdh = apr_hash_get(db->file_data, local_abspath, APR_HASH_KEY_STRING);
+  if (*pdh != NULL && (*pdh)->wcroot != NULL) {
+  const char *local_abspath_parent, *dir_relpath;
+  svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath,
+   scratch_pool);
+
+  /* Stashed directory's local_relpath + basename. */
+  dir_relpath = svn_wc__db_pdh_compute_relpath(*pdh, NULL);
+  *local_relpath = svn_relpath_join(dir_relpath,
+build_relpath,
+result_pool);
+  return SVN_NO_ERROR;
+  }
+
   SVN_ERR(svn_io_check_special_path(local_abspath, &kind,
 &special /* unused */, scratch_pool));
   if (kind != svn_node_dir)
@@ -440,7 +452,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
  For both of these cases, strip the basename off of the path and
  move up one level. Keep record of what we strip, though, since
  we'll need it later to construct local_relpath.  */
-  svn_dirent_split(&local_abspath, &build_relpath, local_abspath,
+  const char *local_abspath_parent;
+  svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath,
scratch_pool);

   /* ### if *pdh != NULL (from further above), then there is (quite
@@ -448,7 +461,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
  ### clear it out? but what if there is an access baton?  */

   /* Is this directory in our hash?  */
-  *pdh = apr_hash_get(db->dir_data, local_abspath, APR_HASH_KEY_STRING);
+  *pdh = apr_hash_get(db->dir_data, local_abspath_parent,
+  APR_HASH_KEY_STRING);
   if (*pdh != NULL && (*pdh)->wcroot != NULL)
 {
   const char *dir_relpath;
@@ -458,6 +472,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_
   *local_relpath = svn_relpath_join(dir_relpath,

Re: Worried about single-db performance

2010-09-04 Thread Justin Erenkrantz
On Sat, Sep 4, 2010 at 2:23 AM, Bert Huijben  wrote:
> SQLite also does a stat call per statement, unless there is already a shared
> lock open, just to check if there is no other process that opened a
> transaction.
> (On Windows this specific stat to check for other processes operating on the
> same db is the performance killer for svn status: Just this stat takes more
> than 50% of the total processing).

Aha.  Adding exclusive locking into our pragma
[http://www.sqlite.org/pragma.html] calls in "svn_sqlite__open":

"PRAGMA locking_mode=exclusive;"

brings the time for "svn st" down from 0.680 to 0.310 seconds.  And,
yes, the I/O percentages drop dramatically:

~/Applications/svn-trunk/bin/svn st  0.37s user 0.31s system 99% cpu 0.684 total
~/Applications/svn-trunk/bin/svn st  0.26s user 0.05s system 98% cpu 0.308 total

I *think* we'd be okay with having Sqlite holding its read/write locks
for the duration of our database connection rather than constantly
giving it up and refetching it between every read and write operation.
 As I read the sqlite docs, we should still be able to have shared
readers in this model - but, it'd create the case where there were
idle shared readers (due to network I/O?) would block an attempted
writer.  With a normal locking mode, a writer could intercept a reader
if it were idle and not active.  However, I'd think our other locks
would interfere with this anyway...so I think it'd be safe.

Thoughts?  -- justin


Repeated SQL queries when doing 'svn st'

2010-09-04 Thread Justin Erenkrantz
When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a
svn trunk WC, a number of things pop out.

We perform 28,062 SQL queries.

---
DBG: sqlite.c:  63: sql="select root, uuid from repository where id = 1;"
---

We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times.  Yikes.

I think this has to do with svn_wc__db_base_get_info's call to
fetch_repos_info.  I'd think we'd be able to cache this result.  I'll
take a stab and see if this reduction saves us any real time.  The
root and uuid should be constant for an wc_id...right?

For each file that we hit the DB for, we execute the following queries:

---
DBG: sqlite.c:  63: sql="select repos_id, repos_relpath, presence,
kind, revnum, checksum,   translated_size, changed_rev, changed_date,
changed_author, depth,   symlink_target, last_mod_time, properties
from base_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select presence, kind, checksum,
translated_size,   changed_rev, changed_date, changed_author, depth,
symlink_target,   copyfrom_repos_id, copyfrom_repos_path,
copyfrom_revnum,   moved_here, moved_to, last_mod_time, properties
from working_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select base_node.repos_id,
base_node.repos_relpath, presence, kind,   revnum, checksum,
translated_size, changed_rev, changed_date,   changed_author, depth,
symlink_target, last_mod_time, properties,   lock_token, lock_owner,
lock_comment, lock_date from base_node left outer join lock on
base_node.repos_id = lock.repos_id   and base_node.repos_relpath =
lock.repos_relpath where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select presence, kind, checksum,
translated_size,   changed_rev, changed_date, changed_author, depth,
symlink_target,   copyfrom_repos_id, copyfrom_repos_path,
copyfrom_revnum,   moved_here, moved_to, last_mod_time, properties
from working_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select root, uuid from repository where id = 1;"
DBG: sqlite.c:  63: sql="select prop_reject, changelist, conflict_old,
conflict_new, conflict_working, tree_conflict_data, properties from
actual_node where wc_id = 1 and local_relpath =
'contrib/server-side';"
DBG: sqlite.c:  63: sql="SELECT properties, presence FROM WORKING_NODE
WHERE wc_id = 1 AND local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from base_node where wc_id
= 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from actual_node where
wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="SELECT properties, presence FROM WORKING_NODE
WHERE wc_id = 1 AND local_relpath =
'contrib/server-side/fsfsverify.py';"
DBG: sqlite.c:  63: sql="select properties from base_node where wc_id
= 1 and local_relpath = 'contrib/server-side/fsfsverify.py';"
---

Notably, AFAICT, we're repeating a few of these queries:

- STMT_SELECT_WORKING_NODE (2 times)
- STMT_SELECT_ACTUAL_NODE (3 times)
- STMT_SELECT_WORKING_PROPS (2 times)
- STMT_SELECT_BASE_PROPS (2 times)

I haven't yet dug into why we're repeating the queries.

So, I'd bet we can cut our volume of SQL calls dramatically with some
minor rejiggering not to lose the values when we do the first calls of
the statement.  -- justin


Re: Repeated SQL queries when doing 'svn st'

2010-09-04 Thread Justin Erenkrantz
On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz
 wrote:
> When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a
> svn trunk WC, a number of things pop out.
>
> We perform 28,062 SQL queries.
>
> ---
> DBG: sqlite.c:  63: sql="select root, uuid from repository where id = 1;"
> ---
>
> We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times.  Yikes.
>
> I think this has to do with svn_wc__db_base_get_info's call to
> fetch_repos_info.  I'd think we'd be able to cache this result.  I'll
> take a stab and see if this reduction saves us any real time.  The
> root and uuid should be constant for an wc_id...right?

It's actually svn_wc__db_read_info's fetch_repos_info call...

With 2215 queries: ~/Applications/svn-trunk-no-debug/bin/svn st  0.26s
user 0.05s system 98% cpu 0.311 total
With a quick-and-hacky cache:
~/Applications/svn-trunk-no-debug/bin/svn st  0.25s user 0.05s system
98% cpu 0.298 total

It's worth a good 4% time savings...

A quick back-of-the-envelope calculation says that if we can remove
all of the extraneous 13,290 SQL queries (out of 28,062 ; leaving
behind 14,772 queries) - we will likely gain something like 25% from
the 0.311 down to around 0.233 seconds.

It's still much higher than 0.050 than 'svn st' on 1.6.x yields, but
inching closer...  -- justin


Re: Repeated SQL queries when doing 'svn st'

2010-09-04 Thread Justin Erenkrantz
On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz
 wrote:
> Notably, AFAICT, we're repeating a few of these queries:
>
> - STMT_SELECT_WORKING_NODE (2 times)
> - STMT_SELECT_ACTUAL_NODE (3 times)
> - STMT_SELECT_WORKING_PROPS (2 times)
> - STMT_SELECT_BASE_PROPS (2 times)
>
> I haven't yet dug into why we're repeating the queries.

Okay - I now know why we're repeating the core queries twice.

In get_dir_status, we want to do a check to identify if the node
exists and what kind it is - which is done by a call to
svn_wc__db_read_info (around line 1269 in status.c).  But, most of the
parameters (except for node and kind) are NULL.  If it's not excluded
and we can go into the depth, then we call handle_dir_entry on the
entry a few lines down - which turns right around and calls
svn_wc__db_read_info - this time asking for everything.

This causes the core per-file queries to be executed twice.

I'm going to see what a quick check to retrieve just the kind and
status will do for the query volume.  I think it's unlikely we have to
pull everything out of sqlite to answer that basic question.  --
justin


Re: Repeated SQL queries when doing 'svn st'

2010-09-04 Thread Justin Erenkrantz
On Sat, Sep 4, 2010 at 12:45 PM, Justin Erenkrantz
 wrote:
> I'm going to see what a quick check to retrieve just the kind and
> status will do for the query volume.  I think it's unlikely we have to
> pull everything out of sqlite to answer that basic question.  --
> justin

We can reduce the query volume by one (per file) as we can skip the
active table query - see quick & dirty patch below.

It does replace 2 larger queries with 2 smaller ones, but I think our
bottleneck is likely more around query volume than query complexity...

Anyway, I'll stop replying to myself and enjoy the long weekend.  =)  -- justin

Index: status.c
===
--- status.c(revision 992534)
+++ status.c(working copy)
@@ -1263,10 +1270,7 @@ get_dir_status(const struct walk_status_baton *wb,
   svn_wc__db_status_t node_status;
   svn_wc__db_kind_t node_kind;

-  SVN_ERR(svn_wc__db_read_info(&node_status, &node_kind, NULL, NULL,
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL,
-   NULL, NULL, NULL, NULL, NULL, NULL, NULL,
-   NULL, NULL, NULL, NULL, NULL, NULL,
+  SVN_ERR(svn_wc__db_read_info_exist(&node_status, &node_kind,
wb->db, node_abspath, iterpool, iterpool));

   if (node_status != svn_wc__db_status_not_present
Index: wc_db.c
===
--- wc_db.c (revision 992534)
+++ wc_db.c (working copy)
@@ -51,7 +51,6 @@
 #include "private/svn_wc_private.h"
 #include "private/svn_token.h"

-
 #define NOT_IMPLEMENTED() SVN__NOT_IMPLEMENTED()


@@ -5051,6 +5050,161 @@ svn_wc__db_temp_op_delete(svn_wc__db_t *db,


 svn_error_t *
+svn_wc__db_read_info_exist(svn_wc__db_status_t *status,
+ svn_wc__db_kind_t *kind,
+ svn_wc__db_t *db,
+ const char *local_abspath,
+ apr_pool_t *result_pool,
+ apr_pool_t *scratch_pool)
+{
+  svn_wc__db_pdh_t *pdh;
+  const char *local_relpath;
+  svn_sqlite__stmt_t *stmt_base;
+  svn_sqlite__stmt_t *stmt_work;
+  svn_boolean_t have_base;
+  svn_boolean_t have_work;
+  svn_error_t *err = NULL;
+
+  SVN_ERR_ASSERT(svn_dirent_is_absolute(local_abspath));
+
+  SVN_ERR(svn_wc__db_pdh_parse_local_abspath(&pdh, &local_relpath, db,
+  local_abspath, svn_sqlite__mode_readonly,
+  scratch_pool, scratch_pool));
+  VERIFY_USABLE_PDH(pdh);
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt_base, pdh->wcroot->sdb,
+STMT_SELECT_BASE_NODE_EXIST));
+  SVN_ERR(svn_sqlite__bindf(stmt_base, "is",
+pdh->wcroot->wc_id, local_relpath));
+  SVN_ERR(svn_sqlite__step(&have_base, stmt_base));
+
+  SVN_ERR(svn_sqlite__get_statement(&stmt_work, pdh->wcroot->sdb,
+STMT_SELECT_WORKING_NODE_EXIST));
+  SVN_ERR(svn_sqlite__bindf(stmt_work, "is",
+pdh->wcroot->wc_id, local_relpath));
+  SVN_ERR(svn_sqlite__step(&have_work, stmt_work));
+
+  if (have_base || have_work)
+{
+  svn_wc__db_kind_t node_kind;
+
+  if (have_work)
+node_kind = svn_sqlite__column_token(stmt_work, 1, kind_map);
+  else
+node_kind = svn_sqlite__column_token(stmt_base, 1, kind_map);
+
+  if (status)
+{
+  if (have_base)
+{
+  *status = svn_sqlite__column_token(stmt_base, 0, presence_map);
+
+  /* We have a presence that allows a WORKING_NODE override
+ (normal or not-present), or we don't have an override.  */
+  /* ### for now, allow an override of an incomplete BASE_NODE
+ ### row. it appears possible to get rows in BASE/WORKING
+ ### both set to 'incomplete'.  */
+  SVN_ERR_ASSERT((*status != svn_wc__db_status_absent
+  && *status != svn_wc__db_status_excluded
+  /* && *status != svn_wc__db_status_incomplete */)
+ || !have_work);
+
+#ifndef SVN_WC__SINGLE_DB
+  if (node_kind == svn_wc__db_kind_subdir
+  && *status == svn_wc__db_status_normal)
+{
+  /* We should have read a row from the subdir wc.db. It
+ must be obstructed in some way.
+
+ It is also possible that a WORKING node will override
+ this value with a proper status.  */
+  *status = svn_wc__db_status_obstructed;
+}
+#end

Autoconf warnings on trunk

2010-11-01 Thread Justin Erenkrantz
I'm getting a slew of warnings when I run autogen.sh on trunk.
(autoconf 2.68 via Macports.)

configure.ac:119: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call
detected in body
../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from...
../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from...
../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from...
build/ac-macros/neon.m4:72: SVN_NEON_CONFIG is expanded from...
../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from...
../../lib/autoconf/general.m4:1482: AC_ARG_WITH is expanded from...
build/ac-macros/neon.m4:39: SVN_LIB_NEON is expanded from...
configure.ac:119: the top level
configure.ac:135: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call
detected in body
../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from...
../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from...
../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from...
build/ac-macros/gssapi.m4:26: SVN_LIB_RA_SERF_GSSAPI is expanded from...
configure.ac:135: the top level
...more warnings...

I've tried to wrap the calls to AC_LINK_IFELSE like so:

Index: build/ac-macros/neon.m4
===
--- build/ac-macros/neon.m4 (revision 1029788)
+++ build/ac-macros/neon.m4 (working copy)
@@ -113,7 +113,7 @@
 #include 
 int main()
 {ne_xml_create(); ne_decompress_destroy(NULL);}"
-  AC_LINK_IFELSE([$neon_test_code], shared_linking="yes",
shared_linking="no")
+  AC_LINK_IFELSE(AC_LANG_SOURCE([$neon_test_code]),
shared_linking="yes", shared_linking="no")
   if test "$shared_linking" = "no"; then
 NEON_LIBS=`$PKG_CONFIG neon --libs --static`
 LIBS="$LIBS $NEON_LIBS"

But, that doesn't seem to be enough.  Also tried [[ ]]s within
AC_LANG_SOURCE and no dice.

Any idea what I'm missing?  Is this some lame autoconf bug?  -- justin


Re: Autoconf warnings on trunk

2010-11-01 Thread Justin Erenkrantz
On Mon, Nov 1, 2010 at 11:49 AM, Mark Phippard  wrote:
> On Mon, Nov 1, 2010 at 2:47 PM, Justin Erenkrantz  
> wrote:
>> I'm getting a slew of warnings when I run autogen.sh on trunk.
>> (autoconf 2.68 via Macports.)
>
> Cool, it is not just me!
>
> I recently reinstalled MacPorts and thought I did something wrong.

Yah, even though I haven't been able to figure it out, I didn't see
any mention of these warnings in the archives, so I thought it'd be
worth posting even though I'm a bit stumped at the moment.  -- justin


Re: Autoconf warnings on trunk

2010-11-01 Thread Justin Erenkrantz
On Mon, Nov 1, 2010 at 12:00 PM, Philip Martin
 wrote:
> The documentation suggests another '[' is needed:
>
>                 AC_LINK_IFELSE([AC_LANG_SOURCE

r1029802.  (also tested with stock 2.59 as well...seems okay.)

Yah, I somehow missed needing the leading [.

m4 as a language blows.  =)  -- justin


Re: What stands between us and branching 1.7?

2011-01-13 Thread Justin Erenkrantz
On Thu, Jan 6, 2011 at 1:04 PM, C. Michael Pilato  wrote:
> On 01/06/2011 03:48 PM, Stefan Küng wrote:
>> On 06.01.2011 21:41, C. Michael Pilato wrote:
>>> I'm sorry if I asked this before -- I've been asking individual folks for
>>> over a month now, but I can't quickly find a public broadcast thread about
>>> it, at least -- but I've been wondering lately:
>>>
>>>    What, exactly, stands in the way of us branching for 1.7 stabilization?
>>>
>>> ra_serf stabilization?  No... that's fairly well taken care of, and would
>>> fit perfectly within in the scope of post-branch work anyway.
>>
>> At least on Windows, I doubt that ra_serf is even used right now. Because of
>> the huge memory leak serf has/had (See here:
>> http://code.google.com/p/serf/source/detail?r=1416). But even though the
>> leak is fixed, there hasn't been another release yet.
>> With the latest release without that fix, serf is not usable at all.
>> To get more people to test ra_serf, serf itself first needs a new release
>> which includes that fix.
>>
>> Stefan
>>
>
> Xlnt feedback.  I've noted this on our roadmap.html page.  What else?

We could cut a serf bug fix release that has a few fixes that Lieven
committed shortly after we released 0.7.0.

Greg or Lieven, any thoughts here?  -- justin


Re: [serf-dev] Re: What stands between us and branching 1.7?

2011-01-20 Thread Justin Erenkrantz
On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts  wrote:
>> Greg or Lieven, any thoughts here?  -- justin
>
> At least the one rev that fixes this issue, don't know if the other are
> already working in all scenario's.
> I'll look at it this weekend and make a release.

Woohoo.  Thanks.  I'll be sure to test it.  Let me know if you need
any help.  -- justin


Re: [serf-dev] Re: What stands between us and branching 1.7?

2011-01-23 Thread Justin Erenkrantz
On Sun, Jan 23, 2011 at 5:41 AM, Lieven Govaerts  wrote:
> I'd appreciate some more testing, before I make the release later this week.
> The code to get is:
> http://serf.googlecode.com/svn/branches/0.7.x at r1427.

% uname -v
Darwin Kernel Version 10.5.0: Fri Nov  5 23:20:39 PDT 2010;
root:xnu-1504.9.17~1/RELEASE_I386
(OS X 10.6.5)
% ln -s ~/work/serf-0.7.x/test/serftestca.pem test/serftestca.pem
(I use VPATH builds, so had to bring over serftestca.pem into the
build dir to get the 2 SSL tests to pass)
% ./test/test_all
.

OK (17 tests)

And, 0.7.x builds fine against latest SVN 1.6.x branches and basic
co/ls functionality via ra_serf seems good.

Looks good!  Thanks!  -- justin


Re: [serf-dev] Re: What stands between us and branching 1.7?

2011-01-23 Thread Justin Erenkrantz
On Sun, Jan 23, 2011 at 8:42 AM, Stefan Küng  wrote:
> Ok, tested with serf from the 0.7.x branch: memory rise is still higher than
> with neon, indicating that there's still some (small) memory leak somewhere.
> But checkouts and updates of even larger projects succeed without using up
> all available memory as before.
> To compare: checkout of the TSVN source including all externals with serf
> uses up about 40MB more RAM than when using neon. I'd say that's ok.

Does the memory keep going up?  Or, does it reach a steady point?  I'd
expect that ra_serf would use a bit more memory than ra_neon as it has
to manage a lot more than what neon has to do.

As a point of reference, on Mac OS X 10.6, svn 1.6.x with ra_serf
checking out svn trunk peaks at 81MB, while ra_neon peaks at 12MB.

HTH.  -- justin


Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)

2011-01-23 Thread Justin Erenkrantz
I think this probably belongs in the book, but I don't know exactly
where...so I'll ensure that it's noted again and let someone else
figure out where it should live.

As we start to have more users with ra_serf, we should also recommend
that httpd server admins bump up their MaxKeepAliveRequests from 100
(default) to at least 1000.  This'll help ra_serf utilize TCP
connections more efficiently - as ra_serf issues 2 HTTP requests per
file that it needs to check out or update.

ra_serf has heuristics so that it'll work gracefully under the default
settings (and determine what the value actually is), but you'll get
better network performance if you can utilize fewer TCP connections.
This way, ra_serf will only need to open a new connection every 500
files rather than every 50.  (Honestly, there is really no reason that
it couldn't be 1.)

If we also have a tuning section, we should remind folks to also
enable mod_deflate (SetOutputFilter DEFLATE in their Location block)
as that'll help a bit when transferring XML content back and forth.
mod_deflate will come at a slight latency penalty, but that'll be
offset for folks with slower connections taking less time to transfer
the responses overall.

FWIW, I've already asked ASF infra to bump up the MaxKeepAliveRequests
settings for svn.apache.org.

Thanks!  -- justin


Re: [serf-dev] Re: What stands between us and branching 1.7?

2011-01-23 Thread Justin Erenkrantz
On Sun, Jan 23, 2011 at 10:22 AM, Lieven Govaerts  wrote:
> Things have changed since then though. Can anyone test with svn 1.6.x to see
> how it uses memory?

For ra_serf, I'm wondering if we're creating an additional pool that
isn't necessary - namely the editor_pool.

I've done some light testing with the ra_serf patch below, which does
the following things:
 - Create a separate pool hierarchy for files
 - Combines the per-file editor pool with the regular pool for file

For serf itself, I've also switched zlib to use the bucket allocator -
which also helps which churn - as every call to inflateInit/inflateEnd
invokes malloc/free - so we can save on that quite bit.

Overall, this seems to reduce the amount of 20KB allocations
substantially...but not sure it does much to the overall memory usage.

Not sure if/when I'll pick this up again...so patches below if someone
else wants to run with it.  -- justin

Index: update.c
===
--- update.c(revision 1062462)
+++ update.c(working copy)
@@ -107,6 +107,7 @@
   /* controlling dir baton - this is only created in open_dir() */
   void *dir_baton;
   apr_pool_t *dir_baton_pool;
+  apr_pool_t *file_pool;

   /* Our master update editor and baton. */
   const svn_delta_editor_t *update_editor;
@@ -202,9 +203,6 @@
   /* The properties for this file */
   apr_hash_t *props;

-  /* pool passed to update->add_file, etc. */
-  apr_pool_t *editor_pool;
-
   /* controlling file_baton and textdelta handler */
   void *file_baton;
   const char *base_checksum;
@@ -403,7 +401,7 @@
   report_info_t *new_info;

   new_info = apr_pcalloc(info_parent_pool, sizeof(*new_info));
-  apr_pool_create(&new_info->pool, info_parent_pool);
+  apr_pool_create(&new_info->pool, info->dir->file_pool);
   new_info->file_baton = NULL;
   new_info->lock_token = NULL;
   new_info->fetch_file = FALSE;
@@ -500,6 +498,7 @@
   if (dir->base_name[0] == '\0')
 {
   apr_pool_create(&dir->dir_baton_pool, dir->pool);
+  apr_pool_create(&dir->file_pool, dir->pool);

   if (dir->report_context->destination &&
   dir->report_context->sess->wc_callbacks->invalidate_wc_props)
@@ -519,6 +518,7 @@
   SVN_ERR(open_dir(dir->parent_dir));

   apr_pool_create(&dir->dir_baton_pool, dir->parent_dir->dir_baton_pool);
+  dir->file_pool = dir->parent_dir->file_pool;

   if (SVN_IS_VALID_REVNUM(dir->base_rev))
 {
@@ -632,7 +632,7 @@
   if (lock_val)
 {
   char *new_lock;
-  new_lock = apr_pstrdup(info->editor_pool, lock_val);
+  new_lock = apr_pstrdup(info->pool, lock_val);
   apr_collapse_spaces(new_lock, new_lock);
   lock_val = new_lock;
 }
@@ -641,7 +641,7 @@
 {
   svn_string_t *str;

-  str = svn_string_ncreate("", 1, info->editor_pool);
+  str = svn_string_ncreate("", 1, info->pool);

   svn_ra_serf__set_ver_prop(info->dir->removed_props, info->base_name,
 info->base_rev, "DAV:", "lock-token",
@@ -756,13 +756,11 @@
   return error_fetch(request, fetch_ctx, err);
 }

-  apr_pool_create(&info->editor_pool, info->dir->dir_baton_pool);
-
   /* Expand our full name now if we haven't done so yet. */
   if (!info->name)
 {
   info->name_buf = svn_stringbuf_dup(info->dir->name_buf,
- info->editor_pool);
+ info->pool);
   svn_path_add_component(info->name_buf, info->base_name);
   info->name = info->name_buf->data;
 }
@@ -772,7 +770,7 @@
   err = info->dir->update_editor->open_file(info->name,
 info->dir->dir_baton,
 info->base_rev,
-info->editor_pool,
+info->pool,
 &info->file_baton);
 }
   else
@@ -781,7 +779,7 @@
info->dir->dir_baton,
info->copyfrom_path,
info->copyfrom_rev,
-   info->editor_pool,
+   info->pool,
&info->file_baton);
 }

@@ -792,7 +790,7 @@

   err = info->dir->update_editor->apply_textdelta(info->file_baton,
   info->base_checksum,
-  info->editor_pool,
+  info->pool,
   &info->textdelta,
 

Re: Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)

2011-01-23 Thread Justin Erenkrantz
On Sun, Jan 23, 2011 at 11:26 AM, Stefan Sperling  wrote:
> On Sun, Jan 23, 2011 at 09:27:18AM -0800, Justin Erenkrantz wrote:
>> If we also have a tuning section, we should remind folks to also
>> enable mod_deflate (SetOutputFilter DEFLATE in their Location block)
>> as that'll help a bit when transferring XML content back and forth.
>> mod_deflate will come at a slight latency penalty, but that'll be
>> offset for folks with slower connections taking less time to transfer
>> the responses overall.
>
> Has this problem been fixed, then?
> http://svn.haxx.se/dev/archive-2009-08/0274.shtml
> I don't think we should recommend mod_deflate if that problem still exists.

No, it doesn't look like it has been fixed yet.

However, how many real-world HTTP clients are out there that don't
speak zlib?  =)

> I don't think that thread came to a conclusion.
> Greg was implying the bug was in mod_deflate rather than svn:
> http://svn.haxx.se/dev/archive-2009-08/0290.shtml

Greg is correct in that the patch doesn't go far enough - mod_dav_svn
should change all uses of output to be a ** rather than *.

As far as mod_deflate being broken, yah, it probably shouldn't be
trying to do any memory usage until it knows it is
activated...but...that's a topic for another list (dev@httpd).
mod_deflate should probably memorize the fact that the checks have
already run and get out of the way...probably with setting no-gzip in
subprocess_env?  -- justin


Re: Code doesn't seem ... right

2011-01-24 Thread Justin Erenkrantz
On Mon, Jan 24, 2011 at 2:22 PM, C. Michael Pilato  wrote:
> [Using dev@ as a public TODO list to avoid pushing stack on a task.]
>
> In mod_dav_svn/mirror.c:dav_svn__location_body_filter() and
> dav_svn__location_in_filter() are code blocks like this:
>
>    if (uri.path)
>        canonicalized_uri = svn_urlpath__canonicalize(uri.path, r->pool);
>    else
>        canonicalized_uri = uri.path;
>    if (strcmp(canonicalized_uri, root_dir) == 0) {
>    [...]
>
> So ... if uri.path == NULL, then canonicalized_uri is set to NULL, and then
> that NULL is used in a strcmp().  Won't that SEGFAULT?

It'd be difficult (if not outright impossible) to hit that else case.
Follow apr_uri_parse and apr_pstrmemdup.  Also know that we don't hit
this code block if master_uri isn't set.  The original code I wrote
was just a straight strcmp - I believe the check for null is spurious.

My $.02.  -- justin


Re: crash in serf on windows

2011-01-25 Thread Justin Erenkrantz
On Tue, Jan 25, 2011 at 12:02 AM, Marc Haesen
 wrote:
> I have seen a crash (null pointer access) during svn merge using serf. The
> same command using neon was successful.
>
>
>
> I am using a trunk compiled version of svn (revision 1061787) on windows
> together with a trunk version of serf (revision 1421) compiled with visual
> studio 2010
>
>
>
> The reason is that the serf_httpconn_socket received a null pointer as
> bucket.

The serf_httpconn stuff is probably not quite ready yet - I know
Lieven is still working through some issues there.  You may have far
better luck with the 0.7.x branch which we're prepping for a new
release soon:

 https://serf.googlecode.com/svn/branches/0.7.x

If you can give that a try, it'd be appreciated!

> Attached are the 2 crash report files and a screen dump of the .dmp file
> loaded in the debugger.

Sadly, there's nothing attached...so I have no clue what you're
seeing.  =(  -- justin


Re: Code doesn't seem ... right

2011-01-27 Thread Justin Erenkrantz
On Wed, Jan 26, 2011 at 10:28 AM, C. Michael Pilato  wrote:
> I'm not sure how to move forward.  Should we require that SVNMasterURI
> values point to something other than the server root?

It was never intended to be at the server root - doing a proxy of / is
always an odd proposition.  -- justin


Re: svn commit: r1064093 -/subversion/trunk/subversion/libsvn_repos/authz.c

2011-01-27 Thread Justin Erenkrantz
On Thu, Jan 27, 2011 at 11:14 AM, C. Michael Pilato  wrote:
> If we have have the option of moving towards case-sensitivity -- that is, a
> *more*-precise authz policy -- that seems like a good thing.  I'd even be in
> favor of making this behavior optional (like the force_username_case option
> we already have).

Given that we do wonky things on the client side when cases are mixed
in the same path, I think it'd be very weird to make this behavior
optional.  It's so highly unlikely that anyone is going to want apply
rules to "AuthZ" but not to "authz" - especially given that it
wouldn't have worked *at all* before anyway.

My $.02.  -- justin


Re: [Proposal] Remove DAV properties cache in ra_serf

2011-02-15 Thread Justin Erenkrantz
On Mon, Feb 14, 2011 at 7:57 PM, Ivan Zhakov  wrote:
> Of course these problems can be fixed, but I'm not sure that we need
> this code. Since in svn 1.7 we have HTTPv2 which doesn't use so many
> PROPFIND requests that we tried to reduce using DAV properties cache.
>
> So I'm propose just to remove DAV properties cache from ra_serf.
> Objections? Comments?

+1.  Makes sense to me.  -- justin


Re: HTTPv2 question - eliminate stub URLs in OPTIONS response?

2011-03-07 Thread Justin Erenkrantz
On Mon, Mar 7, 2011 at 9:34 AM, C. Michael Pilato  wrote:
> Such an optimization would be beyond negligible, processing- and
> network-usage-wise.  And by allowing the server to dictate "This is how you
> need to talk to me" we leave allow ourselves the option of making changes to
> the server's URL-space in a fashion that any HTTPv2-ready client can
> gracefully adjust to.

+1.  -- justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-07 Thread Justin Erenkrantz
On Mon, Mar 7, 2011 at 3:26 PM, John Beranek  wrote:
> Hmm...I'm surprised (and disappointed). No one is interested in
> Subversion 1.7 being lower performance than 1.6?

You're not telling us something we don't already know (go read the
archives some time).  Many folks are still working on improving the
performance of 1.7...so, general complaints aren't going to be
terribly productive.  -- justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-09 Thread Justin Erenkrantz
On Wed, Mar 9, 2011 at 8:28 AM, Philip Martin
 wrote:
>     0.54 epoll_wait(3, {{EPOLLOUT, {u32=30583944, u64=30583944}}}, 16, 
> 36) = 1 <0.11>

As best as I can tell, that strace isn't matching your earlier
description.  The last epoll_wait is taking 0.11 seconds?  --
justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-09 Thread Justin Erenkrantz
On Wed, Mar 9, 2011 at 8:57 AM, Philip Martin
 wrote:
> You are not looking at the right epoll_wait, look 14 lines down from the
> start.  That epoll_wait call takes <0.039952>.  The line below that
> shows a delay of 0.040042 to the subsequent read call.  So epoll_wait is
> responsible for nearly all the delay.

This epoll_wait() is waiting for the response back from the server, so
it makes sense that it'd take longer.  Do we know the server responded
sooner?  Could it be something on the server side?  IOW, this pattern
looks correct on the syscall() side.  -- justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-09 Thread Justin Erenkrantz
On Wed, Mar 9, 2011 at 9:12 AM, Philip Martin
 wrote:
> Joe Orton  writes:
>
>> On Wed, Mar 09, 2011 at 08:50:37AM -0800, Justin Erenkrantz wrote:
>>> On Wed, Mar 9, 2011 at 8:28 AM, Philip Martin
>>>  wrote:
>>> >     0.54 epoll_wait(3, {{EPOLLOUT, {u32=30583944, u64=30583944}}}, 
>>> > 16, 36) = 1 <0.11>
>>>
>>> As best as I can tell, that strace isn't matching your earlier
>>> description.  The last epoll_wait is taking 0.11 seconds?  --
>>
>> Smells like Nagle to me - looks like serf is turning Nagle *on* not
>> off... ("nodelay off" == "delay on") - try this?
>>
>> Index: outgoing.c
>> ===
>> --- outgoing.c        (revision 1440)
>> +++ outgoing.c        (working copy)
>> @@ -201,7 +201,7 @@
>>
>>          /* Disable Nagle's algorithm */
>>          if ((status = apr_socket_opt_set(skt,
>> -                                         APR_TCP_NODELAY, 0)) != 
>> APR_SUCCESS)
>> +                                         APR_TCP_NODELAY, 1)) != 
>> APR_SUCCESS)
>>              return status;
>>
>>          /* Configured. Store it into the connection now. */
>
> Yes, that's it.  Serf is now comparable to neon.

This patch is now applied to trunk and 0.7.x.  Thanks!  -- justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-10 Thread Justin Erenkrantz
On Thu, Mar 10, 2011 at 7:23 AM, Philip Martin
 wrote:
> We using a lot more CPU and elapsed time is hugely increased.

Just to make sure I understand, it's likely that this is more about
the libsvn_wc rewrite than anything with ra_serf?  That is, ra_neon
gives similar timings over NFS in 1.7?  Or?  -- justin


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-10 Thread Justin Erenkrantz
On Thu, Mar 10, 2011 at 8:22 AM, John Beranek  wrote:
> 1.6.16 from localhost trunk server (10 iterations)
> ra_neon: 1.43
> ra_serf: 2.91
>
> trunk from localhost trunk server (10 iterations)
> ra_neon: 1.48
> ra_serf: 2.95
>
> So, neon and serf retain their speeds relative to each other...

Heh, while we've got you testing stuff...what happens if the server is
on a different physical machine?  Are your results in line with
Philip's which says that ra_serf and ra_neon are within margin of
error (if serf is not indeed faster)?  Are the timings still off by
that much?  Because it's async, I expect serf has substantially
different performance characteristics when it is going over localhost
(no network) versus using the actual network stack...  -- justin


Re: wc_db API discussion

2011-03-12 Thread Justin Erenkrantz
On Fri, Mar 11, 2011 at 4:11 PM, Mark Phippard  wrote:
> I think we have to get this work done soon.  We cannot release with
> performance like it is.  How do we define the scope of the work that
> needs to be done so that we can divide and conquer and get these
> changes in place?

It sounds like we should codify what our performance targets are.

Is it acceptable if 1.7 is as fast as 1.6?  Should it be faster?
Could we accept a slowdown for 1.7 as long as we know how we can get
it on par (or faster) for 1.8?

What are the operations (and test cases?) that are important to us?  -- justin


Re: Random serf checkout failures

2012-11-05 Thread Justin Erenkrantz
On Mon, Nov 5, 2012 at 1:33 PM, C. Michael Pilato  wrote:
> My debugging indicates that when close_all_dirs() is called, there are a
> slew of unclosed directories which each have a non-zero ref_count.  Save for
> the root directory of the edit (/tags), fetch_props == TRUE,
> propfind_handler->done == TRUE, and tag_closed == TRUE for all open child
> directories.  So I'm looking around at how file references are managed to
> see if anything has leaked.  I might have inadvertently botched something
> when adding the pull-contents-from-the-wc-cache logic.

Yup - your analysis is pretty accurate IIRC.  I posted about this when
I was in Berlin and even had a few patches to start to clean this up.
However, Greg was planning on redoing how we parse the XML report;
when that happens, some of this management should be far simpler (as
we know when the tags are closed) - so the thinking was to wait until
that was done and then assess whether that was fixed or not.  --
justin


Re: Random serf checkout failures

2012-11-06 Thread Justin Erenkrantz
On Tue, Nov 6, 2012 at 3:22 PM, C. Michael Pilato  wrote:
> On 11/06/2012 10:15 AM, C. Michael Pilato wrote:
>> On 11/06/2012 08:29 AM, C. Michael Pilato wrote:
>>> I recall this patch of yours -- I even asked you about it post-Berlin.  I'll
>>> take a look at it now and see if it can help us out.
>>
>> Not seeing the active_dir_propfinds list corruption you mentioned, Justin.
>
> Bummer.  Now I am.

*grin*

Yah, the pool lifetimes are all...overly complicated.  The 'right' way
to solve it, IMO, is to redo the parsing logic akin to how Greg was
redoing the other ra_serf files.  -- justin


Re: Random serf checkout failures

2012-11-06 Thread Justin Erenkrantz
On Tue, Nov 6, 2012 at 4:41 PM, Ivan Zhakov  wrote:
> Another problem is how serf reads data from network in case of
> multiple connections: it reads data from one connection until EAGAIN.
> But if data comes from network really fast (from local server for
> example) it continue reading data from this connection, without
> reading data from other connection! Which leads time out them.
>
> See outgoing.c:read_from_connection()

Hmm.  I'm not sure if there's an easy answer for this while keeping
ra_serf single-threaded.  Perhaps keep a track of how much we've read
in a single connection and give up reading that connection for a bit
if we've read 100KB or some such to see if there are other connections
with responses ready to read.

But, then again, here in 2012, I'd probably fight a lot harder to make
the checkout logic run across multiple network threads as that would
spread the SSL or gzip loads across multiple cores.  -- justin


Re: Random serf checkout failures

2012-11-06 Thread Justin Erenkrantz
On Tue, Nov 6, 2012 at 12:13 PM, Lieven Govaerts  wrote:
> okay, so with the Timeout directive added I can reproduce this issue.
>
> What I see is that the server closes the connection in the middle of
> sending a response to the client. It doesn't even finalize the
> response first.
> So ra_serf is reading the data from the response bucket, but gets an
> APR_EOF when it needs more data than specified in the Content-Length
> header of the response.
>
> What is the expected behavior here, let serf close down the connection
> and try the request again on a new connection?

Sure sounds like an early TCP close causing a lost response.  httpd
shouldn't be closing the TCP connection in the middle of the response
as that's not when httpd would evaluate the Timeout directive...but,
if you are going over a loopback connection, I think I've seen Windows
lose bytes...but, I don't think Philip uses Windows...weird - I'd
guess something else is going on.  -- justin


Re: Content-Length in HEAD responses

2012-11-09 Thread Justin Erenkrantz
On Fri, Nov 9, 2012 at 10:15 AM, C. Michael Pilato wrote:

> request, so I'm wondering ... does Apache just handle a HEAD as a GET
> under-the-hood and then discard the resulting response body?  The comment
>

Unless the handler does something special, yes, httpd will treat HEAD as a
GET until it is time to send the response body...and simply drops the
response body.  -- justin


Re: Content-Length in HEAD responses

2012-11-10 Thread Justin Erenkrantz
On Fri, Nov 9, 2012 at 11:37 PM, Branko Čibej  wrote:

> That would imply that, if content-length doesn't get set on a HEAD
> response, but Transfer-Encoding: chunked does, then everything is
> correct, right? If somewhat inefficient.
>

The chunked header is only sent on non-HEAD responses.  r->chunked is set
as a side-effect in a conditional in httpd and that conditional gets
short-circuited earlier if we are a HEAD request.

BTW, if I do:

% curl --head
https://svn.apache.org/repos/asf/subversion/trunk/subversion/mod_dav_svn/repos.c
HTTP/1.1 200 OK
Date: Sat, 10 Nov 2012 11:48:26 GMT
Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/1.0.0c DAV/2
mod_wsgi/3.1 Python/2.6.6 SVN/1.7.0
Last-Modified: Fri, 19 Oct 2012 14:11:49 GMT
ETag: "1400104//subversion/trunk/subversion/mod_dav_svn/repos.c"
Accept-Ranges: bytes
Content-Length: 157969
Vary: Accept-Encoding
Content-Type: text/plain; charset=ISO-8859-1

There is a C-L header...so, I don't know what the original poster is
seeing, but we're already doing the right thing...  -- justin


Re: Content-Length in HEAD responses

2012-11-10 Thread Justin Erenkrantz
On Sat, Nov 10, 2012 at 6:49 AM, Justin Erenkrantz wrote:

> There is a C-L header...so, I don't know what the original poster is
> seeing, but we're already doing the right thing...  -- justin
>

I bet the OP is trying to do HEAD on a directory as there was talk
elsethread about trying to discover ACLs on a directories...that's not
going to generate a C-L.  And, C-L is not meaningful in anyway on
directories - only actual resources.  mod_autoindex doesn't do it either:

% curl --head https://archive.apache.org/dist/
HTTP/1.1 200 OK
Date: Sat, 10 Nov 2012 11:53:36 GMT
Server: Apache/2.4.1 (Unix) OpenSSL/1.0.0g
Content-Type: text/html;charset=ISO-8859-1

(The only way to generate the C-L would be to run through the
response...which we don't do for HEAD.)

HTH.  -- justin


Re: Content-Length in HEAD responses

2012-11-10 Thread Justin Erenkrantz
On Sat, Nov 10, 2012 at 3:25 PM, Thomas Åkesson  wrote:

> I suppose this means that it would be a significant optimization to
> perform HEAD rather than GET when discovering ACLs for every subdirectory
> in a directory listing?
>

Probably - doing the HEAD request will run the full authn and authz checks,
but it won't produce the bodies - you'll also save not having to send the
responses on the wire - but you won't know what the directory listing is
unless you do a GET in the first place.  So, it might help at the leaf
nodes in the tree.  (But how would you know it's a leaf!  Fun.)


> Branko's concern is still interesting... because this behaviour (omitting
> CL for HEAD requests) does seem to violate the HTTP RFC, but for good
> reason. Given that mod_autoindex as well as mod_php (at least on the config
> I tested) also omits CL for HEAD I suppose it is a well accepted
> optimization in practice.
>

IIRC, there is an out in the RFC if the content is dynamic - this also may
be something cleaned up in the httpbis RFC clarifications as forcing the
server to generate the content only to throw it away (when it can't be
pre-computed) is kinda pointless.  -- justin


Re: serf in 1.8

2012-11-12 Thread Justin Erenkrantz
On Mon, Nov 12, 2012 at 9:13 PM, Philip Martin
wrote:

> I have a checkout of the gcc tree, it has 78,000 files.  Now it uses
> svn: but if it were to use http: then the serf checkout log would be 4
> orders of magnitude bigger than the neon log.  83 years becomes 1 or 2
> days.
>
> The neon log is independent of the size of the checkout, the serf log
> scales with the size of the checkout.  If this were memory we would say
> we have a scaling problem.  Do scaling problems not apply to disk space?


If you have a high-traffic site and you choose not allocate the disk space,
rotate the logs or *gasp* don't log.

There are clear solutions to this "problem" that are well-known by any
competent owner.  -- justin


Re: [serf-dev] Re: Double compression over HTTPS

2012-11-13 Thread Justin Erenkrantz
On Tue, Nov 13, 2012 at 2:41 AM, Lieven Govaerts  wrote:

> The proposed approach here is to disable SSL compression completely,
> so in terms of the above that leaves us with scenario 2.
>
> Serf doesn't have an option currently to disable SSL compression from
> the client side. I plan to add it in the next version.


+1.  -- justin


Re: serf in 1.8

2012-11-13 Thread Justin Erenkrantz
On Tue, Nov 13, 2012 at 2:26 AM, Lieven Govaerts  wrote:

> Is that a fix that we can get in apache 2.2? Justin?
>

I don't readily recall the issue in mod_deflate.  Is it already resolved in
the 2.4 series?


> work. These concern maybe only 1% of the users, but we should have a
> plan to support them, and I think keeping neon in the build as the
> easiest way to do that.
>

IMO, the "plan to support them" (besides welcoming patches to support that
1% case) is tell them to stick with 1.7.  =P  -- justin


Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c

2012-11-30 Thread Justin Erenkrantz
On Fri, Nov 30, 2012 at 4:54 PM,  wrote:

> Author: cmpilato
> Date: Fri Nov 30 21:54:35 2012
> New Revision: 1415864
>
> URL: http://svn.apache.org/viewvc?rev=1415864&view=rev
> Log:
> Implement in ra_serf "send-all" mode support for update-style REPORTs
> and their responses.  (Currently disabled by compile-time conditionals.)
>
> (This one goes out to Ivan Zhakov.)
>

I've stated for a long time that I think the send-all mode is a huge
mistake architecturally because it is too prone to double-compression and
TCP pipeline stalls and is a tremendous burden on a properly-configured
httpd (by not taking advantage of server-side parallelism), it's nice to
see it's not *too* hard to shoehorn this bad idea back into ra_serf.  We'd
never be able to shove the non-send-all approach into ra_neon.  =)

Here's my suggestion for consideration - let's experiment with this setting
in the beta release process with the setting as-is - that is we always do
the parallel updates unconditionally (except perhaps when svnrdump is being
stupid).  If we get real users complaining about the update during that
cycle, we can then figure out either switching the default and/or adding a
config-option or even allowing some control via capabilities exchange.

Does that sound reasonable?  -- justin


Re: reposurgeon now writes Subversion repositories

2012-11-30 Thread Justin Erenkrantz
On Fri, Nov 30, 2012 at 5:26 PM, Branko Čibej  wrote:

> When I create an account at [your favourite forge], I tell it my name
> and give it one of my e-mail addresses. It is, in my opinion, up to the
> forge software to use that in svn:author, not up to local user
> preferences. That /is/ a fundamental difference between the centralised
> and distributed models.
>
> Why do forges not do that? I don't know, but it's definitely not because
> Subversion doesn't give them fifteen ways of manipulating the svn:author
> property.
>

+1.

Furthermore, if the user wants to set their own custom revprops that have
nothing to do with svn:author (e.g., svn:federation-id) and end up with a
practice by convention rather than by server fiat that tools like
reposurgeon can understand, there's plenty of ways for the user *or the
server* to add in additional revprops as well.  Plus, since revprops aren't
versioned, they can always be fixed up after-the-fact to include
appropriate svn:federation-id tags...finally, a "feature" of revprops not
being versioned!  -- justin


Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 5:59 AM, Johan Corveleyn  wrote:

> I'm wondering whether your concerns apply to both internet-wide
> deployments and local (all on the same LAN) ones.
>

That line is certainly a fair one to draw in the sand.  That said, I think
the internal use case cries out even *more* for the parallel updates as the
internal server in that environment is often wildly over-provisioned on the
CPU side - with a fairly low-traffic environment, you want to take
advantage of the parallel cores of a CPU to drive the updates.

Generally speaking, what I discovered years ago back in 2006 (yikes) and I
believe is still true as we near 2013 (shudder), if everything else is
perfectly optimized (disk, latency, bandwidth, etc.), you're going to
eventually bottleneck on the checksumming on both client and server - which
is entirely CPU-bound and is expensive.  You can solve that by splitting
out the work across multiple cores - for a server, you need to utilize
multiple parallel requests in-flight; and for a client, you then need to
parallelize the editor drive.

The reason that disk isn't such a bottleneck as you might first expect is
due to the OS's buffer cache - for reads on the server-side, common data is
already going to be in RAM so hot spots in the fsfs repos will already be
in memory, for writes on the client-side, modern client OSes won't
necessarily block you until everything is sync'd to disk.  But, once you
exhaust the capabilities of RAM, your underlying disk architecture matters
a lot and one that might not be intuitive to those that haven't spent a lot
of time closely with them.  (Hi Brane!)  If you are using direct-attached
storage locally on either server or client, then you will probably be
bottlenecked right there.  However, if your corporate environment has an
NFS filer or SAN (a la NetApp/EMC) backing the FSFS repository or as NFS
working copies (oh so common), those large disk subsystems are geared
towards parallel I/Os - not single-threaded I/O performance -
Isilon/BlueArc-class storage is however; but I've yet to see anyone
obsessed enough about SVN I/O perf to place either their repository or
working copies on a BlueArc-class storage system!  So, if you are not using
direct-attached storage and are using NFS today in a corporate environment
on either client or server, then you want to parallelize everything so that
you can take advantage of the disk/network I/O architecture preferred by
NetApp/EMC.  Throwing more cores against a NetApp/EMC storage system in a
high-available bandwidth environment allows for linear performance returns
(i.e., reading/writing one I/O is 1X, two threads is 2X, three threads is
3X, etc, etc.).

To that end, I'd eventually love to see ra_serf drive the update editor
across multiple threads so that the checksum and disk I/O bottleneck can be
distributed across cores on the client-side as well.  Compared to where we
were in 2006, that's the biggest inefficiency we have yet to solve and take
advantage of.  And, I'm sure this'll break all sorts of promises in the Ev1
and perhaps Ev2 world and drive C-Mike even crazier.  =)  But, if you want
to put a rocket pack on our HTTP performance, that's exactly what we should
do.  I'm reasonably certain that serf itself could be finely tuned to
handle network I/O in a single thread at or close to wire-speed even on a
10G connection with a modern processor/OS - it's what we do with the file
contents/textdeltas that needs to be shoved to a different set of worker
threads and remove all of that libsvn_wc processing from blocking network
traffic processing and get it all distributed and thread-safe.  If we do
that, woah, I'd bet that we are we going to make things way faster across
the board and completely blow everything else out of the water when our
available bandwidth is high - which is the case in an internal network.
 And, yes, that clearly could all be done in time for 1.8 without
jeopardizing the timelines one tiny bit.  =P

So, that's my long-winded answer of saying that, yah, even in an internal
LAN environment, you still want to parallelize.

However, I'm definitely not going to veto a patch that would add an httpd
directive that allows the server to steer the client - unless overridden by
the client's local config - to using parallel updates or not.  -- justin


Re: RFC: simple proposal for Internet-scoped IDs

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 8:53 AM, Eric S. Raymond  wrote:

> I'm not certain, but if your server-side hooks work the way I think
> they do, all of this except (1) can be done in Python.  Not having to
> add complexity to your C code is a significant virtue.
>

Here's another approach to take with regards to setting the FULLNAME field
that doesn't require any change to the client and can be deployed
server-side via hooks without any code changes at all.  So, (1) can be done
in Python pretty easily for the lazy users and coders who don't want to do
anything at all.  =)

If you have a centralized registry (like either LDAP or
http://people.apache.org/committer-index.html), the server in the
server-side hooks can set FULLNAME for each svn:author if isn't set by the
client by looking up its internal directory.  Within the ASF
infrastructure, we have tools to allow committers to manage fields like
this in a self-service way.  So, the server admin can default that field as
they like and give the users to set that field as they like.

I believe that this is one of the benefits of a centralized infrastructure
- we can make it so that every client doesn't *have* to set something
themselves on their client to utilize FULLNAME.

And, once again, I'll reiterate my earlier point that FULLNAME can be added
retroactively pretty easily to existing SVN repositories.  So, for
svn.apache.org, after we might deploy a FULLNAME infrastructure, we could
easily craft a tool to go back to all old revisions and annotate them
correctly.  Easy peasy.  -- justin


Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 9:01 AM, Lieven Govaerts  wrote:

> There are some scenario's where either the server admin or the user
> can decide if parallel requests make sense or not.
>
> I'm specifically thinking of the use Kerberos per request
> authentication. These responses can't be cached on the client side,
> and require the authorization header to be sent for each request.
> Assuming 2 step handshake of which serf can bypass the first, this
> means an overhead per request of 1-10KB, with a 3 step handshake each
> request has to be sent twice further increasing the overhead.
> IMHO in this scenario the server admin should be able to veto the use
> of parallel requests.
>
> And the same is true for https connections, where it's also the server
> admin who can decide if the necessary caches have been put in place to
> enable the benefits of parallel requests.
>

Totally agreed.  I'd favor a three-value httpd directive option on the
server-side that is advertised in the capabilities exchange:

- default (client defaults to parallel if ra_serf, serial if older ra_neon
client; or if client overrides ra_serf via their local servers options)
- serial (server suggests to client that it should be serial; but permit
parallel when client wants it)
- force-serial (same capability advertisement, but always trigger send-all
responses regardless of what client asks for)

I'm 95% sure we have code in ra_serf that handles the case where the server
sends us inline responses anyway as older (prior to 1.2, IIRC) always sent
inline responses no matter what we send...so, it should be fairly
straightforward decision tree with minimal code changes.

My $.02...which is still not enough for me to write the patch.  =)  --
justin


Re: RFC: simple proposal for Internet-scoped IDs

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 9:28 AM, Daniel Shahaf wrote:

> BTW, the ability to change svn:author at will is one of the reasons they
> aren't global-scoped: if Subversion ever migrated away from ASF, we can
> _then_ change all svn:author revprops --- just like we once changed
> "zhakov" (implied @tigris) to "ivan" (implied @apache).
>

Exactly.  And, to be honest, I probably never realized that Ivan's author
tag changed...I knew it was him in both cases as I'm involved in the
project.  So, for most human-scale projects, I think that you don't need
globally unique IDs as there's a defined community and set of participants.
 Whether I call him "zhakov" or "ivan" - it's the same person.  I know
that, you know that...and, really, all of the people who care know that.  =)

For projects where people don't know everyone (Linux), I can see why
globally unique IDs are helpful to contributors.  But, I would shudder if
suddenly svn's own blame output emitted "Daniel Shahaf <
d...@daniel.shahaf.name>" instead of "danielsh".  I have that map already in
my head thank-you-very-much.  Hence, this is why I'd be a strong proponent
of them being in separate revprops - a "local" project name (svn:author)
and something that more uniquely identifies the contributor (FULLNAME blah
blah blah).  And, perhaps have an option on the client as to which one to
use - I could see some folks wanting the GUID, but that's just way too
verbose for me...  -- justin


Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 9:41 AM, Branko Čibej  wrote:

> On 01.12.2012 14:31, Justin Erenkrantz wrote:
> > And, yes, that clearly could all be done in time for 1.8 without
> > jeopardizing the timelines one tiny bit. =P
>
> Eep ... :)
>
>
> Another thing I've been thinking about is this: Why are we using SHA1
> checksums on the server and on the wire for consistency checks when a
> 64-bit CRC would do the job just as well, and 15 times cheaper? And
> banging my head against the wall for not thinking of this 10 years ago.
>
> I can sort of understand the use of SHA1 as a content index for
> client-side pristine files. On the server, however ... dunno. Maybe we
> could design something akin to what the rsync protocol does, but for
> repository-wide data storage. Could be quite tricky to achieve locality,
> however.
>

The one thing that's nice with using SHA checksums is we're using it
everywhere.  It makes protocol debugging a *lot* easier - since we also
used SHA checksums as the content index, that makes it easier to compare
what we recorded in libsvn_wc to what was sent by the server.  If we
diverged the checksums algorithms, it'd be hard to do a quick comparison
visually (do the checksums match?) without actually running the checksum
yourself!

So, I think we optimized for humans here...and I'm okay with that.  We can
always build faster processors...and take advantage of parallelism.  =)

There I go off on a tangent again.
>

*grin*  -- justin


Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c

2012-12-01 Thread Justin Erenkrantz
On Sat, Dec 1, 2012 at 12:00 PM, Mark Phippard  wrote:

> I feel pretty strongly that we should at minimum use the send-all
> approach when talking to pre-1.8 servers.  Even though in some
> situations it could still offer good performance.  I just think it
> would be more respectful to our users (server admins in this case) to
> not change this behavior in a way that could surprise them.  Maybe we
> could come up with exceptions, such as older servers that are using
> the SVNAllowBulkUpdates off directive.  In that situation we should
> use the new behavior since that is basically what that directive is
> asking for.
>

Without a lot of concrete feedback that parallel updates should be removed
by default, I strongly believe that we should not be conservative on this
issue.  The issue here is not one of compatibility - ra_serf has been
around for years and can talk just fine to older servers (way back to prior
to 1.0 servers actually).  The only argument against altering the default
behavior is that there might be an admin of a high-traffic site somewhere
that might suddenly be shocked by more HTTP requests coming in.  I honestly
have little to no sympathy for such an admin who doesn't properly
understand how to manage a large installation - they likely have other
issues that they are not paying attention to.  Until we have hoards of
users coming in and complaining about this, I think it's silly to be
conservative here.

I'm definitely not against giving knobs to the client or to the admin in
weird corner cases (provided someone cares enough to write that up), but I
strongly believe that for now we should do the right thing out of the box
in 1.8 - which is to utilize parallel updates.  -- justin


Re: Error running context

2012-12-02 Thread Justin Erenkrantz
On Sun, Dec 2, 2012 at 9:05 AM, Lieven Govaerts  wrote:

> Attached the patch. ( I get paid per mail I send to this list in case
> no one noticed. ;) )


The patch looks right to me - the short-circuit to do the simple window can
clearly be called for any window regardless of offset.  Perhaps we can add
this testcase to the regression tests as well?

As for the looping, didn't we put in logic to stop retrying after a number
of request failures?  That's probably a reasonable thing to do...I
definitely think a very nice feature of serf is that it *will* retry (which
is helpful in flaky network situations), but it needs to stop at a certain
point.  =)  -- justin


Re: svn commit: r1417639 - in /subversion/trunk/subversion/mod_dav_svn: dav_svn.h mod_dav_svn.c reports/update.c

2012-12-06 Thread Justin Erenkrantz
On Wed, Dec 5, 2012 at 4:50 PM, Stefan Sperling  wrote:

> In the 1.8 release notes We can raise attention in the fact that a 1.8
> server should turn off skelta mode if necessary, such as when each GET
> request causes an authentication request to LDAP.
>

+1.  I'm all for documenting and giving knobs to users who RTFM.

My very strong preference is that we should do the "right thing" out of the
box - and that is to use parallel updates - even against older servers.
 This has always been the behavior of ra_serf and it's definitely backwards
compatible with all 1.x servers.

If an admin doesn't like it (because of per-req authn or bad logging
subsystems or the phase of the moon), they can try to direct 1.8+ clients
away from parallel updates via the new directive/headers; but, we should
not require any configurations to do the best generically recommended
approach.  -- justin


Re: svn commit: r1417639 - in /subversion/trunk/subversion/mod_dav_svn: dav_svn.h mod_dav_svn.c reports/update.c

2012-12-13 Thread Justin Erenkrantz
On Wed, Dec 12, 2012 at 4:24 PM, C. Michael Pilato wrote:

> Here's a question I've been wondering for some time:  should we expose to
> users a configuration option for declaring the number of aux connections
> ra_serf should use?  I mean, Firefox has exposed such an option for years.
> If we did so for Subversion, we could say, "Yeah, ra_serf+svnrdump is a bad
> combination.  Here's a workaround.  Run it with
> --config-option=servers:global:serf-max-connections=2".  Or better yet (as
> per my other mail in this thread), we could just hardcode that option
> override in svnrdump itself.


Both approaches (allow config option to control # of conns; have svnrdump
disable parallelism) seems reasonable to me.  FWIW, I'd probably set it to
8 -  since 2006, most browsers upped their connection parallelism to 8.
 *grin*

And, yah, it's quite obvious from reading the thread that you missed a
bunch of Lieven's emails.  -- justin


Re: [PATCH] Test for line ending bug in svnrdump (issue 4263)

2012-12-13 Thread Justin Erenkrantz
On Wed, Dec 12, 2012 at 5:06 PM, Daniel Shahaf  wrote:

> P.S.  This thread was an unusually long one, for a patch that adds about
> a dozen lines of code.
>

Uh, how is that at all unusual for this crowd?  =)  -- justin


Re: Would branching across repositories be theoretically possible?

2013-01-01 Thread Justin Erenkrantz
On Mon, Dec 31, 2012 at 9:05 AM, Stefan Sperling  wrote:

> BTW, it would have been better to send your question to the users@ list.
> The dev@ list is off-topic for questions like this. I suppose you chose
> dev@ because of your idea to develop support for cross-repository copies,
> which I'll address now:


At a meta-level, I've seen this dismissal and plea to just send to
users@instead happen a few times recently on dev@.
 I'd like to ensure we don't hinder potential contributors by sending out
the wrong tone in their initial interactions with dev@.  We welcome all
contributors with open arms.  To me, Zem's initial post was perfectly
appropriate for dev@ and given the ensuing discussions including some that
apparently happened off-list, it was (and is) a good catalyst for
discussion and bringing those discussions back to the list.  In other
words, in my opinion, a very good and on-topic dev@ post.

Back to the topic at hand, my take on this feature is it's probably a good
idea to figure out how to solve it...the landscape over a decade ago when
we started out isn't the same as it is now.  -- justin

P.S. Hi Zem!  =)


Re: Proposal for OPW project

2013-01-01 Thread Justin Erenkrantz
On Tue, Jan 1, 2013 at 4:17 PM, Gabriela Gibson
wrote:

> The following is a conversation that should have been held on the
> list, please excuse the misunderstanding on my part.
>

No worries - we've all been there before.  You'll get the hang of it soon
enough.  =)  Keep up the posts!

Cheers.  -- justin


Re: Apache Subversion: GNOME Women Outreach

2013-01-01 Thread Justin Erenkrantz
On Mon, Dec 31, 2012 at 4:22 AM, Miriam Hochwald
wrote:

> I am new to this random public email writing ... and, at least for me it
> takes a fair amount of courage to do so. I have had to face a few phobia
> barriers based on experience.
>

Welcome!  It can indeed be quite scary at first...then, you just give in
and hope no one looks at your initial posts a decade from now.  =)


> I am interested in contributing to the project:
> *
> *
> *“Improve progress output displayed during update and commit.”*
>
> I have written an approach towards it as follows:
>
>  This is basically a Human Computer Interaction (HCI) problem combined
> with understanding the code, and estimating or updating the time it takes
> to complete a file packet transfer. The user needs to be provided with more
> detailed progress information for ‘svn’ commands, so they know how much
> longer they need to wait. This feature is very important, especially for
> GUI clients. Currently there are a rudimentary series of dots, one dot per
> file with a filename after each complete file. This approach provides
> insufficient visual communication regarding the percentage of the total
> size to transmit, or the total number of files to transmit. In general
> this is an interface change, which could be approached such that it
> refrains form disturbing the existing parsers of ‘svn commit’ output. Some
> progress information is available such as the number of bytes or progress
> of a data chunk to be transferred. Missing from the equation is the number
> of files/ chunks that need to be transferred and how many already are
> transferred. For when doing a ‘svn up’, there is no way of telling the user
> whether it will take seconds, minutes or hours until the update will be
> complete if a lot has been changed. One approach could be to increase the
> granularity of update feedback. For example there could be progress bar
> feedback for every 50K in either direction. Alternately, since it is known
> how many files there are to transmit, a progress bar could indicate the
> number of files successfully transferred as a percentage of the total
> counted. For each file DeltaV knows the data to be transmit, so a progress
> bar could be established for it.
>
Yup, ra_serf (what we use on the client-side for HTTP/pseudo-DeltaV) has a
pretty good idea what the overall file size is and how much has been
written on the wire (when committing) or receiving (when updating).  We've
always just copped out with one dot per file at the notification - it's
always been an annoyance to me that we haven't come up with a better
mechanism - some files are 1KB and some doofuses like to commit binary 30MB
JAR files (looking at you Java devs!) - yet, both are represented with one
dot.  So, I look forward to what you come up with here!  I don't know how
much we'll be able to improve ra_svn - my hunch is that it'd be easier to
get it working with ra_serf first and then come back to ra_svn after you
have all of the notification APIs updated.

If what I just wrote doesn't make complete sense yet, don't panic - after
you get more acquainted with the source code, the paragraph above should
make more sense!  =)

> As yet I am still setting up the development environment, working through
> the follow instructions:
>
> http://svn.apache.org/repos/asf/subversion/trunk/INSTALL
>
> Putting together all these bits and bods seems quite foreign to me, and is
> quite confronting. So, I am hopeful that people in this community can be
> supportive and understanding of a learning/ experience gap ... as I trail
> blaze into something I haven't tried before.
>
If you see anywhere in the INSTALL documentation that isn't clear, ask!
 That documentation is intended for new contributors like you - so, if it
doesn't make sense, let us know.  Even better, submit patches to the
INSTALL docs.  =)

> I look forward to getting to know some of you in the near future.
>
Same here!  -- justin


Re: Apache Subversion: GNOME Women Outreach

2013-01-02 Thread Justin Erenkrantz
On Tue, Jan 1, 2013 at 9:43 PM, Miriam Hochwald wrote:

> Hello Justin,
>
> Actually the INSTALL docs  ... whilst detailed make me phobic. It seems to
> target the audience that likes to tinker, rather than just click and oh ...
> there it is (think iPad users ... which is most of the world). It might be
> better to have something other than plain text, in a kind of "Wizard"
> approach - problem solver perspective.
>
> A simple piece of "stupidity" to a Noob which I have just encountered is
> ... use of 'make' - well, if you don't have a C complier installed, then
> "ba-bum!" no fun. So, then I needed to go back into the plain text doc (no
> highlighters or formatting assistance) and then go ... hmmm... C compiler,
> C compiler. Googles gcc. Googles Xcode. Needs updated Mac OS. Downloads
> updated Mac OS - updates Mac. Think she has lost *all* her files - Panic!
> Finds files - "phew!" Downloads Eclipse - chooses version and updates.
> Computer doesn't like update etc. ... etc. Ok ... so you see the picture.
> Eventually when I finish updating Xcode and obtain gcc I will go back to
> that unformatted plain text and read the next line. Convoluted.
>

If you're up for creating one, I'm sure we'd find a home for a separate
document that goes soup-to-nuts on what needs to be installed.  I
definitely agree that we make some assumptions about the environment.  =)

At times, I wish we had a comprehensive toolchain install doc for Windows
that was clear; so, it's probably reasonable to assume that there could be
a use for a clear bootstrap doc for Mac too for devs who might not be as
familiar with Mac OS X and its toolchain.

Keep chugging along!  -- justin


[PATCH] Add -no-suppress to LT_CFLAGS

2013-01-02 Thread Justin Erenkrantz
I'm not entirely sure if we should commit this, but I'll throw this on-list
for posterity and future reference.

I was trying to compile trunk and was getting the following:

/bin/sh path-to-svn/libtool --tag=CC --silent --mode=compile ccache gcc
-std=c89  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp
-DDARWIN_10  -g -g -O2-I../../subversion/subversion/include
-I./subversion -I/opt/local/include/apr-1   -I/opt/local/include/apr-1
-I/opt/local/include -I/opt/local/include/serf-1 -I/opt/local/include   -o
subversion/svn/conflict-callbacks.lo -c
../../subversion/subversion/svn/conflict-callbacks.c
gmake: *** [subversion/svn/conflict-callbacks.lo] Error 1

The key thing here was that I did not get an error message - even though
there was an error code.  (Running with --debug indicated that libtool was
seeing an error returned by ccache and subsequently deleting the .lo files.)

Because GNU libtool always compiles code twice on Mac OS X (insert rant
against GNU libtool here), it defaults to hiding all compiler output on the
second pass.  In their infinite wisdom, I guess they figure that if the
first compile passes, the second always will too.  So, if I add
"-no-suppress" to LT_CFLAGS, I get:

/bin/sh path-to-svn/libtool --tag=CC --silent --mode=compile ccache gcc
-std=c89  -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp
-DDARWIN_10  -g -no-suppress -g -O2
 -I../../subversion/subversion/include -I./subversion
-I/opt/local/include/apr-1   -I/opt/local/include/apr-1
-I/opt/local/include -I/opt/local/include/serf-1 -I/opt/local/include   -o
subversion/svn/conflict-callbacks.lo -c
../../subversion/subversion/svn/conflict-callbacks.c
ccache: FATAL: Could not create
/Users/xx/.ccache/9/9/3aa71ed31e8f3313f3404f17c0985c-599411.o.tmp.stdout.xx.local.25073
(permission denied?)
gmake: *** [subversion/svn/conflict-callbacks.lo] Error 1

That of course is a far more useful error message and led me to the root
cause (bad perms in ~/.ccache/).  In subsequent builds, no more output was
produced with -no-suppress.  I guess if we have warnings in the code, it'd
perhaps emit the warnings twice for every compilation.  But, then again, I
always like reminding devs how evil libtool is and that it's
double-compiling everything.  =P  The right and proper thing would likely
be if libtool stashed the output in a variable and only emitted it when
there's an error code...alas.  -- justin

* configure.ac: Never suppress possible error output from the second-pass
compile.

Index: configure.ac
===
--- configure.ac(revision 1428128)
+++ configure.ac(working copy)
@@ -260,6 +260,7 @@
  [Build shared libraries]),
   [svn_enable_shared="$enableval"], [svn_enable_shared="yes"])

+LT_CFLAGS="-no-suppress"
 if test "$svn_enable_static" = "yes" && test "$svn_enable_shared" = "yes"
; then
   AC_MSG_NOTICE([building both shared and static libraries])
 elif test "$svn_enable_static" = "yes" ; then


Re: [PATCH] Add -no-suppress to LT_CFLAGS

2013-01-03 Thread Justin Erenkrantz
If you look at the path of the file that failed compilation, it is in
subversion/svn/.  On Mac OS X there is no need to compile it twice as there
is no real supported concept of PIC - it's a no-op for the compiler.
Linking the libraries statically or dynamically is different of course.  --
justin
On Jan 3, 2013 2:18 AM, "Peter Samuelson"  wrote:

>
> [Justin Erenkrantz]
> > Because GNU libtool always compiles code twice on Mac OS X (insert rant
> > against GNU libtool here)
> ...
> > But, then again, I always like reminding devs how evil libtool is and
> > that it's double-compiling everything.  =P
>
> Well, you asked it to build both static and shared libraries.  What's
> it supposed to do?  With most ABIs, you want to build shared library
> code in PIC mode and all other code in non-PIC mode.  So yes, that
> means double-compiling.  You'll notice it does _not_ do this for
> non-library code such as subversion/svn/ or subversion/svnadmin/.  (At
> least on Linux it doesn't.  I actually have no idea what it does on Mac
> OS.)
>
> Frankly, if you think double-compiling is "evil", you should be
> suggesting that we make either '--disable-static' or '--disable-shared'
> the default - see configure.ac, line 238 or so.  Not blaming libtool.
>


Re: [RFC] Deprecate Berkelety DB filesystem backend

2013-01-05 Thread Justin Erenkrantz
+1!  -- justin
On Jan 5, 2013 10:40 AM, "Greg Stein"  wrote:

> Is "+1" too short of a response?
>
> :-)
> On Jan 4, 2013 7:35 PM, "Branko Čibej"  wrote:
>
>>  For the following reasons
>>
>>- FSFS has been the default filesystem backend for almost 7 years,
>>since 1.2.
>>
>> - Looking at commit history, I've not seen a single (functional or
>>performance) improvement, beyond a few bug fixes, in the BDB backend in at
>>least two years. The last significant change that I'm aware of was 
>> released
>>in 1.4 (support for BDB 4.4. and DB_REGISTER) back in 2006. In effect, BDB
>>is in "barely maintained" mode whereas interesting things are happening to
>>FSFS all the time.
>>
>> - I cannot remember seeing a BDB-related bug report in a month of
>>Sundays (meaning that it's either rock-solid, or not used).
>>
>> - The BDB backend is an order of magnitude slower on trunk than FSFS
>>   - timing parallel "make check" on my 4x4-core i7+ssd mac:
>>  - FSFS: real 7m33.213s, user 19m8.075s, sys 10m54.739s
>>  - BDB: real 35m17.766s, user 15m28.395s, sys 11m58.824s
>>
>> I propose that we:
>>
>>- Declare the BDB backend deprecated in 1.8, adding appropriate
>>warnings when it's used or manipulated (to svnadmin?)
>>
>> - Stop supporting it (including bug fixes) in 1.9
>>
>> - Completely remove the BDB-related code in 1.10 (I'm making an
>>assumption that we'll have the FSv2 API and releated refactoring of FSFS 
>> by
>>then, and at least a draft experimental FSv2 implementation).
>>
>>
>> I realize I'm raising this question at a time when we should be focusing
>> on branching 1.8. On the other hand, this release may be a good opportunity
>> to deprecate the Berkeley DB filesystem.
>>
>>
>> --
>> Branko Čibej
>> Director of Subversion | WANdisco | www.wandisco.com
>>
>>


Re: BDB vs FSFS - OMG!

2013-01-07 Thread Justin Erenkrantz
On Sun, Jan 6, 2013 at 9:44 PM, C. Michael Pilato wrote:

> hosted elsewhere for them.  The BDB backend (thanks to improvements to the
> Berkeley DB library itself) is much more stable today than it was when we
> first started this project, so it's quite possible that we don't hear noise
>

That's quite surprising.  My understanding from the Sleepycat/Oracle team
way back when was that our core usage of BDB was wrong and would never be
properly supported by them.  Have they embraced multiple reader/writer
processes now, or do they still advocate that having a single-process is
the only Right Way(tm)?  -- justin


Re: Subversion & Windows

2013-01-08 Thread Justin Erenkrantz
IMO, we should follow most other (all?) ASF projects - you need 3 +1s for
release regardless of platform.  The fact that we require 3 +1s just for
Windows is very odd - we don't require 3 +1s for Mac and 3 +1s for RHEL and
3 +1s for Ubuntu, etc.  -- justin
On Jan 8, 2013 3:29 PM, "Ben Reser"  wrote:

> We seem to be having trouble getting releases out the door and the
> delay is almost always related to Windows votes.
>
> Consider the following data:
> Release Planned Actual  Unix vs Windows
> 1.6.19  10 Sep 2012 21 Sep 2012 7 days
> 1.7.7   09 Oct 2012 09 Oct 2012 1 day
> 1.7.8   17 Dec 2012 20 Dec 2012 6 days
> 1.6.20  04 Jan 2013 ??  4+ days
>
> Windows Voters:
> Paul Burba
> Johan Corveleyn
> Ben Reser
> Mark Phippard
> Bert Huijben
>
> The Unix vs Windows column indicates how many days after the last Unix
> vote was reached did the Windows vote get to 3.  1.6.20 hasn't hit the
> required votes for Windows yet but it will be at least 4 or more days
> since we're on day 5 since Unix has been ready to release.
>
> Windows Voters Unix Voters
> Paul Burba   4 C. Michael Pilato  3
> Johan Corveleyn  4 Philip Martin  4
> Ben Reser1 Ben Reser  1
> Mark Phippard1 Branko Čibej   4
> Bert Huijben 1 Stefan Sperling2
>Julian Foad1
>
> The numbers after the names is the number of releases they voted in.
> This also doesn't count people who were RM'ing.
>
> I'd say that the problem is worse for 1.7 but I suspect that 1.7.7 is
> an outlier for some reason.  However, I hope that it's clear that we
> have a problem getting releases out due to Windows testing.
>
> I think flat out the problem is that building on Windows is just a
> pain.  I remember it took me several days to get a working build
> environment so I could be the last signature on 1.6.19.  Unfortunately
> I can't be the last vote on the more recent releases because I've been
> the RM.
>
> There are several possible solutions to this problem.
>
> 1) We could lower the votes required for Windows.  It seems like we
> are able to consistently get 2 votes, it's the 3rd that is often the
> problem.  If you look at the last several releases often the 2 votes
> for Windows come in before the 2nd or 3rd vote for Unix.  If you
> assume each of the 6 votes needed come from different people that
> means you need 6 different people to approve a release.  I'm not sure
> how many active contributors we have but 6 could easily be half.  It's
> not necessary that all 6 be given by different people, but in practice
> this is what happens.  Another point is we only require 3 votes for
> Unix when there are many platforms that fall under that category, some
> of them quite different.  Most of the time the 3 Unix votes ends up
> being done on Linux, sometimes even one distro of Linux.  On the flip
> side of this most of our client users are using Subversion on Windows,
> yet we do most of our development on Linux.  So maybe we are justified
> in the current voting setup.  But changing the voting would help
> resolve the release delays.
>
> 2) I could stop RM'ing.  That throws another person in the pool of
> people who test on Windows.  Right now there are 5 people who have
> tested on Windows in the last 4 releases, I'm one of those people.  In
> comparison Unix has a pool of about 6 people (with me being the person
> who's overlappping).  I do think the actual pool of potential Unix
> voters is larger than is suggested by the last 4 releases, since there
> are quite a few names not included that I know could have voted if
> necessary.  However, I'm not thinking of anyone included in the
> Windows pool that is known to have a working Windows build setup.  It
> would give me a chance to spend some time on hopefully improving the
> situation since I'd be dealing with Windows for the release voting
> (which I haven't done since 1.6.19).
>
> 3) WANdisco has non-committers building and testing the release
> candidates in order to provide binary packages.  I'd imagine
> Collab.net has something similar going on since they also provide
> Windows binaries.  We could start accepting the results of these tests
> as a vote with a committer validating that the tests were done with
> the proper source package and provides the signature for the vote.
> You can look at this as not really any different than a committer
> committing a non-committers code change.
>
> These three above responses might help provide some more immediate
> solutions but they don't really resolve the problem.  So let's look at
> some options that solve the root problem.
>
> 4) One of the major problems with building Subversion on Windows is
> building the dependencies.  We could build and package a pre-built set
> of dependencies that we provide for download.  A script could be
> written that downloads this, puts everything 

Re: Subversion & Windows

2013-01-08 Thread Justin Erenkrantz
On Tue, Jan 8, 2013 at 10:07 PM, Ben Reser  wrote:

> Given stefan2's argument I don't think it's unreasonable to lower the
> Windows votes, but I think removing them entirely is probably going
> too far.
>

Given the fact that we repeatedly have trouble securing votes just for
Windows, I think you are in the clear minority.  =P  -- justin


Re: Subversion & Windows

2013-01-09 Thread Justin Erenkrantz
On Wed, Jan 9, 2013 at 3:30 AM, Branko Čibej  wrote:

> We talked about that a couple days ago, but the problem is that a
> Windows VM requires a Windows OS license for every user of that VM.
> That's not something we can provide or hack around (well ... we could
> hack around it, but it would be a really bad idea).
>

At least for committers, Microsoft makes an MSDN subscription
available...so, it's a tractable problem if our intention is to get more
committers voting for Windows releases (which, duh, is why Microsoft does
it in the first place!).  -- justin


Re: 1.8 Release Status : Issue triage

2013-01-18 Thread Justin Erenkrantz
On Thu, Jan 17, 2013 at 1:19 AM, Branko Čibej  wrote:

> > 1 has not been marked as started nor been assigned but is tied to an
> > open serf issue:
> > 4274  DEFECT  P3  All issues@subversion   philip  NEW
>   1.8.0   serf
> > client hangs when server crashes
>
> I reviewed this the other day and submitted
>
> https://code.google.com/p/serf/issues/detail?id=94
>
> as I believe the root cause is in Serf itself.
>
>
How about this?  If the connection never resulted in a successful HTTP
response during its current lifetime (which gets zero'd in
reset_connection), then we can return an error back upstream.  I think
we'll still be fine on Windows as the HUP should be detected *after* the
full response is read by serf.  So, in this case, we should bug out after a
max of 2 attempts...and maybe one if OPTIONS is the first request on the
connection.  -- justin

Index: outgoing.c
===
--- outgoing.c  (revision 1716)
+++ outgoing.c  (working copy)
@@ -,7 +,10 @@
 /* The connection got reset by the server. On Windows this can
happen
when all data is read, so just cleanup the connection and open
a new one. */
-return reset_connection(conn, 1);
+if (conn->completed_responses) {
+return reset_connection(conn, 1);
+}
+return APR_EGENERAL;
 }
 if ((events & APR_POLLERR) != 0) {
 /* We might be talking to a buggy HTTP server that doesn't


Re: 1.8 Release Status : Issue triage

2013-01-19 Thread Justin Erenkrantz
On Fri, Jan 18, 2013 at 7:54 AM, Branko Čibej  wrote:

>  Doesn't help ... apparently because we get both APR_POLLIN and
> APR_POLLHUP events, and the former is processed first, the function returns
> without error in the previous block:
>
> /* If we decided to reset our connection, return now as we don't
>  * want to write.
>  */
> if ((conn->seen_in_pollset & APR_POLLHUP) != 0) {
> return APR_SUCCESS;
> }
>
> I don't know enough about Serf to even begin guessing at the correct
> solution.
>

The same if check works in that block too.  =)   With the debug-abort flag
set with serf r1717+ (backported to 1.2.x as well):

% svn ls http://localhost:21974/svn-test-work/repositories/basic_tests-3
svn: E120108: Unable to connect to a repository at URL '
http://localhost:21974/svn-test-work/repositories/basic_tests-3'
svn: E120108: Error running context: The server unexpectedly closed the
connection.

Enjoy.  -- justin


Re: packages tree

2013-01-29 Thread Justin Erenkrantz
On Tue, Jan 29, 2013 at 1:04 PM, Ben Reser  wrote:

> If we're not going to keep these things up to date then I think we
> should just remove them.
>

+1 to remove.  (It's all version-controlled anyway!)  -- justin


FOSDEM 2013

2013-01-29 Thread Justin Erenkrantz
FWIW, I'll be at FOSDEM in Brussels this weekend (arriving on Friday
morning) - if any other SVN devs are going to be there, it'd be fun to
catch up.

Cheers.  -- justin


Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/ tests/cmdline/

2013-02-03 Thread Justin Erenkrantz
As another data point, I have hit this text-as-binary myself just a few
weeks ago when I added a bunch of HTML files to a local repository - so,
it's definitely occurring automatically.  I did not have a chance to dig
into why the magic detection failed so miserably...  -- justin

On Saturday, February 2, 2013, Bert Huijben wrote:

>
>
> > -Original Message-
> > From: s...@apache.org  [mailto:s...@apache.org
> ]
> > Sent: zaterdag 2 februari 2013 23:05
> > To: comm...@subversion.apache.org 
> > Subject: svn commit: r1441814 - in /subversion/trunk/subversion: svn/
> > tests/cmdline/
> >
> > Author: stsp
> > Date: Sat Feb  2 22:04:44 2013
> > New Revision: 1441814
> >
> > URL: http://svn.apache.org/viewvc?rev=1441814&view=rev
> > Log:
> > When a binary mime-type is set on a file that looks like a text file,
> > make the 'svn' client print a warning about potential future problems
> > with operations such as diff, merge, and blame.
> >
> > This is only done during local propset for now, because the file needs
> > to be present on disk to detect its mime-type.
> >
> > See for related discussion: http://mail-
> > archives.apache.org/mod_mbox/subversion-
> > dev/201301.mbox/%3C20130131185725.GA13721%40ted.stsp.name%3E
>
> From my users I hear that another way this property is introduced is via
> conversions from other version management systems. Visual SourceSafe (long
> dead, but still used in a lot of small shops) marks UTF-8 files with a BOM
> as binary when it does an auto detect.
> (Well what would you guess for a system that wasn’t really updated since
> that format became popular)
>
> Most conversion tools just copy the binary flag, and there you have this
> problem on all your historic utf-8 files.
> (Where I worked we had this problem on all .xml files previously stored in
> sourcesafe).
>
>
> I don't see a lot of users accidentally adding invalid properties
> themselves.
>
> Bert
>
>


Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/

2013-02-03 Thread Justin Erenkrantz
On Sun, Feb 3, 2013 at 4:41 AM, Bert Huijben  wrote:

> Maybe we should try to fix this automatic detection then (Does your
> setting match the libmagic output for that file?) instead of adding
> just a warning when setting the file binary explicitly.
>
> Maybe we should validate our libmagic based decision to make something
> binary?
>

Taking a look while here at FOSDEM...libsvn_client set the mime-type as
"application/xml":

% svn diff index.shtml
Index: index.shtml
===
Cannot display: file marked as a binary type.
svn:mime-type = application/xml
% svn pg svn:mime-type index.shtml
application/xml
% file --mime-type index.shtml
index.shtml: application/xml
% file -v
file-5.11
magic file from /opt/local/share/misc/magic

I thought we treated application/xml as text at one time?  -- justin


Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/

2013-02-03 Thread Justin Erenkrantz
On Sun, Feb 3, 2013 at 6:06 AM, Justin Erenkrantz wrote:

> I thought we treated application/xml as text at one time?  -- justin
>

Yes, we did...and Ben -1'ed it back in 2004.  See:

http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=374218

I guess...but...really?  Is the only way for a user to override this is to
manually set their local auto-props?  *sigh*  -- justin


Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/

2013-02-03 Thread Justin Erenkrantz
On Sun, Feb 3, 2013 at 6:42 AM, Stefan Sperling  wrote:

> On Sun, Feb 03, 2013 at 06:16:03AM -0500, Justin Erenkrantz wrote:
> > On Sun, Feb 3, 2013 at 6:06 AM, Justin Erenkrantz  >wrote:
> >
> > > I thought we treated application/xml as text at one time?  -- justin
> > >
> >
> > Yes, we did...and Ben -1'ed it back in 2004.  See:
> >
> >
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=374218
> >
> > I guess...but...really?  Is the only way for a user to override this is
> to
> > manually set their local auto-props?  *sigh*  -- justin
>
> Perhaps we should treat mime-types that contain 'charset=UTF-8'
> as text by default? libmagic is able to detect the charset AFAIK.
> Our code currently removes charset information from the mime-type
> returned by libmagic.
>
> I don't really like the idea of adding yet another special case
> to our definition of what a "text" mime-type is. But it might
> be worth doing this if it impacts a lot of people.
>

% file -i -k index.shtml
index.shtml: application/xml; charset=us-ascii

Not all of us use UTF-8, but yes the sentiment is valid.  =)  I do wonder
if printing out this new message saying SVN considers XML files as binary
is going to increase the amount of people realizing that they are tripping
over this.  -- justin


Re: exclusive WC locks by default? (was: Re: svn commit: r1447487 - /subversion/trunk/subversion/svn/svn.c)

2013-02-18 Thread Justin Erenkrantz
On Feb 18, 2013 7:08 PM, "Stefan Sperling"  wrote:
>
> On Mon, Feb 18, 2013 at 10:21:16PM +0100, Bert Huijben wrote:
> > If I would use 1.8 to run
> >
> > svn diff |more
> >
> > And leave that open in a console I would block TortoiseSVN, AnkhSVN,
Subclipse, etc on my entire working copy, while this command doesn't even
obtain a working copy lock.
> >
> > And as a developer on both 'svn' and AnkhSVN I often use 'svn diff'
just for reviews.
> >
> > And all those other clients would just hang without a way to recover...
>
> I only use the command line but run 'svn' commands concurrently quite
often.
>
> E.g. I run 'svn diff' while 'svn commit' has the log message editor open.
> I also use 'svn log --diff' a lot and sometimes still have it running
while
> committing from the same working copy.
>
> I've disabled exclusive locking in my client configuration for this reason
> and wouldn't mind either if exclusive locking was off by default.

+1 - "svn diff" while running "svn commit" (blocking waiting for $EDITOR to
exit) is my SOP as well.  If we break that by default, I consider it a
regression.   -- justin


Re: BDB deprecation (was: branch 1.8 or at least start making alpha releases?)

2013-02-26 Thread Justin Erenkrantz
On Tue, Feb 26, 2013 at 1:54 PM, C. Michael Pilato wrote:

> * The appropriate time to stop supporting Berkeley DB is in the same
> release
> for which existing FSFS will also have to dump/load.  It is cruel to force
> admins to endure the migration process twice -- possibly in successive
> releases of Subversion -- and especially when one of those times is just
> for
> a (possibly less-than-compelling) bit of a performance boost.
>

I brought this up here in Portland with Brane et al - but, I'd be a tad
concerned if we're going to make a dump/load *mandatory* for FSFS.  Sure,
we can advise a dump/load to get better performance, but I think we have
shot ourselves in the foot with the client-side WC upgrade being mandatory.
 I hate the fact that 1.7 and 1.8 can't share WCs and force me to do 'svn
upgrade'.  As a developer testing trunk, this really blows...


> * That said, I'm okay with deprecating Berkeley DB today as a warning to
> existing BDB users that change is a-comin', though the release notes should
> (again) indicate that there's no reason to rush off and convert to FSFS
> until an as-yet-undecided future revision forces the issue for *all*
> Subversion users.
>

+1.  -- justin


Re: Printing error stacks.

2013-04-13 Thread Justin Erenkrantz
On Fri, Apr 12, 2013 at 6:28 PM, Peter Samuelson  wrote:

> What is the utility of the "(apr_err=_)" suffix there?  I think it's
> better to omit it.  Print apr_err= only when it is new information.
>

+1.  -- justin


Re: log --search test failures on trunk and 1.8.x

2013-04-21 Thread Justin Erenkrantz
On Sun, Apr 21, 2013 at 9:37 AM, Bert Huijben  wrote:

> Do we want case folding (or at least case variant compare) support in our
> libraries for 1.8?
>

No.


> Or is this 1.9+ scope?
>

Yes.  =)

My $.02.  -- justin


Re: log --search test failures on trunk and 1.8.x

2013-04-24 Thread Justin Erenkrantz
On Wed, Apr 24, 2013 at 1:34 AM, Branko Čibej  wrote:

> If you consider all this, the easiest approach by far might be to simply
> add a Lucene index of all log messages to the server, then you can and
> any number of bells and whistles including  language-specific stemming.
> I'd consider that a better solution then any homegrown full-text search
> facility; these are never easy.


Ha!  That might actually be easier than importing all of the Unicode
nastiness.  =)  -- justin


Re: svn commit: r1483795 - /subversion/branches/1.8.x/STATUS

2013-05-17 Thread Justin Erenkrantz
On Fri, May 17, 2013 at 5:58 PM, Ivan Zhakov  wrote:

> Yes, this will be good improvement anyway, but I think repository
> integrity should be first goal.
>

+1!  =P  -- justin


Re: C++ thoughts for Berlin

2013-05-29 Thread Justin Erenkrantz
On Wed, May 29, 2013 at 12:05 PM, Blair Zajac  wrote:

> I'm generally in favor of a move to C++, it would be nice to get features
> that we work around now in C.
>

Rewriting even some of our core libraries to use C++ (even if it we kept
the existing C API) just doesn't seem to address any real problems that we
have.  We'd likely be having to write off support for a lot of platforms
due to the inconsistent nature of many C++ compilers on platforms we have
supported since 1.0.  I do not think this is a good thing.

With regards to libraries, I have had nothing but horrible developer
experiences with Boost - it's pretty counter-intuitive in a lot of places;
and C++11 isn't anywhere near widely supported to be considered if we want
to keep broad platform support.

As trying to use APR in a C++-based memory management model is fraught with
paradigm conflicts, we'd quite likely need to write a new portability layer
and new HTTP networking layer.  Fun!  (Not.)

BTW, I believe that GCC is special - due to its bootstrapping
methodologies, it's only really meant to be compiled by itself - this
doesn't apply to Subversion, so I think that analogy is a bit of red
herring.

If we really switched to having core libraries written in C++, I would
forcefully argue that it has to be SVN 2 (regardless if we kept the C API
identical)...and I'd probably say we should just rename the project to
something else - it's not Subversion at that point, but something else.  --
justin


Re: C++ thoughts for Berlin

2013-05-29 Thread Justin Erenkrantz
On Wed, May 29, 2013 at 6:03 PM, Blair Zajac  wrote:

> Yup, I've had lots of issues with this.  Putting C++ pool wrappers in C++
> classes and having them destroy in the correct order can be tricky to get
> right (lots of core dumps in our internal RPC server).  One of the nice
> things about moving to C++, assuming we don't use pools, is that one could
> stop thinking about memory management.
>

I don't see how you can have a clean C++ implementation as we expose pools
in the C API.  I just think we're going to bring a world of hurt upon us
all if we try to do a wholesale rewrite of C++ and try to shove the pool
model into it.  (The batons and callbacks all require pools with a certain
lifetime - so there are third-party code that is relying upon the pool
hierarchy being correct.)

If we really wanted to go that route, the far saner (if you can call it
that) approach is to just call it Subversion 2.0 and have a pool-free
external API...from where I sit, I don't see much value going that far -
how many years do we want to dedicate to 2.0?  Are things ultimately so
broken that we simply can't make Subversion better unless we go to C++?
 It's not a zero-cost item.

To be perfectly clear, I'm well aware that many folks have written some
particular library modules in C++ over the years. Perhaps I could see a
case for a new FS layer written in C++ - those who don't have a modern C++
compiler can still use fsfs.  (We go from 2 to 1 and back to 2 FS impl's
again!)  Rewriting libsvn_client or libsvn_wc in C++ is where I think
things would just breakdown if we tried to do it in a piecemeal approach -
at that point, everything underneath would have to change to C++.  And, we
know how much fun it was to rewrite libsvn_wc the last time.  -- justin


Re: Dependency versioning question

2013-06-05 Thread Justin Erenkrantz
+1 for "no big deal" too, I think.

Depending upon how far we take serf 2.0 and its APIs, a new libsvn_ra_serf2
library may be in order.  So, it'd be no different than all of the
libsvn_fs* and libsvn_wc internal changes.

If we altered the RA APIs, then that would be a no-no if we did it in a
backwards-incompatible manner.  -- justin


On Thu, Jun 6, 2013 at 8:50 AM, Greg Stein  wrote:

> Hey all,
>
> I've got a question where I'm not quite sure what the answer "should"
> be. While I wrote the rules on versioning components, it never talked
> about cross-component and dependency versioning. For Subversion, we
> said "all svn components should be at the same version".
>
> But dependencies.
>
> Within serf, we're thinking about new models for connection and
> protocol handling. And bumping to 2.0 to make that happen. What is the
> effect upon libsvn_ra_serf and svn in general?
>
> My thinking is "no big deal". That upgrading to svn 1.9 (or 1.10 or
> whatever) would require installation of serf 2.0. All svn APIs would
> remain unchanged, per our own versioning guidelines (both source and
> binary APIs).
>
> Thoughts?
>
> Cheers,
> -g
>


Re: --rm-I alias to --remove-ignored

2013-07-02 Thread Justin Erenkrantz
On Mon, Jul 1, 2013 at 5:37 PM, Daniel Shahaf wrote:

> How about adding 'svn cleanup --rm-I' as a short option for 'svn cleanup
> --remove-ignored'?
>

I don't like either option (rm-I or rm-?) as I don't find either an
improvement over the long option.  Mixing lower-and-upper in the same
argument is confusing.  For the other option, having a non-alphanumeric
character seems like a really bad idea in a supposed short-option string.

My $.02.  -- justin


Re: Tri-state for http-detect-chunking config option

2013-07-13 Thread Justin Erenkrantz
On Thu, Jul 11, 2013 at 10:15 PM, Greg Stein  wrote:

> Hey all,
>
> Right now, trunk and the 1.8.1 backport request implements the following
> option:
>
> http-detect-chunking = off / default: perform no detection. A 411
> response will throw an error message suggest to turn this option on.
>

I'm not really a fan of exposing this as a config knob in any situation.

I think that we should always run the detection.  It's one additional
request in 1.8.x...and I think we can indeed remove that extra round-trip
in 1.9.x.  -- justin


Re: [PATCH] Change error reporting for xlate problems

2013-07-13 Thread Justin Erenkrantz
On Sat, Jul 13, 2013 at 8:39 PM, Daniel Shahaf  wrote:

> It appears as soon as svn_utf_cstring_to_utf8() is called --- which is
> normally during argv parsing.  The error happens even if the argv
> arguments are all ASCII, which effectively adds a new dependency on
> apr_xlate_* even for people who use only ASCII.  I assume this new
> failure mode for ASCII-only users means we cannot backport this change.
>

This now means iconv or apr-iconv are a hard global dependency where they
haven't been before.  I'm not sure that I like that.  (If you're going to
do this patch, remove the if check for EINVAL/ENOTIMPL - it's unnecessary.)

IIRC, the reason is that this code gets called all the time - we had to
silence the ENOTIMPL and EINVAL errors as it really shouldn't be treated as
a fatal error.  So, in this case, we're back to treating it as an
irrecoverable error.

It's probably better to just check APR_HAS_XLATE and return an error in
svnsync if that's 0...and let the string pass through unmodified - which is
probably fine for svnsync case...or perhaps go a step further and just
refuse to build svnsync if iconv isn't supported as we might not be able to
guarantee the integrity of the sync'd logs.  I'm not sure how paranoid we
need to be here.  -- justin


Re: Security patches release process

2013-08-07 Thread Justin Erenkrantz
On Wed, Aug 7, 2013 at 5:48 AM, Daniel Shahaf  wrote:

> (v3)
> - Receive report
> - Come up with a fix
> - Gather 3 +1 votes via shadow status file
> - Send pre-notifications
> - Release a signed .diff file (instead of a tarball) as 1.8.(x+1)
> - Commit fix with a log message clearly outlining the security implications
> - Backport without going via STATUS (already has 3 +1s)
> - Tag and roll 1.8.(x+2) whenever we had scheduled the next release to.
>

I'm okay with choice 3 - however, the "release a signed .diff" should also
include a full release - including whatever normal things we include with a
standard release process.  If 1.8.(x+1) is *only* available as a diff
against 1.8.x, that makes it hard for downstream users (or packagers) until
1.8.(x+2) is released.  -- justin


  1   2   3   >