Re: Proposal: Change repository's UUID over RA layer
On Fri, Aug 6, 2010 at 7:34 AM, Branko Čibej wrote: > Ahem. You guys have forgotten about Justin's RW-master/RO-slave > replication hack, which *requires* the slave repositories to have the > same UUID as the master. And that 'svnadmin load' has both --ignore-uuid > and --force-uuid. When doing any master/slave initial configuration, doing a local 'svnadmin setuuid' should be sufficient. IOW, I don't see that use case as driving this need. -- justin
Re: Noise: spicy autoindex httpd.conf workaround #fail
Based on what you describe, I think this is likely what you are looking for: http://httpd.apache.org/docs/2.0/mod/mod_mime.html#modmimeusepathinfo http://mail-archives.apache.org/mod_mbox/httpd-dev/200209.mbox/%3c20020904211343.ga16...@apache.org%3e http://mail-archives.apache.org/mod_mbox/httpd-dev/200306.mbox/%3c2147483647.1054772...@[10.0.1.37]%3e HTH. -- justin On Fri, Aug 20, 2010 at 1:27 AM, C. Michael Pilato wrote: > [Warning: This matter is far from highly pertinent. One tackles strange > non-problems when in an atypical environment, such as a hotel room in CA.] > > I had someone ask me about Subversion autoindex support. So, like, you > point a web browser at > http://svn.apache.org/repos/asf/subversion/site/publish/ and *pow* magically > you are now looking at the index.html inside that directory. > > Clearly, this could be done with an hour or two of mod_dav_svn hackery and > some new directives there. But I was trying to come up with an httpd.conf > workaround that did the trick. Here's what I tried. (On my system, all my > Subversion repositories live inside the /repos/ Location.) > > # If this is a GET request (but not a subrequest) aimed at my > # collection of Subversion repositories and with a trailing slash, and > # if there exists an index.html file inside that directory, then > # temporarily redirect the browser to the index.html file. > RewriteEngine on > RewriteCond %{IS_SUBREQ} false > RewriteCond %{REQUEST_METHOD} GET > RewriteCond %{REQUEST_URI} ^/repos/.*/$ > RewriteCond %{REQUEST_URI}index.html -U > RewriteRule /repos/(.*)/$ /repos/$1/index.html [L,R] > > The result was that for every directory in which an index.html was found, > that file was served (via a browser redirect). Yay! Unfortunately, the > redirect was transmitted for directories which had no index.html child, too. > Boo! > > Sadly, I found that despite the fact that the Apache docs say about that > "-U" test the following: > > '-U' (is existing URL, via subrequest) > Checks whether or not TestString is a valid URL, accessible via all > the server's currently-configured access controls for that path. This > uses an internal subrequest to do the check, so use it with care - it > can impact your server's performance! > > In reality "validity" in this context seems to have nothing to do with > "existence". I traced the subrequest that mod_rewrite made into Subversion, > and found that it never enters mod_dav to actually perform an existence get. > I guess I expected that the subrequest would GET all the way into > Subversion, where it would get the appropriate error code (HTTP_NOT_FOUND). > In retrospect, I think I knew that subrequests don't behavior like > full-fledged content-fetching requests. But the documentation quoted above > is pretty misleading, at any rate, IMO. > > -- > C. Michael Pilato > CollabNet <> www.collab.net <> Distributed Development On Demand >
Re: Reduce the 1.7 release feature set a bit?
On Fri, Aug 20, 2010 at 3:29 AM, Stefan Sperling wrote: > just me): http://subversion.tigris.org/issues/show_bug.cgi?id=3684 Lieven fixed this in serf trunk a few months back. IOW, I can reproduce with serf 0.6.1, but not serf trunk. Greg/Lieven: shall we cut a new release? Greg did the last release, so I would prefer he do it. I'll also go figure out what needs to be backported in ra_serf from trunk to 1.6.x to work against current serf releases. -- justin
Re: [serf-dev] Re: Reduce the 1.7 release feature set a bit?
On Fri, Aug 20, 2010 at 11:15 AM, Lieven Govaerts wrote: > The memory usage problem is probably still there. Last time I looked > at it with valgrind (at the elego days), most of the memory was > allocated in the sqlite libraries. This has probably improved a lot > with the single-db move, but I haven't tested yet. FWIW, I didn't see the memory go above ~48MB on my Mac when checking out SVN trunk with subversion & serf trunk. So, I'm not tempted to pursue it further at this time. Getting ra_serf in 1.6.x happy with current versions of serf is more important (to me). =) -- justin
Re: [serf-dev] Re: Reduce the 1.7 release feature set a bit?
On Fri, Aug 20, 2010 at 11:19 AM, Justin Erenkrantz wrote: > On Fri, Aug 20, 2010 at 11:15 AM, Lieven Govaerts > wrote: >> The memory usage problem is probably still there. Last time I looked >> at it with valgrind (at the elego days), most of the memory was >> allocated in the sqlite libraries. This has probably improved a lot >> with the single-db move, but I haven't tested yet. > > FWIW, I didn't see the memory go above ~48MB on my Mac when checking > out SVN trunk with subversion & serf trunk. So, I'm not tempted to > pursue it further at this time. > > Getting ra_serf in 1.6.x happy with current versions of serf is more > important (to me). =) -- justin 1.6.x + r879757 backport group now works with serf trunk again. All 'make check' tests (via davautocheck.sh) pass except: FAIL: merge_tests.py 133: replace that causes a tree-conflict Considering everything else passes, I'll go out on a limb and say that's likely not due to the serf version. (I'm guessing it's a known failure??) Greg will likely cut a Serf release this weekend that will incorporate r1401 from serf that will let the old-style authentication handlers in 1.6.x-era ra_serf work. I guess it'll either be 0.7.0 or 0.6.2 - not sure what his plans are - I just validated against serf trunk. (If we didn't incorporate r1401 in the new serf release, we'd have to backport all of the authn changes in ra_serf from trunk and I'd rather not do that. ra_serf in SVN trunk already uses the new serf authn framework.) BTW, those who complain that no one's working on ra_serf hasn't looked at the diffs between 1.6.x and trunk. New authn framework, digest auth, Kerberos auth, error handling fixes, better proxy support, HTTPv2 support, etc. Many many kudos to everyone driving ra_serf forward! *group hug* -- justin
Re: svn commit: r987689 - /subversion/branches/1.6.x/STATUS
On Sun, Aug 22, 2010 at 1:02 PM, Lieven Govaerts wrote: > Hey Justin, > > On Sat, Aug 21, 2010 at 4:30 AM, wrote: >> Author: jerenkrantz >> Date: Sat Aug 21 02:30:31 2010 >> New Revision: 987689 >> >> URL: http://svn.apache.org/viewvc?rev=987689&view=rev >> Log: >> Propose r879757 & r880320 for backport to 1.6.x. >> >> Modified: >> subversion/branches/1.6.x/STATUS >> >> Modified: subversion/branches/1.6.x/STATUS >> URL: >> http://svn.apache.org/viewvc/subversion/branches/1.6.x/STATUS?rev=987689&r1=987688&r2=987689&view=diff >> == >> --- subversion/branches/1.6.x/STATUS (original) >> +++ subversion/branches/1.6.x/STATUS Sat Aug 21 02:30:31 2010 >> @@ -233,6 +233,17 @@ Candidate changes: >> Votes: >> +1: danielsh >> >> + * r879757, r880320 >> + Let ra_serf work with current serf releases. >> + Justification: >> + Having a dud client is bad. This seems to be the minimal required >> changes. >> + Backport branch: >> + ^/subversion/branches/1.6.x-r879757 >> + Notes: >> + r879757 is the main fix. r880320 is a follow up fix. >> + Votes: >> + +1: jerenkrantz > > I didn't want to propose r879757 for backport because it changes the > svn_ra_serf__conn_setup function declaration, which is used as a > callback function for serf, in an incompatible way with serf 0.3. As > long as one builds and runs svn with the same serf version there is no > problem. The idea was to just raise the minimum serf version with svn > 1.7 release, so this problem couldn't happen. > > Is this something we make promises about? It has the ifdef so older serf's should work fine. Or, am I missing something? Is the issue compiling against 0.4.x+ and running with 0.3.x+? If so, I'm not sure that's worth worrying about. (Serf doesn't have a promise of binary API compatibility...) The core issue that I'm trying to address in this backport is that we don't have enough auto-fu checks currently in place to block combinations of 1.6.x with current releases of serf. We have no version checks at configure-time - only at compile-time; and the compile-time checks in 1.6.x don't complain if it sees a serf version it doesn't know about. So, right now, 1.6.x (without r879757) builds "successfully", but due to the API mismatches, we get a dud client with serf 0.4.0+ and 1.6.x. -- justin
Re: Upgrade from 1.6 must use the same incremental steps? [was: svn commit: r987526 - ...]
On Sun, Aug 22, 2010 at 11:14 AM, Daniel Shahaf wrote: > Greg Stein wrote on Fri, Aug 20, 2010 at 14:11:21 -0400: >> I wish you wouldn't change the subject line so often. > > Why not? It has proper References: and In-Reply-To:. Should be enough > for threading to work, no? Not with Gmail - changes in subjects are treated as a new conversation. -- justin
Re: svn commit: r987689 - /subversion/branches/1.6.x/STATUS
On Sun, Aug 22, 2010 at 7:25 PM, Greg Stein wrote: > I also added an API to serf to facilitate runtime version checks (ie. > hopefully before a call to a bogus signature is performed). But that > call is only in later versions :-P Yup - I noticed that when I added the minimum version auto-fu check today. I'm not sure how critical it is to do that run-time check at configure-time - I know that APR does it for BDB, but... If we decide not to do the proposed backport, then 1.6.x might actually need a *maximum* version check. But, I'd rather just make 1.6.x work with a current serf instead...and probably force Subversion to require 0.6.2/0.7.0 (depending upon what Greg rolls soon) as 0.4.0-0.6.1 won't work with 1.6.x, but 0.3.1 might. My head hurts. =) -- justin
Re: svn commit: r987956 - /subversion/trunk/build/ac-macros/serf.m4
On Mon, Aug 23, 2010 at 4:28 AM, Daniel Shahaf wrote: > Does this actually expand SERF_VERSION_STRING? A quick independent test > indicates it wouldn't... Ah, you're right. Hmm. Any ideas? -- justin
Re: Noise: spicy autoindex httpd.conf workaround #fail
Why do you want mod_rewrite at all?Enabling that mod_mime directive will let mod_autoindex remap / to /index.html when it exists in the virtual filesystem. -- justin On Mon, Aug 23, 2010 at 10:48 AM, C. Michael Pilato wrote: > Hrm. I'm not seeing the connection here. I'm looking for a way to pull off > a Rewrite condition based on the existence of a given URI. The docs imply > that this can be done with "RewriteCond SOME_URI -U", but appear to just be > wrong -- the existence of SOME_URI doesn't appear to tested at all, only > it's accessibility (from an authn/authz standpoint). > > > On 08/20/2010 12:51 PM, Justin Erenkrantz wrote: >> Based on what you describe, I think this is likely what you are looking for: >> >> http://httpd.apache.org/docs/2.0/mod/mod_mime.html#modmimeusepathinfo >> >> http://mail-archives.apache.org/mod_mbox/httpd-dev/200209.mbox/%3c20020904211343.ga16...@apache.org%3e >> http://mail-archives.apache.org/mod_mbox/httpd-dev/200306.mbox/%3c2147483647.1054772...@[10.0.1.37]%3e >> >> HTH. -- justin >> >> On Fri, Aug 20, 2010 at 1:27 AM, C. Michael Pilato >> wrote: >>> [Warning: This matter is far from highly pertinent. One tackles strange >>> non-problems when in an atypical environment, such as a hotel room in CA.] >>> >>> I had someone ask me about Subversion autoindex support. So, like, you >>> point a web browser at >>> http://svn.apache.org/repos/asf/subversion/site/publish/ and *pow* magically >>> you are now looking at the index.html inside that directory. >>> >>> Clearly, this could be done with an hour or two of mod_dav_svn hackery and >>> some new directives there. But I was trying to come up with an httpd.conf >>> workaround that did the trick. Here's what I tried. (On my system, all my >>> Subversion repositories live inside the /repos/ Location.) >>> >>> # If this is a GET request (but not a subrequest) aimed at my >>> # collection of Subversion repositories and with a trailing slash, and >>> # if there exists an index.html file inside that directory, then >>> # temporarily redirect the browser to the index.html file. >>> RewriteEngine on >>> RewriteCond %{IS_SUBREQ} false >>> RewriteCond %{REQUEST_METHOD} GET >>> RewriteCond %{REQUEST_URI} ^/repos/.*/$ >>> RewriteCond %{REQUEST_URI}index.html -U >>> RewriteRule /repos/(.*)/$ /repos/$1/index.html [L,R] >>> >>> The result was that for every directory in which an index.html was found, >>> that file was served (via a browser redirect). Yay! Unfortunately, the >>> redirect was transmitted for directories which had no index.html child, too. >>> Boo! >>> >>> Sadly, I found that despite the fact that the Apache docs say about that >>> "-U" test the following: >>> >>> '-U' (is existing URL, via subrequest) >>> Checks whether or not TestString is a valid URL, accessible via all >>> the server's currently-configured access controls for that path. This >>> uses an internal subrequest to do the check, so use it with care - it >>> can impact your server's performance! >>> >>> In reality "validity" in this context seems to have nothing to do with >>> "existence". I traced the subrequest that mod_rewrite made into Subversion, >>> and found that it never enters mod_dav to actually perform an existence get. >>> I guess I expected that the subrequest would GET all the way into >>> Subversion, where it would get the appropriate error code (HTTP_NOT_FOUND). >>> In retrospect, I think I knew that subrequests don't behavior like >>> full-fledged content-fetching requests. But the documentation quoted above >>> is pretty misleading, at any rate, IMO. >>> >>> -- >>> C. Michael Pilato >>> CollabNet <> www.collab.net <> Distributed Development On Demand >>> > > > -- > C. Michael Pilato > CollabNet <> www.collab.net <> Distributed Development On Demand >
httpd trunk broken with Subversion: ap_log_rerror busted
In r951893, httpd modified a #define for APLOG_MARK to add in a new parameter called APLOG_MODULE_INDEX (in addition to file and line info). This busts Subversion - specifically, mod_authz_svn which has a function called: static void log_access_verdict(const char *file, int line, const request_rec *r, int allowed, const char *repos_path, const char *dest_repos_path) it is called with: log_access_verdict(APLOG_MARK, r, 1, repos_path, dest_repos_path); Reading the very obtuse (unhelpful) commit log for r951893 as well as any comments in that general area of http_log.h (which are lacking & unhelpful), I have no idea what this APLOG_MODULE_INDEX is or why it's even there. Furthermore, this error case is very very hard to track down because we're relying upon multiple levels of #define's and indirections (hidden static variables??, weird STDC wrappers, redefinitions of function names, etc, etc.). So, it's not going to be obvious to downstream developers what is going on and why it is broken. In general, I'm not quite sure it's a good idea to bust a #define that has been the same since the 1.3 days (if not earlier actually). If we want modules to use a new optimized log API, then we should introduce a new optimized log API and not bust the old one in a way that the failure case isn't obvious to track down. Yes, we could fix this by making mod_authz_svn conditional on the new MMN, but - again, it's about even figuring out that the API is changed and what to do about it. The root cause is just way too obscured, IMO. (I wish I could veto this whole commit just on over-complication alone - it's not an elegant solution at all.) My $.02. -- justin
Re: httpd trunk broken with Subversion: ap_log_rerror busted
On Wed, Aug 25, 2010 at 4:48 AM, Stefan Fritsch wrote: > I agree that the comments/documentation should be improved. I will write a > how-to for adjusting modules to the new API. Here is a constructive suggestion (*grin*): in APR, for some of the more complex declarations (see apr_pools.h and apr_pool_create), what we have done is to create a #ifdef DOXYGEN tag which indicates the normalized reduced form of the function declaration. The #else clause contains the optimized, macro-ized version. (In httpd, I don't know what the appropriate #ifdef should be though.) This way those reading http_log.h will be able to see what the declaration for ap_log_rerror actually reduces to without trying to go through several layers of indirection. -- justin
Re: [serf-dev] Re: serf 0.7.0 released
On Wed, Aug 25, 2010 at 2:45 PM, Blair Zajac wrote: > Does it change the ABI? Do we need to wait for the next svn 1.6.x release > before we can use it? I'm asking because I maintain the MacPorts serf > package and don't want to break existing installs. Subversion 1.6.12 is busted with serf 0.6.1 anyway. =( You'd need the next release of Subversion 1.6.x (assuming the pending backport is approved) to make 1.6.x work in combination with 0.7.0. -- justin
Re: [serf-dev] serf 0.7.0 released
On Wed, Aug 25, 2010 at 2:16 PM, Greg Stein wrote: > Hi all, > > I'm pleased to announce the 0.7.0 release of serf. Thanks! -- justin
Re: [serf-dev] Re: serf 0.7.0 released
On Wed, Aug 25, 2010 at 4:01 PM, Peter Samuelson wrote: > Yeah but presumably he's asking because he's linking libsvn_ra_serf to > serf 0.3.x now, so the question remains: is 0.7.0 ABI-compatible with > 0.3.x? As far as I know, 0.3 is ABI-compatible with 0.2 and 0.1. Macports is at 0.6.1. IOW, it's already busted now. =( -- justin
Re: [serf-dev] Re: serf 0.7.0 released
On Wed, Aug 25, 2010 at 4:24 PM, Blair Zajac wrote: > On 8/25/10 4:16 PM, Justin Erenkrantz wrote: >> >> On Wed, Aug 25, 2010 at 4:01 PM, Peter Samuelson wrote: >>> >>> Yeah but presumably he's asking because he's linking libsvn_ra_serf to >>> serf 0.3.x now, so the question remains: is 0.7.0 ABI-compatible with >>> 0.3.x? As far as I know, 0.3 is ABI-compatible with 0.2 and 0.1. >> >> Macports is at 0.6.1. IOW, it's already busted now. =( -- justin > > Busted checkouts and operations are better than core dumps :) Subversion 1.6.12 and serf 0.6.1 from Macports doesn't even do any type of a checkout - it always errors out immediately. Backing up, there are three faults: - API/ABI compatibility - this was broken way back in 0.4.0 - no current 1.6.x release of Subversion can work with anything from serf 0.4.0 or higher. You need the proposed backport to Subversion (1.6.x-r879757) to make 1.6.x work with the serf 0.4.0+ API. - Authn framework fix - this is fixed in 0.7.0. - Memory usage - this is also fixed in 0.7.0. 1.7.x/trunk of Subversion could work with 0.4.0-0.6.1 - this is what Stefan and others reported the memory leaks - but Subversion has never officially released a version that works with serf 0.4.0+. > Can I get an answer to the question? I'm not sure there is any API/ABI changes between 0.6.1 and 0.7.0. Greg? Lieven? -- justin
Re: Worried about single-db performance
On Fri, Sep 3, 2010 at 8:39 AM, Greg Stein wrote: > It "should" already be faster. Obviously, that's not the case. I just spent a little bit time with Shark and gdb. A cold run of 'svn st' against Subversion trunk checkouts for 1.6 yields 0.402 seconds and 1.7 is 0.919 seconds. Hot runs for 1.6 are about 0.055 seconds with 1.7 at 0.750 seconds. One striking difference in the perf profile between 1.6 & trunk is that we seem to do a larger amount of stat() calls in 1.7. >From looking at the traces and code, I *think* svn_wc__db_pdh_parse_local_abspath's call to svn_io_check_special_path may be in play here: /* ### at some point in the future, we may need to find a way to get ### rid of this stat() call. it is going to happen for EVERY call ### into wc_db which references a file. calls for directories could ### get an early-exit in the hash lookup just above. */ SVN_ERR(svn_io_check_special_path(local_abspath, &kind, &special /* unused */, scratch_pool)); I see this stat() call getting called approximately seven times *per* file in a WC - that can't be good. The patch below brings us down to one invocation of this particular svn_io_check_special_path call per status run. (If anyone would like the stacktraces, I can send 'em along. If we could somehow stop asking for this seven times per file, that'd be good too. *grin*) Shark says this patch saves us about 2.5% time-wise - not a whole lot - but, something like this is likely needed to relieve pressure on the I/O subsystem. (I haven't rigorously tested this patch to see if it breaks anything else - hope someone more familiar with libsvn_wc will see if this is useful.) I'll keep looking through the timing profiles to see what else I can uncover. -- justin --- Remember WC files we've seen before to save on stat() calls. * subversion/libsvn_wc/wc_db_private.h (svn_wc__db_t): Add a hash table to remember file entries. * subversion/libsvn_wc/wc_db_pdh.c (svn_wc__db_open): Create file_data hash. (svn_wc__db_pdh_parse_local_abspath): Use hash to save stat() lookups. Index: wc_db_pdh.c === --- wc_db_pdh.c (revision 992534) +++ wc_db_pdh.c (working copy) @@ -216,6 +216,7 @@ svn_wc__db_open(svn_wc__db_t **db, (*db)->auto_upgrade = auto_upgrade; (*db)->enforce_empty_wq = enforce_empty_wq; (*db)->dir_data = apr_hash_make(result_pool); + (*db)->file_data = apr_hash_make(result_pool); (*db)->state_pool = result_pool; return SVN_NO_ERROR; @@ -424,10 +425,21 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ return SVN_NO_ERROR; } - /* ### at some point in the future, we may need to find a way to get - ### rid of this stat() call. it is going to happen for EVERY call - ### into wc_db which references a file. calls for directories could - ### get an early-exit in the hash lookup just above. */ + /* Have we already successfully seen this file before? */ + *pdh = apr_hash_get(db->file_data, local_abspath, APR_HASH_KEY_STRING); + if (*pdh != NULL && (*pdh)->wcroot != NULL) { + const char *local_abspath_parent, *dir_relpath; + svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath, + scratch_pool); + + /* Stashed directory's local_relpath + basename. */ + dir_relpath = svn_wc__db_pdh_compute_relpath(*pdh, NULL); + *local_relpath = svn_relpath_join(dir_relpath, +build_relpath, +result_pool); + return SVN_NO_ERROR; + } + SVN_ERR(svn_io_check_special_path(local_abspath, &kind, &special /* unused */, scratch_pool)); if (kind != svn_node_dir) @@ -440,7 +452,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ For both of these cases, strip the basename off of the path and move up one level. Keep record of what we strip, though, since we'll need it later to construct local_relpath. */ - svn_dirent_split(&local_abspath, &build_relpath, local_abspath, + const char *local_abspath_parent; + svn_dirent_split(&local_abspath_parent, &build_relpath, local_abspath, scratch_pool); /* ### if *pdh != NULL (from further above), then there is (quite @@ -448,7 +461,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ ### clear it out? but what if there is an access baton? */ /* Is this directory in our hash? */ - *pdh = apr_hash_get(db->dir_data, local_abspath, APR_HASH_KEY_STRING); + *pdh = apr_hash_get(db->dir_data, local_abspath_parent, + APR_HASH_KEY_STRING); if (*pdh != NULL && (*pdh)->wcroot != NULL) { const char *dir_relpath; @@ -458,6 +472,8 @@ svn_wc__db_pdh_parse_local_abspath(svn_wc__db_pdh_ *local_relpath = svn_relpath_join(dir_relpath,
Re: Worried about single-db performance
On Sat, Sep 4, 2010 at 2:23 AM, Bert Huijben wrote: > SQLite also does a stat call per statement, unless there is already a shared > lock open, just to check if there is no other process that opened a > transaction. > (On Windows this specific stat to check for other processes operating on the > same db is the performance killer for svn status: Just this stat takes more > than 50% of the total processing). Aha. Adding exclusive locking into our pragma [http://www.sqlite.org/pragma.html] calls in "svn_sqlite__open": "PRAGMA locking_mode=exclusive;" brings the time for "svn st" down from 0.680 to 0.310 seconds. And, yes, the I/O percentages drop dramatically: ~/Applications/svn-trunk/bin/svn st 0.37s user 0.31s system 99% cpu 0.684 total ~/Applications/svn-trunk/bin/svn st 0.26s user 0.05s system 98% cpu 0.308 total I *think* we'd be okay with having Sqlite holding its read/write locks for the duration of our database connection rather than constantly giving it up and refetching it between every read and write operation. As I read the sqlite docs, we should still be able to have shared readers in this model - but, it'd create the case where there were idle shared readers (due to network I/O?) would block an attempted writer. With a normal locking mode, a writer could intercept a reader if it were idle and not active. However, I'd think our other locks would interfere with this anyway...so I think it'd be safe. Thoughts? -- justin
Repeated SQL queries when doing 'svn st'
When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a svn trunk WC, a number of things pop out. We perform 28,062 SQL queries. --- DBG: sqlite.c: 63: sql="select root, uuid from repository where id = 1;" --- We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times. Yikes. I think this has to do with svn_wc__db_base_get_info's call to fetch_repos_info. I'd think we'd be able to cache this result. I'll take a stab and see if this reduction saves us any real time. The root and uuid should be constant for an wc_id...right? For each file that we hit the DB for, we execute the following queries: --- DBG: sqlite.c: 63: sql="select repos_id, repos_relpath, presence, kind, revnum, checksum, translated_size, changed_rev, changed_date, changed_author, depth, symlink_target, last_mod_time, properties from base_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select presence, kind, checksum, translated_size, changed_rev, changed_date, changed_author, depth, symlink_target, copyfrom_repos_id, copyfrom_repos_path, copyfrom_revnum, moved_here, moved_to, last_mod_time, properties from working_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select prop_reject, changelist, conflict_old, conflict_new, conflict_working, tree_conflict_data, properties from actual_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select base_node.repos_id, base_node.repos_relpath, presence, kind, revnum, checksum, translated_size, changed_rev, changed_date, changed_author, depth, symlink_target, last_mod_time, properties, lock_token, lock_owner, lock_comment, lock_date from base_node left outer join lock on base_node.repos_id = lock.repos_id and base_node.repos_relpath = lock.repos_relpath where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select presence, kind, checksum, translated_size, changed_rev, changed_date, changed_author, depth, symlink_target, copyfrom_repos_id, copyfrom_repos_path, copyfrom_revnum, moved_here, moved_to, last_mod_time, properties from working_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select prop_reject, changelist, conflict_old, conflict_new, conflict_working, tree_conflict_data, properties from actual_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select root, uuid from repository where id = 1;" DBG: sqlite.c: 63: sql="select prop_reject, changelist, conflict_old, conflict_new, conflict_working, tree_conflict_data, properties from actual_node where wc_id = 1 and local_relpath = 'contrib/server-side';" DBG: sqlite.c: 63: sql="SELECT properties, presence FROM WORKING_NODE WHERE wc_id = 1 AND local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select properties from base_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select properties from actual_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="SELECT properties, presence FROM WORKING_NODE WHERE wc_id = 1 AND local_relpath = 'contrib/server-side/fsfsverify.py';" DBG: sqlite.c: 63: sql="select properties from base_node where wc_id = 1 and local_relpath = 'contrib/server-side/fsfsverify.py';" --- Notably, AFAICT, we're repeating a few of these queries: - STMT_SELECT_WORKING_NODE (2 times) - STMT_SELECT_ACTUAL_NODE (3 times) - STMT_SELECT_WORKING_PROPS (2 times) - STMT_SELECT_BASE_PROPS (2 times) I haven't yet dug into why we're repeating the queries. So, I'd bet we can cut our volume of SQL calls dramatically with some minor rejiggering not to lose the values when we do the first calls of the statement. -- justin
Re: Repeated SQL queries when doing 'svn st'
On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz wrote: > When compiled with SVN_DEBUG and SQLITE3_DEBUG and 'svn st' against a > svn trunk WC, a number of things pop out. > > We perform 28,062 SQL queries. > > --- > DBG: sqlite.c: 63: sql="select root, uuid from repository where id = 1;" > --- > > We execute *this* query (STMT_SELECT_REPOSITORY_BY_ID) 2215 times. Yikes. > > I think this has to do with svn_wc__db_base_get_info's call to > fetch_repos_info. I'd think we'd be able to cache this result. I'll > take a stab and see if this reduction saves us any real time. The > root and uuid should be constant for an wc_id...right? It's actually svn_wc__db_read_info's fetch_repos_info call... With 2215 queries: ~/Applications/svn-trunk-no-debug/bin/svn st 0.26s user 0.05s system 98% cpu 0.311 total With a quick-and-hacky cache: ~/Applications/svn-trunk-no-debug/bin/svn st 0.25s user 0.05s system 98% cpu 0.298 total It's worth a good 4% time savings... A quick back-of-the-envelope calculation says that if we can remove all of the extraneous 13,290 SQL queries (out of 28,062 ; leaving behind 14,772 queries) - we will likely gain something like 25% from the 0.311 down to around 0.233 seconds. It's still much higher than 0.050 than 'svn st' on 1.6.x yields, but inching closer... -- justin
Re: Repeated SQL queries when doing 'svn st'
On Sat, Sep 4, 2010 at 10:18 AM, Justin Erenkrantz wrote: > Notably, AFAICT, we're repeating a few of these queries: > > - STMT_SELECT_WORKING_NODE (2 times) > - STMT_SELECT_ACTUAL_NODE (3 times) > - STMT_SELECT_WORKING_PROPS (2 times) > - STMT_SELECT_BASE_PROPS (2 times) > > I haven't yet dug into why we're repeating the queries. Okay - I now know why we're repeating the core queries twice. In get_dir_status, we want to do a check to identify if the node exists and what kind it is - which is done by a call to svn_wc__db_read_info (around line 1269 in status.c). But, most of the parameters (except for node and kind) are NULL. If it's not excluded and we can go into the depth, then we call handle_dir_entry on the entry a few lines down - which turns right around and calls svn_wc__db_read_info - this time asking for everything. This causes the core per-file queries to be executed twice. I'm going to see what a quick check to retrieve just the kind and status will do for the query volume. I think it's unlikely we have to pull everything out of sqlite to answer that basic question. -- justin
Re: Repeated SQL queries when doing 'svn st'
On Sat, Sep 4, 2010 at 12:45 PM, Justin Erenkrantz wrote: > I'm going to see what a quick check to retrieve just the kind and > status will do for the query volume. I think it's unlikely we have to > pull everything out of sqlite to answer that basic question. -- > justin We can reduce the query volume by one (per file) as we can skip the active table query - see quick & dirty patch below. It does replace 2 larger queries with 2 smaller ones, but I think our bottleneck is likely more around query volume than query complexity... Anyway, I'll stop replying to myself and enjoy the long weekend. =) -- justin Index: status.c === --- status.c(revision 992534) +++ status.c(working copy) @@ -1263,10 +1270,7 @@ get_dir_status(const struct walk_status_baton *wb, svn_wc__db_status_t node_status; svn_wc__db_kind_t node_kind; - SVN_ERR(svn_wc__db_read_info(&node_status, &node_kind, NULL, NULL, - NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, NULL, NULL, + SVN_ERR(svn_wc__db_read_info_exist(&node_status, &node_kind, wb->db, node_abspath, iterpool, iterpool)); if (node_status != svn_wc__db_status_not_present Index: wc_db.c === --- wc_db.c (revision 992534) +++ wc_db.c (working copy) @@ -51,7 +51,6 @@ #include "private/svn_wc_private.h" #include "private/svn_token.h" - #define NOT_IMPLEMENTED() SVN__NOT_IMPLEMENTED() @@ -5051,6 +5050,161 @@ svn_wc__db_temp_op_delete(svn_wc__db_t *db, svn_error_t * +svn_wc__db_read_info_exist(svn_wc__db_status_t *status, + svn_wc__db_kind_t *kind, + svn_wc__db_t *db, + const char *local_abspath, + apr_pool_t *result_pool, + apr_pool_t *scratch_pool) +{ + svn_wc__db_pdh_t *pdh; + const char *local_relpath; + svn_sqlite__stmt_t *stmt_base; + svn_sqlite__stmt_t *stmt_work; + svn_boolean_t have_base; + svn_boolean_t have_work; + svn_error_t *err = NULL; + + SVN_ERR_ASSERT(svn_dirent_is_absolute(local_abspath)); + + SVN_ERR(svn_wc__db_pdh_parse_local_abspath(&pdh, &local_relpath, db, + local_abspath, svn_sqlite__mode_readonly, + scratch_pool, scratch_pool)); + VERIFY_USABLE_PDH(pdh); + + SVN_ERR(svn_sqlite__get_statement(&stmt_base, pdh->wcroot->sdb, +STMT_SELECT_BASE_NODE_EXIST)); + SVN_ERR(svn_sqlite__bindf(stmt_base, "is", +pdh->wcroot->wc_id, local_relpath)); + SVN_ERR(svn_sqlite__step(&have_base, stmt_base)); + + SVN_ERR(svn_sqlite__get_statement(&stmt_work, pdh->wcroot->sdb, +STMT_SELECT_WORKING_NODE_EXIST)); + SVN_ERR(svn_sqlite__bindf(stmt_work, "is", +pdh->wcroot->wc_id, local_relpath)); + SVN_ERR(svn_sqlite__step(&have_work, stmt_work)); + + if (have_base || have_work) +{ + svn_wc__db_kind_t node_kind; + + if (have_work) +node_kind = svn_sqlite__column_token(stmt_work, 1, kind_map); + else +node_kind = svn_sqlite__column_token(stmt_base, 1, kind_map); + + if (status) +{ + if (have_base) +{ + *status = svn_sqlite__column_token(stmt_base, 0, presence_map); + + /* We have a presence that allows a WORKING_NODE override + (normal or not-present), or we don't have an override. */ + /* ### for now, allow an override of an incomplete BASE_NODE + ### row. it appears possible to get rows in BASE/WORKING + ### both set to 'incomplete'. */ + SVN_ERR_ASSERT((*status != svn_wc__db_status_absent + && *status != svn_wc__db_status_excluded + /* && *status != svn_wc__db_status_incomplete */) + || !have_work); + +#ifndef SVN_WC__SINGLE_DB + if (node_kind == svn_wc__db_kind_subdir + && *status == svn_wc__db_status_normal) +{ + /* We should have read a row from the subdir wc.db. It + must be obstructed in some way. + + It is also possible that a WORKING node will override + this value with a proper status. */ + *status = svn_wc__db_status_obstructed; +} +#end
Autoconf warnings on trunk
I'm getting a slew of warnings when I run autogen.sh on trunk. (autoconf 2.68 via Macports.) configure.ac:119: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... build/ac-macros/neon.m4:72: SVN_NEON_CONFIG is expanded from... ../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from... ../../lib/autoconf/general.m4:1482: AC_ARG_WITH is expanded from... build/ac-macros/neon.m4:39: SVN_LIB_NEON is expanded from... configure.ac:119: the top level configure.ac:135: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... build/ac-macros/gssapi.m4:26: SVN_LIB_RA_SERF_GSSAPI is expanded from... configure.ac:135: the top level ...more warnings... I've tried to wrap the calls to AC_LINK_IFELSE like so: Index: build/ac-macros/neon.m4 === --- build/ac-macros/neon.m4 (revision 1029788) +++ build/ac-macros/neon.m4 (working copy) @@ -113,7 +113,7 @@ #include int main() {ne_xml_create(); ne_decompress_destroy(NULL);}" - AC_LINK_IFELSE([$neon_test_code], shared_linking="yes", shared_linking="no") + AC_LINK_IFELSE(AC_LANG_SOURCE([$neon_test_code]), shared_linking="yes", shared_linking="no") if test "$shared_linking" = "no"; then NEON_LIBS=`$PKG_CONFIG neon --libs --static` LIBS="$LIBS $NEON_LIBS" But, that doesn't seem to be enough. Also tried [[ ]]s within AC_LANG_SOURCE and no dice. Any idea what I'm missing? Is this some lame autoconf bug? -- justin
Re: Autoconf warnings on trunk
On Mon, Nov 1, 2010 at 11:49 AM, Mark Phippard wrote: > On Mon, Nov 1, 2010 at 2:47 PM, Justin Erenkrantz > wrote: >> I'm getting a slew of warnings when I run autogen.sh on trunk. >> (autoconf 2.68 via Macports.) > > Cool, it is not just me! > > I recently reinstalled MacPorts and thought I did something wrong. Yah, even though I haven't been able to figure it out, I didn't see any mention of these warnings in the archives, so I thought it'd be worth posting even though I'm a bit stumped at the moment. -- justin
Re: Autoconf warnings on trunk
On Mon, Nov 1, 2010 at 12:00 PM, Philip Martin wrote: > The documentation suggests another '[' is needed: > > AC_LINK_IFELSE([AC_LANG_SOURCE r1029802. (also tested with stock 2.59 as well...seems okay.) Yah, I somehow missed needing the leading [. m4 as a language blows. =) -- justin
Re: What stands between us and branching 1.7?
On Thu, Jan 6, 2011 at 1:04 PM, C. Michael Pilato wrote: > On 01/06/2011 03:48 PM, Stefan Küng wrote: >> On 06.01.2011 21:41, C. Michael Pilato wrote: >>> I'm sorry if I asked this before -- I've been asking individual folks for >>> over a month now, but I can't quickly find a public broadcast thread about >>> it, at least -- but I've been wondering lately: >>> >>> What, exactly, stands in the way of us branching for 1.7 stabilization? >>> >>> ra_serf stabilization? No... that's fairly well taken care of, and would >>> fit perfectly within in the scope of post-branch work anyway. >> >> At least on Windows, I doubt that ra_serf is even used right now. Because of >> the huge memory leak serf has/had (See here: >> http://code.google.com/p/serf/source/detail?r=1416). But even though the >> leak is fixed, there hasn't been another release yet. >> With the latest release without that fix, serf is not usable at all. >> To get more people to test ra_serf, serf itself first needs a new release >> which includes that fix. >> >> Stefan >> > > Xlnt feedback. I've noted this on our roadmap.html page. What else? We could cut a serf bug fix release that has a few fixes that Lieven committed shortly after we released 0.7.0. Greg or Lieven, any thoughts here? -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts wrote: >> Greg or Lieven, any thoughts here? -- justin > > At least the one rev that fixes this issue, don't know if the other are > already working in all scenario's. > I'll look at it this weekend and make a release. Woohoo. Thanks. I'll be sure to test it. Let me know if you need any help. -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 5:41 AM, Lieven Govaerts wrote: > I'd appreciate some more testing, before I make the release later this week. > The code to get is: > http://serf.googlecode.com/svn/branches/0.7.x at r1427. % uname -v Darwin Kernel Version 10.5.0: Fri Nov 5 23:20:39 PDT 2010; root:xnu-1504.9.17~1/RELEASE_I386 (OS X 10.6.5) % ln -s ~/work/serf-0.7.x/test/serftestca.pem test/serftestca.pem (I use VPATH builds, so had to bring over serftestca.pem into the build dir to get the 2 SSL tests to pass) % ./test/test_all . OK (17 tests) And, 0.7.x builds fine against latest SVN 1.6.x branches and basic co/ls functionality via ra_serf seems good. Looks good! Thanks! -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 8:42 AM, Stefan Küng wrote: > Ok, tested with serf from the 0.7.x branch: memory rise is still higher than > with neon, indicating that there's still some (small) memory leak somewhere. > But checkouts and updates of even larger projects succeed without using up > all available memory as before. > To compare: checkout of the TSVN source including all externals with serf > uses up about 40MB more RAM than when using neon. I'd say that's ok. Does the memory keep going up? Or, does it reach a steady point? I'd expect that ra_serf would use a bit more memory than ra_neon as it has to manage a lot more than what neon has to do. As a point of reference, on Mac OS X 10.6, svn 1.6.x with ra_serf checking out svn trunk peaks at 81MB, while ra_neon peaks at 12MB. HTH. -- justin
Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)
I think this probably belongs in the book, but I don't know exactly where...so I'll ensure that it's noted again and let someone else figure out where it should live. As we start to have more users with ra_serf, we should also recommend that httpd server admins bump up their MaxKeepAliveRequests from 100 (default) to at least 1000. This'll help ra_serf utilize TCP connections more efficiently - as ra_serf issues 2 HTTP requests per file that it needs to check out or update. ra_serf has heuristics so that it'll work gracefully under the default settings (and determine what the value actually is), but you'll get better network performance if you can utilize fewer TCP connections. This way, ra_serf will only need to open a new connection every 500 files rather than every 50. (Honestly, there is really no reason that it couldn't be 1.) If we also have a tuning section, we should remind folks to also enable mod_deflate (SetOutputFilter DEFLATE in their Location block) as that'll help a bit when transferring XML content back and forth. mod_deflate will come at a slight latency penalty, but that'll be offset for folks with slower connections taking less time to transfer the responses overall. FWIW, I've already asked ASF infra to bump up the MaxKeepAliveRequests settings for svn.apache.org. Thanks! -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 10:22 AM, Lieven Govaerts wrote: > Things have changed since then though. Can anyone test with svn 1.6.x to see > how it uses memory? For ra_serf, I'm wondering if we're creating an additional pool that isn't necessary - namely the editor_pool. I've done some light testing with the ra_serf patch below, which does the following things: - Create a separate pool hierarchy for files - Combines the per-file editor pool with the regular pool for file For serf itself, I've also switched zlib to use the bucket allocator - which also helps which churn - as every call to inflateInit/inflateEnd invokes malloc/free - so we can save on that quite bit. Overall, this seems to reduce the amount of 20KB allocations substantially...but not sure it does much to the overall memory usage. Not sure if/when I'll pick this up again...so patches below if someone else wants to run with it. -- justin Index: update.c === --- update.c(revision 1062462) +++ update.c(working copy) @@ -107,6 +107,7 @@ /* controlling dir baton - this is only created in open_dir() */ void *dir_baton; apr_pool_t *dir_baton_pool; + apr_pool_t *file_pool; /* Our master update editor and baton. */ const svn_delta_editor_t *update_editor; @@ -202,9 +203,6 @@ /* The properties for this file */ apr_hash_t *props; - /* pool passed to update->add_file, etc. */ - apr_pool_t *editor_pool; - /* controlling file_baton and textdelta handler */ void *file_baton; const char *base_checksum; @@ -403,7 +401,7 @@ report_info_t *new_info; new_info = apr_pcalloc(info_parent_pool, sizeof(*new_info)); - apr_pool_create(&new_info->pool, info_parent_pool); + apr_pool_create(&new_info->pool, info->dir->file_pool); new_info->file_baton = NULL; new_info->lock_token = NULL; new_info->fetch_file = FALSE; @@ -500,6 +498,7 @@ if (dir->base_name[0] == '\0') { apr_pool_create(&dir->dir_baton_pool, dir->pool); + apr_pool_create(&dir->file_pool, dir->pool); if (dir->report_context->destination && dir->report_context->sess->wc_callbacks->invalidate_wc_props) @@ -519,6 +518,7 @@ SVN_ERR(open_dir(dir->parent_dir)); apr_pool_create(&dir->dir_baton_pool, dir->parent_dir->dir_baton_pool); + dir->file_pool = dir->parent_dir->file_pool; if (SVN_IS_VALID_REVNUM(dir->base_rev)) { @@ -632,7 +632,7 @@ if (lock_val) { char *new_lock; - new_lock = apr_pstrdup(info->editor_pool, lock_val); + new_lock = apr_pstrdup(info->pool, lock_val); apr_collapse_spaces(new_lock, new_lock); lock_val = new_lock; } @@ -641,7 +641,7 @@ { svn_string_t *str; - str = svn_string_ncreate("", 1, info->editor_pool); + str = svn_string_ncreate("", 1, info->pool); svn_ra_serf__set_ver_prop(info->dir->removed_props, info->base_name, info->base_rev, "DAV:", "lock-token", @@ -756,13 +756,11 @@ return error_fetch(request, fetch_ctx, err); } - apr_pool_create(&info->editor_pool, info->dir->dir_baton_pool); - /* Expand our full name now if we haven't done so yet. */ if (!info->name) { info->name_buf = svn_stringbuf_dup(info->dir->name_buf, - info->editor_pool); + info->pool); svn_path_add_component(info->name_buf, info->base_name); info->name = info->name_buf->data; } @@ -772,7 +770,7 @@ err = info->dir->update_editor->open_file(info->name, info->dir->dir_baton, info->base_rev, -info->editor_pool, +info->pool, &info->file_baton); } else @@ -781,7 +779,7 @@ info->dir->dir_baton, info->copyfrom_path, info->copyfrom_rev, - info->editor_pool, + info->pool, &info->file_baton); } @@ -792,7 +790,7 @@ err = info->dir->update_editor->apply_textdelta(info->file_baton, info->base_checksum, - info->editor_pool, + info->pool, &info->textdelta,
Re: Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)
On Sun, Jan 23, 2011 at 11:26 AM, Stefan Sperling wrote: > On Sun, Jan 23, 2011 at 09:27:18AM -0800, Justin Erenkrantz wrote: >> If we also have a tuning section, we should remind folks to also >> enable mod_deflate (SetOutputFilter DEFLATE in their Location block) >> as that'll help a bit when transferring XML content back and forth. >> mod_deflate will come at a slight latency penalty, but that'll be >> offset for folks with slower connections taking less time to transfer >> the responses overall. > > Has this problem been fixed, then? > http://svn.haxx.se/dev/archive-2009-08/0274.shtml > I don't think we should recommend mod_deflate if that problem still exists. No, it doesn't look like it has been fixed yet. However, how many real-world HTTP clients are out there that don't speak zlib? =) > I don't think that thread came to a conclusion. > Greg was implying the bug was in mod_deflate rather than svn: > http://svn.haxx.se/dev/archive-2009-08/0290.shtml Greg is correct in that the patch doesn't go far enough - mod_dav_svn should change all uses of output to be a ** rather than *. As far as mod_deflate being broken, yah, it probably shouldn't be trying to do any memory usage until it knows it is activated...but...that's a topic for another list (dev@httpd). mod_deflate should probably memorize the fact that the checks have already run and get out of the way...probably with setting no-gzip in subprocess_env? -- justin
Re: Code doesn't seem ... right
On Mon, Jan 24, 2011 at 2:22 PM, C. Michael Pilato wrote: > [Using dev@ as a public TODO list to avoid pushing stack on a task.] > > In mod_dav_svn/mirror.c:dav_svn__location_body_filter() and > dav_svn__location_in_filter() are code blocks like this: > > if (uri.path) > canonicalized_uri = svn_urlpath__canonicalize(uri.path, r->pool); > else > canonicalized_uri = uri.path; > if (strcmp(canonicalized_uri, root_dir) == 0) { > [...] > > So ... if uri.path == NULL, then canonicalized_uri is set to NULL, and then > that NULL is used in a strcmp(). Won't that SEGFAULT? It'd be difficult (if not outright impossible) to hit that else case. Follow apr_uri_parse and apr_pstrmemdup. Also know that we don't hit this code block if master_uri isn't set. The original code I wrote was just a straight strcmp - I believe the check for null is spurious. My $.02. -- justin
Re: crash in serf on windows
On Tue, Jan 25, 2011 at 12:02 AM, Marc Haesen wrote: > I have seen a crash (null pointer access) during svn merge using serf. The > same command using neon was successful. > > > > I am using a trunk compiled version of svn (revision 1061787) on windows > together with a trunk version of serf (revision 1421) compiled with visual > studio 2010 > > > > The reason is that the serf_httpconn_socket received a null pointer as > bucket. The serf_httpconn stuff is probably not quite ready yet - I know Lieven is still working through some issues there. You may have far better luck with the 0.7.x branch which we're prepping for a new release soon: https://serf.googlecode.com/svn/branches/0.7.x If you can give that a try, it'd be appreciated! > Attached are the 2 crash report files and a screen dump of the .dmp file > loaded in the debugger. Sadly, there's nothing attached...so I have no clue what you're seeing. =( -- justin
Re: Code doesn't seem ... right
On Wed, Jan 26, 2011 at 10:28 AM, C. Michael Pilato wrote: > I'm not sure how to move forward. Should we require that SVNMasterURI > values point to something other than the server root? It was never intended to be at the server root - doing a proxy of / is always an odd proposition. -- justin
Re: svn commit: r1064093 -/subversion/trunk/subversion/libsvn_repos/authz.c
On Thu, Jan 27, 2011 at 11:14 AM, C. Michael Pilato wrote: > If we have have the option of moving towards case-sensitivity -- that is, a > *more*-precise authz policy -- that seems like a good thing. I'd even be in > favor of making this behavior optional (like the force_username_case option > we already have). Given that we do wonky things on the client side when cases are mixed in the same path, I think it'd be very weird to make this behavior optional. It's so highly unlikely that anyone is going to want apply rules to "AuthZ" but not to "authz" - especially given that it wouldn't have worked *at all* before anyway. My $.02. -- justin
Re: [Proposal] Remove DAV properties cache in ra_serf
On Mon, Feb 14, 2011 at 7:57 PM, Ivan Zhakov wrote: > Of course these problems can be fixed, but I'm not sure that we need > this code. Since in svn 1.7 we have HTTPv2 which doesn't use so many > PROPFIND requests that we tried to reduce using DAV properties cache. > > So I'm propose just to remove DAV properties cache from ra_serf. > Objections? Comments? +1. Makes sense to me. -- justin
Re: HTTPv2 question - eliminate stub URLs in OPTIONS response?
On Mon, Mar 7, 2011 at 9:34 AM, C. Michael Pilato wrote: > Such an optimization would be beyond negligible, processing- and > network-usage-wise. And by allowing the server to dictate "This is how you > need to talk to me" we leave allow ourselves the option of making changes to > the server's URL-space in a fashion that any HTTPv2-ready client can > gracefully adjust to. +1. -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Mon, Mar 7, 2011 at 3:26 PM, John Beranek wrote: > Hmm...I'm surprised (and disappointed). No one is interested in > Subversion 1.7 being lower performance than 1.6? You're not telling us something we don't already know (go read the archives some time). Many folks are still working on improving the performance of 1.7...so, general complaints aren't going to be terribly productive. -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Wed, Mar 9, 2011 at 8:28 AM, Philip Martin wrote: > 0.54 epoll_wait(3, {{EPOLLOUT, {u32=30583944, u64=30583944}}}, 16, > 36) = 1 <0.11> As best as I can tell, that strace isn't matching your earlier description. The last epoll_wait is taking 0.11 seconds? -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Wed, Mar 9, 2011 at 8:57 AM, Philip Martin wrote: > You are not looking at the right epoll_wait, look 14 lines down from the > start. That epoll_wait call takes <0.039952>. The line below that > shows a delay of 0.040042 to the subsequent read call. So epoll_wait is > responsible for nearly all the delay. This epoll_wait() is waiting for the response back from the server, so it makes sense that it'd take longer. Do we know the server responded sooner? Could it be something on the server side? IOW, this pattern looks correct on the syscall() side. -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Wed, Mar 9, 2011 at 9:12 AM, Philip Martin wrote: > Joe Orton writes: > >> On Wed, Mar 09, 2011 at 08:50:37AM -0800, Justin Erenkrantz wrote: >>> On Wed, Mar 9, 2011 at 8:28 AM, Philip Martin >>> wrote: >>> > 0.54 epoll_wait(3, {{EPOLLOUT, {u32=30583944, u64=30583944}}}, >>> > 16, 36) = 1 <0.11> >>> >>> As best as I can tell, that strace isn't matching your earlier >>> description. The last epoll_wait is taking 0.11 seconds? -- >> >> Smells like Nagle to me - looks like serf is turning Nagle *on* not >> off... ("nodelay off" == "delay on") - try this? >> >> Index: outgoing.c >> === >> --- outgoing.c (revision 1440) >> +++ outgoing.c (working copy) >> @@ -201,7 +201,7 @@ >> >> /* Disable Nagle's algorithm */ >> if ((status = apr_socket_opt_set(skt, >> - APR_TCP_NODELAY, 0)) != >> APR_SUCCESS) >> + APR_TCP_NODELAY, 1)) != >> APR_SUCCESS) >> return status; >> >> /* Configured. Store it into the connection now. */ > > Yes, that's it. Serf is now comparable to neon. This patch is now applied to trunk and 0.7.x. Thanks! -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Thu, Mar 10, 2011 at 7:23 AM, Philip Martin wrote: > We using a lot more CPU and elapsed time is hugely increased. Just to make sure I understand, it's likely that this is more about the libsvn_wc rewrite than anything with ra_serf? That is, ra_neon gives similar timings over NFS in 1.7? Or? -- justin
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Thu, Mar 10, 2011 at 8:22 AM, John Beranek wrote: > 1.6.16 from localhost trunk server (10 iterations) > ra_neon: 1.43 > ra_serf: 2.91 > > trunk from localhost trunk server (10 iterations) > ra_neon: 1.48 > ra_serf: 2.95 > > So, neon and serf retain their speeds relative to each other... Heh, while we've got you testing stuff...what happens if the server is on a different physical machine? Are your results in line with Philip's which says that ra_serf and ra_neon are within margin of error (if serf is not indeed faster)? Are the timings still off by that much? Because it's async, I expect serf has substantially different performance characteristics when it is going over localhost (no network) versus using the actual network stack... -- justin
Re: wc_db API discussion
On Fri, Mar 11, 2011 at 4:11 PM, Mark Phippard wrote: > I think we have to get this work done soon. We cannot release with > performance like it is. How do we define the scope of the work that > needs to be done so that we can divide and conquer and get these > changes in place? It sounds like we should codify what our performance targets are. Is it acceptable if 1.7 is as fast as 1.6? Should it be faster? Could we accept a slowdown for 1.7 as long as we know how we can get it on par (or faster) for 1.8? What are the operations (and test cases?) that are important to us? -- justin
Re: Random serf checkout failures
On Mon, Nov 5, 2012 at 1:33 PM, C. Michael Pilato wrote: > My debugging indicates that when close_all_dirs() is called, there are a > slew of unclosed directories which each have a non-zero ref_count. Save for > the root directory of the edit (/tags), fetch_props == TRUE, > propfind_handler->done == TRUE, and tag_closed == TRUE for all open child > directories. So I'm looking around at how file references are managed to > see if anything has leaked. I might have inadvertently botched something > when adding the pull-contents-from-the-wc-cache logic. Yup - your analysis is pretty accurate IIRC. I posted about this when I was in Berlin and even had a few patches to start to clean this up. However, Greg was planning on redoing how we parse the XML report; when that happens, some of this management should be far simpler (as we know when the tags are closed) - so the thinking was to wait until that was done and then assess whether that was fixed or not. -- justin
Re: Random serf checkout failures
On Tue, Nov 6, 2012 at 3:22 PM, C. Michael Pilato wrote: > On 11/06/2012 10:15 AM, C. Michael Pilato wrote: >> On 11/06/2012 08:29 AM, C. Michael Pilato wrote: >>> I recall this patch of yours -- I even asked you about it post-Berlin. I'll >>> take a look at it now and see if it can help us out. >> >> Not seeing the active_dir_propfinds list corruption you mentioned, Justin. > > Bummer. Now I am. *grin* Yah, the pool lifetimes are all...overly complicated. The 'right' way to solve it, IMO, is to redo the parsing logic akin to how Greg was redoing the other ra_serf files. -- justin
Re: Random serf checkout failures
On Tue, Nov 6, 2012 at 4:41 PM, Ivan Zhakov wrote: > Another problem is how serf reads data from network in case of > multiple connections: it reads data from one connection until EAGAIN. > But if data comes from network really fast (from local server for > example) it continue reading data from this connection, without > reading data from other connection! Which leads time out them. > > See outgoing.c:read_from_connection() Hmm. I'm not sure if there's an easy answer for this while keeping ra_serf single-threaded. Perhaps keep a track of how much we've read in a single connection and give up reading that connection for a bit if we've read 100KB or some such to see if there are other connections with responses ready to read. But, then again, here in 2012, I'd probably fight a lot harder to make the checkout logic run across multiple network threads as that would spread the SSL or gzip loads across multiple cores. -- justin
Re: Random serf checkout failures
On Tue, Nov 6, 2012 at 12:13 PM, Lieven Govaerts wrote: > okay, so with the Timeout directive added I can reproduce this issue. > > What I see is that the server closes the connection in the middle of > sending a response to the client. It doesn't even finalize the > response first. > So ra_serf is reading the data from the response bucket, but gets an > APR_EOF when it needs more data than specified in the Content-Length > header of the response. > > What is the expected behavior here, let serf close down the connection > and try the request again on a new connection? Sure sounds like an early TCP close causing a lost response. httpd shouldn't be closing the TCP connection in the middle of the response as that's not when httpd would evaluate the Timeout directive...but, if you are going over a loopback connection, I think I've seen Windows lose bytes...but, I don't think Philip uses Windows...weird - I'd guess something else is going on. -- justin
Re: Content-Length in HEAD responses
On Fri, Nov 9, 2012 at 10:15 AM, C. Michael Pilato wrote: > request, so I'm wondering ... does Apache just handle a HEAD as a GET > under-the-hood and then discard the resulting response body? The comment > Unless the handler does something special, yes, httpd will treat HEAD as a GET until it is time to send the response body...and simply drops the response body. -- justin
Re: Content-Length in HEAD responses
On Fri, Nov 9, 2012 at 11:37 PM, Branko Čibej wrote: > That would imply that, if content-length doesn't get set on a HEAD > response, but Transfer-Encoding: chunked does, then everything is > correct, right? If somewhat inefficient. > The chunked header is only sent on non-HEAD responses. r->chunked is set as a side-effect in a conditional in httpd and that conditional gets short-circuited earlier if we are a HEAD request. BTW, if I do: % curl --head https://svn.apache.org/repos/asf/subversion/trunk/subversion/mod_dav_svn/repos.c HTTP/1.1 200 OK Date: Sat, 10 Nov 2012 11:48:26 GMT Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/1.0.0c DAV/2 mod_wsgi/3.1 Python/2.6.6 SVN/1.7.0 Last-Modified: Fri, 19 Oct 2012 14:11:49 GMT ETag: "1400104//subversion/trunk/subversion/mod_dav_svn/repos.c" Accept-Ranges: bytes Content-Length: 157969 Vary: Accept-Encoding Content-Type: text/plain; charset=ISO-8859-1 There is a C-L header...so, I don't know what the original poster is seeing, but we're already doing the right thing... -- justin
Re: Content-Length in HEAD responses
On Sat, Nov 10, 2012 at 6:49 AM, Justin Erenkrantz wrote: > There is a C-L header...so, I don't know what the original poster is > seeing, but we're already doing the right thing... -- justin > I bet the OP is trying to do HEAD on a directory as there was talk elsethread about trying to discover ACLs on a directories...that's not going to generate a C-L. And, C-L is not meaningful in anyway on directories - only actual resources. mod_autoindex doesn't do it either: % curl --head https://archive.apache.org/dist/ HTTP/1.1 200 OK Date: Sat, 10 Nov 2012 11:53:36 GMT Server: Apache/2.4.1 (Unix) OpenSSL/1.0.0g Content-Type: text/html;charset=ISO-8859-1 (The only way to generate the C-L would be to run through the response...which we don't do for HEAD.) HTH. -- justin
Re: Content-Length in HEAD responses
On Sat, Nov 10, 2012 at 3:25 PM, Thomas Åkesson wrote: > I suppose this means that it would be a significant optimization to > perform HEAD rather than GET when discovering ACLs for every subdirectory > in a directory listing? > Probably - doing the HEAD request will run the full authn and authz checks, but it won't produce the bodies - you'll also save not having to send the responses on the wire - but you won't know what the directory listing is unless you do a GET in the first place. So, it might help at the leaf nodes in the tree. (But how would you know it's a leaf! Fun.) > Branko's concern is still interesting... because this behaviour (omitting > CL for HEAD requests) does seem to violate the HTTP RFC, but for good > reason. Given that mod_autoindex as well as mod_php (at least on the config > I tested) also omits CL for HEAD I suppose it is a well accepted > optimization in practice. > IIRC, there is an out in the RFC if the content is dynamic - this also may be something cleaned up in the httpbis RFC clarifications as forcing the server to generate the content only to throw it away (when it can't be pre-computed) is kinda pointless. -- justin
Re: serf in 1.8
On Mon, Nov 12, 2012 at 9:13 PM, Philip Martin wrote: > I have a checkout of the gcc tree, it has 78,000 files. Now it uses > svn: but if it were to use http: then the serf checkout log would be 4 > orders of magnitude bigger than the neon log. 83 years becomes 1 or 2 > days. > > The neon log is independent of the size of the checkout, the serf log > scales with the size of the checkout. If this were memory we would say > we have a scaling problem. Do scaling problems not apply to disk space? If you have a high-traffic site and you choose not allocate the disk space, rotate the logs or *gasp* don't log. There are clear solutions to this "problem" that are well-known by any competent owner. -- justin
Re: [serf-dev] Re: Double compression over HTTPS
On Tue, Nov 13, 2012 at 2:41 AM, Lieven Govaerts wrote: > The proposed approach here is to disable SSL compression completely, > so in terms of the above that leaves us with scenario 2. > > Serf doesn't have an option currently to disable SSL compression from > the client side. I plan to add it in the next version. +1. -- justin
Re: serf in 1.8
On Tue, Nov 13, 2012 at 2:26 AM, Lieven Govaerts wrote: > Is that a fix that we can get in apache 2.2? Justin? > I don't readily recall the issue in mod_deflate. Is it already resolved in the 2.4 series? > work. These concern maybe only 1% of the users, but we should have a > plan to support them, and I think keeping neon in the build as the > easiest way to do that. > IMO, the "plan to support them" (besides welcoming patches to support that 1% case) is tell them to stick with 1.7. =P -- justin
Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c
On Fri, Nov 30, 2012 at 4:54 PM, wrote: > Author: cmpilato > Date: Fri Nov 30 21:54:35 2012 > New Revision: 1415864 > > URL: http://svn.apache.org/viewvc?rev=1415864&view=rev > Log: > Implement in ra_serf "send-all" mode support for update-style REPORTs > and their responses. (Currently disabled by compile-time conditionals.) > > (This one goes out to Ivan Zhakov.) > I've stated for a long time that I think the send-all mode is a huge mistake architecturally because it is too prone to double-compression and TCP pipeline stalls and is a tremendous burden on a properly-configured httpd (by not taking advantage of server-side parallelism), it's nice to see it's not *too* hard to shoehorn this bad idea back into ra_serf. We'd never be able to shove the non-send-all approach into ra_neon. =) Here's my suggestion for consideration - let's experiment with this setting in the beta release process with the setting as-is - that is we always do the parallel updates unconditionally (except perhaps when svnrdump is being stupid). If we get real users complaining about the update during that cycle, we can then figure out either switching the default and/or adding a config-option or even allowing some control via capabilities exchange. Does that sound reasonable? -- justin
Re: reposurgeon now writes Subversion repositories
On Fri, Nov 30, 2012 at 5:26 PM, Branko Čibej wrote: > When I create an account at [your favourite forge], I tell it my name > and give it one of my e-mail addresses. It is, in my opinion, up to the > forge software to use that in svn:author, not up to local user > preferences. That /is/ a fundamental difference between the centralised > and distributed models. > > Why do forges not do that? I don't know, but it's definitely not because > Subversion doesn't give them fifteen ways of manipulating the svn:author > property. > +1. Furthermore, if the user wants to set their own custom revprops that have nothing to do with svn:author (e.g., svn:federation-id) and end up with a practice by convention rather than by server fiat that tools like reposurgeon can understand, there's plenty of ways for the user *or the server* to add in additional revprops as well. Plus, since revprops aren't versioned, they can always be fixed up after-the-fact to include appropriate svn:federation-id tags...finally, a "feature" of revprops not being versioned! -- justin
Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c
On Sat, Dec 1, 2012 at 5:59 AM, Johan Corveleyn wrote: > I'm wondering whether your concerns apply to both internet-wide > deployments and local (all on the same LAN) ones. > That line is certainly a fair one to draw in the sand. That said, I think the internal use case cries out even *more* for the parallel updates as the internal server in that environment is often wildly over-provisioned on the CPU side - with a fairly low-traffic environment, you want to take advantage of the parallel cores of a CPU to drive the updates. Generally speaking, what I discovered years ago back in 2006 (yikes) and I believe is still true as we near 2013 (shudder), if everything else is perfectly optimized (disk, latency, bandwidth, etc.), you're going to eventually bottleneck on the checksumming on both client and server - which is entirely CPU-bound and is expensive. You can solve that by splitting out the work across multiple cores - for a server, you need to utilize multiple parallel requests in-flight; and for a client, you then need to parallelize the editor drive. The reason that disk isn't such a bottleneck as you might first expect is due to the OS's buffer cache - for reads on the server-side, common data is already going to be in RAM so hot spots in the fsfs repos will already be in memory, for writes on the client-side, modern client OSes won't necessarily block you until everything is sync'd to disk. But, once you exhaust the capabilities of RAM, your underlying disk architecture matters a lot and one that might not be intuitive to those that haven't spent a lot of time closely with them. (Hi Brane!) If you are using direct-attached storage locally on either server or client, then you will probably be bottlenecked right there. However, if your corporate environment has an NFS filer or SAN (a la NetApp/EMC) backing the FSFS repository or as NFS working copies (oh so common), those large disk subsystems are geared towards parallel I/Os - not single-threaded I/O performance - Isilon/BlueArc-class storage is however; but I've yet to see anyone obsessed enough about SVN I/O perf to place either their repository or working copies on a BlueArc-class storage system! So, if you are not using direct-attached storage and are using NFS today in a corporate environment on either client or server, then you want to parallelize everything so that you can take advantage of the disk/network I/O architecture preferred by NetApp/EMC. Throwing more cores against a NetApp/EMC storage system in a high-available bandwidth environment allows for linear performance returns (i.e., reading/writing one I/O is 1X, two threads is 2X, three threads is 3X, etc, etc.). To that end, I'd eventually love to see ra_serf drive the update editor across multiple threads so that the checksum and disk I/O bottleneck can be distributed across cores on the client-side as well. Compared to where we were in 2006, that's the biggest inefficiency we have yet to solve and take advantage of. And, I'm sure this'll break all sorts of promises in the Ev1 and perhaps Ev2 world and drive C-Mike even crazier. =) But, if you want to put a rocket pack on our HTTP performance, that's exactly what we should do. I'm reasonably certain that serf itself could be finely tuned to handle network I/O in a single thread at or close to wire-speed even on a 10G connection with a modern processor/OS - it's what we do with the file contents/textdeltas that needs to be shoved to a different set of worker threads and remove all of that libsvn_wc processing from blocking network traffic processing and get it all distributed and thread-safe. If we do that, woah, I'd bet that we are we going to make things way faster across the board and completely blow everything else out of the water when our available bandwidth is high - which is the case in an internal network. And, yes, that clearly could all be done in time for 1.8 without jeopardizing the timelines one tiny bit. =P So, that's my long-winded answer of saying that, yah, even in an internal LAN environment, you still want to parallelize. However, I'm definitely not going to veto a patch that would add an httpd directive that allows the server to steer the client - unless overridden by the client's local config - to using parallel updates or not. -- justin
Re: RFC: simple proposal for Internet-scoped IDs
On Sat, Dec 1, 2012 at 8:53 AM, Eric S. Raymond wrote: > I'm not certain, but if your server-side hooks work the way I think > they do, all of this except (1) can be done in Python. Not having to > add complexity to your C code is a significant virtue. > Here's another approach to take with regards to setting the FULLNAME field that doesn't require any change to the client and can be deployed server-side via hooks without any code changes at all. So, (1) can be done in Python pretty easily for the lazy users and coders who don't want to do anything at all. =) If you have a centralized registry (like either LDAP or http://people.apache.org/committer-index.html), the server in the server-side hooks can set FULLNAME for each svn:author if isn't set by the client by looking up its internal directory. Within the ASF infrastructure, we have tools to allow committers to manage fields like this in a self-service way. So, the server admin can default that field as they like and give the users to set that field as they like. I believe that this is one of the benefits of a centralized infrastructure - we can make it so that every client doesn't *have* to set something themselves on their client to utilize FULLNAME. And, once again, I'll reiterate my earlier point that FULLNAME can be added retroactively pretty easily to existing SVN repositories. So, for svn.apache.org, after we might deploy a FULLNAME infrastructure, we could easily craft a tool to go back to all old revisions and annotate them correctly. Easy peasy. -- justin
Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c
On Sat, Dec 1, 2012 at 9:01 AM, Lieven Govaerts wrote: > There are some scenario's where either the server admin or the user > can decide if parallel requests make sense or not. > > I'm specifically thinking of the use Kerberos per request > authentication. These responses can't be cached on the client side, > and require the authorization header to be sent for each request. > Assuming 2 step handshake of which serf can bypass the first, this > means an overhead per request of 1-10KB, with a 3 step handshake each > request has to be sent twice further increasing the overhead. > IMHO in this scenario the server admin should be able to veto the use > of parallel requests. > > And the same is true for https connections, where it's also the server > admin who can decide if the necessary caches have been put in place to > enable the benefits of parallel requests. > Totally agreed. I'd favor a three-value httpd directive option on the server-side that is advertised in the capabilities exchange: - default (client defaults to parallel if ra_serf, serial if older ra_neon client; or if client overrides ra_serf via their local servers options) - serial (server suggests to client that it should be serial; but permit parallel when client wants it) - force-serial (same capability advertisement, but always trigger send-all responses regardless of what client asks for) I'm 95% sure we have code in ra_serf that handles the case where the server sends us inline responses anyway as older (prior to 1.2, IIRC) always sent inline responses no matter what we send...so, it should be fairly straightforward decision tree with minimal code changes. My $.02...which is still not enough for me to write the patch. =) -- justin
Re: RFC: simple proposal for Internet-scoped IDs
On Sat, Dec 1, 2012 at 9:28 AM, Daniel Shahaf wrote: > BTW, the ability to change svn:author at will is one of the reasons they > aren't global-scoped: if Subversion ever migrated away from ASF, we can > _then_ change all svn:author revprops --- just like we once changed > "zhakov" (implied @tigris) to "ivan" (implied @apache). > Exactly. And, to be honest, I probably never realized that Ivan's author tag changed...I knew it was him in both cases as I'm involved in the project. So, for most human-scale projects, I think that you don't need globally unique IDs as there's a defined community and set of participants. Whether I call him "zhakov" or "ivan" - it's the same person. I know that, you know that...and, really, all of the people who care know that. =) For projects where people don't know everyone (Linux), I can see why globally unique IDs are helpful to contributors. But, I would shudder if suddenly svn's own blame output emitted "Daniel Shahaf < d...@daniel.shahaf.name>" instead of "danielsh". I have that map already in my head thank-you-very-much. Hence, this is why I'd be a strong proponent of them being in separate revprops - a "local" project name (svn:author) and something that more uniquely identifies the contributor (FULLNAME blah blah blah). And, perhaps have an option on the client as to which one to use - I could see some folks wanting the GUID, but that's just way too verbose for me... -- justin
Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c
On Sat, Dec 1, 2012 at 9:41 AM, Branko Čibej wrote: > On 01.12.2012 14:31, Justin Erenkrantz wrote: > > And, yes, that clearly could all be done in time for 1.8 without > > jeopardizing the timelines one tiny bit. =P > > Eep ... :) > > > Another thing I've been thinking about is this: Why are we using SHA1 > checksums on the server and on the wire for consistency checks when a > 64-bit CRC would do the job just as well, and 15 times cheaper? And > banging my head against the wall for not thinking of this 10 years ago. > > I can sort of understand the use of SHA1 as a content index for > client-side pristine files. On the server, however ... dunno. Maybe we > could design something akin to what the rsync protocol does, but for > repository-wide data storage. Could be quite tricky to achieve locality, > however. > The one thing that's nice with using SHA checksums is we're using it everywhere. It makes protocol debugging a *lot* easier - since we also used SHA checksums as the content index, that makes it easier to compare what we recorded in libsvn_wc to what was sent by the server. If we diverged the checksums algorithms, it'd be hard to do a quick comparison visually (do the checksums match?) without actually running the checksum yourself! So, I think we optimized for humans here...and I'm okay with that. We can always build faster processors...and take advantage of parallelism. =) There I go off on a tangent again. > *grin* -- justin
Re: svn commit: r1415864 - /subversion/trunk/subversion/libsvn_ra_serf/update.c
On Sat, Dec 1, 2012 at 12:00 PM, Mark Phippard wrote: > I feel pretty strongly that we should at minimum use the send-all > approach when talking to pre-1.8 servers. Even though in some > situations it could still offer good performance. I just think it > would be more respectful to our users (server admins in this case) to > not change this behavior in a way that could surprise them. Maybe we > could come up with exceptions, such as older servers that are using > the SVNAllowBulkUpdates off directive. In that situation we should > use the new behavior since that is basically what that directive is > asking for. > Without a lot of concrete feedback that parallel updates should be removed by default, I strongly believe that we should not be conservative on this issue. The issue here is not one of compatibility - ra_serf has been around for years and can talk just fine to older servers (way back to prior to 1.0 servers actually). The only argument against altering the default behavior is that there might be an admin of a high-traffic site somewhere that might suddenly be shocked by more HTTP requests coming in. I honestly have little to no sympathy for such an admin who doesn't properly understand how to manage a large installation - they likely have other issues that they are not paying attention to. Until we have hoards of users coming in and complaining about this, I think it's silly to be conservative here. I'm definitely not against giving knobs to the client or to the admin in weird corner cases (provided someone cares enough to write that up), but I strongly believe that for now we should do the right thing out of the box in 1.8 - which is to utilize parallel updates. -- justin
Re: Error running context
On Sun, Dec 2, 2012 at 9:05 AM, Lieven Govaerts wrote: > Attached the patch. ( I get paid per mail I send to this list in case > no one noticed. ;) ) The patch looks right to me - the short-circuit to do the simple window can clearly be called for any window regardless of offset. Perhaps we can add this testcase to the regression tests as well? As for the looping, didn't we put in logic to stop retrying after a number of request failures? That's probably a reasonable thing to do...I definitely think a very nice feature of serf is that it *will* retry (which is helpful in flaky network situations), but it needs to stop at a certain point. =) -- justin
Re: svn commit: r1417639 - in /subversion/trunk/subversion/mod_dav_svn: dav_svn.h mod_dav_svn.c reports/update.c
On Wed, Dec 5, 2012 at 4:50 PM, Stefan Sperling wrote: > In the 1.8 release notes We can raise attention in the fact that a 1.8 > server should turn off skelta mode if necessary, such as when each GET > request causes an authentication request to LDAP. > +1. I'm all for documenting and giving knobs to users who RTFM. My very strong preference is that we should do the "right thing" out of the box - and that is to use parallel updates - even against older servers. This has always been the behavior of ra_serf and it's definitely backwards compatible with all 1.x servers. If an admin doesn't like it (because of per-req authn or bad logging subsystems or the phase of the moon), they can try to direct 1.8+ clients away from parallel updates via the new directive/headers; but, we should not require any configurations to do the best generically recommended approach. -- justin
Re: svn commit: r1417639 - in /subversion/trunk/subversion/mod_dav_svn: dav_svn.h mod_dav_svn.c reports/update.c
On Wed, Dec 12, 2012 at 4:24 PM, C. Michael Pilato wrote: > Here's a question I've been wondering for some time: should we expose to > users a configuration option for declaring the number of aux connections > ra_serf should use? I mean, Firefox has exposed such an option for years. > If we did so for Subversion, we could say, "Yeah, ra_serf+svnrdump is a bad > combination. Here's a workaround. Run it with > --config-option=servers:global:serf-max-connections=2". Or better yet (as > per my other mail in this thread), we could just hardcode that option > override in svnrdump itself. Both approaches (allow config option to control # of conns; have svnrdump disable parallelism) seems reasonable to me. FWIW, I'd probably set it to 8 - since 2006, most browsers upped their connection parallelism to 8. *grin* And, yah, it's quite obvious from reading the thread that you missed a bunch of Lieven's emails. -- justin
Re: [PATCH] Test for line ending bug in svnrdump (issue 4263)
On Wed, Dec 12, 2012 at 5:06 PM, Daniel Shahaf wrote: > P.S. This thread was an unusually long one, for a patch that adds about > a dozen lines of code. > Uh, how is that at all unusual for this crowd? =) -- justin
Re: Would branching across repositories be theoretically possible?
On Mon, Dec 31, 2012 at 9:05 AM, Stefan Sperling wrote: > BTW, it would have been better to send your question to the users@ list. > The dev@ list is off-topic for questions like this. I suppose you chose > dev@ because of your idea to develop support for cross-repository copies, > which I'll address now: At a meta-level, I've seen this dismissal and plea to just send to users@instead happen a few times recently on dev@. I'd like to ensure we don't hinder potential contributors by sending out the wrong tone in their initial interactions with dev@. We welcome all contributors with open arms. To me, Zem's initial post was perfectly appropriate for dev@ and given the ensuing discussions including some that apparently happened off-list, it was (and is) a good catalyst for discussion and bringing those discussions back to the list. In other words, in my opinion, a very good and on-topic dev@ post. Back to the topic at hand, my take on this feature is it's probably a good idea to figure out how to solve it...the landscape over a decade ago when we started out isn't the same as it is now. -- justin P.S. Hi Zem! =)
Re: Proposal for OPW project
On Tue, Jan 1, 2013 at 4:17 PM, Gabriela Gibson wrote: > The following is a conversation that should have been held on the > list, please excuse the misunderstanding on my part. > No worries - we've all been there before. You'll get the hang of it soon enough. =) Keep up the posts! Cheers. -- justin
Re: Apache Subversion: GNOME Women Outreach
On Mon, Dec 31, 2012 at 4:22 AM, Miriam Hochwald wrote: > I am new to this random public email writing ... and, at least for me it > takes a fair amount of courage to do so. I have had to face a few phobia > barriers based on experience. > Welcome! It can indeed be quite scary at first...then, you just give in and hope no one looks at your initial posts a decade from now. =) > I am interested in contributing to the project: > * > * > *“Improve progress output displayed during update and commit.”* > > I have written an approach towards it as follows: > > This is basically a Human Computer Interaction (HCI) problem combined > with understanding the code, and estimating or updating the time it takes > to complete a file packet transfer. The user needs to be provided with more > detailed progress information for ‘svn’ commands, so they know how much > longer they need to wait. This feature is very important, especially for > GUI clients. Currently there are a rudimentary series of dots, one dot per > file with a filename after each complete file. This approach provides > insufficient visual communication regarding the percentage of the total > size to transmit, or the total number of files to transmit. In general > this is an interface change, which could be approached such that it > refrains form disturbing the existing parsers of ‘svn commit’ output. Some > progress information is available such as the number of bytes or progress > of a data chunk to be transferred. Missing from the equation is the number > of files/ chunks that need to be transferred and how many already are > transferred. For when doing a ‘svn up’, there is no way of telling the user > whether it will take seconds, minutes or hours until the update will be > complete if a lot has been changed. One approach could be to increase the > granularity of update feedback. For example there could be progress bar > feedback for every 50K in either direction. Alternately, since it is known > how many files there are to transmit, a progress bar could indicate the > number of files successfully transferred as a percentage of the total > counted. For each file DeltaV knows the data to be transmit, so a progress > bar could be established for it. > Yup, ra_serf (what we use on the client-side for HTTP/pseudo-DeltaV) has a pretty good idea what the overall file size is and how much has been written on the wire (when committing) or receiving (when updating). We've always just copped out with one dot per file at the notification - it's always been an annoyance to me that we haven't come up with a better mechanism - some files are 1KB and some doofuses like to commit binary 30MB JAR files (looking at you Java devs!) - yet, both are represented with one dot. So, I look forward to what you come up with here! I don't know how much we'll be able to improve ra_svn - my hunch is that it'd be easier to get it working with ra_serf first and then come back to ra_svn after you have all of the notification APIs updated. If what I just wrote doesn't make complete sense yet, don't panic - after you get more acquainted with the source code, the paragraph above should make more sense! =) > As yet I am still setting up the development environment, working through > the follow instructions: > > http://svn.apache.org/repos/asf/subversion/trunk/INSTALL > > Putting together all these bits and bods seems quite foreign to me, and is > quite confronting. So, I am hopeful that people in this community can be > supportive and understanding of a learning/ experience gap ... as I trail > blaze into something I haven't tried before. > If you see anywhere in the INSTALL documentation that isn't clear, ask! That documentation is intended for new contributors like you - so, if it doesn't make sense, let us know. Even better, submit patches to the INSTALL docs. =) > I look forward to getting to know some of you in the near future. > Same here! -- justin
Re: Apache Subversion: GNOME Women Outreach
On Tue, Jan 1, 2013 at 9:43 PM, Miriam Hochwald wrote: > Hello Justin, > > Actually the INSTALL docs ... whilst detailed make me phobic. It seems to > target the audience that likes to tinker, rather than just click and oh ... > there it is (think iPad users ... which is most of the world). It might be > better to have something other than plain text, in a kind of "Wizard" > approach - problem solver perspective. > > A simple piece of "stupidity" to a Noob which I have just encountered is > ... use of 'make' - well, if you don't have a C complier installed, then > "ba-bum!" no fun. So, then I needed to go back into the plain text doc (no > highlighters or formatting assistance) and then go ... hmmm... C compiler, > C compiler. Googles gcc. Googles Xcode. Needs updated Mac OS. Downloads > updated Mac OS - updates Mac. Think she has lost *all* her files - Panic! > Finds files - "phew!" Downloads Eclipse - chooses version and updates. > Computer doesn't like update etc. ... etc. Ok ... so you see the picture. > Eventually when I finish updating Xcode and obtain gcc I will go back to > that unformatted plain text and read the next line. Convoluted. > If you're up for creating one, I'm sure we'd find a home for a separate document that goes soup-to-nuts on what needs to be installed. I definitely agree that we make some assumptions about the environment. =) At times, I wish we had a comprehensive toolchain install doc for Windows that was clear; so, it's probably reasonable to assume that there could be a use for a clear bootstrap doc for Mac too for devs who might not be as familiar with Mac OS X and its toolchain. Keep chugging along! -- justin
[PATCH] Add -no-suppress to LT_CFLAGS
I'm not entirely sure if we should commit this, but I'll throw this on-list for posterity and future reference. I was trying to compile trunk and was getting the following: /bin/sh path-to-svn/libtool --tag=CC --silent --mode=compile ccache gcc -std=c89 -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -DDARWIN_10 -g -g -O2-I../../subversion/subversion/include -I./subversion -I/opt/local/include/apr-1 -I/opt/local/include/apr-1 -I/opt/local/include -I/opt/local/include/serf-1 -I/opt/local/include -o subversion/svn/conflict-callbacks.lo -c ../../subversion/subversion/svn/conflict-callbacks.c gmake: *** [subversion/svn/conflict-callbacks.lo] Error 1 The key thing here was that I did not get an error message - even though there was an error code. (Running with --debug indicated that libtool was seeing an error returned by ccache and subsequently deleting the .lo files.) Because GNU libtool always compiles code twice on Mac OS X (insert rant against GNU libtool here), it defaults to hiding all compiler output on the second pass. In their infinite wisdom, I guess they figure that if the first compile passes, the second always will too. So, if I add "-no-suppress" to LT_CFLAGS, I get: /bin/sh path-to-svn/libtool --tag=CC --silent --mode=compile ccache gcc -std=c89 -DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -DDARWIN_10 -g -no-suppress -g -O2 -I../../subversion/subversion/include -I./subversion -I/opt/local/include/apr-1 -I/opt/local/include/apr-1 -I/opt/local/include -I/opt/local/include/serf-1 -I/opt/local/include -o subversion/svn/conflict-callbacks.lo -c ../../subversion/subversion/svn/conflict-callbacks.c ccache: FATAL: Could not create /Users/xx/.ccache/9/9/3aa71ed31e8f3313f3404f17c0985c-599411.o.tmp.stdout.xx.local.25073 (permission denied?) gmake: *** [subversion/svn/conflict-callbacks.lo] Error 1 That of course is a far more useful error message and led me to the root cause (bad perms in ~/.ccache/). In subsequent builds, no more output was produced with -no-suppress. I guess if we have warnings in the code, it'd perhaps emit the warnings twice for every compilation. But, then again, I always like reminding devs how evil libtool is and that it's double-compiling everything. =P The right and proper thing would likely be if libtool stashed the output in a variable and only emitted it when there's an error code...alas. -- justin * configure.ac: Never suppress possible error output from the second-pass compile. Index: configure.ac === --- configure.ac(revision 1428128) +++ configure.ac(working copy) @@ -260,6 +260,7 @@ [Build shared libraries]), [svn_enable_shared="$enableval"], [svn_enable_shared="yes"]) +LT_CFLAGS="-no-suppress" if test "$svn_enable_static" = "yes" && test "$svn_enable_shared" = "yes" ; then AC_MSG_NOTICE([building both shared and static libraries]) elif test "$svn_enable_static" = "yes" ; then
Re: [PATCH] Add -no-suppress to LT_CFLAGS
If you look at the path of the file that failed compilation, it is in subversion/svn/. On Mac OS X there is no need to compile it twice as there is no real supported concept of PIC - it's a no-op for the compiler. Linking the libraries statically or dynamically is different of course. -- justin On Jan 3, 2013 2:18 AM, "Peter Samuelson" wrote: > > [Justin Erenkrantz] > > Because GNU libtool always compiles code twice on Mac OS X (insert rant > > against GNU libtool here) > ... > > But, then again, I always like reminding devs how evil libtool is and > > that it's double-compiling everything. =P > > Well, you asked it to build both static and shared libraries. What's > it supposed to do? With most ABIs, you want to build shared library > code in PIC mode and all other code in non-PIC mode. So yes, that > means double-compiling. You'll notice it does _not_ do this for > non-library code such as subversion/svn/ or subversion/svnadmin/. (At > least on Linux it doesn't. I actually have no idea what it does on Mac > OS.) > > Frankly, if you think double-compiling is "evil", you should be > suggesting that we make either '--disable-static' or '--disable-shared' > the default - see configure.ac, line 238 or so. Not blaming libtool. >
Re: [RFC] Deprecate Berkelety DB filesystem backend
+1! -- justin On Jan 5, 2013 10:40 AM, "Greg Stein" wrote: > Is "+1" too short of a response? > > :-) > On Jan 4, 2013 7:35 PM, "Branko Čibej" wrote: > >> For the following reasons >> >>- FSFS has been the default filesystem backend for almost 7 years, >>since 1.2. >> >> - Looking at commit history, I've not seen a single (functional or >>performance) improvement, beyond a few bug fixes, in the BDB backend in at >>least two years. The last significant change that I'm aware of was >> released >>in 1.4 (support for BDB 4.4. and DB_REGISTER) back in 2006. In effect, BDB >>is in "barely maintained" mode whereas interesting things are happening to >>FSFS all the time. >> >> - I cannot remember seeing a BDB-related bug report in a month of >>Sundays (meaning that it's either rock-solid, or not used). >> >> - The BDB backend is an order of magnitude slower on trunk than FSFS >> - timing parallel "make check" on my 4x4-core i7+ssd mac: >> - FSFS: real 7m33.213s, user 19m8.075s, sys 10m54.739s >> - BDB: real 35m17.766s, user 15m28.395s, sys 11m58.824s >> >> I propose that we: >> >>- Declare the BDB backend deprecated in 1.8, adding appropriate >>warnings when it's used or manipulated (to svnadmin?) >> >> - Stop supporting it (including bug fixes) in 1.9 >> >> - Completely remove the BDB-related code in 1.10 (I'm making an >>assumption that we'll have the FSv2 API and releated refactoring of FSFS >> by >>then, and at least a draft experimental FSv2 implementation). >> >> >> I realize I'm raising this question at a time when we should be focusing >> on branching 1.8. On the other hand, this release may be a good opportunity >> to deprecate the Berkeley DB filesystem. >> >> >> -- >> Branko Čibej >> Director of Subversion | WANdisco | www.wandisco.com >> >>
Re: BDB vs FSFS - OMG!
On Sun, Jan 6, 2013 at 9:44 PM, C. Michael Pilato wrote: > hosted elsewhere for them. The BDB backend (thanks to improvements to the > Berkeley DB library itself) is much more stable today than it was when we > first started this project, so it's quite possible that we don't hear noise > That's quite surprising. My understanding from the Sleepycat/Oracle team way back when was that our core usage of BDB was wrong and would never be properly supported by them. Have they embraced multiple reader/writer processes now, or do they still advocate that having a single-process is the only Right Way(tm)? -- justin
Re: Subversion & Windows
IMO, we should follow most other (all?) ASF projects - you need 3 +1s for release regardless of platform. The fact that we require 3 +1s just for Windows is very odd - we don't require 3 +1s for Mac and 3 +1s for RHEL and 3 +1s for Ubuntu, etc. -- justin On Jan 8, 2013 3:29 PM, "Ben Reser" wrote: > We seem to be having trouble getting releases out the door and the > delay is almost always related to Windows votes. > > Consider the following data: > Release Planned Actual Unix vs Windows > 1.6.19 10 Sep 2012 21 Sep 2012 7 days > 1.7.7 09 Oct 2012 09 Oct 2012 1 day > 1.7.8 17 Dec 2012 20 Dec 2012 6 days > 1.6.20 04 Jan 2013 ?? 4+ days > > Windows Voters: > Paul Burba > Johan Corveleyn > Ben Reser > Mark Phippard > Bert Huijben > > The Unix vs Windows column indicates how many days after the last Unix > vote was reached did the Windows vote get to 3. 1.6.20 hasn't hit the > required votes for Windows yet but it will be at least 4 or more days > since we're on day 5 since Unix has been ready to release. > > Windows Voters Unix Voters > Paul Burba 4 C. Michael Pilato 3 > Johan Corveleyn 4 Philip Martin 4 > Ben Reser1 Ben Reser 1 > Mark Phippard1 Branko Čibej 4 > Bert Huijben 1 Stefan Sperling2 >Julian Foad1 > > The numbers after the names is the number of releases they voted in. > This also doesn't count people who were RM'ing. > > I'd say that the problem is worse for 1.7 but I suspect that 1.7.7 is > an outlier for some reason. However, I hope that it's clear that we > have a problem getting releases out due to Windows testing. > > I think flat out the problem is that building on Windows is just a > pain. I remember it took me several days to get a working build > environment so I could be the last signature on 1.6.19. Unfortunately > I can't be the last vote on the more recent releases because I've been > the RM. > > There are several possible solutions to this problem. > > 1) We could lower the votes required for Windows. It seems like we > are able to consistently get 2 votes, it's the 3rd that is often the > problem. If you look at the last several releases often the 2 votes > for Windows come in before the 2nd or 3rd vote for Unix. If you > assume each of the 6 votes needed come from different people that > means you need 6 different people to approve a release. I'm not sure > how many active contributors we have but 6 could easily be half. It's > not necessary that all 6 be given by different people, but in practice > this is what happens. Another point is we only require 3 votes for > Unix when there are many platforms that fall under that category, some > of them quite different. Most of the time the 3 Unix votes ends up > being done on Linux, sometimes even one distro of Linux. On the flip > side of this most of our client users are using Subversion on Windows, > yet we do most of our development on Linux. So maybe we are justified > in the current voting setup. But changing the voting would help > resolve the release delays. > > 2) I could stop RM'ing. That throws another person in the pool of > people who test on Windows. Right now there are 5 people who have > tested on Windows in the last 4 releases, I'm one of those people. In > comparison Unix has a pool of about 6 people (with me being the person > who's overlappping). I do think the actual pool of potential Unix > voters is larger than is suggested by the last 4 releases, since there > are quite a few names not included that I know could have voted if > necessary. However, I'm not thinking of anyone included in the > Windows pool that is known to have a working Windows build setup. It > would give me a chance to spend some time on hopefully improving the > situation since I'd be dealing with Windows for the release voting > (which I haven't done since 1.6.19). > > 3) WANdisco has non-committers building and testing the release > candidates in order to provide binary packages. I'd imagine > Collab.net has something similar going on since they also provide > Windows binaries. We could start accepting the results of these tests > as a vote with a committer validating that the tests were done with > the proper source package and provides the signature for the vote. > You can look at this as not really any different than a committer > committing a non-committers code change. > > These three above responses might help provide some more immediate > solutions but they don't really resolve the problem. So let's look at > some options that solve the root problem. > > 4) One of the major problems with building Subversion on Windows is > building the dependencies. We could build and package a pre-built set > of dependencies that we provide for download. A script could be > written that downloads this, puts everything
Re: Subversion & Windows
On Tue, Jan 8, 2013 at 10:07 PM, Ben Reser wrote: > Given stefan2's argument I don't think it's unreasonable to lower the > Windows votes, but I think removing them entirely is probably going > too far. > Given the fact that we repeatedly have trouble securing votes just for Windows, I think you are in the clear minority. =P -- justin
Re: Subversion & Windows
On Wed, Jan 9, 2013 at 3:30 AM, Branko Čibej wrote: > We talked about that a couple days ago, but the problem is that a > Windows VM requires a Windows OS license for every user of that VM. > That's not something we can provide or hack around (well ... we could > hack around it, but it would be a really bad idea). > At least for committers, Microsoft makes an MSDN subscription available...so, it's a tractable problem if our intention is to get more committers voting for Windows releases (which, duh, is why Microsoft does it in the first place!). -- justin
Re: 1.8 Release Status : Issue triage
On Thu, Jan 17, 2013 at 1:19 AM, Branko Čibej wrote: > > 1 has not been marked as started nor been assigned but is tied to an > > open serf issue: > > 4274 DEFECT P3 All issues@subversion philip NEW > 1.8.0 serf > > client hangs when server crashes > > I reviewed this the other day and submitted > > https://code.google.com/p/serf/issues/detail?id=94 > > as I believe the root cause is in Serf itself. > > How about this? If the connection never resulted in a successful HTTP response during its current lifetime (which gets zero'd in reset_connection), then we can return an error back upstream. I think we'll still be fine on Windows as the HUP should be detected *after* the full response is read by serf. So, in this case, we should bug out after a max of 2 attempts...and maybe one if OPTIONS is the first request on the connection. -- justin Index: outgoing.c === --- outgoing.c (revision 1716) +++ outgoing.c (working copy) @@ -,7 +,10 @@ /* The connection got reset by the server. On Windows this can happen when all data is read, so just cleanup the connection and open a new one. */ -return reset_connection(conn, 1); +if (conn->completed_responses) { +return reset_connection(conn, 1); +} +return APR_EGENERAL; } if ((events & APR_POLLERR) != 0) { /* We might be talking to a buggy HTTP server that doesn't
Re: 1.8 Release Status : Issue triage
On Fri, Jan 18, 2013 at 7:54 AM, Branko Čibej wrote: > Doesn't help ... apparently because we get both APR_POLLIN and > APR_POLLHUP events, and the former is processed first, the function returns > without error in the previous block: > > /* If we decided to reset our connection, return now as we don't > * want to write. > */ > if ((conn->seen_in_pollset & APR_POLLHUP) != 0) { > return APR_SUCCESS; > } > > I don't know enough about Serf to even begin guessing at the correct > solution. > The same if check works in that block too. =) With the debug-abort flag set with serf r1717+ (backported to 1.2.x as well): % svn ls http://localhost:21974/svn-test-work/repositories/basic_tests-3 svn: E120108: Unable to connect to a repository at URL ' http://localhost:21974/svn-test-work/repositories/basic_tests-3' svn: E120108: Error running context: The server unexpectedly closed the connection. Enjoy. -- justin
Re: packages tree
On Tue, Jan 29, 2013 at 1:04 PM, Ben Reser wrote: > If we're not going to keep these things up to date then I think we > should just remove them. > +1 to remove. (It's all version-controlled anyway!) -- justin
FOSDEM 2013
FWIW, I'll be at FOSDEM in Brussels this weekend (arriving on Friday morning) - if any other SVN devs are going to be there, it'd be fun to catch up. Cheers. -- justin
Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/ tests/cmdline/
As another data point, I have hit this text-as-binary myself just a few weeks ago when I added a bunch of HTML files to a local repository - so, it's definitely occurring automatically. I did not have a chance to dig into why the magic detection failed so miserably... -- justin On Saturday, February 2, 2013, Bert Huijben wrote: > > > > -Original Message- > > From: s...@apache.org [mailto:s...@apache.org > ] > > Sent: zaterdag 2 februari 2013 23:05 > > To: comm...@subversion.apache.org > > Subject: svn commit: r1441814 - in /subversion/trunk/subversion: svn/ > > tests/cmdline/ > > > > Author: stsp > > Date: Sat Feb 2 22:04:44 2013 > > New Revision: 1441814 > > > > URL: http://svn.apache.org/viewvc?rev=1441814&view=rev > > Log: > > When a binary mime-type is set on a file that looks like a text file, > > make the 'svn' client print a warning about potential future problems > > with operations such as diff, merge, and blame. > > > > This is only done during local propset for now, because the file needs > > to be present on disk to detect its mime-type. > > > > See for related discussion: http://mail- > > archives.apache.org/mod_mbox/subversion- > > dev/201301.mbox/%3C20130131185725.GA13721%40ted.stsp.name%3E > > From my users I hear that another way this property is introduced is via > conversions from other version management systems. Visual SourceSafe (long > dead, but still used in a lot of small shops) marks UTF-8 files with a BOM > as binary when it does an auto detect. > (Well what would you guess for a system that wasn’t really updated since > that format became popular) > > Most conversion tools just copy the binary flag, and there you have this > problem on all your historic utf-8 files. > (Where I worked we had this problem on all .xml files previously stored in > sourcesafe). > > > I don't see a lot of users accidentally adding invalid properties > themselves. > > Bert > >
Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/
On Sun, Feb 3, 2013 at 4:41 AM, Bert Huijben wrote: > Maybe we should try to fix this automatic detection then (Does your > setting match the libmagic output for that file?) instead of adding > just a warning when setting the file binary explicitly. > > Maybe we should validate our libmagic based decision to make something > binary? > Taking a look while here at FOSDEM...libsvn_client set the mime-type as "application/xml": % svn diff index.shtml Index: index.shtml === Cannot display: file marked as a binary type. svn:mime-type = application/xml % svn pg svn:mime-type index.shtml application/xml % file --mime-type index.shtml index.shtml: application/xml % file -v file-5.11 magic file from /opt/local/share/misc/magic I thought we treated application/xml as text at one time? -- justin
Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/
On Sun, Feb 3, 2013 at 6:06 AM, Justin Erenkrantz wrote: > I thought we treated application/xml as text at one time? -- justin > Yes, we did...and Ben -1'ed it back in 2004. See: http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=374218 I guess...but...really? Is the only way for a user to override this is to manually set their local auto-props? *sigh* -- justin
Re: svn commit: r1441814 - in /subversion/trunk/subversion: svn/
On Sun, Feb 3, 2013 at 6:42 AM, Stefan Sperling wrote: > On Sun, Feb 03, 2013 at 06:16:03AM -0500, Justin Erenkrantz wrote: > > On Sun, Feb 3, 2013 at 6:06 AM, Justin Erenkrantz >wrote: > > > > > I thought we treated application/xml as text at one time? -- justin > > > > > > > Yes, we did...and Ben -1'ed it back in 2004. See: > > > > > http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=374218 > > > > I guess...but...really? Is the only way for a user to override this is > to > > manually set their local auto-props? *sigh* -- justin > > Perhaps we should treat mime-types that contain 'charset=UTF-8' > as text by default? libmagic is able to detect the charset AFAIK. > Our code currently removes charset information from the mime-type > returned by libmagic. > > I don't really like the idea of adding yet another special case > to our definition of what a "text" mime-type is. But it might > be worth doing this if it impacts a lot of people. > % file -i -k index.shtml index.shtml: application/xml; charset=us-ascii Not all of us use UTF-8, but yes the sentiment is valid. =) I do wonder if printing out this new message saying SVN considers XML files as binary is going to increase the amount of people realizing that they are tripping over this. -- justin
Re: exclusive WC locks by default? (was: Re: svn commit: r1447487 - /subversion/trunk/subversion/svn/svn.c)
On Feb 18, 2013 7:08 PM, "Stefan Sperling" wrote: > > On Mon, Feb 18, 2013 at 10:21:16PM +0100, Bert Huijben wrote: > > If I would use 1.8 to run > > > > svn diff |more > > > > And leave that open in a console I would block TortoiseSVN, AnkhSVN, Subclipse, etc on my entire working copy, while this command doesn't even obtain a working copy lock. > > > > And as a developer on both 'svn' and AnkhSVN I often use 'svn diff' just for reviews. > > > > And all those other clients would just hang without a way to recover... > > I only use the command line but run 'svn' commands concurrently quite often. > > E.g. I run 'svn diff' while 'svn commit' has the log message editor open. > I also use 'svn log --diff' a lot and sometimes still have it running while > committing from the same working copy. > > I've disabled exclusive locking in my client configuration for this reason > and wouldn't mind either if exclusive locking was off by default. +1 - "svn diff" while running "svn commit" (blocking waiting for $EDITOR to exit) is my SOP as well. If we break that by default, I consider it a regression. -- justin
Re: BDB deprecation (was: branch 1.8 or at least start making alpha releases?)
On Tue, Feb 26, 2013 at 1:54 PM, C. Michael Pilato wrote: > * The appropriate time to stop supporting Berkeley DB is in the same > release > for which existing FSFS will also have to dump/load. It is cruel to force > admins to endure the migration process twice -- possibly in successive > releases of Subversion -- and especially when one of those times is just > for > a (possibly less-than-compelling) bit of a performance boost. > I brought this up here in Portland with Brane et al - but, I'd be a tad concerned if we're going to make a dump/load *mandatory* for FSFS. Sure, we can advise a dump/load to get better performance, but I think we have shot ourselves in the foot with the client-side WC upgrade being mandatory. I hate the fact that 1.7 and 1.8 can't share WCs and force me to do 'svn upgrade'. As a developer testing trunk, this really blows... > * That said, I'm okay with deprecating Berkeley DB today as a warning to > existing BDB users that change is a-comin', though the release notes should > (again) indicate that there's no reason to rush off and convert to FSFS > until an as-yet-undecided future revision forces the issue for *all* > Subversion users. > +1. -- justin
Re: Printing error stacks.
On Fri, Apr 12, 2013 at 6:28 PM, Peter Samuelson wrote: > What is the utility of the "(apr_err=_)" suffix there? I think it's > better to omit it. Print apr_err= only when it is new information. > +1. -- justin
Re: log --search test failures on trunk and 1.8.x
On Sun, Apr 21, 2013 at 9:37 AM, Bert Huijben wrote: > Do we want case folding (or at least case variant compare) support in our > libraries for 1.8? > No. > Or is this 1.9+ scope? > Yes. =) My $.02. -- justin
Re: log --search test failures on trunk and 1.8.x
On Wed, Apr 24, 2013 at 1:34 AM, Branko Čibej wrote: > If you consider all this, the easiest approach by far might be to simply > add a Lucene index of all log messages to the server, then you can and > any number of bells and whistles including language-specific stemming. > I'd consider that a better solution then any homegrown full-text search > facility; these are never easy. Ha! That might actually be easier than importing all of the Unicode nastiness. =) -- justin
Re: svn commit: r1483795 - /subversion/branches/1.8.x/STATUS
On Fri, May 17, 2013 at 5:58 PM, Ivan Zhakov wrote: > Yes, this will be good improvement anyway, but I think repository > integrity should be first goal. > +1! =P -- justin
Re: C++ thoughts for Berlin
On Wed, May 29, 2013 at 12:05 PM, Blair Zajac wrote: > I'm generally in favor of a move to C++, it would be nice to get features > that we work around now in C. > Rewriting even some of our core libraries to use C++ (even if it we kept the existing C API) just doesn't seem to address any real problems that we have. We'd likely be having to write off support for a lot of platforms due to the inconsistent nature of many C++ compilers on platforms we have supported since 1.0. I do not think this is a good thing. With regards to libraries, I have had nothing but horrible developer experiences with Boost - it's pretty counter-intuitive in a lot of places; and C++11 isn't anywhere near widely supported to be considered if we want to keep broad platform support. As trying to use APR in a C++-based memory management model is fraught with paradigm conflicts, we'd quite likely need to write a new portability layer and new HTTP networking layer. Fun! (Not.) BTW, I believe that GCC is special - due to its bootstrapping methodologies, it's only really meant to be compiled by itself - this doesn't apply to Subversion, so I think that analogy is a bit of red herring. If we really switched to having core libraries written in C++, I would forcefully argue that it has to be SVN 2 (regardless if we kept the C API identical)...and I'd probably say we should just rename the project to something else - it's not Subversion at that point, but something else. -- justin
Re: C++ thoughts for Berlin
On Wed, May 29, 2013 at 6:03 PM, Blair Zajac wrote: > Yup, I've had lots of issues with this. Putting C++ pool wrappers in C++ > classes and having them destroy in the correct order can be tricky to get > right (lots of core dumps in our internal RPC server). One of the nice > things about moving to C++, assuming we don't use pools, is that one could > stop thinking about memory management. > I don't see how you can have a clean C++ implementation as we expose pools in the C API. I just think we're going to bring a world of hurt upon us all if we try to do a wholesale rewrite of C++ and try to shove the pool model into it. (The batons and callbacks all require pools with a certain lifetime - so there are third-party code that is relying upon the pool hierarchy being correct.) If we really wanted to go that route, the far saner (if you can call it that) approach is to just call it Subversion 2.0 and have a pool-free external API...from where I sit, I don't see much value going that far - how many years do we want to dedicate to 2.0? Are things ultimately so broken that we simply can't make Subversion better unless we go to C++? It's not a zero-cost item. To be perfectly clear, I'm well aware that many folks have written some particular library modules in C++ over the years. Perhaps I could see a case for a new FS layer written in C++ - those who don't have a modern C++ compiler can still use fsfs. (We go from 2 to 1 and back to 2 FS impl's again!) Rewriting libsvn_client or libsvn_wc in C++ is where I think things would just breakdown if we tried to do it in a piecemeal approach - at that point, everything underneath would have to change to C++. And, we know how much fun it was to rewrite libsvn_wc the last time. -- justin
Re: Dependency versioning question
+1 for "no big deal" too, I think. Depending upon how far we take serf 2.0 and its APIs, a new libsvn_ra_serf2 library may be in order. So, it'd be no different than all of the libsvn_fs* and libsvn_wc internal changes. If we altered the RA APIs, then that would be a no-no if we did it in a backwards-incompatible manner. -- justin On Thu, Jun 6, 2013 at 8:50 AM, Greg Stein wrote: > Hey all, > > I've got a question where I'm not quite sure what the answer "should" > be. While I wrote the rules on versioning components, it never talked > about cross-component and dependency versioning. For Subversion, we > said "all svn components should be at the same version". > > But dependencies. > > Within serf, we're thinking about new models for connection and > protocol handling. And bumping to 2.0 to make that happen. What is the > effect upon libsvn_ra_serf and svn in general? > > My thinking is "no big deal". That upgrading to svn 1.9 (or 1.10 or > whatever) would require installation of serf 2.0. All svn APIs would > remain unchanged, per our own versioning guidelines (both source and > binary APIs). > > Thoughts? > > Cheers, > -g >
Re: --rm-I alias to --remove-ignored
On Mon, Jul 1, 2013 at 5:37 PM, Daniel Shahaf wrote: > How about adding 'svn cleanup --rm-I' as a short option for 'svn cleanup > --remove-ignored'? > I don't like either option (rm-I or rm-?) as I don't find either an improvement over the long option. Mixing lower-and-upper in the same argument is confusing. For the other option, having a non-alphanumeric character seems like a really bad idea in a supposed short-option string. My $.02. -- justin
Re: Tri-state for http-detect-chunking config option
On Thu, Jul 11, 2013 at 10:15 PM, Greg Stein wrote: > Hey all, > > Right now, trunk and the 1.8.1 backport request implements the following > option: > > http-detect-chunking = off / default: perform no detection. A 411 > response will throw an error message suggest to turn this option on. > I'm not really a fan of exposing this as a config knob in any situation. I think that we should always run the detection. It's one additional request in 1.8.x...and I think we can indeed remove that extra round-trip in 1.9.x. -- justin
Re: [PATCH] Change error reporting for xlate problems
On Sat, Jul 13, 2013 at 8:39 PM, Daniel Shahaf wrote: > It appears as soon as svn_utf_cstring_to_utf8() is called --- which is > normally during argv parsing. The error happens even if the argv > arguments are all ASCII, which effectively adds a new dependency on > apr_xlate_* even for people who use only ASCII. I assume this new > failure mode for ASCII-only users means we cannot backport this change. > This now means iconv or apr-iconv are a hard global dependency where they haven't been before. I'm not sure that I like that. (If you're going to do this patch, remove the if check for EINVAL/ENOTIMPL - it's unnecessary.) IIRC, the reason is that this code gets called all the time - we had to silence the ENOTIMPL and EINVAL errors as it really shouldn't be treated as a fatal error. So, in this case, we're back to treating it as an irrecoverable error. It's probably better to just check APR_HAS_XLATE and return an error in svnsync if that's 0...and let the string pass through unmodified - which is probably fine for svnsync case...or perhaps go a step further and just refuse to build svnsync if iconv isn't supported as we might not be able to guarantee the integrity of the sync'd logs. I'm not sure how paranoid we need to be here. -- justin
Re: Security patches release process
On Wed, Aug 7, 2013 at 5:48 AM, Daniel Shahaf wrote: > (v3) > - Receive report > - Come up with a fix > - Gather 3 +1 votes via shadow status file > - Send pre-notifications > - Release a signed .diff file (instead of a tarball) as 1.8.(x+1) > - Commit fix with a log message clearly outlining the security implications > - Backport without going via STATUS (already has 3 +1s) > - Tag and roll 1.8.(x+2) whenever we had scheduled the next release to. > I'm okay with choice 3 - however, the "release a signed .diff" should also include a full release - including whatever normal things we include with a standard release process. If 1.8.(x+1) is *only* available as a diff against 1.8.x, that makes it hard for downstream users (or packagers) until 1.8.(x+2) is released. -- justin