Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c

2010-08-12 Thread Stefan Fuhrmann

Daniel Shahaf wrote:

Stefan Fuhrmann wrote on Thu, Aug 12, 2010 at 22:08:03 +0200:
  

Hyrum K. Wright wrote:


On Tue, Aug 10, 2010 at 5:56 AM, Stefan Sperling  wrote:
  
  

On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote:



Also, this isn't really related to performance; it belongs on /trunk.  Next
time, you could send this with a [PATCH] marker in the subject line, and
a full committer could +1 you to commit that to directly to /trunk. 
  

Yes, please send patches if you have a change that isn't direclty
related to your performance improvements work.

The scope of the branch is not "stefan2 makes all of his commits there",
it's "this branch is for stefan2's performance-related work".


This was a special case because without the patch, my load
tests wouldn't run. So, I could at least kind of justify the process
violation to myself as "performance-related work".

Sure, but we still want to get the patch in trunk one way or another (and
before the whole branch is merged).  Just committing it to the branch and
forgetting about it doesn't solve the problem :-)


Point taken ;)

Speaking of which: what's the plan for merging the branch to trunk?  i.e.,
it's great to see you working there and making progress, but we all would
like to see that in 1.7 (or, at least, 1.8)...
  

I'm still in the process of copying changes from my prototype source
to the performance branch. Commit-count-wise, I might be half through
the pile. The most expensive changes, those requiring large amounts of
commentary or touch many, many file, should be in the SVN repo by now.
The whole synchronization process is the reason why my patches arrive
in a seemingly arbitrary order - it is determined by the manageability of
the diff tool(s).

So, my hope is that all essential changes are in by the end of this month,
including some basic tests for the new classes I introduced.

Basically, my whole motivation for all that work is to have it in 1.7 so
I can have it rolled out in my company for the next "infrastructure cycle".
If that would mean to leave some smaller parts out, I would rather go
for that part I can get than waiting a long time for the whole thing.

-- Stefan^2.


Re: svn commit: r966822 - in /subversion/trunk/subversion: libsvn_client/repos_diff.c tests/cmdline/merge_tests.py

2010-08-12 Thread Paul Burba
On Tue, Aug 3, 2010 at 12:40 PM, Paul Burba  wrote:
> Ok, I am waving the white flag as far as implementing a fix for issue
> #3657 in the RA layer.  I simply don't see how it can be done outside
> of this:  Put the DAV update report handler into send-all=TRUE
>  mode when creating a
> svn_ra_reporter3_t * during a merge/diff.  This would prevent
>  response that causes the client to fetch *all* the
> properties of a path which subsequently pushes these to the
> svn_delta_editor_t callbacks as if they were all prop *changes* (as
> Mike discussed here
> http://subversion.tigris.org/issues/show_bug.cgi?id=3657#desc9).
>
> fyi: Currently this is how the two DAV RA layers use send-all mode:
>
>               send-all   send-all
>               mode       mode
>  operation    ra_serf    ra_neon
>  -       -
>  update       FALSE      TRUE
>  status       FALSE      TRUE
>  switch       FALSE      TRUE
>  diff         FALSE      FALSE
>
> In that attached patch I reverted r966822 and tried the aforementioned
> approach by tweaking ra_neon so the send-all mode on diff/merge is
> TRUE.  The whole test suite passes (including the test for issue
> #3657).  I haven't got an equivalent change to ra_serf to work yet,
> but I'm not too worried about that yet because...
>
> ...After testing out this idea, I realized the problem with this
> approach: It reopens issue #2048 'Primary connection (of parallel
> network connections used) in ra-dav diff/merge timeout
> unnecessarily.':
>
>  http://subversion.tigris.org/issues/show_bug.cgi?id=2048
>  http://svn.apache.org/viewvc?view=revision&revision=853457
>
> So I'm at a dead-end right now.  I'm happy to revert r966822 and
> reopen issue #3657 if the consensus is that my initial fix is a sloppy
> band-aid to cover incorrect ra_neon/ra_serf behavior and that the
> latter two are where the fix belongs.
>
> Paul

Mike suggested to me that the fix for this problem can be made in
mod_dav_svn, since the update reporter already knows which properties
have changed, *regardless* of the send-all mode, but it discards this
information if send-all=FALSE and instead instead sends the client a
'fetch-props' response.  As described previously in this thread, this
causes the client to fetch all the properties of the path, not just
the changes, and *all* the props are pushed through the editor
callbacks as if they were all actual changes (which again is the core
problem of issue #3657).

I'm working on that approach, but in the meantime I have a question
about the behavior of svn diff --summarize when describing the
addition of paths with properties.

### Let's take a vanilla greek tree, add a file and directory,
### set a prop on each, and commit as r2:

  >echo nu file > nu

  >svn add nu
  A nu

  >svn ps prop val nu
  property 'prop' set on 'nu'

  >svn mkdir J
  A J

  >svn ps prop val J
  property 'prop' set on 'J'

### Status looks right, two scheduled additions:

  >svn st
  A   nu
  A   J

### Commit it

  >svn ci -m ""
  Adding J
  Adding nu
  Transmitting file data .
  Committed revision 2.

### Check diff --summarize for r2

  >svn diff -r1:2 --summarize
  AM  nu
  AM  J

Should we really be describing the props for this added (not copied!)
file as modified?  It seems incorrect to call the props modified in
this case.  I ask because this is one of the problems I am running
into with fixing this issue in mod_dav_svn, and before I spend more
time trying to "fix" it I want to be sure the current behavior is
really what we expect.

Paul "Dying slowly at the hands of mod_dav_svn" Burba


Re: Note to self (with all of dev@ peering over my shoulder...)

2010-08-12 Thread Роман Донченко
C. Michael Pilato  писал в своём письме Fri, 13 Aug  
2010 00:41:36 +0400:



Simplest recipe I can find:

$ python -c 'from svn import client, core;
ctx = client.svn_client_ctx_t();
ctx.config = core.svn_config_get_config(None)'
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line  
633, in

__setattr__
return _swig_setattr(self, self.__class__, name, value)
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line 44,  
in

_swig_setattr
return _swig_setattr_nondynamic(self,class_type,name,value,0)
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line 37,  
in

_swig_setattr_nondynamic
if method: return method(self,value)
TypeError: Unexpected NULL parent pool on proxy object
$



You're supposed to use svn_client_create_context (that is,  
client.create_context).


Roman.



Re: [patch] fsfs revprop packing paranoia

2010-08-12 Thread Daniel Shahaf
I figure this patch can't do any harm (except cost another file-read when a
write-lock or txn-current-lock is being acquired), and it might as well
help, so I committed it in r984990.

Daniel Shahaf wrote on Tue, Aug 03, 2010 at 18:59:33 +0300:
> Daniel Shahaf wrote on Mon, Aug 02, 2010 at 23:44:10 +0300:
> > Is this necessary to avoid some race conditions around a revprop becoming
> > packed just before commit_body()'s write-lock had been acquired?
> 
> [[[
> Index: subversion/libsvn_fs_fs/fs_fs.c
> ===
> --- subversion/libsvn_fs_fs/fs_fs.c   (revision 981921)
> +++ subversion/libsvn_fs_fs/fs_fs.c   (working copy)
> @@ -2781,6 +2789,7 @@ set_revision_proplist(svn_fs_t *fs,
>  
>SVN_ERR(ensure_revision_exists(fs, rev, pool));
>  
> +  SVN_ERR(update_min_unpacked_revprop(fs, pool));
>if (ffd->format < SVN_FS_FS__MIN_PACKED_REVPROP_FORMAT ||
>rev >= ffd->min_unpacked_revprop)
>  {
> ]]]
> 
> 
> And, to be honest, I think I'd rather fix this once and for all by updating
> the ffd->min_unpacked_* caches whenever a write-lock is obtained anywhere: 
> 
> [[[
> Index: subversion/libsvn_fs_fs/fs_fs.c
> ===
> --- subversion/libsvn_fs_fs/fs_fs.c   (revision 981921)
> +++ subversion/libsvn_fs_fs/fs_fs.c   (working copy)
> @@ -150,6 +150,9 @@ read_min_unpacked_rev(svn_revnum_t *min_unpacked_r
>  static svn_error_t *
>  update_min_unpacked_rev(svn_fs_t *fs, apr_pool_t *pool);
>  
> +static svn_error_t *
> +update_min_unpacked_revprop(svn_fs_t *fs, apr_pool_t *pool);
> +
>  /* Pathname helper functions */
>  
>  /* Return TRUE is REV is packed in FS, FALSE otherwise. */
> @@ -567,7 +570,8 @@ get_lock_on_filesystem(const char *lock_filename,
> BATON and that subpool, destroy the subpool (releasing the write
> lock) and return what BODY returned. */
>  static svn_error_t *
> -with_some_lock(svn_error_t *(*body)(void *baton,
> +with_some_lock(svn_fs_t *fs,
> +   svn_error_t *(*body)(void *baton,
>  apr_pool_t *pool),
> void *baton,
> const char *lock_filename,
> @@ -594,7 +598,11 @@ static svn_error_t *
>err = get_lock_on_filesystem(lock_filename, subpool);
>  
>if (!err)
> -err = body(baton, subpool);
> +{
> +  SVN_ERR(update_min_unpacked_rev(fs, pool));
> +  SVN_ERR(update_min_unpacked_revprop(fs, pool));
> +  err = body(baton, subpool);
> +}
>  
>svn_pool_destroy(subpool);
>  
> @@ -622,7 +630,7 @@ svn_fs_fs__with_write_lock(svn_fs_t *fs,
>apr_thread_mutex_t *mutex = ffsd->fs_write_lock;
>  #endif
>  
> -  return with_some_lock(body, baton,
> +  return with_some_lock(fs, body, baton,
>  path_lock(fs, pool),
>  #if SVN_FS_FS__USE_LOCK_MUTEX
>  mutex,
> @@ -645,7 +653,7 @@ with_txn_current_lock(svn_fs_t *fs,
>apr_thread_mutex_t *mutex = ffsd->txn_current_lock;
>  #endif
>  
> -  return with_some_lock(body, baton,
> +  return with_some_lock(fs, body, baton,
>  path_txn_current_lock(fs, pool),
>  #if SVN_FS_FS__USE_LOCK_MUTEX
>  mutex,
> ]]]


Re: svn commit: r984926 - in /subversion/trunk: ./ subversion/libsvn_subr/dirent_uri.c

2010-08-12 Thread Daniel Shahaf
Lieven Govaerts wrote on Thu, Aug 12, 2010 at 22:19:48 +0200:
> Slightly off topic remark, but it would be nice if we could run
> stefan's branch on the buildbots from time to time, would be
> interesting to see the impact on the test run times.

I'm pretty sure gmcdonald can arrange that, if we (collectively) would like
this.


Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c

2010-08-12 Thread Daniel Shahaf
Stefan Fuhrmann wrote on Thu, Aug 12, 2010 at 22:08:03 +0200:
> Hyrum K. Wright wrote:
>> On Tue, Aug 10, 2010 at 5:56 AM, Stefan Sperling  wrote:
>>   
>>> On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote:
>>> 
 Also, this isn't really related to performance; it belongs on /trunk.  Next
 time, you could send this with a [PATCH] marker in the subject line, and
 a full committer could +1 you to commit that to directly to /trunk.
   
>>> Yes, please send patches if you have a change that isn't direclty
>>> related to your performance improvements work.
>>>
>>> The scope of the branch is not "stefan2 makes all of his commits there",
>>> it's "this branch is for stefan2's performance-related work".
>>> 
> This was a special case because without the patch, my load
> tests wouldn't run. So, I could at least kind of justify the process
> violation to myself as "performance-related work".
>

Sure, but we still want to get the patch in trunk one way or another (and
before the whole branch is merged).  Just committing it to the branch and
forgetting about it doesn't solve the problem :-)

Speaking of which: what's the plan for merging the branch to trunk?  i.e.,
it's great to see you working there and making progress, but we all would
like to see that in 1.7 (or, at least, 1.8)...

> If I'm not mistaken, there are no outstanding behavioral patches
> left that I would need for my performance improvements.

Okay.  (I don't recall any either, of those I'd reviewed.)


Re: Note to self (with all of dev@ peering over my shoulder...)

2010-08-12 Thread C. Michael Pilato
On 07/15/2010 04:04 PM, Daniel Shahaf wrote:
> C. Michael Pilato wrote on Thu, Jul 15, 2010 at 15:43:10 -0400:
>> [ Mailing this to the list so I don't forget about it -- I haven't
>>   time to dive in right now. ]
>>
>> I ran into a problem today with the SWIG Python bindings when doing some
>> ViewVC work (annotate view under standalone.py).  Exception below:
>>
> 
> How to reproduce?  (just go to the "annotate" view!?)

Simplest recipe I can find:

$ python -c 'from svn import client, core;
ctx = client.svn_client_ctx_t();
ctx.config = core.svn_config_get_config(None)'
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line 633, in
__setattr__
return _swig_setattr(self, self.__class__, name, value)
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line 44, in
_swig_setattr
return _swig_setattr_nondynamic(self,class_type,name,value,0)
  File "/usr/local/subversion/lib/svn-python/libsvn/client.py", line 37, in
_swig_setattr_nondynamic
if method: return method(self,value)
TypeError: Unexpected NULL parent pool on proxy object
$


-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: What C structure contains the HTTP Headers

2010-08-12 Thread Loren Cahlander
After discussing things with Mike, I an going to be making the changes through 
the HTTP PROPPATCH method.


Thank you Mike and Hyrum for you help and patience.

Loren


On Aug 12, 2010, at 12:33 PM, C. Michael Pilato wrote:

> On 08/12/2010 12:49 PM, Loren Cahlander wrote:
>> I want to change the following function in subversion/mod_dav_svn/version.c 
>> to get those two headers values and set the values for SVN_PROP_REVISION_LOG 
>> and SVN_PROP_REVISION_AUTHOR.  Is there a simple way to get the request_rec 
>> structure from the svn_fs_txn_t or apr_pool_t structures?  Another thing 
>> that might help is where is the entry into mod_dav_svn for a HTTP PUT 
>> request?
> 
> You'll need to tweak the dav_svn__attach_auto_revprops() function's
> signature to accept a 'request_rec *r' parameter, and tweak the callers to
> provide it.
> 
> As for the entry point, mod_dav_svn is actually a DAV filesystem provider
> that sits behind Apache's own mod_dav.  In the Apache source code sits a
> modules/dav/main/ directory, and the source code in that directory contains
> functions with names of the form 'dav_method_REQUESTTYPE' where REQUESTTYPE
> is the HTTP request method (in lowercase).  So,
> ${HTTPD_SRC}/modules/dav/main/mod_dav.c:dav_method_put() is the entry point
> into this whole stack of WebDAV-based version control.
> 
> 
> -- 
> C. Michael Pilato 
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
> 



Re: svn commit: r984926 - in /subversion/trunk: ./ subversion/libsvn_subr/dirent_uri.c

2010-08-12 Thread Lieven Govaerts
On Thu, Aug 12, 2010 at 10:08 PM, Lieven Govaerts  wrote:
> On Thu, Aug 12, 2010 at 9:59 PM, Daniel Shahaf  
> wrote:
>> stef...@apache.org wrote on Thu, Aug 12, 2010 at 19:29:24 -:
>>> Author: stefan2
>>> Date: Thu Aug 12 19:29:23 2010
>>> New Revision: 984926
>>>
>>> URL: http://svn.apache.org/viewvc?rev=984926&view=rev
>>> Log:
>>> Merge r983764 from branches/performance. Approved by: danielsh
>>> http://svn.haxx.se/dev/archive-2010-08/0217.shtml
>>
>> Usually, the "Approved by:" appears on a separate line, and an accompanying 
>> URL
>> is not necessary.  (See [1].)
>>
>> I won't be bothered if these log messages aren't fixed, but please be aware
>> of the convention for next time.
>
> These conventions are there to make the contribulyzer work no?
>
> http://www.red-bean.com/svnproject/contribulyzer/
>
> Not following these conventions means we have to pay attention to your
> work instead of having an automated tool do that for us :-).
>

Slightly off topic remark, but it would be nice if we could run
stefan's branch on the buildbots from time to time, would be
interesting to see the impact on the test run times.

Lieven


Re: [PATCH] Tweak description of BASE_NODE table

2010-08-12 Thread Greg Stein
Looks good!

On Aug 12, 2010 1:12 PM, "Julian Foad"  wrote:
> Can someone check this before I commit?
>
> [[[
> Index: subversion/libsvn_wc/wc_db.h
> ===
> --- subversion/libsvn_wc/wc_db.h (revision 984862)
> +++ subversion/libsvn_wc/wc_db.h (working copy)
> @@ -438,14 +438,19 @@ svn_wc__db_get_wcroot(const char **wcroo
>
> /* @defgroup svn_wc__db_base BASE tree management
>
> - BASE should be what we get from the server. The *absolute* pristine
copy.
> - Nothing can change it -- it is always a reflection of the repository.
> + BASE is what we get from the server. It is the *absolute* pristine copy.
> You need to use checkout, update, switch, or commit to alter your view of
> the repository.
>
> + In the BASE tree, each node corresponds to a particular node-rev in the
> + repository. It can be a mixed-revision tree. Each node holds either a
> + copy of the node-rev as it exists in the repository (if presence =
> + 'normal'), or a place-holder (if presence = 'absent' or 'excluded' or
> + 'not-present').
> +
> @{
> */
> ]]]
>
> - Julian
>
>


Returning revprops through commit info

2010-08-12 Thread Hyrum K. Wright
Recently, we've updated the various client APIs which do commits to
return commit info back through a callback[1], since they may be
extended to perform multiple commits (see issue #1199 for instance).
In doing so, I've had the opportunity to take a look at the
svn_commit_info_t struct.  It's pretty simplistic, but includes fields
for the author and date.

In 1.6 (I believe) we changed the various log and commit APIs to use a
hash of arbitrary revision properties, rather than a hard coded list.
I wonder if it's worth it to do so in the commit_info struct.  We'd
still keep the existing fields for compat, but we would also add a
hash of revision properties, for consistency with the other APIs, and
for greater future extensibility.

Thoughts?

-Hyrum

[1] The callback was added as a member of the client context, instead
of a per-API func/baton set.  This choice was somewhat arbitrary, and
conversations with Mark regarding the same in the JavaHL wrappers have
made me wonder if we should go with the explicit func/baton args,
rather than using the client ctx.  Anybody have any strong feelings
about this issue?


Re: svn commit: r984926 - in /subversion/trunk: ./ subversion/libsvn_subr/dirent_uri.c

2010-08-12 Thread Lieven Govaerts
On Thu, Aug 12, 2010 at 9:59 PM, Daniel Shahaf  wrote:
> stef...@apache.org wrote on Thu, Aug 12, 2010 at 19:29:24 -:
>> Author: stefan2
>> Date: Thu Aug 12 19:29:23 2010
>> New Revision: 984926
>>
>> URL: http://svn.apache.org/viewvc?rev=984926&view=rev
>> Log:
>> Merge r983764 from branches/performance. Approved by: danielsh
>> http://svn.haxx.se/dev/archive-2010-08/0217.shtml
>
> Usually, the "Approved by:" appears on a separate line, and an accompanying 
> URL
> is not necessary.  (See [1].)
>
> I won't be bothered if these log messages aren't fixed, but please be aware
> of the convention for next time.

These conventions are there to make the contribulyzer work no?

http://www.red-bean.com/svnproject/contribulyzer/

Not following these conventions means we have to pay attention to your
work instead of having an automated tool do that for us :-).

>
> Thanks again for your work!
>

+1

Lieven


Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c

2010-08-12 Thread Stefan Fuhrmann

Hyrum K. Wright wrote:

On Tue, Aug 10, 2010 at 5:56 AM, Stefan Sperling  wrote:
  

On Mon, Aug 09, 2010 at 09:45:18PM +0300, Daniel Shahaf wrote:


Also, this isn't really related to performance; it belongs on /trunk.  Next
time, you could send this with a [PATCH] marker in the subject line, and
a full committer could +1 you to commit that to directly to /trunk.
  

Yes, please send patches if you have a change that isn't direclty
related to your performance improvements work.

The scope of the branch is not "stefan2 makes all of his commits there",
it's "this branch is for stefan2's performance-related work".


This was a special case because without the patch, my load
tests wouldn't run. So, I could at least kind of justify the process
violation to myself as "performance-related work".

If I'm not mistaken, there are no outstanding behavioral patches
left that I would need for my performance improvements.


By the way, please don't take all this as "everybody is jumping on
Stefan F."  We are really grateful for the performance improvements
going on on your branch and look forward to seeing them in Subversion.
 Your changes are just the first time this has happened in a while,
and we're using them as an opportunity to do a bit of group
re-education.  So please don't feel singled out. :)
  

Fair enough and no offense is taken. I'm used to criticism; it is
part of my job.

(We will shortly be chastising Daniel S. for a similar transgression. :P )
  

Lucky us that Leviticus doesn't mention that particular crime.
At times, all that stoning business gets a bit tiresome ... ;)

-- Stefan^2.


Re: svn commit: r984926 - in /subversion/trunk: ./ subversion/libsvn_subr/dirent_uri.c

2010-08-12 Thread Daniel Shahaf
stef...@apache.org wrote on Thu, Aug 12, 2010 at 19:29:24 -:
> Author: stefan2
> Date: Thu Aug 12 19:29:23 2010
> New Revision: 984926
> 
> URL: http://svn.apache.org/viewvc?rev=984926&view=rev
> Log:
> Merge r983764 from branches/performance. Approved by: danielsh
> http://svn.haxx.se/dev/archive-2010-08/0217.shtml

Usually, the "Approved by:" appears on a separate line, and an accompanying URL
is not necessary.  (See [1].)

I won't be bothered if these log messages aren't fixed, but please be aware
of the convention for next time.

Thanks again for your work!

Daniel


[1] http://subversion.apache.org/docs/community-guide/conventions.html#crediting


Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c

2010-08-12 Thread Daniel Shahaf
Stefan Fuhrmann wrote on Thu, Aug 12, 2010 at 21:51:12 +0200:
> Merged in r984928.

Thanks.

>> (actually, this may be a good change to backport to 1.6.x too)
>>   
> It seems to work with the 1.6.x client (at least with the builds I used).
> So, this bug might have gotten triggered by some canonicalization
> usage cleanup in 1.7.
>

I'd nominated it for backport before seeing this ^^^

> -- Stefan^2.
>


Re: svn commit: r983766 - /subversion/branches/performance/subversion/libsvn_client/export.c

2010-08-12 Thread Stefan Fuhrmann

Daniel Shahaf wrote:

stef...@apache.org wrote on Mon, Aug 09, 2010 at 18:33:53 -:
  

Author: stefan2
Date: Mon Aug  9 18:33:53 2010
New Revision: 983766

URL: http://svn.apache.org/viewvc?rev=983766&view=rev
Log:
Fix the root cause of an assertion triggered by exporting KDE /trunk:
File names need to be canonicalized when forming URLs.

* subversion/libsvn_client/export.c
  (add_file): properly escape the file's URL

Modified:
subversion/branches/performance/subversion/libsvn_client/export.c

Modified: subversion/branches/performance/subversion/libsvn_client/export.c
URL: 
http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_client/export.c?rev=983766&r1=983765&r2=983766&view=diff
==
--- subversion/branches/performance/subversion/libsvn_client/export.c (original)
+++ subversion/branches/performance/subversion/libsvn_client/export.c Mon Aug  
9 18:33:53 2010
@@ -708,7 +708,12 @@ add_file(const char *path,
   struct edit_baton *eb = pb->edit_baton;
   struct file_baton *fb = apr_pcalloc(pool, sizeof(*fb));
   const char *full_path = svn_dirent_join(eb->root_path, path, pool);
-  const char *full_url = svn_uri_join(eb->root_url, path, pool);
+
+  /* path is not canonicalized, i.e. it may still contain spaces etc. */
+  const char *full_url = svn_uri_canonicalize(svn_uri_join(eb->root_url, 
+   path, 
+   pool),

+  pool);
 



See svn_path_url_add_component2().
  

Yep, that would be the correct function to use. I didn't test my
variant with a repo root that required escaped chars. In r984927,
it uses svn_path_url_add_component2 and works fine also in that case.

Also, this isn't really related to performance; it belongs on /trunk.  Next
time, you could send this with a [PATCH] marker in the subject line, and
a full committer could +1 you to commit that to directly to /trunk.
  

Will do. But as far as I remember, my prototyping code didn't
require any further bug fixes.

In this case, I give my +1.
  

Merged in r984928.

Daniel
(actually, this may be a good change to backport to 1.6.x too)
  

It seems to work with the 1.6.x client (at least with the builds I used).
So, this bug might have gotten triggered by some canonicalization
usage cleanup in 1.7.

-- Stefan^2.



Re: svn commit: r983764 - /subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c

2010-08-12 Thread Stefan Fuhrmann

Daniel Shahaf wrote:

+1 to merge this to trunk.

(Please specify "Approved by: danielsh" when doing so, since your commit
access does not currently include /trunk.  Thanks.)
  

Done in r984926.

-- Stefan^2.

stef...@apache.org wrote on Mon, Aug 09, 2010 at 18:27:49 -:
  

Author: stefan2
Date: Mon Aug  9 18:27:49 2010
New Revision: 983764

URL: http://svn.apache.org/viewvc?rev=983764&view=rev
Log:
Fix an obvious typo in the path validation code that is also present at /trunk.
It produces false negatives, i.e. certain malformed URIs won't be detected.

* subversion/libsvn_subr/dirent_uri.c
  (svn_uri_is_canonical): actually compare the chars following '%' instead
   of comparing '%'+1 and '%'+2.

Modified:
subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c

Modified: subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c
URL: 
http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c?rev=983764&r1=983763&r2=983764&view=diff
==
--- subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c 
(original)
+++ subversion/branches/performance/subversion/libsvn_subr/dirent_uri.c Mon Aug 
 9 18:27:49 2010
@@ -1901,11 +1901,11 @@ svn_uri_is_canonical(const char *uri, ap
 
   /* Can't use apr_isxdigit() because lower case letters are

  not in our canonical format */
-  if (((*(ptr+1) < '0' || (*ptr+1) > '9')) 
-  && (*(ptr+1) < 'A' || (*ptr+1) > 'F'))
+  if (((*(ptr+1) < '0' || *(ptr+1) > '9')) 
+  && (*(ptr+1) < 'A' || *(ptr+1) > 'F'))

 return FALSE;
-  else if (((*(ptr+2) < '0' || (*ptr+2) > '9')) 
-  && (*(ptr+2) < 'A' || (*ptr+2) > 'F'))
+  else if (((*(ptr+2) < '0' || *(ptr+2) > '9')) 
+  && (*(ptr+2) < 'A' || *(ptr+2) > 'F'))

 return FALSE;
 
   digitz[0] = *(++ptr);






  




Re: [PATCH] neon capabilities exchange

2010-08-12 Thread Daniel Shahaf
Daniel Shahaf wrote on Fri, Jul 30, 2010 at 15:30:19 +0300:
> So, here's an updated patch.  I'll commit it in a few days if I don't hear
> objections.

0:% svn ci --cl part
Sendingsubversion/libsvn_ra_neon/options.c
Sendingsubversion/libsvn_ra_serf/options.c
Sendingsubversion/svnsync/main.c
Transmitting file data ...
Committed revision 984923.

> (btw, should the "partial replay capability present" check be done at every
> connection, rather than only in 'svnsync init'?  Well, maybe...)

Tabling.


Re: [PATCH] Revert to using zlib-provided compress bound

2010-08-12 Thread Peter Samuelson

[Gavin Beau Baumanis]
> Just thought I would follow up your last email and see if you're
> still working on this patch?

r984911.

Thanks for the reminder,
Peter


Re: RFC: WCNG and Issue #2915: Merge tracking and missing subtrees due to non-svn removal

2010-08-12 Thread Daniel Shahaf
Paul Burba wrote on Fri, Aug 06, 2010 at 10:30:50 -0400:
> As described in issue #2915, in 1.6 when a merge target is a missing
> subtree due to its removal by non-svn means, we try to record
> mergeinfo such that the missing subtree doesn't have (or inherit)
> mergeinfo describing the merge:
> 
> (If you already have a vague idea of how this works you can skip to
> 'You might suggest that it makes more sense' below for the RFC)
> 
> ### A file is missing in our merge target
> 
>   1.6.13-dev>svn st
>   !   A_COPY\D\H\psi
> 
> ### No initial mergeinfo
> 
>   1.6.13-dev>svn pg svn:mergeinfo -vR
> 
> ### Merge tries to apply change to missing file, but can't
> ### and reports it as skipped
> 
>   1.6.13-dev>svn merge ^^/A A_COPY -r2:4
>   --- Merging r3 through r4 into 'A_COPY':
>   UA_COPY\D\G\rho
>   Skipped missing target: 'A_COPY\D\H\psi'
>   Summary of conflicts:
> Skipped paths: 1
> 
> ### Merge target gets mergeinfo describing the merge
> ### performed and the missing file gets empty "override"
> ### mergeinfo so it doesn't inherit the target's mergeinfo
> 
>   1.6.13-dev>svn st
>M  A_COPY
>   M   A_COPY\D\G\rho
>   !M  A_COPY\D\H\psi
> 
>   1.6.13-dev>svn pg svn:mergeinfo -vR
>   Properties on 'A_COPY\D\H\psi':
> svn:mergeinfo
> 
>   Properties on 'A_COPY':
> svn:mergeinfo
>   /A:3-4
> 
> If the missing subtree was a directory we obviously can't set its
> properties, so we treat this as a tree conflict:
> 
>   1.6.13-dev>svn st
>   !   A_COPY\D\H
> 
>   1.6.13-dev>svn merge ^^/A A_COPY -r2:4
>   --- Merging r3 through r4 into 'A_COPY':
>   UA_COPY\D\G\rho
>  C A_COPY\D\H
>   Summary of conflicts:
> Tree conflicts: 1
> 
>   1.6.13-dev>svn st
>M  A_COPY
>   M   A_COPY\D\G\rho
>   ! C A_COPY\D\H
> >   local delete, incoming edit upon merge
> 
> ~
> 
> You might suggest that it makes more sense to simply throw an error
> when a mergeinfo aware merge hits a missing subtree rather than
> jumping through these hoops.  I wouldn't disagree, but an even better
> solution was suggested to me recently by Bert: Restore the missing
> subtrees.  With wcng's single DB this should be easy with
> svn_wc_restore().
> 
> Does anyone see a reason we should not restore missing subtrees
> encountered during a merge?
> 
> Assuming for a moment there is general agreement about the goodness of
> this approach, which sounds better:
> 
> A) Check for missing subtrees at the start of the merge and restore
> them prior to any editor drives.  This would be easy enough to do in
> libsvn_client/merge.c:get_mergeinfo_paths() which we call at the start
> of a merges to collect information about subtrees "interesting" from a
> merge-tracking perspective.  The drawback here is that we need to
> detect all the missing subtrees and that means a new call to
> svn_io_check_path/apr_stat for every path in the merge target.  Since
> 99.99%* of merges don't involve missing subtrees, we are imposing a
> needless performance penalty on most users.
> 

Agreed: stat() on every file on, say, our trunk during a merge from a
branch, is too expensive.

> B) Restore the missing subtrees on-the-fly when the svn_delta_editor_t
> callbacks (i.e. open_directory, open_file) actually encounter a
> missing subtree.  This keeps the svn_io_check_path() calls to a
> minimum.  The drawback is that missing subtrees untouched by the
> editor are still missing after the merge.
> 

*nod*

> Any preference or suggestions for a superior solution are appreciated.
> 

We could treat missing files as conflicts?  e.g., about the same as what
we'd do if someone truncated the file to zero length.

That way all information is present locally (so there will be no need to
repeat the merge).

> Thanks,
> 
> Paul
> 
> * Yes, completely invented fact, but I feel pretty safe saying it.


Re: [PATCH] Revert to using zlib-provided compress bound

2010-08-12 Thread Daniel Shahaf
Welcome back Gavin :-)

Gavin Beau Baumanis wrote on Fri, Aug 13, 2010 at 00:12:53 +1000:
> Hi Michael,
> 
> Just thought I would follow up your last email and see if you're still 
> working on this patch?
> There's no rush... I just like to bring patch submissions back to the top of 
> the mailing list after about 7 days or so - just to remind everyone.
> 
> 
> Gavin.
> 


Re: What C structure contains the HTTP Headers

2010-08-12 Thread C. Michael Pilato
On 08/12/2010 12:49 PM, Loren Cahlander wrote:
> I want to change the following function in subversion/mod_dav_svn/version.c 
> to get those two headers values and set the values for SVN_PROP_REVISION_LOG 
> and SVN_PROP_REVISION_AUTHOR.  Is there a simple way to get the request_rec 
> structure from the svn_fs_txn_t or apr_pool_t structures?  Another thing that 
> might help is where is the entry into mod_dav_svn for a HTTP PUT request?

You'll need to tweak the dav_svn__attach_auto_revprops() function's
signature to accept a 'request_rec *r' parameter, and tweak the callers to
provide it.

As for the entry point, mod_dav_svn is actually a DAV filesystem provider
that sits behind Apache's own mod_dav.  In the Apache source code sits a
modules/dav/main/ directory, and the source code in that directory contains
functions with names of the form 'dav_method_REQUESTTYPE' where REQUESTTYPE
is the HTTP request method (in lowercase).  So,
${HTTPD_SRC}/modules/dav/main/mod_dav.c:dav_method_put() is the entry point
into this whole stack of WebDAV-based version control.


-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


[PATCH] Tweak description of BASE_NODE table

2010-08-12 Thread Julian Foad
Can someone check this before I commit?

[[[
Index: subversion/libsvn_wc/wc_db.h
===
--- subversion/libsvn_wc/wc_db.h(revision 984862)
+++ subversion/libsvn_wc/wc_db.h(working copy)
@@ -438,14 +438,19 @@ svn_wc__db_get_wcroot(const char **wcroo
 
 /* @defgroup svn_wc__db_base  BASE tree management
 
-   BASE should be what we get from the server. The *absolute* pristine copy.
-   Nothing can change it -- it is always a reflection of the repository.
+   BASE is what we get from the server.  It is the *absolute* pristine copy.
You need to use checkout, update, switch, or commit to alter your view of
the repository.
 
+   In the BASE tree, each node corresponds to a particular node-rev in the
+   repository.  It can be a mixed-revision tree.  Each node holds either a
+   copy of the node-rev as it exists in the repository (if presence =
+   'normal'), or a place-holder (if presence = 'absent' or 'excluded' or
+   'not-present').
+
@{
 */
]]]

- Julian




Re: What C structure contains the HTTP Headers

2010-08-12 Thread Loren Cahlander
On Aug 12, 2010, at 11:12 AM, Hyrum K. Wright wrote:

> On Thu, Aug 12, 2010 at 11:07 AM, C. Michael Pilato  
> wrote:
>> On 08/12/2010 12:04 PM, Loren Cahlander wrote:
>>> Hello there,
>>> 
>>> I am trying to make a change to mod_dav_svn on my server and need to get
>>> a value out of the HTTP headers. I have not developed in C in quite a
>>> while and would appreciate any help by pointing me in the right
>>> direction.  I can take it from there, but I do need some help.
>> 
>> Apache passes around the request structure (often just called 'r'), and
>> inside that is a table called 'headers_in'.  Grep for 'r->headers_in' in,
>> say, subversion/mod_dav_svn/repos.c.
> 
> I know next-to-nothing about mod_dav_svn, but why would an Apache
> request appear in libsvn_repos?  I thought all that stuff was handled
> higher up, and libsvn_repos was ra-agnostic.
> 
> -Hyrum

I am trying to make a hack in mod_dav_svn to get some additional headers from a 
HTTP put.

I have the following XQuery code:

declare function http:put-basic-auth($url, $content, $username, $password, 
$header) as node(){

let $credentials := concat($username, ':', $password)
let $encode := util:base64-encode($credentials)
let $value := concat('Basic ', $encode)
let $headers :=

   {$header//header}
   

let $response := httpclient:put($url, $content, false(), $headers)
return $response
};

Where $header will contain:








I want to change the following function in subversion/mod_dav_svn/version.c to 
get those two headers values and set the values for SVN_PROP_REVISION_LOG and 
SVN_PROP_REVISION_AUTHOR.  Is there a simple way to get the request_rec 
structure from the svn_fs_txn_t or apr_pool_t structures?  Another thing that 
might help is where is the entry into mod_dav_svn for a HTTP PUT request?

svn_error_t *
dav_svn__attach_auto_revprops(svn_fs_txn_t *txn,
  const char *fs_path,
  apr_pool_t *pool)
{
  const char *logmsg;
  svn_string_t *logval;
  svn_error_t *serr;

  logmsg = apr_psprintf(pool,
"Autoversioning commit:  a non-deltaV client made "
"a change to\n%s", fs_path);

  logval = svn_string_create(logmsg, pool);
  if ((serr = svn_repos_fs_change_txn_prop(txn, SVN_PROP_REVISION_LOG, logval,
   pool)))
return serr;

  /* Notate that this revision was created by autoversioning.  (Tools
 like post-commit email scripts might not care to send an email
 for every autoversioning change.) */
  if ((serr = svn_repos_fs_change_txn_prop(txn,
   SVN_PROP_REVISION_AUTOVERSIONED,
   svn_string_create("*", pool),
   pool)))
return serr;

  return SVN_NO_ERROR;
}


I thank you for replying to me.  I will keep code diving until I find the 
solution.  

Re: What C structure contains the HTTP Headers

2010-08-12 Thread C. Michael Pilato
On 08/12/2010 12:12 PM, Hyrum K. Wright wrote:
> On Thu, Aug 12, 2010 at 11:07 AM, C. Michael Pilato  
> wrote:
>> On 08/12/2010 12:04 PM, Loren Cahlander wrote:
>>> Hello there,
>>>
>>> I am trying to make a change to mod_dav_svn on my server and need to get
>>> a value out of the HTTP headers. I have not developed in C in quite a
>>> while and would appreciate any help by pointing me in the right
>>> direction.  I can take it from there, but I do need some help.
>>
>> Apache passes around the request structure (often just called 'r'), and
>> inside that is a table called 'headers_in'.  Grep for 'r->headers_in' in,
>> say, subversion/libsvn_repos/repos.c.
> 
> I know next-to-nothing about mod_dav_svn, but why would an Apache
> request appear in libsvn_repos?  I thought all that stuff was handled
> higher up, and libsvn_repos was ra-agnostic.

Doh!  I meant to say "subversion/mod_dav_svn/repos.c".

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: [RFC] Change svn_wc_add_repos_file4(copyfrom_url) to (_root, _relpath)

2010-08-12 Thread Hyrum K. Wright
On Thu, Aug 12, 2010 at 8:43 AM, Julian Foad  wrote:
> svn_wc_add_repos_file4(..., copyfrom_url, ...) has a check that says,
>
>  "If copyfrom_url isn't in the same repository as the WC target parent
> dir, then SVN_ERR_UNSUPPORTED_FEATURE."
>
> Semantically, yes, copying from another repos isn't yet supported.  But
> the check for that should be at a higher and/or lower level.
>
> The only reason it checks *at this level* is because it wants to know
> the repos_root so it can calculate the repos_relpath which is required
> for the DB.  And this is a cheap way of finding the repos root.
>
> Proposal:
>
>  * Change this API to take copyfrom_repos_root and
> copyfrom_repos_relpath instead.
>
>  * Remove this check and relpath calculation from its implementation.
>
>  * Update the callers (two in libsvn_client, one in libsvn_wc).
>
>  * For back-compat, move the check and _relpath calculation into
> svn_wc_add_repos_file3().
>
> That wouldn't change any behaviour right now, but would remove an
> obstacle for later improvements.
>
> I'm thinking this change fits into a generally desirable pattern of
> moving towards keeping repos_root and repos_relpath separate.
> Especially in the RA and repository side APIs, I assume, but makes sense
> to do so in other parts of the libraries as well.
>
> Any concerns?  If not I'll try to do it.

+1


Re: What C structure contains the HTTP Headers

2010-08-12 Thread Hyrum K. Wright
On Thu, Aug 12, 2010 at 11:07 AM, C. Michael Pilato  wrote:
> On 08/12/2010 12:04 PM, Loren Cahlander wrote:
>> Hello there,
>>
>> I am trying to make a change to mod_dav_svn on my server and need to get
>> a value out of the HTTP headers. I have not developed in C in quite a
>> while and would appreciate any help by pointing me in the right
>> direction.  I can take it from there, but I do need some help.
>
> Apache passes around the request structure (often just called 'r'), and
> inside that is a table called 'headers_in'.  Grep for 'r->headers_in' in,
> say, subversion/libsvn_repos/repos.c.

I know next-to-nothing about mod_dav_svn, but why would an Apache
request appear in libsvn_repos?  I thought all that stuff was handled
higher up, and libsvn_repos was ra-agnostic.

-Hyrum


Re: What C structure contains the HTTP Headers

2010-08-12 Thread C. Michael Pilato
On 08/12/2010 12:04 PM, Loren Cahlander wrote:
> Hello there,
> 
> I am trying to make a change to mod_dav_svn on my server and need to get
> a value out of the HTTP headers. I have not developed in C in quite a
> while and would appreciate any help by pointing me in the right
> direction.  I can take it from there, but I do need some help.

Apache passes around the request structure (often just called 'r'), and
inside that is a table called 'headers_in'.  Grep for 'r->headers_in' in,
say, subversion/libsvn_repos/repos.c.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


What C structure contains the HTTP Headers

2010-08-12 Thread Loren Cahlander
Hello there,

I am trying to make a change to mod_dav_svn on my server and need to get a 
value out of the HTTP headers. I have not developed in C in quite a while and 
would appreciate any help by pointing me in the right direction.  I can take it 
from there, but I do need some help.

Thank you,
Loren



Re: Looking to improve performance of svn annotate

2010-08-12 Thread Greg Hudson
On Thu, 2010-08-12 at 10:57 -0400, Julian Foad wrote:
> I'm wary of embedding any client functionality in the server, but I
> guess it's worth considering if it would be that useful.  If so, let's
> take great care to ensure it's only lightly coupled to the core server
> logic.

Again, it's possible that binary diffs between sequential revisions
could be used for blame purposes (not the binary deltas we have now, but
edit-stream-style binary diffs), which would decouple the
line-processing logic from the server.

(But again, I haven't thought through the problem in enough detail to be
certain.)




Re: NODE_DATA (2nd iteration)

2010-08-12 Thread Stefan Sperling
On Thu, Aug 12, 2010 at 02:28:39PM +0200, Neels J Hofmeyr wrote:
> I've had so many family and summer events in the past weeks... soon I'll
> start forgetting my passwords!

Just type "yes" at the "really save in plaintext?" prompt :)


Re: Looking to improve performance of svn annotate

2010-08-12 Thread Julian Foad
On Thu, 2010-08-12, C. Michael Pilato wrote:
> In times past, I've wondered if the server couldn't just store line-delta
> information -- as a comparison between each FS DAG node and its immediate
> predecessor -- similar to the way that CVS does (and in addition to the
> stuff it already stores, of course).  The line-delta info could be populated
> post-commit just as the BDB backend did deltafication, or perhaps also on
> demand (to rebuild this information for older servers) via some 'svnadmin'
> command.

The server could usefully calculate and store it - but on demand and
cached, not at commit time - see below.

>   But it shouldn't ever change once calculated, right?

Well ... that depends.  The line-ending style the client wants to use
can vary over time: in order to cope with the cases where it was not
specified correctly in earlier revisions, the client may want to use the
EOL style specified by HEAD (or the latest version being blamed).  It
may be possible to calculate and store a generic data set that will be
useful whatever EOL style the client eventually decides it should be
using.

> My only concern is in dealing with the definition of a "line".  The FS layer
> today is happily file-content agnostic.  All EOL translation stuffs are
> considered voodoo of interest only to the client layers.  We could, of
> course, choose to make the FS layer pay attention to the likes of the
> svn:eol-style property, but that feels mucho icky.
> 
> Thoughts?

If the decision to calculate and store linewise info for a given file is
made at commit time by the server, it would probably want to consider
svn:mime-type or the likes, to avoid doing it unnecessarily on all
files.  Clients might then never request blame info on a majority of
files.  Conversely, a client might request blame info for a file on
which the server thought the data would not be needed (perhaps because
it was marked 'application/octet-stream' for example).

I would think that having the server calculate the info on demand and
store it in a limited-lifetime cache is more sensible, if we take a
server-assisted approach at all.  A cache would handle the cases above
better, and could also be made to handle requests with varying
definitions of EOL style if that is necessary.

I'm wary of embedding any client functionality in the server, but I
guess it's worth considering if it would be that useful.  If so, let's
take great care to ensure it's only lightly coupled to the core server
logic.

- Julian




Log HTTP header for autoversioning

2010-08-12 Thread Loren Cahlander
I ran across this in the subversion archives:

http://thread.gmane.org/gmane.comp.version-control.subversion.devel/55794

I am also needing this feature.  Could someone supply me with a patch or help 
tell me how to get a HTTP header property while in version.c


svn_error_t *
dav_svn__attach_auto_revprops(svn_fs_txn_t *txn,
  const char *fs_path,
  apr_pool_t *pool)
{
  const char *logmsg;
  svn_string_t *logval;
  svn_error_t *serr;

  logmsg = apr_psprintf(pool,
"Autoversioning commit:  a non-deltaV client made "
"a change to\n%s", fs_path);

  logval = svn_string_create(logmsg, pool);
  if ((serr = svn_repos_fs_change_txn_prop(txn, SVN_PROP_REVISION_LOG, logval,
   pool)))
return serr;

  /* Notate that this revision was created by autoversioning.  (Tools
 like post-commit email scripts might not care to send an email
 for every autoversioning change.) */
  if ((serr = svn_repos_fs_change_txn_prop(txn,
   SVN_PROP_REVISION_AUTOVERSIONED,
   svn_string_create("*", pool),
   pool)))
return serr;

  return SVN_NO_ERROR;
}


Since I am in control of the subversion on our server, I can implement this.  I 
have not done C development in quite a long time.  I have been doing Java and 
XQuery.  I would appreciate any help that you might be able to provide.

Thank you,

Loren



Re: [RFC] Change svn_wc_add_repos_file4(copyfrom_url) to (_root, _relpath)

2010-08-12 Thread Ivan Zhakov
On Thu, Aug 12, 2010 at 17:43, Julian Foad  wrote:
> svn_wc_add_repos_file4(..., copyfrom_url, ...) has a check that says,
>
>  "If copyfrom_url isn't in the same repository as the WC target parent
> dir, then SVN_ERR_UNSUPPORTED_FEATURE."
>
> Semantically, yes, copying from another repos isn't yet supported.  But
> the check for that should be at a higher and/or lower level.
>
> The only reason it checks *at this level* is because it wants to know
> the repos_root so it can calculate the repos_relpath which is required
> for the DB.  And this is a cheap way of finding the repos root.
>
> Proposal:
>
>  * Change this API to take copyfrom_repos_root and
> copyfrom_repos_relpath instead.
>
>  * Remove this check and relpath calculation from its implementation.
>
>  * Update the callers (two in libsvn_client, one in libsvn_wc).
>
>  * For back-compat, move the check and _relpath calculation into
> svn_wc_add_repos_file3().
>
> That wouldn't change any behaviour right now, but would remove an
> obstacle for later improvements.
>
> I'm thinking this change fits into a generally desirable pattern of
> moving towards keeping repos_root and repos_relpath separate.
> Especially in the RA and repository side APIs, I assume, but makes sense
> to do so in other parts of the libraries as well.
>
> Any concerns?  If not I'll try to do it.
>
+1, makes sense to me.


-- 
Ivan Zhakov
VisualSVN Team


Re: [PATCH] Revert to using zlib-provided compress bound

2010-08-12 Thread Gavin Beau Baumanis
Hi Michael,

Just thought I would follow up your last email and see if you're still working 
on this patch?
There's no rush... I just like to bring patch submissions back to the top of 
the mailing list after about 7 days or so - just to remind everyone.


Gavin.


On 07/08/2010, at 8:11 AM, Michael Spang wrote:

> On Fri, Aug 6, 2010 at 1:51 PM, Peter Samuelson  wrote:
>> 
>> [Michael Spang]
>>> This reverts to using the zlib-provided version, since the old version
>>> of zlib that was missing this function should be quite rare these
>>> days.
>> 
>> Maybe I'm just old ... but I bet there's still some zlib 1.1.4
>> out there.  Maybe do the following instead?  (Untested.)
> 
> Yeah that does seem better.
> 
> Michael



RE: [RFC] Change svn_wc_add_repos_file4(copyfrom_url) to (_root, _relpath)

2010-08-12 Thread Bert Huijben


> -Original Message-
> From: Julian Foad [mailto:julian.f...@wandisco.com]
> Sent: donderdag 12 augustus 2010 15:43
> To: dev@subversion.apache.org
> Subject: [RFC] Change svn_wc_add_repos_file4(copyfrom_url) to (_root,
> _relpath)
> 
> svn_wc_add_repos_file4(..., copyfrom_url, ...) has a check that says,
> 
>   "If copyfrom_url isn't in the same repository as the WC target parent
> dir, then SVN_ERR_UNSUPPORTED_FEATURE."
> 
> Semantically, yes, copying from another repos isn't yet supported.  But
> the check for that should be at a higher and/or lower level.
> 
> The only reason it checks *at this level* is because it wants to know
> the repos_root so it can calculate the repos_relpath which is required
> for the DB.  And this is a cheap way of finding the repos root.
> 
> Proposal:
> 
>   * Change this API to take copyfrom_repos_root and
> copyfrom_repos_relpath instead.
> 
>   * Remove this check and relpath calculation from its implementation.
> 
>   * Update the callers (two in libsvn_client, one in libsvn_wc).
> 
>   * For back-compat, move the check and _relpath calculation into
> svn_wc_add_repos_file3().
> 
> That wouldn't change any behaviour right now, but would remove an
> obstacle for later improvements.
> 
> I'm thinking this change fits into a generally desirable pattern of
> moving towards keeping repos_root and repos_relpath separate.
> Especially in the RA and repository side APIs, I assume, but makes
> sense
> to do so in other parts of the libraries as well.
> 
> Any concerns?  If not I'll try to do it.

+1

Bert 




[RFC] Change svn_wc_add_repos_file4(copyfrom_url) to (_root, _relpath)

2010-08-12 Thread Julian Foad
svn_wc_add_repos_file4(..., copyfrom_url, ...) has a check that says,

  "If copyfrom_url isn't in the same repository as the WC target parent
dir, then SVN_ERR_UNSUPPORTED_FEATURE."

Semantically, yes, copying from another repos isn't yet supported.  But
the check for that should be at a higher and/or lower level.

The only reason it checks *at this level* is because it wants to know
the repos_root so it can calculate the repos_relpath which is required
for the DB.  And this is a cheap way of finding the repos root.

Proposal:

  * Change this API to take copyfrom_repos_root and
copyfrom_repos_relpath instead.

  * Remove this check and relpath calculation from its implementation.

  * Update the callers (two in libsvn_client, one in libsvn_wc).

  * For back-compat, move the check and _relpath calculation into
svn_wc_add_repos_file3().

That wouldn't change any behaviour right now, but would remove an
obstacle for later improvements.

I'm thinking this change fits into a generally desirable pattern of
moving towards keeping repos_root and repos_relpath separate.
Especially in the RA and repository side APIs, I assume, but makes sense
to do so in other parts of the libraries as well.

Any concerns?  If not I'll try to do it.

- Julian




Re: Looking to improve performance of svn annotate

2010-08-12 Thread C. Michael Pilato
In times past, I've wondered if the server couldn't just store line-delta
information -- as a comparison between each FS DAG node and its immediate
predecessor -- similar to the way that CVS does (and in addition to the
stuff it already stores, of course).  The line-delta info could be populated
post-commit just as the BDB backend did deltafication, or perhaps also on
demand (to rebuild this information for older servers) via some 'svnadmin'
command.  But it shouldn't ever change once calculated, right?

My only concern is in dealing with the definition of a "line".  The FS layer
today is happily file-content agnostic.  All EOL translation stuffs are
considered voodoo of interest only to the client layers.  We could, of
course, choose to make the FS layer pay attention to the likes of the
svn:eol-style property, but that feels mucho icky.

Thoughts?

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: svn_wc_translated_file3

2010-08-12 Thread Julian Foad
On Mon, 2010-08-09, Stefan Küng wrote: 
> On 09.08.2010 16:44, Hyrum K. Wright wrote:
> > On Mon, Aug 9, 2010 at 7:46 AM, Julian Foad  
> > wrote:
> >> On Mon, 2010-08-09, Julian Foad wrote:
> >>> On Mon, 2010-08-09, Bert Huijben wrote:
>  Stefan Küng wrote:
> > This leads me to another request: would it be possible to 'un-deprecate'
> > the svn_wc_get_pristine_copy_path() API?
> >>>
>  One of the ideas of WC-NG (which will most likely not be part of 1.7, but
>  can be added later without really updating our current pristine api) was 
>  to
>  allow compressed and/or (partially) absent pristine storage.
> 
>  This specific api function requires that the file is available in the
>  pristine store as uncompressed file or that we store a copy of the file 
>  in a
>  tempfolder (see the documentation update from Julian in r983593).
> >>>
> >>> Ah, yes - I'd forgotton that.  When the pristine file already exists in
> >>> plain text, it's not a big problem to return its path even though we
> >>> can't promise how long it will live.  But if pristine texts are stored
> >>> compressed and so we have to create a temp file in order to support that
> >>> API, when would we delete the temp file?
> >>
> >> A good answer could be: let the caller of
> >> svn_wc_get_pristine_copy_path() specify one of:
> >>
> >>   - Give me a new copy of the file that I can keep indefinitely and
> >> delete when I wish.
> >>
> >>   - Give me a reference to a shared copy of the file; keep it available
> >> until pool clean-up.
> 
> Or just 'give me a reference to a file': if it's already available, just 
> return the path to that file. If it's not available yet, get a copy of 
> that file (uncompressed or fetched from a repository if it's not 
> available locally at all) and return the path to that temp file.

Yes, that's what I meant.  By saying "a shared copy" I just meant that
the caller must not modify or delete (or get an exclusive read lock,
etc.) the file, because it *may* be shared.

> Maybe with a flag to keep/remove the file on pool cleanup.

My point is that this "give me a reference" API *must* specify the
lifetime of the reference, otherwise the library implementation doesn't
know when it can delete the temp file.  The lifetime can be until pool
clean-up, or until something else, but it must be specified and it must
be finite.  Otherwise the temp dir will just grow and grow.

> >>   - Give me a reference to a shared copy of the file; keep it available
> >> until I call ... some new "unreference it" API?
> >>
> >> Would the second or third of those options work well, Stefan?
> >>
> >> We'll need to consider this within Subversion as well.  If we start
> >> supporting compressed bases, we might want to cache non-compressed
> >> copies of some of them for faster access.
> >>
> >> A caller of svn_wc_translated_file2() has some control over its output
> >> file lifetime with a default behaviour of delete-on-pool-cleanup and a
> >> SVN_WC_TRANSLATE_NO_OUTPUT_CLEANUP flag to avoid that.  However that
> >> only applies when it's making a new copy of the file.  The behaviour
> >> when it returns a reference to the shared file is undefined.  I've
> >> updated the doc string in r983593 to say so.  Thus this function also
> >> needs attention.
> >
> > The whole point of the pristine store is that the location and
> > encoding of pristine contents should be transparent to anybody above
> > libsvn_wc.  While making libsvn_wc much easier to maintain and extend,
> > going that route apparently decreases performance for clients such as
> > TortoiseSVN by taking away some of the shortcuts they were exploiting.
> >   We need to decide if exposing these shortcuts (and adding the
> > associated code complexity) is worth it.
> >
> > In the potential scenario where a user has opted for no pristines or
> > compressed pristines, how would TortoiseSVN maintain its current
> > functionality without waiting on decompression or network roundtrips?
> 
> I think users will understand if they decide to have their BASE 
> compressed or not stored locally, that such diffs will take longer.

Yes, exactly.

> If I may suggest how the API(s) should look and work:
> 
> * rename svn_wc_translated_file2() to svn_wc_translated_filestream() 
> since it doesn't return a file but a file stream. File streams can only 
> be used by svn clients directly, they can't be passed on to external 
> applications so their use is limited.

No, svn_wc_translated_file2() already creates a *file* and returns the
path to that file.  Maybe you're thinking of svn_wc_translated_stream()
which also exists and was also deprecated recently.

> * new API svn_wc_translated_file3():
>svn_client_translated_file(const char ** resultfile,
>svn_boolean_t * copyCreated,
>const char *  wcfile,
>svn_boolean_t deleteOnPoolCleanup,
>apr_uint32_t  flags,
>apr_pool_t*   pool)
> 
> the flags would be the same as now:

Re: NODE_DATA (2nd iteration)

2010-08-12 Thread Neels J Hofmeyr
I did draw some sort of chart of NODE_DATA in wc-metadata.sql...

[[[
   For illustration, with a scenario like this:

 # (0)
 svn rm foo
 svn cp ^/moo foo   # (1)
 svn rm foo/bar
 touch foo/bar
 svn add foo/bar# (2)

   , these are the NODE_DATA for the path foo/bar (before single-db, the
   numbering of op_depth is still a bit different):

   (0)  BASE_NODE ->  NODE_DATA (op_depth == 0)
   (1)NODE_DATA (op_depth == 1) ( <_ )
   (2)NODE_DATA (op_depth == 2)   <- WORKING_NODE

   0 is the original data for foo/bar before 'svn rm foo' (if it existed).
   1 is the data for foo/bar copied in from ^/moo/bar. (There would also be
 a WORKING_NODE for the path foo, with original_* pointing at ^/moo.)
   2 is the to-be-committed data for foo/bar, created by 'svn add foo/bar'.

   An 'svn revert foo/bar' would remove the NODE_DATA of (2) (and possibly
   rewire the WORKING_NODE to represent a child of the operation (1)).
   So foo/bar would be a copy of ^/moo/bar again.
]]]

So there's always at most one working node and at most one base node per
path. While working node rows get "overwritten" with operations done on
child paths, the intermediate node_data are kept until commit or revert...

That's about all I know of it.

Been thinking on those last_modtime and translated_size columns that are
kept in both BASE_NODE and WORKING_NODE. I believe this duplication shows
that detecting modifications in the local file system is a different concept
from base/working. We should have a separate table for that instead
of doing "is-there-a-working-node?-no?-then-look-in-the-base"
we do that kind of looking up anyway and having a little 'if' to
select the proper modtime/size doesn't hurt much

(Am not having a strong opinion, just brainstorming...)


On 2010-08-10 18:18, Julian Foad wrote:
> Any responses would be greatly appreciated.

I've had so many family and summer events in the past weeks... soon I'll
start forgetting my passwords!

~Holiday-Neels

> 
> - Julian
> 
> 
> On Tue, 2010-08-03, Julian Foad wrote:
>> On Mon, 2010-07-12, Erik Huelsmann wrote:
>>> After lots of discussion regarding the way NODE_DATA/4th tree should
>>> be working, I'm now ready to post a summary of the progress. In my
>>> last e-mail (http://svn.haxx.se/dev/archive-2010-07/0262.shtml) I
>>> stated why we need this; this post is about the conclusion of what
>>> needs to happen. Also included are the first steps there.
>>>
>>>
>>> With the advent of NODE_DATA, we distinguish node values specifically
>>> related to BASE nodes, those specifically related to "current" WORKING
>>> nodes and those which are to be maintained for multiple levels of
>>> WORKING nodes (not only the "current" view) (the latter category is
>>> most often also shared with BASE).
>>>
>>> The respective tables will hold the columns shown below.
>>>
>>>
>>> -
>>> TABLE WORKING_NODE (
>>>   wc_id  INTEGER NOT NULL REFERENCES WCROOT (id),
>>>   local_relpath  TEXT NOT NULL,
>>>   parent_relpath  TEXT,
>>>   moved_here  INTEGER,
>>>   moved_to  TEXT,
>>>   original_repos_id  INTEGER REFERENCES REPOSITORY (id),
>>>   original_repos_path  TEXT,
>>>   original_revnum  INTEGER,
>>>   translated_size  INTEGER,
>>>   last_mod_time  INTEGER,  /* an APR date/time (usec since 1970) */
>>>   keep_local  INTEGER,
>>>
>>>   PRIMARY KEY (wc_id, local_relpath)
>>>   );
>>>
>>> CREATE INDEX I_WORKING_PARENT ON WORKING_NODE (wc_id, parent_relpath);
>>> 
>>>
>>> The moved_* and original_* columns are typical examples of "WORKING
>>> fields only maintained for the visible WORKING nodes": the original_*
>>> and moved_* fields are inherited from the operation root by all
>>> children part of the operation. The operation root will be the visible
>>> change on its own level, meaning it'll have rows both in the
>>> WORKING_NODE and NODE_DATA tables. The fact that these columns are not
>>> in the WORKING_NODE table means that tree changes are not preserved
>>> accros overlapping changes. This is fully compatible with what we do
>>> today: changes to higher levels destroy changes to lower levels.
>>>
>>> The translated_size and last_mod_time columns exist in WORKING_NODE
>>> and BASE_NODE; they explicitly don't exist in NODE_DATA. The fact that
>>> they exist in BASE_NODE is a bit of a hack: it's to prevent creation
>>> of WORKING_NODE data for every file which has keyword expansion or eol
>>> translation properties set: these columns serve only to optimize
>>> working copy scanning for changes and as such only relate to the
>>> visible WORKING_NODEs.
>>>
>>
>> Can we come up with an English description of what each table will now
>> represent?
>>
>> "The BASE_NODE table lists the existing node-revs in the repository that
>> comprise the mixed-revision tree that was most recently updated/switched
>> to or checked out.  (The kind and content of these nodes is not here;

Re: svnrdump: The BIG update

2010-08-12 Thread Daniel Shahaf
Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530:
> > > The dump functionality is also complete- thanks to Stefan's review and
> > > MANY others for cleaning it up. It's however hit a brick wall now
> > > because of missing headers in the RA layer. Until I (or someone else)
> > > figures out how to fix the RA layer, we can't do better than the XFail
> > > copy-and-modify test I've committed.
> > 
> > Part of the diff there is lack of SHA-1 headers --- which is unavoidable
> > until editor is revved --- but part of it is a missing Text-copy-source-md5.
> > Why don't you output that information --- doesn't the editor give it to you?
> 
> Afaik, no. I don't see Text-copy-source-* anywhere in the RA
> layer. Maybe I'm not looking hard enough?
> 

Hmm.  It seems you're right.  So you might have to use two RA session in
parallel...

(and then, you might have to have the user authenticate twice?)

> > > - More optimizations. Since svnrdump is already so fast compared to
> > >   the other tools, I think we can squeeze some more speed out of it.
> > > - Huge documentation effort. svnrdump is a hack- I just did what I
> > >   felt like and got it to work somehow. It's very unlike svnmucc,
> > >   which does things by the book.
> > > - Build more infrastructure around svnrdump- I've mostly used existing
> > >   SVN API. Although a lot of new functions were suggested, I never
> > >   really got down to writing them.
> > 
> > Yep.  There was also talk of moving some of the logic into the libraries ---
> > where does that stand?
> 
> Yeah, I haven't started working on this yet. I'll need some guidance
> for this- I have to sketch out a roadmap and ask for access to the
> specified regions or branch; planning is something I'm not used to at
> all :p
> 

:-)

> > > - Make dumpfile v3 the de-facto standard and improve it for optimized
> > >   loading/ generation. The former part was suggested by Stefan.
> > > - Integrate it into svnadmin etc as appropriate. I think there's
> > >   enough work here for a mini-GSoC project?
> > 
> > How would it be integrated into svnadmin?  Do you want to push the logic
> > into the standard 'svnadmin dump' command?
> 
> This is something I haven't given thought either. I brought it up
> because of an earlier discussion in which everyone seemed to be in
> favor of NOT having a new command. It feels like we're stuffing a lot
> of functionality into one tool though.
> 

Personally I also like having svnadmin operates only locally (so it doesn't
even link against libsvn_ra), but that was hashed out already on that
moderately-long thread a few weeks ago.

> > > - GitHub support (?) -- I saw this discussed on IRC somewhere, but I
> > >   didn't understand this myself. Can someone clarify?
> > > 
> > 
> > Joke.  GitHub implemented a mod_dav_svn interface to their repositories [1],
> > so it's now possible (if their implementation is sound) to generate an svn
> > dump of a GitHub git repository.
> 
> Ah, yes. I'm aware. With the infrastructure I've written on the Git
> end (incomplete), the SVN <-> Git bidirectional bridge should be
> seamless and awesome :)
> 
> Note: I'll be visiting home this weekend (that means: mostly
> travelling). I'll be back to hack next week.
> 

> -- Ram