Any idea why public function like svn_fspath__dirname have double __ in its name?

2010-12-28 Thread Kamesh Jayachandran

Hi All,

I could see the following functions in our public header with '__' in 
their names,


svn_fspath__is_canonical
svn_fspath__dirname
svn_fspath__basename
svn_fspath__split
svn_fspath__join
svn_fspath__is_child
svn_fspath__skip_ancestor
svn_fspath__is_ancestor
svn_fspath__get_longest_ancestor
svn_error__locate

svn_error__malfunction

IIUC __ convention is used for semi-public functions where we may not 
keep BC promise.


Do I miss something?

With regards
Kamesh Jayachandran


Merge tracking between several parallel branches

2010-12-28 Thread Fredrik Orderud
This question has already been posted on the users list, but I did
not receive any answer there.

I have some questions regarding the status of the merge tracking
functionality in subversion 1.6.  Based on my understanding, it seems
like svn has good “automatic” support for merging between branches in
one direction, whereas cyclic/bidirectional merging requires a more
complex “reintegrate” + “record only” merge procedure. I am wondering
if this support also extends to scenarios with repeated merging
between several parallel release branches.

Does svn handle arbitrary directed acyclic merging between several
parallel branches, when merging is always conducted in the same
direction (e.g. from old to newer release branch)?

Does the “reintegrate” + “record only” merge procedure extend
gracefully to scenarios with arbitrary cyclic/bidirectional merging
between several parallel branches?

Thanks in advance,
Fredrik Orderud


Re: Default commandline args

2010-12-28 Thread Stefan Fuhrmann

On 20.12.2010 23:28, Hyrum K. Wright wrote:

In issue #3765, I suggest the possibility of a CVS-style config file
wherein a user could specify a set of default arguments for a given
subcommand.  For example, 'svn diff' would default to 'svn diff -x -p'
if the config file had an entry for such.  Thinking this would be an
SMOP, I jumped in...only to discover that it isn't.

The actual implementation isn't hard, I'm just wondering about where
the configuration should live.  We have a ~/.subversion/config file,
but these options aren't for the client library, they are specific to
the commandline client.  This makes me think that we should have a
separate file for the commandline client, and that it may want to be
something like ~/.svnrc or something equally commandline-ish.

Thoughts?

-Hyrum

http://subversion.tigris.org/issues/show_bug.cgi?id=3765

I think this kind of feature can be useful and since
it is a UI optimization specific / local to a particular
client, I don't see why it should not be implemented.

TSVN, for instance, uses its own implementation
to manage user preferences and these are often
quite specific to TSVN's UI needs. So, there is
little room for or potential benefit of reuse across
various client implementations.

Not to limit the kind of information a particular
client could store in its config file (e.g. multiple
sections per client instead of one), I would prefer
separate config files for each client.

The number of clients would be  1 and I would
not like to see my home folder being cluttered by
tons of .svnrc, .svnadminrc ... files. Keep them in
~/.subversion/ .

-- Stefan^2.



How is mixed authentication/anonymous access implemented

2010-12-28 Thread Avalon

Hi,

SVN features a mixed authentication/anonymous access (see 
http://svnbook.red-bean.com/nightly/en/svn.serverconfig.httpd.html#svn.serverconfig.httpd.authz.perdir.ex-3).

I want to achieve the same functionality using a PHP script: allow anonymous access until accessing some special content and than request authentification which should be checked according to a 
htaccess-file.

As far as i understand the SVN example the authentification is performed by the 
Apache modules.

I configured the .htaccess file to look similar:
  Order allow,deny
  Allow from all
  AuthType Basic
  AuthName Realm
  AuthUserFile /path/to/.htusers
  require valid-user
  Satisfy any

Additionally a PHP script is inside the same folder.
When you now browse to the URL of the PHP script, you can access it without any 
credentials requested.

At some point the PHP script decides that authentification is required (e.g. when 
passing a param like ?need-auth=1).
I suppose this is similar to how the mixed authentication/anonymous access in 
SVN works (?).

Therefore it sends the following two headers:
  WWW-Authenticate: Basic realm=Realm
  HTTP/1.x 401 Unauthorized

Then the user is asked to insert username/password for the basic auth.
But now comes the problem:
The apache will ALWAYS let the user pass as anonymous access is always granted.
I suppose the webserver does not even try to authenticate the user credentials.
Therefore it is not possible to decide in PHP if the user is anonymous or has 
been successfully authenticated.

How is this performed in SVN for the mixed authentication/anonymous access?

What i do not want is:
- check the credentials in PHP (due to the many different auth-methods which 
could be configured with Apache)
- have a dummy anonymous user like guest with password guest
- split anonymous and authenticated parts in separate folders (to use separate 
.htaccess-files)

I hope to get some enlightenment from the way SVN realizes this feature.

Any feedback is highly appreciated.

Thank you
Dirk


Re: Safe to add a field to svn_error_t?

2010-12-28 Thread Stefan Fuhrmann

On 24.12.2010 19:32, Blair Zajac wrote:

On 12/24/10 6:07 AM, Hyrum K Wright wrote:
On Fri, Dec 24, 2010 at 12:10 AM, Blair Zajacbl...@orcaware.com  
wrote:
The addition of tracing to svn_error_t in 1.7 currently uses a 
special
message string traced call.  Instead of doing this, can we safely 
add an

svn_boolean_t to svn_error_t the indicates if it is a traced error?

While svn_error_t is public, I'm hoping most people are creating 
them with
svn_error_create*(), so if that's the case, then code written and 
compiled

against svn= 1.6 would work fine when linked against a 1.7 library.

Thoughts?


We don't explicitly say only use approved methods to create this
object, but we are pretty implicit about it, with an entire
documentation group for Error creation and destruction.  I'm on the
fence on this, but would lean toward go ahead and extend the struct
if pressured.

What problem are you trying to solve?


There's this comment in svn_error__is_tracing_link():

svn_boolean_t
svn_error__is_tracing_link(svn_error_t *err)
{
#ifdef SVN_ERR__TRACING
  /* ### A strcmp()?  Really?  I think it's the best we can do unless
 ### we add a boolean field to svn_error_t that's set only for
 ### these placeholder error chain items.  Not such a bad idea,
 ### really...  */
  return (err  err-message  !strcmp(err-message, SVN_ERR__TRACED));
#else
  return FALSE;
#endif
}


What is the use-case for this function anyways?

IMHO,  an svn_error_t says There was problem
$message when executing the instruction at $file,
$line. To be more precise it was inner error info
follows.

The fact that most error messages are just traced
call simply means we did not bother to give more
specific descriptions. But I don't see a real semantic
difference because everybody along the calling
chain may have contributed to the problem.

At best one might use this function to reduce the
length of the error report hoping that all the default
text in between are not relevant to the particular
scenario.


I thought of two other solutions:

1) a sentinel message and we compare by address.
2) negative source file line numbers.


To me, they seems at least as hacky as the
current solution. I could easily live with them but
so can I with the current code.

-- Stefan^2.


Re: Merge tracking between several parallel branches

2010-12-28 Thread Stefan Sperling
On Tue, Dec 28, 2010 at 01:08:47PM +0100, Fredrik Orderud wrote:
 This question has already been posted on the users list, but I did
 not receive any answer there.

Hi Fredrik,

The dev@ list is not an escalation path for unanswered questions from
us...@.  I'd suggest that you try posting to users@ again. Maybe it
just happened that nobody had time to answer when you sent your
original question. You can reply to your own users@ post to raise
awareness of the fact that you're still looking for an answer.

Good luck,
Stefan


Re: [PATCH] fixes some path related deprecation warnings

2010-12-28 Thread Kamesh Jayachandran

Prabhu

--- subversion/libsvn_repos/commit.c(revision 1053267)
+++ subversion/libsvn_repos/commit.c(working copy)
@@ -231,7 +231,7 @@
   svn_node_kind_t kind;
   svn_revnum_t cr_rev;
   svn_repos_authz_access_t required = svn_authz_write;
-  const char *full_path = svn_path_join(eb-base_path, path, pool);

+  const char *full_path = svn_dirent_join(eb-base_path, path, pool);


I think you need to use svn_fspath__join

Same in other parts of this patch.

Because eb-basepath is a path inside fs as per the below snipped comment.

snip
  /* Location in fs where the edit will begin. */
  const char *base_path;
/snip

With regards
Kamesh Jayachandran



Re: Any idea why public function like svn_fspath__dirname have double __ in its name?

2010-12-28 Thread C. Michael Pilato
On 12/28/2010 04:30 AM, Kamesh Jayachandran wrote:
 Hi All,
 
 I could see the following functions in our public header with '__' in their
 names,
 
 svn_fspath__is_canonical
 svn_fspath__dirname
 svn_fspath__basename
 svn_fspath__split
 svn_fspath__join
 svn_fspath__is_child
 svn_fspath__skip_ancestor
 svn_fspath__is_ancestor
 svn_fspath__get_longest_ancestor
 svn_error__locate
 
 svn_error__malfunction
 
 IIUC __ convention is used for semi-public functions where we may not keep
 BC promise.
 
 Do I miss something?

While we've mandated that __ must be used for semi-public functions, we've
never said that it can't be used for private ones.  One could argue that the
use of __ for even private functions avoids a required rename if the
function is made semi-public, and adds information.  For example, I know
that most of the functions you list above are FS-related and probably
logically related.  But I wouldn't suggest a universal policy change at this
stage.

-- 
C. Michael Pilato cmpil...@collab.net
CollabNet  www.collab.net  Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: [RFC] diff-optimizations-bytes branch: avoiding function call overhead (?)

2010-12-28 Thread Johan Corveleyn
On Fri, Dec 24, 2010 at 3:40 PM, Stefan Fuhrmann eq...@web.de wrote:
 On 20.12.2010 02:43, Johan Corveleyn wrote:

 On Wed, Dec 15, 2010 at 10:58 AM, Stefan Fuhrmanneq...@web.de  wrote:

 On 15.12.2010 02:30, Stefan Fuhrmann wrote:

 On 14.12.2010 23:35, Johan Corveleyn wrote:

 Thoughts?

 Two things you might try:
 * introduce a local variable for afile[i]
 * replace the for() loop with two nested ones, keeping
  calls to other functions out of the hot spot:

 for (i=0; i  file_len;)

 That should read:
 for (i=0; i  file_len; i++)

 {
  /* hot spot: */
  for(; i  file_len; i++)
  {
    curFile = afile[i];
    if (curFile-chunk==-1)
      curFile-chunk = 0;
    else if (curFile-curp != curFile-endp -1)
      curFile-curp++;
    else
      break;
  }

  if (i  file_len)
  {
    /* the complex, rarely used stuff goes here */
  }
 }

 Ok, I tried this, but it didn't really help. It's still only about as
 fast as before. While the macro version is about 10% faster (I made a
 cleaner macro version, only macro'ing the hotspot stuff, with a
 function call to the complex, rarely used stuff).

 For the inline version, I tried several variations (always with
 APR_INLINE in the function signature, and with local variable for
 file[i]): with or without the nested for loop, with or without the
 complex stuff factored out to a separate function, ... it made no
 difference.

 Maybe I'm still doing something wrong, keeping the compiler from
 inlining it (but I'm not really a compiler expert, maybe I have to add
 some compiler options or something, ...). OTOH, I now have a macro
 implementation which is quite readable (and which uses the do ...
 while(0) construct - thanks Peter), and which does the trick. So maybe
 I should simply stick with that option ...

 The key factor here is that file_len is only 2.
 Basically, that will make my optimization a
 pessimization.

 Assuming that for most calls, curp of *both*
 files will be somewhere *between* BOF and
 EOF, the unrolling the loop may help:

 #define INCREMENT_POINTERS

 /* special code for the common case. 10 .. 12 ticks per execution */

 static APR_INLINE svn_boolean_t
 quickly_increment_pointers(struct file_info *afile[])
 {
 struct file_info *file0 = afile[0];
 struct file_info *file1 = afile[1];
 if ((afile0-chunk != -1)  (afile1-chunk != -1))
  {
    /* suitable_type */ nextp0 = afile0-curp + 1;
    /* suitable_type */ nextp1 = afile1-curp + 1;
    if ((nextp0 != afile0-endp)  (nextp1 != afile1-endp))
      {
        afile0-curp = nextp0;
        afile1-curp = nextp1;
        return TRUE;
      }
  }
 return FALSE;
 }

 /* slow code covering all special cases */

 static svn_error_t*
 slowly_increment_pointers(struct file_info *file[], apr_size_t file_len,
 apr_pool_t *pool)
 {
  for (i = 0; i  file_len;i++)
 ...
 }

 /* maybe as macro: */
 return ((file_len == 2)  quickly_increment_pointers (file))
  ? SVN_NO_ERROR
  : slowly_increment_pointers(file, file_len, pool);

Nice! I will try this out sometime soon, but I'm not so sure it will
help more than the current macro version (only macro for the critical
section, calling a function for increment/decrement chunk - see
diff_file.c in HEAD of diff-optimizations-bytes branch). I guess I'll
just have to test it.

With the macro implementation in place my gut feeling is that the
prefix/suffix scanning is reasonably optimal, and that the remaining
time spent in the diff code (70 seconds for my example blame of big
xml file with 2200 changes) is almost all due to the original diff
algorithm (working on the remainder between prefix and suffix). But to
be sure I should probably measure that first.

(I should note here that, with my example file, the prefix/suffix
scanning doesn't always work optimally, because in about half of the
revisions there is a change both in one of the first lines (where an
author name is inserted automatically by XML Spy upon saving it), and
then somewhere in the middle where the actual change is done. It's
really a pity, because that means about 2-3 lines cannot be
skipped as prefix.)

 Minor remark on C style:

 static APR_INLINE svn_error_t *
 increment_pointers(struct file_info *file[], int file_len, apr_pool_t
 *pool)

 file_len should be apr_size_t unless it can assume negative
 values in which case it should be apr_ssize_t.That way, you
 know for sure it will never overflow. Int is potentially too short
 (although extrmly unlikely in this case) and it is signed
 (often not appropriate for an index value).

Thanks. I didn't know that. I'll change it.

 One more thing that might help speed up your code. The fact
 that the calling overhead of a single  simple function turns
 out to be a significant contribution to the overall runtime tells
 us that the execution is tightly CPU-bound and squeezing
 some extra cycles out of it could help. Try

 struct file_info file[] instead of struct file_info *file[]

 i.e. array of structs instead of array of pointers. The boring
 background 

Re: Any idea why public function like svn_fspath__dirname have double __ in its name?

2010-12-28 Thread Peter Samuelson

[C. Michael Pilato]
  svn_fspath__is_canonical
  svn_fspath__dirname
  svn_fspath__basename
  svn_fspath__split
  svn_fspath__join
  svn_fspath__is_child
  svn_fspath__skip_ancestor
  svn_fspath__is_ancestor
  svn_fspath__get_longest_ancestor
  svn_error__locate
  
  svn_error__malfunction

 While we've mandated that __ must be used for semi-public
 functions, we've never said that it can't be used for private ones.

Kamesh is talking about public, not private, functions.  I.e., ones
where we actually do have an ABI promise to keep.

I looked at svn_error__locate last week.  It's really only useful in
--enable-maintainer-mode, but the way it's implemented, it ends up
being used by macros in the public (non-maintainer) ABI, so even if we
eliminate those callers, we have to supply the function itself forever.
I've addressed this as best we can, in r1053469.

Supplying a given ABI forever is the sort of thing I thought we didn't
have to do for __ functions.  I think that's Kamesh's question too.
-- 
Peter Samuelson | org-tld!p12n!peter | http://p12n.org/


Re: svn commit: r1053461 - /subversion/site/publish/doap.rdf

2010-12-28 Thread Mark Phippard
On Tue, Dec 28, 2010 at 5:00 PM,  hwri...@apache.org wrote:
 Author: hwright
 Date: Tue Dec 28 22:00:37 2010
 New Revision: 1053461

 URL: http://svn.apache.org/viewvc?rev=1053461view=rev
 Log:
 First hack at a DOAP file for Subversion, as requested by the Powers That Be
 at the ASF.  (But I think it's a Good Thing, too. ;)

 This was autogenerated by http://projects.apache.org/create.html

 * publish/doap.rdf:
  New.

 Added:
    subversion/site/publish/doap.rdf   (with props)

 Added: subversion/site/publish/doap.rdf
 URL: 
 http://svn.apache.org/viewvc/subversion/site/publish/doap.rdf?rev=1053461view=auto
 ==
 --- subversion/site/publish/doap.rdf (added)
 +++ subversion/site/publish/doap.rdf Tue Dec 28 22:00:37 2010
 @@ -0,0 +1,51 @@
 +?xml version=1.0?
 +?xml-stylesheet type=text/xsl?
 +rdf:RDF xml:lang=en
 +         xmlns=http://usefulinc.com/ns/doap#;
 +         xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 +         xmlns:asfext=http://projects.apache.org/ns/asfext#;
 +         xmlns:foaf=http://xmlns.com/foaf/0.1/;
 +!--
 +    Licensed to the Apache Software Foundation (ASF) under one or more
 +    contributor license agreements.  See the NOTICE file distributed with
 +    this work for additional information regarding copyright ownership.
 +    The ASF licenses this file to You under the Apache License, Version 2.0
 +    (the License); you may not use this file except in compliance with
 +    the License.  You may obtain a copy of the License at
 +
 +         http://www.apache.org/licenses/LICENSE-2.0
 +
 +    Unless required by applicable law or agreed to in writing, software
 +    distributed under the License is distributed on an AS IS BASIS,
 +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +    See the License for the specific language governing permissions and
 +    limitations under the License.
 +--
 +  Project rdf:about=http://subversion.apache.org/;
 +    created2010-12-28/created
 +    license rdf:resource=http://usefulinc.com/doap/licenses/asl20; /
 +    nameApache Subversion/name
 +    homepage rdf:resource=http://subversion.apache.org/; /
 +    asfext:pmc rdf:resource=http://subversion.apache.org; /
 +    shortdescEnterprise-class centralized version control for the 
 masses/shortdesc
 +    descriptionSubversion exists to be universally recognized and adopted 
 as an open-source, centralized version control system characterized by its 
 reliability as a safe haven for valuable data; the simplicity of its model 
 and usage; and its ability to support the needs of a wide variety of users 
 and projects, from individuals to large-scale enterprise 
 operations./description
 +    bug-database rdf:resource=http://subversion.tigris.org/issues/; /

Couldn't we just use this URL for bug-database?
http://subversion.apache.org/issue-tracker.html


 +    mailing-list 
 rdf:resource=http://subversion.apache.org/mailing-lists.html; /
 +    download-page 
 rdf:resource=http://subversion.apache.org/source-code.html; /
 +    programming-languageC/programming-language
 +    category 
 rdf:resource=http://projects.apache.org/category/build-management; /

They do not have version control or SCM as a category to use?  I do
not think of Build Management as category for Subversion.



-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


[Be|Na]gging for review

2010-12-28 Thread Stefan Fuhrmann

Hi devs,

I would really like the core improvement of my
work on SVN performance to become part of
1.7: the in-process fulltext cache.

While the risk is low (we already offer something
similar based on memcached), the caching code
is so fundamental to most other pending changes,
that I want to hear at least a second opinion:

^/subversion/branches/integrate-cache-membuffer

Take this as an opportunity for a last nightly deed
(Ny!) in 2010 or a new year's resolution.

-- Stefan^2.


Re: FSFS format 6 (was: Re: FSv2)

2010-12-28 Thread Johan Corveleyn
On Sun, Dec 12, 2010 at 4:23 PM, Stefan Fuhrmann
stefanfuhrm...@alice-dsl.de wrote:
 On 19.10.2010 15:10, Daniel Shahaf wrote:

 Greg Stein wrote on Tue, Oct 19, 2010 at 04:31:42 -0400:

 Personally, I see [FSv2] as a broad swath of API changes to align our
 needs with the underlying storage. Trowbridge noted that our current
 API makes it *really* difficult to implement an effective backend. I'd
 also like to see a backend that allows for parallel PUTs during the
 commit process. Hyrum sees FSv2 as some kind of super-key-value
 storage with layers on top, allowing for various types of high-scaling
 mechanisms.

 At the retreat, stefan2 also had some thoughts about this...

 [This is just a brain-dump for 1.8+]

 While working on the performance branch I made some
 observations concerning the way FSFS organizes data
 and how that could be changed for reduced I/O overhead.

 notes/fsfs-improvements.txt contains a summary of that
 could be done to improve FSFS before FS-NG. A later
 FS-NG implementation should then still benefit from the
 improvements.

+(number of fopen calls during a log operation)

I like this proposal a lot. As I already told before, we are running
our FSFS back-end on a SAN over NFS (and I suspect we're not the only
company doing this). In this environment, the server-side I/O of SVN
(especially the amount of random reads and fopen calls during e.g.
log) is often the major bottleneck.

There is one question going around in my head though: won't you have
to change/rearrange a lot of the FS layer code (and maybe repos
layer?) to benefit from this new format?

The current code is written in a certain way, not particularly
optimized for this new format (I seem to remember log does around 10
fopen calls for every interesting rev file, each time reading a
different part of it). Also, if an operation currently needs to access
many revisions (like log or blame), it doesn't take advantage at all
of the fact that they might be in a single packed rev file. The pack
file is opened and seeked in just as much as the sum of the individual
rev files.

So: how will the current code be able to take advantage of this new
format? Won't this require a major effort to restructure that code?

(This reminds me of the current difficulty (as I can see it, as an
innocent bystander) with the WC-NG rewrite: theoretically it should be
very fast, but the higher level code is still largely based upon the
old principles. So to take advantage of it, certain things have to be
changed at the higher level, making operations work dir-based or
tree-based, instead of file-based etc).

Cheers,
-- 
Johan


Re: svn commit: r1053461 - /subversion/site/publish/doap.rdf

2010-12-28 Thread Hyrum K Wright
On Tue, Dec 28, 2010 at 5:36 PM, Mark Phippard markp...@gmail.com wrote:
 On Tue, Dec 28, 2010 at 5:00 PM,  hwri...@apache.org wrote:
 Author: hwright
 Date: Tue Dec 28 22:00:37 2010
 New Revision: 1053461

 URL: http://svn.apache.org/viewvc?rev=1053461view=rev
 Log:
 First hack at a DOAP file for Subversion, as requested by the Powers That Be
 at the ASF.  (But I think it's a Good Thing, too. ;)

 This was autogenerated by http://projects.apache.org/create.html

 * publish/doap.rdf:
  New.

 Added:
    subversion/site/publish/doap.rdf   (with props)

 Added: subversion/site/publish/doap.rdf
 URL: 
 http://svn.apache.org/viewvc/subversion/site/publish/doap.rdf?rev=1053461view=auto
 ==
 --- subversion/site/publish/doap.rdf (added)
 +++ subversion/site/publish/doap.rdf Tue Dec 28 22:00:37 2010
 @@ -0,0 +1,51 @@
 +?xml version=1.0?
 +?xml-stylesheet type=text/xsl?
 +rdf:RDF xml:lang=en
 +         xmlns=http://usefulinc.com/ns/doap#;
 +         xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 +         xmlns:asfext=http://projects.apache.org/ns/asfext#;
 +         xmlns:foaf=http://xmlns.com/foaf/0.1/;
 +!--
 +    Licensed to the Apache Software Foundation (ASF) under one or more
 +    contributor license agreements.  See the NOTICE file distributed with
 +    this work for additional information regarding copyright ownership.
 +    The ASF licenses this file to You under the Apache License, Version 2.0
 +    (the License); you may not use this file except in compliance with
 +    the License.  You may obtain a copy of the License at
 +
 +         http://www.apache.org/licenses/LICENSE-2.0
 +
 +    Unless required by applicable law or agreed to in writing, software
 +    distributed under the License is distributed on an AS IS BASIS,
 +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 +    See the License for the specific language governing permissions and
 +    limitations under the License.
 +--
 +  Project rdf:about=http://subversion.apache.org/;
 +    created2010-12-28/created
 +    license rdf:resource=http://usefulinc.com/doap/licenses/asl20; /
 +    nameApache Subversion/name
 +    homepage rdf:resource=http://subversion.apache.org/; /
 +    asfext:pmc rdf:resource=http://subversion.apache.org; /
 +    shortdescEnterprise-class centralized version control for the 
 masses/shortdesc
 +    descriptionSubversion exists to be universally recognized and adopted 
 as an open-source, centralized version control system characterized by its 
 reliability as a safe haven for valuable data; the simplicity of its model 
 and usage; and its ability to support the needs of a wide variety of users 
 and projects, from individuals to large-scale enterprise 
 operations./description
 +    bug-database rdf:resource=http://subversion.tigris.org/issues/; /

 Couldn't we just use this URL for bug-database?
 http://subversion.apache.org/issue-tracker.html

Sure could. :)

(I'll probably get around to that at some point, but if anybody else
wants to beat me to it, feel free.)



 +    mailing-list 
 rdf:resource=http://subversion.apache.org/mailing-lists.html; /
 +    download-page 
 rdf:resource=http://subversion.apache.org/source-code.html; /
 +    programming-languageC/programming-language
 +    category 
 rdf:resource=http://projects.apache.org/category/build-management; /

 They do not have version control or SCM as a category to use?  I do
 not think of Build Management as category for Subversion.

From http://projects.apache.org/categories.html:
build-managementProjects related to building/maintaining source 
code/websites.

That seems to fit the Subversion bill pretty well.  But if we'd like,
we can include additional categories as well.

-Hyrum


Re: svn commit: r1053461 - /subversion/site/publish/doap.rdf

2010-12-28 Thread Mark Phippard
On Tue, Dec 28, 2010 at 8:13 PM, Hyrum K Wright hy...@hyrumwright.org wrote:
 +    category 
 rdf:resource=http://projects.apache.org/category/build-management; /

 They do not have version control or SCM as a category to use?  I do
 not think of Build Management as category for Subversion.

 From http://projects.apache.org/categories.html:
 build-management        Projects related to building/maintaining source 
 code/websites.

 That seems to fit the Subversion bill pretty well.  But if we'd like,
 we can include additional categories as well.

Based on the list of categories, I can see why you chose that one, but
I think we need to create a new category.  Subversion is not a build
management tool and it does not make sense to lump it in with Maven,
Ant, Ivy etc.

I think version control is probably the best category, but if the goal
seems to be to find ways to group Apache projects together then maybe
we should add SCM?  There are other Apache projects like Maven that
could fit in that category.

We could also add httpd-module as one of our categories.  Technically
content and library would also probably be valid, but these are really
all secondary categories IMO.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/


Re: [Be|Na]gging for review

2010-12-28 Thread Stefan Fuhrmann

On 29.12.2010 01:33, Stefan Fuhrmann wrote:

Hi devs,

I would really like the core improvement of my
work on SVN performance to become part of
1.7: the in-process fulltext cache.

While the risk is low (we already offer something
similar based on memcached), the caching code
is so fundamental to most other pending changes,
that I want to hear at least a second opinion:

^/subversion/branches/integrate-cache-membuffer

Take this as an opportunity for a last nightly deed

*k*nightly (which stresses even more that _nightly_
is exactly what we probably don't want)

(Ny!) in 2010 or a new year's resolution.

-- Stefan^2.




Re: [PATCH] Correct documentation for svn_repos_node_editor and friends

2010-12-28 Thread Daniel Shahaf
Erik Johansson wrote on Wed, Dec 22, 2010 at 22:00:22 +0100:
 On Sat, Dec 18, 2010 at 20:43, Daniel Shahaf d...@daniel.shahaf.name wrote:
  Erik Johansson wrote on Tue, Dec 14, 2010 at 19:19:17 +0100:
  [[[
  r846201 added svn_repos_replay as a replacement not so much for
  svn_repos_dir_delta, but for at least a whole class of functionality that
  svn_repos_dir_delta was never intended to handle.
 
  One place where svn_repos_replay replaced svn_repos_dir_delta was in
  combination with svn_repos_node_editor (only used in svnlook). This commit
  updates the documentation for svn_repos_node_editor and friends to reflect
  this.
 
  * subversion/include/svn_repos.h
    (svn_repos_node_editor, svn_repos_node_from_baton): Doc update
  ]]]
 
  The patch basically does s/svn_repos_dir_delta2/svn_repos_replay2/g.
  However, despite the log message, I still don't understand why that is
  a correct change.
 
 According to the log message for r846201 it seems like
 svn_repos_replay is faster than svn_repos_dir_delta for some
 operations. I can only assume that this is one such case.
 

IOW, the patch recommends svn_repos_replay2() because that is newer API.

  Why are we mentioning a specific driver at all?  Can svn_repos_node_editor()
  be driven only by svn_repos_replay2()?
 
  (I don't see why that would be the case.)  If not, then its doc string
  should avoid mention that the returned editor can be driven only by %s,
  when in fact it can be driven by anything.
 
 I don't know the anything about the other drivers, but my assumption
 is that any driver works, but svn_repos_replay is the prefered one in
 this case.
 

*nod*

  We have three mechanisms: svn_repos_dir_delta2(), svn_repos_begin_report2(),
  and svn_repos_replay2().  I think it would be useful to just mention all of
  them --- after all, none of them is deprecated and all of them can represent
  a tree delta using an editor.
 
 Mention all on equal footing or say that svn_repos_replay2 is the preferred 
 one?
 
 I can fix the docs, but I don't know enough about svn api to actually
 determine the correct fix. E.g. do all the drivers call methods in the
 editor in the same order?
 

In theory, yes.  That's the core of the editor API: the
driver only knows the svn_delta_editor_t vtable, and the drivee also
knows only the vtable --- they need not (in general) know each other.

For example, 'svnrdump dump' drives a commit editor; but nothing in the
commit editor knows what dumpfiles are.

 // Erik
 
 -- 
 Erik Johansson
 Home Page: http://ejohansson.se/
 PGP Key: http://ejohansson.se/erik.asc


Re: [PATCH] support printing changed properties in svnlook

2010-12-28 Thread Daniel Shahaf
Erik Johansson wrote on Wed, Dec 22, 2010 at 23:00:43 +0100:
 On Fri, Dec 10, 2010 at 14:29, Daniel Shahaf d...@daniel.shahaf.name wrote:
   +  /** How this property entered the node tree: 'A'dd, 'D'elete, 
   'R'eplace */
   +  char action;
  
   This is copied from svn_repos_node_t-action.  There was recently
   a question about that field:
   http://mid.gmane.org/3abd28aa-a2fc-4d7d-a502-479d37995...@orcaware.com
  
   So, that asks whether 'C'hanged is a valid answer to the question that
   -action is meant to answer.  I'll also ask how this interacts with node
   changes: for example; if r5 replaces-with-history a node that has
   'fooprop' set with another node that also has 'fooprop' set, what would
   the value of 'action' be?
 
  What about this:
  When a node is deleted all the properties it had are listed in
  mod_prop with action D.
 
  When a node is added-with-history all the properties the source had
  are listed in mod_prop with action A and a new flag copyfrom = TRUE.
 
  A replace-with-history will result in two repos_nodes, each having a
  mod_prop list. If the same property exists in both it means it has
  been replaced.
 
 
  (will reply later)
 
   +  /** The name of the property */
   +  const char *name;
  
   Where is the value of the property?  How to get it?
 
  The idea was that the struct should indicate changes to properties,
  not their values. In the same way that svn_repos_node_t shows changes,
  not node content.
 
 
  Flawed analogy: we never store node content in memory, but we do have
  all property values in memory, so the cost is just to add an
  svn_string_t * member to the struct.
 
  My question was how would callers get the value if they cared about it.
  i.e., I assume that whoever calls this function already has a property
  hash (or, at least, an fs object) available that they can get the
  property values from?
 
 I agree that it is simple to add the value to the struct, but in
 svnlook (which is the only current in-tree user of this api) I can't
 see any need to get the value.
 

Yep; but we're designing an API, so I was trying to get it right (ie:
general) the first time.

IMO, if we're tweaking the API and we can have it put the value there
without too much effort, we should do that.  (But since svnlook doesn't
need that, I'm okay with punting on providing the value in the API if
it's more than some threshold effort.)

 How a caller would get the property value if they indeed cared about
 it is outside my knowledge of the api. Perhaps svn_repos_node_editor()
 is not the editor to use if you want the property value?
 

To be honest I don't recall the context exactly, but I imagine that
whoever called the API knows whose properties list it was just given, so
it could just use another API to get the values if needed.

Over RA there are sometimes calling order restrictions, but I don't
recall that we have such issues at the repos layer (which is lower).

   +  /** Pointer to the next sibling property */
   +  struct svn_repos_node_prop_t *sibling;
   +
  
   You use a linked list.  How about using apr_array_header_t *?  Or a hash
   of (prop_name - struct)?
 
  I guess anyone of those would work, but the reason I went for a linked
  list was that svn_repos_node_t did that and I wanted them to be
  similar.
 
 
  Hmm.  I haven't seen that struct before: svn_repos_node_t indeed uses an
  explicit tree structure.
 
  What do you think will be more useful to consumers of the API?  A hash
  allows both random access and iteration --- do APR arrays or linked
  lists have advantages that outweigh that?
 
 A hash seemed like overkill, but if you think it is better I have no
 problem using that.
 

I'm used to seeing 'apr_hash_t *props'.

I think an APR array (apr_array_header_t) would be better than an
explicit linked list, because we can use library functions rather than
rewrite them.  Would an apr_hash_t be better still?  Possibly, but I'm
not feeling strongly on that.  (It's not needed for svnlook's use case,
and it might be a bit costlier.)

 // Erik
 
 -- 
 Erik Johansson
 Home Page: http://ejohansson.se/
 PGP Key: http://ejohansson.se/erik.asc


Re: [PATCH] support printing changed properties in svnlook

2010-12-28 Thread Daniel Shahaf
[ The second hunk below has a discussion regarding the representation
in the API of 'svn ps k v iota; svn ci iota; svn cp iota iota2;
svn ps k2 v2 iota2; svn ci iota2', which I'd be happy to have further
input on. ]

Erik Johansson wrote on Wed, Dec 22, 2010 at 23:08:30 +0100:
 On Fri, Dec 10, 2010 at 15:08, Daniel Shahaf d...@daniel.shahaf.name wrote:
  [ I'm somewhat confused about this issue. ]
 
  Erik Johansson wrote on Thu, Dec 09, 2010 at 21:41:38 +0100:
  On Wed, Dec 8, 2010 at 23:22, Daniel Shahaf d...@daniel.shahaf.name 
  wrote:
   Erik Johansson wrote on Wed, Dec 08, 2010 at 17:17:47 +0100:
   +  /** How this property entered the node tree: 'A'dd, 'D'elete, 
   'R'eplace */
   +  char action;
 
  I should have caught this earlier, but replaced doesn't make sense
  for properties.  (A node can be replaced by another node at the same
  path, but for a fixed node properties can be added/modified/removed but
  not replaced.)
 
 I'll change it to 'A'dd, 'D'elete or 'U'pdate then?
 

'M'odify, not 'U'pdate, I think.  At least that's what the UI level
does, but if there's precedent for 'U'pdated then follow it.

  
   This is copied from svn_repos_node_t-action.  There was recently
   a question about that field:
   http://mid.gmane.org/3abd28aa-a2fc-4d7d-a502-479d37995...@orcaware.com
  
   So, that asks whether 'C'hanged is a valid answer to the question that
   -action is meant to answer.  I'll also ask how this interacts with node
   changes: for example; if r5 replaces-with-history a node that has
   'fooprop' set with another node that also has 'fooprop' set, what would
   the value of 'action' be?
 
  What about this:
  When a node is deleted all the properties it had are listed in
  mod_prop with action D.
 
  When a node is added-with-history all the properties the source had
  are listed in mod_prop with action A and a new flag copyfrom = TRUE.
 
  There is already a copyfrom indication in svn_repos_node_t-copyfrom_rev,
  I don't think we need another one.
 
 I was thinking that the copyfrom flag would be useful in the case
 where you copy a node with a propery and add another property before
 committing. Here you would have two property with action A and one
 would have copyfrom = TRUE.
 

In other words, you'd like to make a distinction between This node was
added-with-history, and this property came along and ditto, but the
property is new on this copy target and didn't exist in the copy
source.

I see, and I agree it's a good distinction to make.

I'm not sure how/whether our other API's (the editor, possibly the
reporter, others?) represent this.

  A replace-with-history will result in two repos_nodes, each having a
  mod_prop list. If the same property exists in both it means it has
  been replaced.
 
  Two nodes doesn't sound like a good idea; one node marked as 'replaced'
  should suffice.
 
 In the current api a replaced node is represented by two nodes. Should
 we change this?

Doesn't that mean that API callers who want to distinguish a replace
from a remove or an add have to walk through all the child entries (for
a given dir) before they can process any adds or removals?

I'm thinking that one entry is better, because callers can split it if they
wish; that's easier than two entries and letting callers figure out for
themselves how to recombine them.

 Should svnlook stay back-wards compatible and print a
 replaced node twice, once as Deleted and once as Added?
 

I could argue both ways: it should be back-compatible with itself, but
it should also be compatible with the cmdline client where the latter
prints just one R notification.

  In case of a replacement, do we want the API to provide the replaced
  node's properties? (at least their names)
 

I suppose we can agree to answer this as Yes, then? :)

 It would be pretty simple if there are two separate nodes for a
 replaced node. The deleted node will have all the properties it had
 marked as Deleted. The new node will have its properties marked as
 appropiate.
 

That's a straightforward representation, yes.  I suppose we could find
a straightforward in the single node case, though --- eg by adding
fields only used in the case of 'R'eplacements.

 // Erik
 
 -- 
 Erik Johansson
 Home Page: http://ejohansson.se/
 PGP Key: http://ejohansson.se/erik.asc


Re: [Be|Na]gging for review

2010-12-28 Thread Blair Zajac

On 12/28/2010 6:28 PM, Stefan Fuhrmann wrote:

On 29.12.2010 01:33, Stefan Fuhrmann wrote:

Hi devs,

I would really like the core improvement of my
work on SVN performance to become part of
1.7: the in-process fulltext cache.

While the risk is low (we already offer something
similar based on memcached), the caching code
is so fundamental to most other pending changes,
that I want to hear at least a second opinion:

^/subversion/branches/integrate-cache-membuffer

Take this as an opportunity for a last nightly deed

*k*nightly (which stresses even more that _nightly_
is exactly what we probably don't want)

(Ny!) in 2010 or a new year's resolution.


I'll take a look at it when I have a chance, but my first question is, 
does it work in a long running (months) persistent multithreaded process 
that LRU caches svn_fs_t, svn_repos_t and other objects for long periods 
of time?  I have a custom svn server that exposes the svn_fs.h API 
through an Ice layer (see www.zeroc.com).


Thanks,
Blair