Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Ramkumar Ramachandra
Hi Bert,

Bert Huijben writes:
> > -Original Message-
> > From: Ramkumar Ramachandra [mailto:artag...@gmail.com]
> >
> > Index: subversion/libsvn_subr/io.c
> > ===
> > --- subversion/libsvn_subr/io.c (revision 1005706)
> > +++ subversion/libsvn_subr/io.c (working copy)
> > @@ -1269,7 +1269,8 @@ static svn_error_t *
> >  get_default_file_perms(apr_fileperms_t *perms, apr_pool_t
> > *scratch_pool)
> >  {
> >/* the default permissions as read from the temp folder */
> > -  static apr_fileperms_t default_perms = 0;
> > +  static apr_fileperms_t default_perms;
> > +  apr_atomic_set32(&default_perms, 0);
> 
> This shouldn't change. This = 0 is applied at application/library load (just
> once). After your change it happens at every invocation.

Ok.

> > 
> >/* Technically, this "racy": Multiple threads may use enter here and
> >   try to figure out the default permisission concurrently. That's
> > fine
> > @@ -1303,7 +1304,7 @@ get_default_file_perms(apr_fileperms_t *perms,
> > apr
> >SVN_ERR(svn_io_file_close(fd, scratch_pool));
> > 
> >*perms = finfo.protection;
> > -  default_perms = finfo.protection;
> > +  apr_atomic_set32(&default_perms, finfo.protection);
> 
> Yes, assuming default_perms is a 32 bit integer (which I think it always is)
> this makes it safe on all platforms. Without this it relies on cpu
> architecture, alignment and possibly caching hints.
> (As noted by others, without this patch it should be safe on x86 and x64
> architectures (where compilers handle the proper alignment for the static
> variable). But we can't be sure about the other architectures)

Right, got it. Can I get a +1 for the following patch?

[[[
* subversion/libsvn_subr/io.c

  (get_default_file_perms): The 32-bit integer `default_perms` is only
  really atomic (and hence thread-safe) on x86 and x64 where compilers
  handle the proper alignment for static variables. Handle it in the
  more general case using `apr_atomic_set32`.

Suggested by: Bert
]]]

Index: subversion/libsvn_subr/io.c
===
--- subversion/libsvn_subr/io.c (revision 1005706)
+++ subversion/libsvn_subr/io.c (working copy)
@@ -1303,7 +1303,7 @@ get_default_file_perms(apr_fileperms_t *perms, apr
   SVN_ERR(svn_io_file_close(fd, scratch_pool));
 
   *perms = finfo.protection;
-  default_perms = finfo.protection;
+  apr_atomic_set32(&default_perms, finfo.protection);
 }
   else
 *perms = default_perms;


Re: svn commit: r1021511 - /subversion/site/publish/source-code.html

2010-10-11 Thread C. Michael Pilato
If you haven't already, please add this to the list of website spots that
need updatin' in the Community Guide's "releasing" section, please.


On 10/11/2010 05:18 PM, hwri...@apache.org wrote:
> Author: hwright
> Date: Mon Oct 11 21:18:24 2010
> New Revision: 1021511
> 
> URL: http://svn.apache.org/viewvc?rev=1021511&view=rev
> Log:
> One more bit of post-release house-keeping.  (How many places do we need this
> information?!?)
> 
> Found by: Mark Poole 
> 
> * publish/source-code.html:
>   Update the "best available" version to 1.6.13.
> 
> Modified:
> subversion/site/publish/source-code.html
> 
> Modified: subversion/site/publish/source-code.html
> URL: 
> http://svn.apache.org/viewvc/subversion/site/publish/source-code.html?rev=1021511&r1=1021510&r2=1021511&view=diff
> ==
> --- subversion/site/publish/source-code.html (original)
> +++ subversion/site/publish/source-code.html Mon Oct 11 21:18:24 2010
> @@ -19,7 +19,7 @@
>  
>  
>  The best available version of Subversion
> -   is: 1.6.12
> +   is: 1.6.13
>   
>  
>  You can install Subversion by compiling
> 
> 


-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: SVN client on Unix checks for parent dir "../.svn/wc.db"

2010-10-11 Thread C. Michael Pilato
[Cross-posting to d...@subversion.apache.org]

On 10/11/2010 06:07 PM, Anthony Falabella wrote:
> Why does the code need to look at all at the parent dir above where a
> checkout is being performed?  Not only might the parent dir be not
> writable, but in our case even looking within that dir is an expensive
> operation.  Subversion 1.6.6 through 1.6.12 are essentially not usable by
> my team until this is addressed.

This behavior was added to help folks who are using 1.6.x clients know that
they need to upgrade their software when they attempt to run Subversion on a
1.7-or-newer working copy.  (Because Subversion 1.7 won't have the .svn
subdirs anywhere except the very root directory of the working copy, it's
not sufficient to look only in the current directory.)

Why does checkout need to do this?  "Need" could be debatable.  Under the
hood, though, a checkout is just a special form of an update.  Update needs
to check parent directories (so it can integrated newly fetched
subdirectories into their parent working copy).  So it happens that checkout
behaves the same way.

I'm very unhappy with just how far-reaching this "for your own good" check
is.  I keep my home directory under version control, and because I'm a
Subversion dev, my working copies tend to be of the latest trunk format.  So
in my situation, I can't use Subversion 1.6 *at all* anywhere in my home
directory because it *always* searches parent directories and *always* finds
a 1.7-ish .svn directory in ~/.svn.  I pity folks that, say, keep their
system's root directories under version control.

Some time ago I fixed the trunk code to not be so similarly far-reaching --
as a result, I was able to finally create 1.7 working copies inside of 1.6
ones.  But I've not yet attempted to fix the reverse in the 1.6 line.

Just so you know, you can force 1.6.x to *not* perform this check by setting
the following obnoxiously long environment variable:

   SVN_I_LOVE_CORRUPTED_WORKING_COPIES_SO_DISABLE_CHECK_FOR_WC_NG=yes

Insert warnings, disclaimers, etc. here.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-11 Thread Johan Corveleyn
On Mon, Oct 11, 2010 at 11:53 AM, Julian Foad  wrote:
> On Sat, 2010-10-09, Johan Corveleyn wrote:
>> On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad  wrote:
>> > So I wrote a patch - attached - that refactors this into an array of 4
>> > sub-structures, and simplifies all the code that uses them.
> [...]
>> Yes, great idea! That would indeed vastly simplify a lot of the code.
>> So please go ahead and commit the refactoring.
>
> OK, committed in r1021282.

Thanks, looks much more manageable now.

>> Also, maybe the last_chunk number could be included in the file_info
>> struct? Now it's calculated in several places: "last_chunk0 =
>> offset_to_chunk(file_baton->size[idx0])", or I have to pass it on
>> every time as an extra argument. Seems like the sort of info that
>> could be part of file_info.
>
> It looks to me like you only need to calculate it in exactly one place:
> in increment_pointer_or_chunk().  I wouldn't worry about pre-calculating
> for efficiency, as it's a trivial calculation.

Hm, I don't know. That's recalculating it for every byte in the
prefix. In the files I'm testing there could be 1 Mb or more of
prefix, so 1 million times this calculation (and that for every diff
in say 2000 diffs for a blame operation). Unless the compiler can
optimize that to a single calculation, because it never changes (I
don't know enough about compilers to know if this is possible), it
might have an impact, even if it's a trivial calculation.

OTOH, I might be in premature-optimization-land here, the impact might
be negligible. If I find some time I'll test it.

> And in a later email you wrote:
>> [...] Maybe I can give it a go,
>> first suffix then prefix, and see if I can find real-life problems ...
>
> There's no need to re-visit that.  I think it's fine as is it, using a
> separate pair of buffers.

I've been thinking some more about that. For relatively small files,
smaller than 1 chunk (128 Kb), this seems wasteful. If the file is 127
Kb, it all fits in one chunk, yet I still read it twice, once for
suffix and once for prefix (and further token processing). Ok, caching
by the OS might make this insignificant, but still.

Since most common source files in repositories are probably smaller
than 128 Kb I might be hurting the common case, in cases where it's
not compensated by performance gain from identical prefix/suffix.

Then again, maybe this can be solved with a specific optimization,
without changing the current code too much: if file size is smaller
than chunk size, I point the suffix buffer to the prefix buffer and
I'm done.

Cheers,
-- 
Johan


Re: Problem with external file

2010-10-11 Thread Bob Fletcher
Philip Martin  wandisco.com> writes:


> There is no assert on line 4603 in the current code, it's not possible
> for the current code to produce that error.  There was an assert on
> that line back in r993028 but that dates from more than a month ago.
> If you try a more up-to-date build then the problem may well have been
> fixed.

Hi Philip,

I was using recent binaries from
http://nightlybuilds.tortoisesvn.net/latest/win32/small/

For example, in response to svn --version, I would get
svn, version 1.7.0 (dev build)
   compiled Oct  9 2010, 00:09:42

As you suggested, I downloaded the binaries just posted today on the same site.
Now in response to svn --version, I get
svn, version 1.7.0 (dev build)
   compiled Oct 11 2010, 00:09:54
   
Based on the compiled dates, I was very skeptical that there 
would be any difference. But guess what - the problem went away!
I am not certain that everything works perfectly,
but I no longer get the crash with the bad error message.

Apparently, something was changed while I was adding noise to the forum.

Thanks,
Bob






Re: Problem with external file

2010-10-11 Thread Philip Martin
Bob Fletcher  writes:

> OK. I was able to create the error using the command line. The
> following are the steps I took. I sure hope this is helpful. (In the
> steps below, I changed the name of my company in the URL to xxx.)

Thanks for doing this.  I tried your recipe on my Linux machine and it
doesn't crash, however:

> 'C:\Users\kueng\nightlybuilds\latest\TortoiseSVN\ext\subversion\subversion\libsv
> n_wc\update_editor.c' line 4603: assertion failed (status == 
> svn_wc__db_status_added)

There is no assert on line 4603 in the current code, it's not possible
for the current code to produce that error.  There was an assert on
that line back in r993028 but that dates from more than a month ago.
If you try a more up-to-date build then the problem may well have been
fixed.

-- 
Philip


Re: Problem with external file

2010-10-11 Thread Bob Fletcher
OK. I was able to create the error using the command line. The following are 
the 
steps I took. I sure hope this is helpful. (In the steps below, I changed the 
name of my company in the URL to xxx.)

(1) Create a local TestLib directory with a single file libfile.h.

(2) Import this to repo.
C:\Dev\rhf>svn import TestLib http://svn1.xxx.com/svn/rhf/TestLib -m"Initial 
import"

(3) Delete the local TestLib and then checkout a WC.
C:\Dev\rhf> svn checkout http://svn1.xxx.com/svn/rhf/TestLib

At this point we have TestLib in the repo and a checked out WC.

(4) Now create another local directory, TestPrj, with a single file prjfile.txt 
and import this to the repository.
C:\Dev\rhf>svn import TestPrj http://svn1.xxx.com/svn/rhf/TestPrj -m"Initial 
import"

(5) Delete the local TestPrj directory and checkout a WC.
C:\Dev\rhf> svn checkout http://svn1.xxx.com/svn/rhf/TestPrj

(6) Make the file TestLib/libfile.h an external file in TestPrj. 
(The Subversion book was very confusing here and I finally found this syntax 
from a google search.)
C:\Dev\rhf\TestPrj>svn propset svn:externals 
"http://svn1.xxx.com/svn/rhf/TestLib/libfile.h libfile.h" .
property 'svn:externals' set on '.'

(7) Commit the change and update.
C:\Dev\rhf\TestPrj>svn update

Fetching external item into 'libfile.h'
Elibfile.h
Updated external to revision 3076.

Updated to revision 3076.

(8) Check the status.
C:\Dev\rhf\TestPrj>svn status -v
  3076 3076 rfletcher.
  3076 3075 rfletcherprjfile.txt
X 3076 3072 rfletcherlibfile.h

Everything looks fine at this point, here and in the repo browser.


(9) Now go to the WC of TestLib, modify and commit libfile.h.
C:\Dev\rhf\TestLib>svn commit -m"test change"
Sendinglibfile.h
Transmitting file data .
Committed revision 3077.

(11) Check the status of TestLib
C:\Dev\rhf\TestLib>svn status -v
  3071 3071 rfletcher.
  3077 3077 rfletcherlibfile.h


(12) Return to TestPrj and attempt to Update. This gives an error.
C:\Dev\rhf\TestPrj>svn update
svn: In file 
'C:\Users\kueng\nightlybuilds\latest\TortoiseSVN\ext\subversion\subversion\libsv
n_wc\update_editor.c' line 4603: assertion failed (status == 
svn_wc__db_status_added)
  
(13) Check the status of TestPrj - note the !.
C:\Dev\rhf\TestPrj>svn status -v
! L   3077 3076 rfletcher.
  3076 3075 rfletcherprjfile.txt
X 3076 3072 rfletcherlibfile.h


By the way, if I do a complete new checkout of TestPrj, 
libfile.h is modified as expected.

Bob



Re: svn commit: r1021431 - /subversion/trunk/subversion/libsvn_wc/adm_ops.c

2010-10-11 Thread Julian Foad
On Mon, 2010-10-11, julianf...@apache.org wrote:
> Author: julianfoad
> Date: Mon Oct 11 17:10:35 2010
> New Revision: 1021431
> 
> URL: http://svn.apache.org/viewvc?rev=1021431&view=rev
> Log:
> * subversion/libsvn_wc/adm_ops.c
>   (svn_wc_add4): Update a comment after r1021430.

Please excuse this spate of tiny patches.  I built up a load of changes
to this function last week and now need to commit the obvious parts so I
can concentrate on the non-trivial parts.

- Julian




Re: Performance optimization - svn_stringbuf_appendbyte()

2010-10-11 Thread Julian Foad
On Sun, 2010-10-10, Stefan Fuhrmann wrote:
> On 07.10.2010 16:07, Julian Foad wrote:
> >> New Revision: 997203
> >>
> >> URL: http://svn.apache.org/viewvc?rev=997203&view=rev
> >> Log:
> >> Merge r985037, r985046, r995507 and r995603 from the performance branch.
> >>
> >> These changes introduce the svn_stringbuf_appendbyte() function, which has
> >> significantly less overhead than svn_stringbuf_appendbytes(), and can be
> >> used in a number of places within our codebase.
> > Hi Stefan2.
> >
> > I have been wondering about the actual benefit of such tightly
> > hand-optimized code.  Today I got around to giving it a quick spin.
> >
> > In my first test, it made no difference whatsoever, because the
> > optimized code is never executed.  The recent merge r1002898 of r1001413
> > from the performance branch introduced a bug:
> >
> > -  if (str->blocksize>  old_len + 1)
> > +  if (str->blocksize<  old_len + 1)
> WTF how did that happen?
> Well, that's what reviews are for ;)
> > which totally disables the optimized code path.
> >
> > Fixed in r1005437.
> Thanks.
> > That solved, a loop doing a million simple appendbyte calls runs more
> > than twice as fast as calling appendbytes(..., 1).  That's fantastic.
> >
> > But what about that hand-optimization?  I wrote a simple version of the
> > function, and called it 'appendbyte0':
> >
> > svn_stringbuf_appendbyte0(svn_stringbuf_t *str, char byte)
> > {
> >if (str->blocksize>  str->len + 1)
> >  {
> >str->data[str->len++] = byte;
> >str->data[str->len] = '\0';
> >  }
> >else
> >  svn_stringbuf_appendbytes(str,&byte, 1);
> > }
> >
> > Here are the results (see full patch attached):
> >
> > Times:  appendbytes appendbyte0 appendbyte  (in ms)
> > run:  89  31  34
> > run:  88  30  35
> > run:  88  31  34
> > run:  88  30  34
> > run:  88  31  34
> > min:  88  30  34
> >
> > This tells me that the hand-optimization is actually harmful and the
> > compiler does a 10% better job by itself.
> >
> > Have I made a mistake?
> My guess is that you might have made two mistakes actually.

Heh, OK.

> First, the number of operations was relatively low - everything
> in the low ms range could be due to secondary effects (and
> still be repeatable).

I can't think of any reason why.  I ran the whole benchmark lots of
times.  Occasionally some of the times were a big chunk longer due to
other system activity.  Normally they were stable.  I also ran it
several times with 10 million ops instead of 1 million, and the results
were exactly 10x longer with the same degree of variability.

> The most important problem would be compiler optimizations
> or lack thereof. Since the numbers are very large (50..100
> ticks per byte, depending on your CPU clock), I would assume
> that you used a normal debug build for testing. In that case,

You're right.  I'll try again, with "--disable-debug -O2".

> the actual number of C statements has a large impact on
> the execution time. See my results below for details.
> > What are the results for your system?
> >
> > (I'm using GCC 4.4.1 on an Intel Centrino laptop CPU.)
> >
> Test code used (10^10 calls, newer re-allocate):
> 
>  int i;
>  unsigned char c;
> 
>  svn_stringbuf_t* s = svn_stringbuf_create_ensure (255, pool);

OK so you're avoiding any calls to the "re-alloc".  My tests included
re-allocs, but were long enough strings that this wasn't much overhead.
Nevertheless I'll avoid re-allocs so we can compare results fairly.

>  for (i = 0; i < 5000; ++i)
>  {
>  s->len = 0;
>  for (c = 0; c < 200; ++c)
>  svn_stringbuf_appendbyte (s, c);
>  }
> 
> 
> XEON 55xx / Core i7, hyper-threading on, 3GHz peak
> 64 bits Linux, GCC 4.3.3; ./configure --disable-debug
> 
> (1)  10,7s; IPC = 2.1
> (2)  8,11s; IPC = 1.4
> (3)  2,64s; IPC = 2.4
> (4)  2,43s; IPC = 2.3

> (1) use appendbytes gives 9 instructions in main, 59 in appbytes
> (2) handle count==1 directly in in appendbytes; 9 inst. in main, 26 in 
> appbytes
> (3) appendbyte0 (same compiler output as the the non-handtuned code);
>  13 inst. in appbyte, 6 in main
> (4) tr...@head appendbyte; 11 inst. in appbyte, 6 in main
> 
> Core2 2.4GHz, Win32, VS2010
> (not using valgrind to count instructions here; also not testing (2))
> 
> (1)  17,0s release, 20,2s debug
> (3)  10,6s release, 12,2s debug
> (4)  9,7s release, 13,0s debug

With a test harness more similar to yours, and built with
"--disable-debug -O2", here are my relative numbers (but only 1 million
outer loops, compared to your 50 million):

$ svn-c-test subr string 24
Times for 100 x 200 bytes, in seconds
  (1)appendbytes (3)appendbyte0 (4)tr...@head (5)macro
run:7.032.061.370.53
run:6.621.761.550.53
r

Re: [Merge request] Merge r985477 from performance branch

2010-10-11 Thread Philip Martin
Branko Čibej  writes:

>  Got into this a bit late, sorry, but I'm not at all happy about this
> change.
>
> If we ignore the issue with long-running SVN processes ... ok, let's
> assume that changing umask requires that you restart daemons ...
>
> I cannot ignore two issues:
>
> * The default perms will come from the temp directory, not the WC
>   directory. These perms may not be the same, defaults may be
>   different on different volumes.
> * Even if we took the perms off of the current WC and stored them, a
>   long-running process can work with multiple WCs at the same times
>   and there's no guarantee that the default file perms are the same
>   on all of them.

Those are valid concerns but not really an issue with patch which
merely caches the permissions and doesn't change the value that gets
cached.  Also it was introduced as a server optimisation so repository
permissions are affected as well as working copy permissions.

-- 
Philip


RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Bert Huijben


> -Original Message-
> From: Ramkumar Ramachandra [mailto:artag...@gmail.com]
> Sent: maandag 11 oktober 2010 14:54
> To: Bert Huijben
> Cc: dev@subversion.apache.org; 'Greg Stein'; 'Julian Foad'
> Subject: Re: svn commit: r1004286 - in /subversion/trunk: ./
> subversion/libsvn_subr/io.c
> 
> Hi Bert and Julian,
> 
> Bert Huijben writes:
> > > -Original Message-
> > > From: Julian Foad [mailto:julian.f...@wandisco.com]
> > > On Mon, 2010-10-11, Julian Foad wrote:
> > > >  of-
> > > an-int-atomic>.
> > > >  "On an IA32 a correctly aligned address will be an atomic
> operation.
> > > > [... otherwise... can't assume it is]."
> > >
> > > Sorry, I pressed "Send" too early.  That's not the most important
> bit of
> > > information.  (That paragraph talks mostly about <= 32-bit CPUs,
> where
> > > of course there will be problems.)  Bert explained to me on IRC
> that
> > > atomicity is not guaranteed even on >= 32 bit architectures, and
> the
> > > highest-ranked answer on that web page agrees.  I'm no expert in
> this so
> > > I'll go away now.
> >
> > Let me add that calling apr_atomic_set32() instead of the 'x =
> ' part of the pattern will fix this issue in the way that was
> documented by the comment in the code: all threads do the same thing
> and one of the results is left in the static variable.
> >
> > Another option is to use an atomic initialization function to
> initialize the value just once.
> 
> I see; I don't think I know enough to comment.
> Bert: So does this solve the issue or did I misunderstand something?
> 
> Index: subversion/libsvn_subr/io.c
> ===
> --- subversion/libsvn_subr/io.c   (revision 1005706)
> +++ subversion/libsvn_subr/io.c   (working copy)
> @@ -1269,7 +1269,8 @@ static svn_error_t *
>  get_default_file_perms(apr_fileperms_t *perms, apr_pool_t
> *scratch_pool)
>  {
>/* the default permissions as read from the temp folder */
> -  static apr_fileperms_t default_perms = 0;
> +  static apr_fileperms_t default_perms;
> +  apr_atomic_set32(&default_perms, 0);

This shouldn't change. This = 0 is applied at application/library load (just
once). After your change it happens at every invocation.
> 
>/* Technically, this "racy": Multiple threads may use enter here and
>   try to figure out the default permisission concurrently. That's
> fine
> @@ -1303,7 +1304,7 @@ get_default_file_perms(apr_fileperms_t *perms,
> apr
>SVN_ERR(svn_io_file_close(fd, scratch_pool));
> 
>*perms = finfo.protection;
> -  default_perms = finfo.protection;
> +  apr_atomic_set32(&default_perms, finfo.protection);

Yes, assuming default_perms is a 32 bit integer (which I think it always is)
this makes it safe on all platforms. Without this it relies on cpu
architecture, alignment and possibly caching hints.
(As noted by others, without this patch it should be safe on x86 and x64
architectures (where compilers handle the proper alignment for the static
variable). But we can't be sure about the other architectures)

Bert



Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Ramkumar Ramachandra
Hi Bert and Julian,

Bert Huijben writes:
> > -Original Message-
> > From: Julian Foad [mailto:julian.f...@wandisco.com]
> > On Mon, 2010-10-11, Julian Foad wrote:
> > >  > an-int-atomic>.
> > >  "On an IA32 a correctly aligned address will be an atomic operation.
> > > [... otherwise... can't assume it is]."
> > 
> > Sorry, I pressed "Send" too early.  That's not the most important bit of
> > information.  (That paragraph talks mostly about <= 32-bit CPUs, where
> > of course there will be problems.)  Bert explained to me on IRC that
> > atomicity is not guaranteed even on >= 32 bit architectures, and the
> > highest-ranked answer on that web page agrees.  I'm no expert in this so
> > I'll go away now.
> 
> Let me add that calling apr_atomic_set32() instead of the 'x = ' part 
> of the pattern will fix this issue in the way that was documented by the 
> comment in the code: all threads do the same thing and one of the results is 
> left in the static variable.
> 
> Another option is to use an atomic initialization function to initialize the 
> value just once.

I see; I don't think I know enough to comment.
Bert: So does this solve the issue or did I misunderstand something?

Index: subversion/libsvn_subr/io.c
===
--- subversion/libsvn_subr/io.c (revision 1005706)
+++ subversion/libsvn_subr/io.c (working copy)
@@ -1269,7 +1269,8 @@ static svn_error_t *
 get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool)
 {
   /* the default permissions as read from the temp folder */
-  static apr_fileperms_t default_perms = 0;
+  static apr_fileperms_t default_perms;
+  apr_atomic_set32(&default_perms, 0);
 
   /* Technically, this "racy": Multiple threads may use enter here and
  try to figure out the default permisission concurrently. That's fine
@@ -1303,7 +1304,7 @@ get_default_file_perms(apr_fileperms_t *perms, apr
   SVN_ERR(svn_io_file_close(fd, scratch_pool));
 
   *perms = finfo.protection;
-  default_perms = finfo.protection;
+  apr_atomic_set32(&default_perms, finfo.protection);
 }
   else
 *perms = default_perms;


Re: [Merge request] Merge r985477 from performance branch

2010-10-11 Thread Branko Čibej
 Got into this a bit late, sorry, but I'm not at all happy about this
change.

If we ignore the issue with long-running SVN processes ... ok, let's
assume that changing umask requires that you restart daemons ...

I cannot ignore two issues:

* The default perms will come from the temp directory, not the WC
  directory. These perms may not be the same, defaults may be
  different on different volumes.
* Even if we took the perms off of the current WC and stored them, a
  long-running process can work with multiple WCs at the same times
  and there's no guarantee that the default file perms are the same
  on all of them.

-- Brane


Re: Problem with external file

2010-10-11 Thread Julian Foad
On Sun, 2010-10-10 at 15:23 +, Bob Fletcher wrote:
> > > Maybe I can get something this weekend.
> > 
> > Thanks.
> > 
> > By the way, we aren't trying to make life harder on you, or introduce
> > some difficult-to-clear bar for getting bugs fixed.  The developers as
> > just hard at work on actually shipping 1.7, so doing the bits you can
> > helps keep us from getting too distracted, and also helps get your bug
> > fixed sooner.
> 
> I started to see if I could get the external file error with the command line 
> client.  Before spending a lot of time, however, I would like to know if 
> there 
> could be a compatibility problem with the server version number.
> 
> The server repository, at my company, is using Subversion version 1.5.1. The 
> client code I have been experimenting with uses the 1.7 WC format. If these 
> are 
> known to be incompatible, then I would probably be wasting my time.

All 1.*.* versions of client are meant to be compatible with all 1.*.*
versions of server, so you are not wasting your time.  Thanks!

- Julian




Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-11 Thread Julian Foad
On Sat, 2010-10-09, Johan Corveleyn wrote:
> On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad  wrote:
> > On Sat, 2010-10-09, Johan Corveleyn wrote:
> >> Ok, third iteration of the patch in attachment. It passes make check.
> >>
> >> As discussed in [1], this version keeps 50 lines of the identical
> >> suffix around, to give the algorithm a good chance to generate a diff
> >> output of good quality (in all but the most extreme cases, this will
> >> be the same as with the original svn_diff algorithm).
> >>
> >> That's about the only difference with the previous iteration. So for
> >> now, I'm submitting this for review. Any feedback is very welcome :-).
> >
> > Hi Johan.
> 
> Hi Julian,
> 
> Thanks for taking a look.
> 
> > I haven't reviewed it, but after seeing today's discussion I had just
> > scrolled quickly through the previous version of this patch.  I noticed
> > that the two main functions - find_identical_suffix and
> > find_identical_suffix - are both quite similar (but not quite similar
> > enough to make them easily share implementation) and both quite long,
> > and I noticed you wrote in an earlier email that you were finding it
> > hard to make the code readable.  I have a suggestion that may help.
[...]
> > So I wrote a patch - attached - that refactors this into an array of 4
> > sub-structures, and simplifies all the code that uses them.
[...]
> Yes, great idea! That would indeed vastly simplify a lot of the code.
> So please go ahead and commit the refactoring.

OK, committed in r1021282.


> Also, maybe the last_chunk number could be included in the file_info
> struct? Now it's calculated in several places: "last_chunk0 =
> offset_to_chunk(file_baton->size[idx0])", or I have to pass it on
> every time as an extra argument. Seems like the sort of info that
> could be part of file_info.

It looks to me like you only need to calculate it in exactly one place:
in increment_pointer_or_chunk().  I wouldn't worry about pre-calculating
for efficiency, as it's a trivial calculation.


> One more thing: you might have noticed that for find_identical_suffix
> I use other buffers, chunks, curp's, endp's, ... than for the prefix.
> For prefix scanning I can just use the stuff from the diff_baton,
> because after prefix scanning has finished, everything is buffered and
> pointing correctly for the normal algorithm to continue (i.e.
> everything points at the first byte of the first non-identical line).
> For suffix scanning I need to use other structures (newly alloc'd
> buffer etc), so as to not mess with those pointers/buffers from the
> diff_baton.

> Maybe I can give it a go,
> first suffix then prefix, and see if I can find real-life problems ...

> So: I think I'll need the file_info struct to be available out of the
> diff_baton_t struct as well, so I can use this in suffix scanning
> also.

Yes; I gave the struct type a name - "struct file_info" - so you can use
it in other places.

> (side-note: I considered first doing suffix scanning, then prefix
> scanning, so I could reuse the buffers/pointers from diff_baton all
> the time, and still have everything pointing correctly after
> eliminating prefix/suffix. But that could give vastly different
> results in some cases, for instance when original file is entirely
> identical to both the prefix and the suffix of the modified file. So I
> decided it's best to stick with first prefix, then suffix).

And in a later email you wrote:
> [...] Maybe I can give it a go,
> first suffix then prefix, and see if I can find real-life problems ...

There's no need to re-visit that.  I think it's fine as is it, using a
separate pair of buffers.


> > Responding to some of the other points you mentioned in a much earlier
> > mail:
> >
> >> 3) It's only implemented for 2 files. I'd like to generalize this for
> >> an array of datasources, so it can also be used for diff3 and diff4.
> >>
> >> 4) As a small hack, I had to add a flag "datasource_opened" to
> >> token.c#svn_diff__get_tokens, so I could have different behavior for
> >> regular diff vs. diff3 and diff4. If 3) gets implemented, this hack is
> >> no longer needed.
> >
> > Yes, I'd like to see 3), and so hack 4) will go away.
> 
> I'm wondering though how I should represent the datasources to pass
> into datasources_open. An array combined with a length parameter?
> Something like:
> 
> static svn_error_t *
> datasources_open(void *baton, apr_off_t *prefix_lines,
>  svn_diff_datasource_e[] datasources, int datasources_len)
> 
> ? And then use for loops everywhere I now do things twice for the two
> datasources?

I haven't looked at how datasources_open() API is used or what form of
API would be best.

For the internal functions, you can now pass around an array of
file_info pointers rather than indexes.


> >> 5) I've struggled with making the code readable/understandable. It's
> >> likely that there is still a lot of room for improvement. I also
> >> probably need to document it

RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Bert Huijben


> -Original Message-
> From: Julian Foad [mailto:julian.f...@wandisco.com]
> Sent: maandag 11 oktober 2010 11:46
> To: Bert Huijben
> Cc: 'Greg Stein'; dev@subversion.apache.org; artag...@apache.org
> Subject: RE: svn commit: r1004286 - in /subversion/trunk: ./
> subversion/libsvn_subr/io.c
> 
> On Mon, 2010-10-11, Julian Foad wrote:
> > On Mon, 2010-10-11, Bert Huijben wrote:
> > > > -Original Message-
> > > > From: Greg Stein [mailto:gst...@gmail.com]
> > > > On Mon, Oct 11, 2010 at 04:58, Bert Huijben  wrote:
> > > > >> -Original Message-
> > > > >> From: artag...@apache.org [mailto:artag...@apache.org]
> > > > >> * subversion/libsvn_subr/io.c
> > > > >>   (get_default_file_perms): Store the permissions of the created
> > > > >>   temporary file in a static variable and re-use it in subsequent
> > > > >>   calls instead of checking persmissions everytime. This has
> > > > >>   performance benefits.
> > > > >
> > > > > Shouldn't this function use some 'atomic initialization' handling?
> > > > >
> > > > > Currently it uses a static apr_fileperms_t (integer?) which can be
> > > > manipulated by multiple threads at the same time.
> > > > >
> > > > > This part of subversion is a library and inside tools like Subclipse,
> > > > TortoiseSVN, AnkhSVN and others it is used multithreaded.
> > > >
> > > > So what? Aren't all of those threads going to write the exact same
> > > > value into the variable?
> > > >
> > > > And if they *don't*, then I believe we have larger problems.
> > >
> > > No, they are taking the value that is being stored there at the same time
> > > (read: value is undefined) and use that as the umask.
> > >
> > > The pattern
> > > static long x;
> >
> > > if (x == 0)
> > > {
> > > 
> > >   x = 
> > > }
> > > 
> > >
> > > is not thread safe.
> 
> >  an-int-atomic>.
> >  "On an IA32 a correctly aligned address will be an atomic operation.
> > [... otherwise... can't assume it is]."
> 
> Sorry, I pressed "Send" too early.  That's not the most important bit of
> information.  (That paragraph talks mostly about <= 32-bit CPUs, where
> of course there will be problems.)  Bert explained to me on IRC that
> atomicity is not guaranteed even on >= 32 bit architectures, and the
> highest-ranked answer on that web page agrees.  I'm no expert in this so
> I'll go away now.

Let me add that calling apr_atomic_set32() instead of the 'x = ' part of 
the pattern will fix this issue in the way that was documented by the comment 
in the code: all threads do the same thing and one of the results is left in 
the static variable.

Another option is to use an atomic initialization function to initialize the 
value just once.

Bert 




RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Julian Foad
On Mon, 2010-10-11, Julian Foad wrote:
> On Mon, 2010-10-11, Bert Huijben wrote:
> > > -Original Message-
> > > From: Greg Stein [mailto:gst...@gmail.com]
> > > On Mon, Oct 11, 2010 at 04:58, Bert Huijben  wrote:
> > > >> -Original Message-
> > > >> From: artag...@apache.org [mailto:artag...@apache.org]
> > > >> * subversion/libsvn_subr/io.c
> > > >>   (get_default_file_perms): Store the permissions of the created
> > > >>   temporary file in a static variable and re-use it in subsequent
> > > >>   calls instead of checking persmissions everytime. This has
> > > >>   performance benefits.
> > > >
> > > > Shouldn't this function use some 'atomic initialization' handling?
> > > >
> > > > Currently it uses a static apr_fileperms_t (integer?) which can be
> > > manipulated by multiple threads at the same time.
> > > >
> > > > This part of subversion is a library and inside tools like Subclipse,
> > > TortoiseSVN, AnkhSVN and others it is used multithreaded.
> > > 
> > > So what? Aren't all of those threads going to write the exact same
> > > value into the variable?
> > > 
> > > And if they *don't*, then I believe we have larger problems.
> > 
> > No, they are taking the value that is being stored there at the same time
> > (read: value is undefined) and use that as the umask.
> > 
> > The pattern
> > static long x;
> 
> > if (x == 0)
> > {
> > 
> >   x = 
> > }
> > 
> > 
> > is not thread safe.

> .
>  "On an IA32 a correctly aligned address will be an atomic operation.
> [... otherwise... can't assume it is]."

Sorry, I pressed "Send" too early.  That's not the most important bit of
information.  (That paragraph talks mostly about <= 32-bit CPUs, where
of course there will be problems.)  Bert explained to me on IRC that
atomicity is not guaranteed even on >= 32 bit architectures, and the
highest-ranked answer on that web page agrees.  I'm no expert in this so
I'll go away now.

- Julian




RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Julian Foad
On Mon, 2010-10-11 at 11:11 +0200, Bert Huijben wrote:
> 
> > -Original Message-
> > From: Greg Stein [mailto:gst...@gmail.com]
> > Sent: maandag 11 oktober 2010 11:03
> > To: Bert Huijben
> > Cc: dev@subversion.apache.org; artag...@apache.org
> > Subject: Re: svn commit: r1004286 - in /subversion/trunk: ./
> > subversion/libsvn_subr/io.c
> > 
> > On Mon, Oct 11, 2010 at 04:58, Bert Huijben  wrote:
> > >> -Original Message-
> > >> From: artag...@apache.org [mailto:artag...@apache.org]
> > >> Sent: maandag 4 oktober 2010 17:27
> > >> To: comm...@subversion.apache.org
> > >> Subject: svn commit: r1004286 - in /subversion/trunk: ./
> > >> subversion/libsvn_subr/io.c
> > >>
> > >> Author: artagnon
> > >> Date: Mon Oct  4 15:26:44 2010
> > >> New Revision: 1004286
> > >>
> > >> URL: http://svn.apache.org/viewvc?rev=1004286&view=rev
> > >> Log:
> > >> Merge r985477 from subversion/branches/performance
> > >>
> > >> * subversion/libsvn_subr/io.c
> > >>   (get_default_file_perms): Store the permissions of the created
> > >>   temporary file in a static variable and re-use it in subsequent
> > >>   calls instead of checking persmissions everytime. This has
> > >>   performance benefits.
> > >>
> > >> Review by: artagnon
> > >> Approved by: julianfoad
> > >
> > > Delayed review:
> > >
> > > Shouldn't this function use some 'atomic initialization' handling?
> > >
> > > Currently it uses a static apr_fileperms_t (integer?) which can be
> > manipulated by multiple threads at the same time.
> > >
> > > This part of subversion is a library and inside tools like Subclipse,
> > TortoiseSVN, AnkhSVN and others it is used multithreaded.
> > 
> > So what? Aren't all of those threads going to write the exact same
> > value into the variable?
> > 
> > And if they *don't*, then I believe we have larger problems.
> 
> No, they are taking the value that is being stored there at the same time
> (read: value is undefined) and use that as the umask.
> 
> The pattern
> static long x;

> if (x == 0)
> {
> 
>   x = 
> }
> 
> 
> is not thread safe.



.
 "On an IA32 a correctly aligned address will be an atomic operation.
[... otherwise... can't assume it is]."

- Julian




RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Bert Huijben


> -Original Message-
> From: Greg Stein [mailto:gst...@gmail.com]
> Sent: maandag 11 oktober 2010 11:03
> To: Bert Huijben
> Cc: dev@subversion.apache.org; artag...@apache.org
> Subject: Re: svn commit: r1004286 - in /subversion/trunk: ./
> subversion/libsvn_subr/io.c
> 
> On Mon, Oct 11, 2010 at 04:58, Bert Huijben  wrote:
> >> -Original Message-
> >> From: artag...@apache.org [mailto:artag...@apache.org]
> >> Sent: maandag 4 oktober 2010 17:27
> >> To: comm...@subversion.apache.org
> >> Subject: svn commit: r1004286 - in /subversion/trunk: ./
> >> subversion/libsvn_subr/io.c
> >>
> >> Author: artagnon
> >> Date: Mon Oct  4 15:26:44 2010
> >> New Revision: 1004286
> >>
> >> URL: http://svn.apache.org/viewvc?rev=1004286&view=rev
> >> Log:
> >> Merge r985477 from subversion/branches/performance
> >>
> >> * subversion/libsvn_subr/io.c
> >>   (get_default_file_perms): Store the permissions of the created
> >>   temporary file in a static variable and re-use it in subsequent
> >>   calls instead of checking persmissions everytime. This has
> >>   performance benefits.
> >>
> >> Review by: artagnon
> >> Approved by: julianfoad
> >
> > Delayed review:
> >
> > Shouldn't this function use some 'atomic initialization' handling?
> >
> > Currently it uses a static apr_fileperms_t (integer?) which can be
> manipulated by multiple threads at the same time.
> >
> > This part of subversion is a library and inside tools like Subclipse,
> TortoiseSVN, AnkhSVN and others it is used multithreaded.
> 
> So what? Aren't all of those threads going to write the exact same
> value into the variable?
> 
> And if they *don't*, then I believe we have larger problems.

No, they are taking the value that is being stored there at the same time
(read: value is undefined) and use that as the umask.

The pattern
static long x;
if (x == 0)
{

  x = 
}


is not thread safe.

Bert



Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Greg Stein
On Mon, Oct 11, 2010 at 04:58, Bert Huijben  wrote:
>> -Original Message-
>> From: artag...@apache.org [mailto:artag...@apache.org]
>> Sent: maandag 4 oktober 2010 17:27
>> To: comm...@subversion.apache.org
>> Subject: svn commit: r1004286 - in /subversion/trunk: ./
>> subversion/libsvn_subr/io.c
>>
>> Author: artagnon
>> Date: Mon Oct  4 15:26:44 2010
>> New Revision: 1004286
>>
>> URL: http://svn.apache.org/viewvc?rev=1004286&view=rev
>> Log:
>> Merge r985477 from subversion/branches/performance
>>
>> * subversion/libsvn_subr/io.c
>>   (get_default_file_perms): Store the permissions of the created
>>   temporary file in a static variable and re-use it in subsequent
>>   calls instead of checking persmissions everytime. This has
>>   performance benefits.
>>
>> Review by: artagnon
>> Approved by: julianfoad
>
> Delayed review:
>
> Shouldn't this function use some 'atomic initialization' handling?
>
> Currently it uses a static apr_fileperms_t (integer?) which can be 
> manipulated by multiple threads at the same time.
>
> This part of subversion is a library and inside tools like Subclipse, 
> TortoiseSVN, AnkhSVN and others it is used multithreaded.

So what? Aren't all of those threads going to write the exact same
value into the variable?

And if they *don't*, then I believe we have larger problems.

Cheers,
-g


RE: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

2010-10-11 Thread Bert Huijben
> -Original Message-
> From: artag...@apache.org [mailto:artag...@apache.org]
> Sent: maandag 4 oktober 2010 17:27
> To: comm...@subversion.apache.org
> Subject: svn commit: r1004286 - in /subversion/trunk: ./
> subversion/libsvn_subr/io.c
> 
> Author: artagnon
> Date: Mon Oct  4 15:26:44 2010
> New Revision: 1004286
> 
> URL: http://svn.apache.org/viewvc?rev=1004286&view=rev
> Log:
> Merge r985477 from subversion/branches/performance
> 
> * subversion/libsvn_subr/io.c
>   (get_default_file_perms): Store the permissions of the created
>   temporary file in a static variable and re-use it in subsequent
>   calls instead of checking persmissions everytime. This has
>   performance benefits.
> 
> Review by: artagnon
> Approved by: julianfoad

Delayed review:

Shouldn't this function use some 'atomic initialization' handling?

Currently it uses a static apr_fileperms_t (integer?) which can be manipulated 
by multiple threads at the same time.

This part of subversion is a library and inside tools like Subclipse, 
TortoiseSVN, AnkhSVN and others it is used multithreaded.

Bert