Re: svn merge operation extremely slow

2011-10-02 Thread Kyle Leber
Yup, trunk version has empty properties
branch version has:

svn:mime-type
application/octet-stream

On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf wrote:

> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
> > Johan,
> >
> > I did a little more digging.  There were a few different places where svn
> > seems to get hung up so I ran the gprof report on just the first one (the
> > merge takes hours otherwise).  In this particular case, svn prints out
> that
> > it is merging from a small text file while it is hanging for more than a
> > minute @ 100% CPU.  When I examine "lsof", however, it see it actually
> has a
> > different file open.  This one is a large (15 MB) "binary" file.  It
> turns
> > out this binary file did not have a property in the trunk (which I think
> > means it's treated as text, right?).  But in the branch it was marked as
> > octet stream.   So perhaps svn is doing a text-based diff on this binary
> > file because it used to be incorrectly marked as text?
> >
>
> If either side is marked as binary then svn will defer to the "Use
> merge-right if merge-left == base, else conflict" algorithm.
>
> Could you share the value of 'svn proplist --verbose' on both files?
>
> Thanks,
>
> Daniel
>


Re: svn merge operation extremely slow

2011-10-02 Thread Daniel Shahaf
Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
> Johan,
> 
> I did a little more digging.  There were a few different places where svn
> seems to get hung up so I ran the gprof report on just the first one (the
> merge takes hours otherwise).  In this particular case, svn prints out that
> it is merging from a small text file while it is hanging for more than a
> minute @ 100% CPU.  When I examine "lsof", however, it see it actually has a
> different file open.  This one is a large (15 MB) "binary" file.  It turns
> out this binary file did not have a property in the trunk (which I think
> means it's treated as text, right?).  But in the branch it was marked as
> octet stream.   So perhaps svn is doing a text-based diff on this binary
> file because it used to be incorrectly marked as text?
> 

If either side is marked as binary then svn will defer to the "Use
merge-right if merge-left == base, else conflict" algorithm.

Could you share the value of 'svn proplist --verbose' on both files?

Thanks,

Daniel


Re: svn merge operation extremely slow

2011-10-02 Thread Kyle Leber
Johan,

I did a little more digging.  There were a few different places where svn
seems to get hung up so I ran the gprof report on just the first one (the
merge takes hours otherwise).  In this particular case, svn prints out that
it is merging from a small text file while it is hanging for more than a
minute @ 100% CPU.  When I examine "lsof", however, it see it actually has a
different file open.  This one is a large (15 MB) "binary" file.  It turns
out this binary file did not have a property in the trunk (which I think
means it's treated as text, right?).  But in the branch it was marked as
octet stream.   So perhaps svn is doing a text-based diff on this binary
file because it used to be incorrectly marked as text?

Side-note: The contents of this 15MB file are actually ASCII, but we do want
it treated as binary b/c line-based merges are never valid.

Another snippet from the same gprof report is below.

Cheers,
Kyle

---
0.00  144.03  27/27  do_text_merge [10]
[11]95.90.00  144.03  27 svn_diff_file_diff3_2 [11]
0.01  144.02  27/27  svn_diff_diff3_2 [12]
0.000.00  27/5723apr_pool_destroy [833]
0.000.00  27/6430svn_pool_create_ex [1558]
---
0.01  144.02  27/27  svn_diff_file_diff3_2 [11]
[12]95.90.01  144.02  27 svn_diff_diff3_2 [12]
8.64  128.73  54/56  svn_diff(long, char, short)
[13]
0.015.09   21014/21014   svn_diff__resolve_conflict
[15]
0.031.51  81/81  svn_diff__get_tokens [25]
0.000.01  27/27  datasources_open [272]
0.010.00  81/85  svn_diff__get_token_counts
[284]
0.000.00   42065/2476341 apr_palloc [136]
0.000.00  27/27  svn_diff__tree_create
[1235]
0.000.00  54/5723apr_pool_destroy [833]
0.000.00  27/27  token_discard_all [1282]
0.000.00  54/6430svn_pool_create_ex [1558]
0.000.00  27/27  svn_diff__get_node_count
[1911]
---
0.324.77   2/56  svn_diff__resolve_conflict
[15]
8.64  128.73  54/56  svn_diff_diff3_2 [12]
[13]94.88.96  133.49  56 svn_diff(long, char, short)
[13]
  133.490.00 2002891836/2002891836 svn_diff__snake [14]
0.000.00 224/2476341 apr_palloc [136]
0.000.00  64/64  prepend_lcs [1103]
0.000.00  56/56  svn_diff__lcs_reverse
[1875]
---
  133.490.00 2002891836/2002891836 svn_diff(long, char,
short) [13]
[14]88.9  133.490.00 2002891836 svn_diff__snake [14]
0.000.00  168906/2476341 apr_palloc [136]

On Sun, Oct 2, 2011 at 5:58 PM, Johan Corveleyn  wrote:

> On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber  wrote:
> > I was able to capture a profile from svn (after remembering I have to
> link
> > statically).  I compiled with "-pg -O0" Here is the top of the file:
> >
> > Each sample counts as 0.01 seconds.
> >   %   cumulative   self  self total
> >  time   seconds   secondscalls   s/call   s/call  name
> >  88.88133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
> >   5.97142.45 8.96   56 0.16 2.54  svn_diff(long,
> char,
> > short)
> >   1.98145.42 2.97  4163001 0.00 0.00  MD5Transform
> >   0.41146.04 0.62  4163001 0.00 0.00  Decode
>
> What's it doing in svn_diff__snake (or svn_diff for that matter)? That
> should only be hit when svn is doing textual merges (in which case it
> must do rather expensive diff calculations --- I'm sure those
> calculations can go ballistic when being confronted with a large
> binary file, not consisting of text lines).
>
> Are you sure those files were actually marked as binary (svn:mime-type
> of application/octet-stream or something else non-texty)?
>
> --
> Johan
>


Re: svn merge operation extremely slow

2011-10-02 Thread Johan Corveleyn
On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber  wrote:
> I was able to capture a profile from svn (after remembering I have to link
> statically).  I compiled with "-pg -O0" Here is the top of the file:
>
> Each sample counts as 0.01 seconds.
>   %   cumulative   self  self total
>  time   seconds   seconds    calls   s/call   s/call  name
>  88.88    133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
>   5.97    142.45 8.96   56 0.16 2.54  svn_diff(long, char,
> short)
>   1.98    145.42 2.97  4163001 0.00 0.00  MD5Transform
>   0.41    146.04 0.62  4163001 0.00 0.00  Decode

What's it doing in svn_diff__snake (or svn_diff for that matter)? That
should only be hit when svn is doing textual merges (in which case it
must do rather expensive diff calculations --- I'm sure those
calculations can go ballistic when being confronted with a large
binary file, not consisting of text lines).

Are you sure those files were actually marked as binary (svn:mime-type
of application/octet-stream or something else non-texty)?

-- 
Johan


Re: svn merge operation extremely slow

2011-10-02 Thread Daniel Shahaf
Kyle Leber wrote on Sun, Oct 02, 2011 at 17:08:16 -0400:
> Is it OK to attach the full report to this user list?  The resulting text
> file is 1.3MB and I wasn't sure if the list would tolerate an attachment of
> that size.

It would be better to upload it somewhere and send a link to this list,
or to digest the report and post only the highlights to this list (as
you have done).


Re: svn merge operation extremely slow

2011-10-02 Thread David Chapman

On 10/2/2011 2:08 PM, Kyle Leber wrote:
I was able to capture a profile from svn (after remembering I have to 
link statically).  I compiled with "-pg -O0" Here is the top of the file:


Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 88.88133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
  5.97142.45 8.96   56 0.16 2.54  svn_diff(long, 
char, short)

  1.98145.42 2.97  4163001 0.00 0.00  MD5Transform
  0.41146.04 0.62  4163001 0.00 0.00  Decode

Is it OK to attach the full report to this user list?  The resulting 
text file is 1.3MB and I wasn't sure if the list would tolerate an 
attachment of that size.




It's a weekend, so you might not get a lot of replies from people who 
know SVN source code, but it's likely that the full report won't be 
needed.  There are two billion (!) calls to svn_diff__snake(), and the 
question is why there are so many.  It might help the devs if you pasted 
in the entries for functions which directly called svn_diff__snake() 
(quite possibly svn_diff() only) and perhaps the functions which 
svn_diff__snake() called directly (none of any significance, if I read 
the above report correctly).  This should be only a few dozen lines of 
the report.  Note that you'll have to trace through the report (the 
top-level function is listed first, followed by its children, 
grandchildren, etc.) to find the entries for these functions.


I have a suspicion that one of the devs will be able to identify the 
issue from just the above report, but a little more information might 
turn out to be helpful.  They certainly won't need to see information 
for all of the zillion functions in SVN.


--
David Chapman dcchap...@acm.org
Chapman Consulting -- San Jose, CA



Re: svn merge operation extremely slow

2011-10-02 Thread Kyle Leber
I was able to capture a profile from svn (after remembering I have to link
statically).  I compiled with "-pg -O0" Here is the top of the file:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 88.88133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
  5.97142.45 8.96   56 0.16 2.54  svn_diff(long, char,
short)
  1.98145.42 2.97  4163001 0.00 0.00  MD5Transform
  0.41146.04 0.62  4163001 0.00 0.00  Decode

Is it OK to attach the full report to this user list?  The resulting text
file is 1.3MB and I wasn't sure if the list would tolerate an attachment of
that size.

Cheers,
Kyle

On Sat, Oct 1, 2011 at 7:55 PM, Daniel Shahaf wrote:

> gprof is what I'm familiar with (nutshell: compile with 'gcc -pg' and
> read gmon.out).  There are no specific profiling docs for svn; if you
> need more specific advice please post to the dev@ list.  Thanks!
>
> Kyle Leber wrote on Sat, Oct 01, 2011 at 19:33:10 -0400:
> > What method of profiling do you recommend?  I have used gprof previously
> > (it's been awhile) but am not familiar with the subversion project source
> > code and build setup.  Is the a online guide or wiki describing the
> > preferred setup for performing this?
> >
> > Kyle
> >
> > On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf  >wrote:
> >
> > > Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
> > > > [ Please do not top-post on this list, i.e. please put your reply
> > > > below or inline. More below ... ]
> > > >
> > > > On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber 
> wrote:
> > > > > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <
> jcor...@gmail.com>
> > > wrote:
> > > > >>
> > > > >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber  >
> > > wrote:
> > > > >> > I've encountered what I think is a problem with subversion, but
> I'm
> > > not
> > > > >> > completely sure (and according to the online instructions I
> should
> > > bring
> > > > >> > it
> > > > >> > up here prior to filing a bug).
> > > > >>
> > > > >> Actually, the instructions on
> > > > >> http://subversion.apache.org/issue-tracker.html say that you
> should
> > > > >> send your report to users@, not dev@. So I'm adding users@.
> Please
> > > > >> drop dev@ from any further replies.
> > > > >>
> > > > >> > Basically, we're trying to merge a rather large collection of
> > > > >> > fixes back in our trunk.  I check out a fresh copy of the trunk,
> > > > >> > then use the merge syntax: svn merge https://path/to/my/branch.
> > > > >> >
> > > > >> > This generally churns along just fine, but we occasionally get
> > > > >> > hung up on medium sized binary files where the svn client jumps
> > > > >> > to 100% cpu usage and sits on it for 3+ hours before moving on
> to
> > > > >> > the next file.  These files are anywhere from 3-10MB in size, so
> > > > >> > not ridiculously huge.  We generally have these files marked as
> > > > >> > octet stream, but changing to text did not help the situation
> > > > >> > when we tried that.
> > > > >> >
> > > > >> > I did find an old forum discussion about a potential issue that
> > > > >> > could be related.  I was wondering if this was ever addressed
> and
> > > > >> > could it still be the same problem.  Link is here:
> > > > >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> > > > >> >
> > > > >> > I'm using svn client 1.6.12.  I looked at the online change log
> > > > >> > up through the 1.7 alphas and didn't see any bug fixes that
> > > > >> > sounded relevant.
> > > > >>
> > > > >> This could be a relevant change (listed in the 1.7 release notes,
> not
> > > > >> in the change log):
> > > > >>
> > > > >>
> > >
> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
> > > > >>
> > > > >> Can you please try one of the 1.7 pre-release binaries, and see if
> it
> > > > >> helps? See http://subversion.apache.org/packages.html#pre-release
> > > > >>
> > > > > Thanks, Johan.  I tested with 1.7rc4 and it did not make any
> > > perceptible
> > > > > difference.  Anything else I can try?
> > > >
> > > > Hm, that's unfortunate.
> > > >
> > > > Actually, it was to be expected that this wouldn't help, because the
> > > > diff-optimizations in 1.7 only play a role when merging text files
> > > > (and diffing and blaming). And you said those
> > > > "files-that-make-merge-hang" are generally marked as octet-stream,
> and
> > > > changing them to text made no difference.
> > > >
> > > > That seems to indicate that the 100% cpu usage on the client isn't
> > > > spent in the diff code (unlike the forum thread that you linked to,
> > > > where the poster tracked it down to libsvn_diff/lcs.c --- he would
> > > > definitely have been helped by the 1.7 improvements).
> > > >
> > >
> > > What does 'svn merge' do for binary files?  I checked svn_wc__merge()
> > > a few months ago and for binary files all it knew to do was
> >

Re: File access control

2011-10-02 Thread Thorsten Schöning
Guten Tag Grant,
am Sonntag, 2. Oktober 2011 um 03:07 schrieben Sie:

> All of the big enterprise websites allow each of their developers to
> check out a full working copy of the company code with only an NDA/NCC
> to protect them?

Mostly, yes and why do you think this is not enough?

> It would be so easy for any of them to use, sell, or
> give the code away, or even to accidentally allow an unauthorized
> person access to it.

No hard feelings, but your thread doesn't read that whatever you try
to protect is interesting enough for anyone out there to justify the
time you spent on thinking about how to protect your code against god
and the world. You speak of hiring a developer, if it's not the son
of a friend who worked with computers in school, but a professional,
he will act like a professional if he wants to get paid without
problems.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning
AM-SoFT IT-Systeme - Hameln | Potsdam | Leipzig
 
Telefon: Potsdam: 0331-743881-0
E-Mail:  tschoen...@am-soft.de
Web: http://www.am-soft.de

AM-SoFT GmbH IT-Systeme, Konsumhof 1-5, 14482 Potsdam
Amtsgericht Potsdam HRB 21278 P, Geschäftsführer: Andreas Muchow