Re: svn merge operation extremely slow

2011-10-03 Thread Johan Corveleyn
[ Again: please don't top-post on this list. I'm moving your reply to
the bottom. More below ... ]

On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber kyle.le...@gmail.com wrote:
 On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf d...@daniel.shahaf.name
 wrote:

 Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
  Johan,
 
  I did a little more digging.  There were a few different places where
  svn
  seems to get hung up so I ran the gprof report on just the first one
  (the
  merge takes hours otherwise).  In this particular case, svn prints out
  that
  it is merging from a small text file while it is hanging for more than a
  minute @ 100% CPU.  When I examine lsof, however, it see it actually
  has a
  different file open.  This one is a large (15 MB) binary file.  It
  turns
  out this binary file did not have a property in the trunk (which I think
  means it's treated as text, right?).  But in the branch it was marked as
  octet stream.   So perhaps svn is doing a text-based diff on this binary
  file because it used to be incorrectly marked as text?
 

 If either side is marked as binary then svn will defer to the Use
 merge-right if merge-left == base, else conflict algorithm.

 Could you share the value of 'svn proplist --verbose' on both files?

 Yup, trunk version has empty properties
 branch version has:

 svn:mime-type
 application/octet-stream


What is the merge target? Is it a trunk working copy (the one without
mime-type), or a branch working copy (with
svn:mime-type=application/octet-stream)?

I think it's the mime-type of the merge target that determines if
merge will take the binary route, or the text route. See this
snippet from libsvn_wc/merge.c [1] (in the function
svn_wc__internal_merge):

[[[
  /* Decide if the merge target is a text or binary file. */
  if ((mimeprop = get_prop(mt, SVN_PROP_MIME_TYPE))
   mimeprop-value)
is_binary = svn_mime_type_is_binary(mimeprop-value-data);
  else
{
  const char *value = svn_prop_get_value(mt.actual_props,
 SVN_PROP_MIME_TYPE);

  is_binary = value  svn_mime_type_is_binary(value);
}
]]]

(mt is the merge target)

I'm not terribly familiar with this part of the codebase. But on first
sight, this seems to say:

  (1) Look at the mime-type of the base version of the merge target.
If that's binary, then we'll go binary.

  (2) If the base of the merge target doesn't have a mime-type, look
if it has one on the actual node (the uncommitted local
modifications). If that's binary, then we'll go binary.

  (3) Else: text merge

So I'm guessing that you're merging to trunk, the target without
mime-type property, which makes svn take the text route for merging.
Is that correct?

If that's the case, maybe you can simply set the mime-type on that
binary file in your merge target, as a local modification (I don't
think you need to even commit it). Can you try that?

-- 
Johan

[1] 
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c


Re: svn merge operation extremely slow

2011-10-03 Thread Kyle Leber
On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn jcor...@gmail.com wrote:

 [ Again: please don't top-post on this list. I'm moving your reply to
 the bottom. More below ... ]

 On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber kyle.le...@gmail.com wrote:
  On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf d...@daniel.shahaf.name
  wrote:
 
  Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
   Johan,
  
   I did a little more digging.  There were a few different places where
   svn
   seems to get hung up so I ran the gprof report on just the first one
   (the
   merge takes hours otherwise).  In this particular case, svn prints out
   that
   it is merging from a small text file while it is hanging for more than
 a
   minute @ 100% CPU.  When I examine lsof, however, it see it actually
   has a
   different file open.  This one is a large (15 MB) binary file.  It
   turns
   out this binary file did not have a property in the trunk (which I
 think
   means it's treated as text, right?).  But in the branch it was marked
 as
   octet stream.   So perhaps svn is doing a text-based diff on this
 binary
   file because it used to be incorrectly marked as text?
  
 
  If either side is marked as binary then svn will defer to the Use
  merge-right if merge-left == base, else conflict algorithm.
 
  Could you share the value of 'svn proplist --verbose' on both files?
 
  Yup, trunk version has empty properties
  branch version has:
 
  svn:mime-type
  application/octet-stream
 

 What is the merge target? Is it a trunk working copy (the one without
 mime-type), or a branch working copy (with
 svn:mime-type=application/octet-stream)?

 I think it's the mime-type of the merge target that determines if
 merge will take the binary route, or the text route. See this
 snippet from libsvn_wc/merge.c [1] (in the function
 svn_wc__internal_merge):

 [[[
  /* Decide if the merge target is a text or binary file. */
  if ((mimeprop = get_prop(mt, SVN_PROP_MIME_TYPE))
   mimeprop-value)
is_binary = svn_mime_type_is_binary(mimeprop-value-data);
  else
{
  const char *value = svn_prop_get_value(mt.actual_props,
 SVN_PROP_MIME_TYPE);

  is_binary = value  svn_mime_type_is_binary(value);
}
 ]]]

 (mt is the merge target)

 I'm not terribly familiar with this part of the codebase. But on first
 sight, this seems to say:

  (1) Look at the mime-type of the base version of the merge target.
 If that's binary, then we'll go binary.

  (2) If the base of the merge target doesn't have a mime-type, look
 if it has one on the actual node (the uncommitted local
 modifications). If that's binary, then we'll go binary.

  (3) Else: text merge

 So I'm guessing that you're merging to trunk, the target without
 mime-type property, which makes svn take the text route for merging.
 Is that correct?

 If that's the case, maybe you can simply set the mime-type on that
 binary file in your merge target, as a local modification (I don't
 think you need to even commit it). Can you try that?

 --
 Johan

 [1]
 http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c


Johan,

Sorry for the top-post.  Hopefully this is better :)

I set the mime-type to application/octet-stream in the working copy prior
to merge and this fixed the problem.  No more heavy CPU usage or excessive
time spent on the file.

Kyle


Re: svn merge operation extremely slow

2011-10-03 Thread Daniel Shahaf
Kyle Leber wrote on Mon, Oct 03, 2011 at 08:16:53 -0400:
 On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn jcor...@gmail.com wrote:
   (2) If the base of the merge target doesn't have a mime-type, look
  if it has one on the actual node (the uncommitted local
  modifications). If that's binary, then we'll go binary.
 
   (3) Else: text merge

I stand corrected.


Re: svn merge operation extremely slow

2011-10-03 Thread Johan Corveleyn
On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber kyle.le...@gmail.com wrote:


 On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn jcor...@gmail.com wrote:

 [ Again: please don't top-post on this list. I'm moving your reply to
 the bottom. More below ... ]

 On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber kyle.le...@gmail.com wrote:
  On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf d...@daniel.shahaf.name
  wrote:
 
  Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
   Johan,
  
   I did a little more digging.  There were a few different places where
   svn
   seems to get hung up so I ran the gprof report on just the first one
   (the
   merge takes hours otherwise).  In this particular case, svn prints
   out
   that
   it is merging from a small text file while it is hanging for more
   than a
   minute @ 100% CPU.  When I examine lsof, however, it see it
   actually
   has a
   different file open.  This one is a large (15 MB) binary file.  It
   turns
   out this binary file did not have a property in the trunk (which I
   think
   means it's treated as text, right?).  But in the branch it was marked
   as
   octet stream.   So perhaps svn is doing a text-based diff on this
   binary
   file because it used to be incorrectly marked as text?
  
 
  If either side is marked as binary then svn will defer to the Use
  merge-right if merge-left == base, else conflict algorithm.
 
  Could you share the value of 'svn proplist --verbose' on both files?
 
  Yup, trunk version has empty properties
  branch version has:
 
  svn:mime-type
      application/octet-stream
 

 What is the merge target? Is it a trunk working copy (the one without
 mime-type), or a branch working copy (with
 svn:mime-type=application/octet-stream)?

 I think it's the mime-type of the merge target that determines if
 merge will take the binary route, or the text route. See this
 snippet from libsvn_wc/merge.c [1] (in the function
 svn_wc__internal_merge):

 [[[
  /* Decide if the merge target is a text or binary file. */
  if ((mimeprop = get_prop(mt, SVN_PROP_MIME_TYPE))
       mimeprop-value)
    is_binary = svn_mime_type_is_binary(mimeprop-value-data);
  else
    {
      const char *value = svn_prop_get_value(mt.actual_props,
                                             SVN_PROP_MIME_TYPE);

      is_binary = value  svn_mime_type_is_binary(value);
    }
 ]]]

 (mt is the merge target)

 I'm not terribly familiar with this part of the codebase. But on first
 sight, this seems to say:

  (1) Look at the mime-type of the base version of the merge target.
 If that's binary, then we'll go binary.

  (2) If the base of the merge target doesn't have a mime-type, look
 if it has one on the actual node (the uncommitted local
 modifications). If that's binary, then we'll go binary.

  (3) Else: text merge

 So I'm guessing that you're merging to trunk, the target without
 mime-type property, which makes svn take the text route for merging.
 Is that correct?

 If that's the case, maybe you can simply set the mime-type on that
 binary file in your merge target, as a local modification (I don't
 think you need to even commit it). Can you try that?

 --
 Johan

 [1]
 http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c

 Johan,

 Sorry for the top-post.  Hopefully this is better :)

Much better, thank you :).

 I set the mime-type to application/octet-stream in the working copy prior
 to merge and this fixed the problem.  No more heavy CPU usage or excessive
 time spent on the file.

I'm glad it helped. Apart from the performance, it's important that
svn does this merge the binary way, because as you said line-based
merges are not correct for this file.

-- 
Johan


Re: svn merge operation extremely slow

2011-10-03 Thread Johan Corveleyn
On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn jcor...@gmail.com wrote:
 On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber kyle.le...@gmail.com wrote:


 On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn jcor...@gmail.com wrote:

 [ Again: please don't top-post on this list. I'm moving your reply to
 the bottom. More below ... ]

 On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber kyle.le...@gmail.com wrote:
  On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf d...@daniel.shahaf.name
  wrote:
 
  Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
   Johan,
  
   I did a little more digging.  There were a few different places where
   svn
   seems to get hung up so I ran the gprof report on just the first one
   (the
   merge takes hours otherwise).  In this particular case, svn prints
   out
   that
   it is merging from a small text file while it is hanging for more
   than a
   minute @ 100% CPU.  When I examine lsof, however, it see it
   actually
   has a
   different file open.  This one is a large (15 MB) binary file.  It
   turns
   out this binary file did not have a property in the trunk (which I
   think
   means it's treated as text, right?).  But in the branch it was marked
   as
   octet stream.   So perhaps svn is doing a text-based diff on this
   binary
   file because it used to be incorrectly marked as text?
  
 
  If either side is marked as binary then svn will defer to the Use
  merge-right if merge-left == base, else conflict algorithm.
 
  Could you share the value of 'svn proplist --verbose' on both files?
 
  Yup, trunk version has empty properties
  branch version has:
 
  svn:mime-type
      application/octet-stream
 

 What is the merge target? Is it a trunk working copy (the one without
 mime-type), or a branch working copy (with
 svn:mime-type=application/octet-stream)?

 I think it's the mime-type of the merge target that determines if
 merge will take the binary route, or the text route. See this
 snippet from libsvn_wc/merge.c [1] (in the function
 svn_wc__internal_merge):

 [[[
  /* Decide if the merge target is a text or binary file. */
  if ((mimeprop = get_prop(mt, SVN_PROP_MIME_TYPE))
       mimeprop-value)
    is_binary = svn_mime_type_is_binary(mimeprop-value-data);
  else
    {
      const char *value = svn_prop_get_value(mt.actual_props,
                                             SVN_PROP_MIME_TYPE);

      is_binary = value  svn_mime_type_is_binary(value);
    }
 ]]]

 (mt is the merge target)

 I'm not terribly familiar with this part of the codebase. But on first
 sight, this seems to say:

  (1) Look at the mime-type of the base version of the merge target.
 If that's binary, then we'll go binary.

  (2) If the base of the merge target doesn't have a mime-type, look
 if it has one on the actual node (the uncommitted local
 modifications). If that's binary, then we'll go binary.

  (3) Else: text merge

 So I'm guessing that you're merging to trunk, the target without
 mime-type property, which makes svn take the text route for merging.
 Is that correct?

 If that's the case, maybe you can simply set the mime-type on that
 binary file in your merge target, as a local modification (I don't
 think you need to even commit it). Can you try that?

 --
 Johan

 [1]
 http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c

 Johan,

 Sorry for the top-post.  Hopefully this is better :)

 Much better, thank you :).

 I set the mime-type to application/octet-stream in the working copy prior
 to merge and this fixed the problem.  No more heavy CPU usage or excessive
 time spent on the file.

 I'm glad it helped. Apart from the performance, it's important that
 svn does this merge the binary way, because as you said line-based
 merges are not correct for this file.

It may also interest you (and other readers of this thread) that there
is an open enhancement request for making text-merges take the same
shortcut as binary-merges (if mine == merge-left then set merged :=
merge-right), to avoid expensive diffing [1]. But that hasn't been
addressed yet.


[1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
trivial text files merged MUCH slower than binary - pls optimize.

-- 
Johan


Re: svn merge operation extremely slow

2011-10-03 Thread Daniel Shahaf
Johan Corveleyn wrote on Mon, Oct 03, 2011 at 14:59:25 +0200:
 It may also interest you (and other readers of this thread) that there
 is an open enhancement request for making text-merges take the same
 shortcut as binary-merges (if mine == merge-left then set merged :=
 merge-right), to avoid expensive diffing [1]. But that hasn't been
 addressed yet.
 
 
 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
 trivial text files merged MUCH slower than binary - pls optimize.
 

Isn't Set the svn:mime-type property locally, and revert it before
commit a workaround for that?

 -- 
 Johan


Re: svn merge operation extremely slow

2011-10-03 Thread Stefan Sperling
On Mon, Oct 03, 2011 at 02:59:25PM +0200, Johan Corveleyn wrote:
 On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn jcor...@gmail.com wrote:
  On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber kyle.le...@gmail.com wrote:
  I set the mime-type to application/octet-stream in the working copy prior
  to merge and this fixed the problem.  No more heavy CPU usage or excessive
  time spent on the file.
 
  I'm glad it helped. Apart from the performance, it's important that
  svn does this merge the binary way, because as you said line-based
  merges are not correct for this file.
 
 It may also interest you (and other readers of this thread) that there
 is an open enhancement request for making text-merges take the same
 shortcut as binary-merges (if mine == merge-left then set merged :=
 merge-right), to avoid expensive diffing [1]. But that hasn't been
 addressed yet.
 
 
 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
 trivial text files merged MUCH slower than binary - pls optimize.
 

I think we should also file an issue about the problem discussed
in this thread. svn should take properties on the left/right side of the
merge into account when determining whether to treat a file as binary.
I guess it should run the binary merge if any of left, right, or the
target are marked as binary.


Re: svn merge operation extremely slow

2011-10-03 Thread Johan Corveleyn
On Mon, Oct 3, 2011 at 3:02 PM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 Johan Corveleyn wrote on Mon, Oct 03, 2011 at 14:59:25 +0200:
 It may also interest you (and other readers of this thread) that there
 is an open enhancement request for making text-merges take the same
 shortcut as binary-merges (if mine == merge-left then set merged :=
 merge-right), to avoid expensive diffing [1]. But that hasn't been
 addressed yet.


 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
 trivial text files merged MUCH slower than binary - pls optimize.


 Isn't Set the svn:mime-type property locally, and revert it before
 commit a workaround for that?

Yes, it would seem so. Though it may not be very helpful in lots of
situations (because people only discover the problem after they
out-waited a merge of several hours). Still, it's useful information
to work around it (maybe people can detect the problem in some wrapper
scripts, ahead of merging), so maybe you should add it to the issue
tracker.

-- 
Johan


Re: svn merge operation extremely slow

2011-10-03 Thread Johan Corveleyn
On Mon, Oct 3, 2011 at 3:04 PM, Stefan Sperling s...@elego.de wrote:
 On Mon, Oct 03, 2011 at 02:59:25PM +0200, Johan Corveleyn wrote:
 On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn jcor...@gmail.com wrote:
  On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber kyle.le...@gmail.com wrote:
  I set the mime-type to application/octet-stream in the working copy 
  prior
  to merge and this fixed the problem.  No more heavy CPU usage or excessive
  time spent on the file.
 
  I'm glad it helped. Apart from the performance, it's important that
  svn does this merge the binary way, because as you said line-based
  merges are not correct for this file.

 It may also interest you (and other readers of this thread) that there
 is an open enhancement request for making text-merges take the same
 shortcut as binary-merges (if mine == merge-left then set merged :=
 merge-right), to avoid expensive diffing [1]. But that hasn't been
 addressed yet.


 [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
 trivial text files merged MUCH slower than binary - pls optimize.


 I think we should also file an issue about the problem discussed
 in this thread. svn should take properties on the left/right side of the
 merge into account when determining whether to treat a file as binary.
 I guess it should run the binary merge if any of left, right, or the
 target are marked as binary.

Yes, maybe you're right. I don't know the specifics / historics of
this behavior (maybe there is a reason for this?). But on the surface
it looks like it should indeed do a binary merge if either one of
left, right or target is marked as binary.

Even if #4009 would be addressed, it would still make a difference in
the situation where the shortcut-condition (mine == merge-left)
doesn't hold. In that case, I think the binary-merge would always
flag a conflict (because it can't do a line-based merge). Is that also
the behavior we want f.i. if only merge-left (or only merge-right)
were marked as binary, and all the other players are marked as text?
I guess it's the safest thing to do ...

-- 
Johan


Re: svn merge operation extremely slow

2011-10-02 Thread David Chapman

On 10/2/2011 2:08 PM, Kyle Leber wrote:
I was able to capture a profile from svn (after remembering I have to 
link statically).  I compiled with -pg -O0 Here is the top of the file:


Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 88.88133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
  5.97142.45 8.96   56 0.16 2.54  svn_diff(long, 
char, short)

  1.98145.42 2.97  4163001 0.00 0.00  MD5Transform
  0.41146.04 0.62  4163001 0.00 0.00  Decode

Is it OK to attach the full report to this user list?  The resulting 
text file is 1.3MB and I wasn't sure if the list would tolerate an 
attachment of that size.




It's a weekend, so you might not get a lot of replies from people who 
know SVN source code, but it's likely that the full report won't be 
needed.  There are two billion (!) calls to svn_diff__snake(), and the 
question is why there are so many.  It might help the devs if you pasted 
in the entries for functions which directly called svn_diff__snake() 
(quite possibly svn_diff() only) and perhaps the functions which 
svn_diff__snake() called directly (none of any significance, if I read 
the above report correctly).  This should be only a few dozen lines of 
the report.  Note that you'll have to trace through the report (the 
top-level function is listed first, followed by its children, 
grandchildren, etc.) to find the entries for these functions.


I have a suspicion that one of the devs will be able to identify the 
issue from just the above report, but a little more information might 
turn out to be helpful.  They certainly won't need to see information 
for all of the zillion functions in SVN.


--
David Chapman dcchap...@acm.org
Chapman Consulting -- San Jose, CA



Re: svn merge operation extremely slow

2011-10-02 Thread Daniel Shahaf
Kyle Leber wrote on Sun, Oct 02, 2011 at 17:08:16 -0400:
 Is it OK to attach the full report to this user list?  The resulting text
 file is 1.3MB and I wasn't sure if the list would tolerate an attachment of
 that size.

It would be better to upload it somewhere and send a link to this list,
or to digest the report and post only the highlights to this list (as
you have done).


Re: svn merge operation extremely slow

2011-10-02 Thread Johan Corveleyn
On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber kyle.le...@gmail.com wrote:
 I was able to capture a profile from svn (after remembering I have to link
 statically).  I compiled with -pg -O0 Here is the top of the file:

 Each sample counts as 0.01 seconds.
   %   cumulative   self  self total
  time   seconds   seconds    calls   s/call   s/call  name
  88.88    133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
   5.97    142.45 8.96   56 0.16 2.54  svn_diff(long, char,
 short)
   1.98    145.42 2.97  4163001 0.00 0.00  MD5Transform
   0.41    146.04 0.62  4163001 0.00 0.00  Decode

What's it doing in svn_diff__snake (or svn_diff for that matter)? That
should only be hit when svn is doing textual merges (in which case it
must do rather expensive diff calculations --- I'm sure those
calculations can go ballistic when being confronted with a large
binary file, not consisting of text lines).

Are you sure those files were actually marked as binary (svn:mime-type
of application/octet-stream or something else non-texty)?

-- 
Johan


Re: svn merge operation extremely slow

2011-10-02 Thread Kyle Leber
Johan,

I did a little more digging.  There were a few different places where svn
seems to get hung up so I ran the gprof report on just the first one (the
merge takes hours otherwise).  In this particular case, svn prints out that
it is merging from a small text file while it is hanging for more than a
minute @ 100% CPU.  When I examine lsof, however, it see it actually has a
different file open.  This one is a large (15 MB) binary file.  It turns
out this binary file did not have a property in the trunk (which I think
means it's treated as text, right?).  But in the branch it was marked as
octet stream.   So perhaps svn is doing a text-based diff on this binary
file because it used to be incorrectly marked as text?

Side-note: The contents of this 15MB file are actually ASCII, but we do want
it treated as binary b/c line-based merges are never valid.

Another snippet from the same gprof report is below.

Cheers,
Kyle

---
0.00  144.03  27/27  do_text_merge [10]
[11]95.90.00  144.03  27 svn_diff_file_diff3_2 [11]
0.01  144.02  27/27  svn_diff_diff3_2 [12]
0.000.00  27/5723apr_pool_destroy [833]
0.000.00  27/6430svn_pool_create_ex [1558]
---
0.01  144.02  27/27  svn_diff_file_diff3_2 [11]
[12]95.90.01  144.02  27 svn_diff_diff3_2 [12]
8.64  128.73  54/56  svn_diff(long, char, short)
[13]
0.015.09   21014/21014   svn_diff__resolve_conflict
[15]
0.031.51  81/81  svn_diff__get_tokens [25]
0.000.01  27/27  datasources_open [272]
0.010.00  81/85  svn_diff__get_token_counts
[284]
0.000.00   42065/2476341 apr_palloc [136]
0.000.00  27/27  svn_diff__tree_create
[1235]
0.000.00  54/5723apr_pool_destroy [833]
0.000.00  27/27  token_discard_all [1282]
0.000.00  54/6430svn_pool_create_ex [1558]
0.000.00  27/27  svn_diff__get_node_count
[1911]
---
0.324.77   2/56  svn_diff__resolve_conflict
[15]
8.64  128.73  54/56  svn_diff_diff3_2 [12]
[13]94.88.96  133.49  56 svn_diff(long, char, short)
[13]
  133.490.00 2002891836/2002891836 svn_diff__snake [14]
0.000.00 224/2476341 apr_palloc [136]
0.000.00  64/64  prepend_lcs [1103]
0.000.00  56/56  svn_diff__lcs_reverse
[1875]
---
  133.490.00 2002891836/2002891836 svn_diff(long, char,
short) [13]
[14]88.9  133.490.00 2002891836 svn_diff__snake [14]
0.000.00  168906/2476341 apr_palloc [136]

On Sun, Oct 2, 2011 at 5:58 PM, Johan Corveleyn jcor...@gmail.com wrote:

 On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber kyle.le...@gmail.com wrote:
  I was able to capture a profile from svn (after remembering I have to
 link
  statically).  I compiled with -pg -O0 Here is the top of the file:
 
  Each sample counts as 0.01 seconds.
%   cumulative   self  self total
   time   seconds   secondscalls   s/call   s/call  name
   88.88133.49   133.49 2002891836 0.00 0.00  svn_diff__snake
5.97142.45 8.96   56 0.16 2.54  svn_diff(long,
 char,
  short)
1.98145.42 2.97  4163001 0.00 0.00  MD5Transform
0.41146.04 0.62  4163001 0.00 0.00  Decode

 What's it doing in svn_diff__snake (or svn_diff for that matter)? That
 should only be hit when svn is doing textual merges (in which case it
 must do rather expensive diff calculations --- I'm sure those
 calculations can go ballistic when being confronted with a large
 binary file, not consisting of text lines).

 Are you sure those files were actually marked as binary (svn:mime-type
 of application/octet-stream or something else non-texty)?

 --
 Johan



Re: svn merge operation extremely slow

2011-10-02 Thread Daniel Shahaf
Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
 Johan,
 
 I did a little more digging.  There were a few different places where svn
 seems to get hung up so I ran the gprof report on just the first one (the
 merge takes hours otherwise).  In this particular case, svn prints out that
 it is merging from a small text file while it is hanging for more than a
 minute @ 100% CPU.  When I examine lsof, however, it see it actually has a
 different file open.  This one is a large (15 MB) binary file.  It turns
 out this binary file did not have a property in the trunk (which I think
 means it's treated as text, right?).  But in the branch it was marked as
 octet stream.   So perhaps svn is doing a text-based diff on this binary
 file because it used to be incorrectly marked as text?
 

If either side is marked as binary then svn will defer to the Use
merge-right if merge-left == base, else conflict algorithm.

Could you share the value of 'svn proplist --verbose' on both files?

Thanks,

Daniel


Re: svn merge operation extremely slow

2011-10-02 Thread Kyle Leber
Yup, trunk version has empty properties
branch version has:

svn:mime-type
application/octet-stream

On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf d...@daniel.shahaf.namewrote:

 Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
  Johan,
 
  I did a little more digging.  There were a few different places where svn
  seems to get hung up so I ran the gprof report on just the first one (the
  merge takes hours otherwise).  In this particular case, svn prints out
 that
  it is merging from a small text file while it is hanging for more than a
  minute @ 100% CPU.  When I examine lsof, however, it see it actually
 has a
  different file open.  This one is a large (15 MB) binary file.  It
 turns
  out this binary file did not have a property in the trunk (which I think
  means it's treated as text, right?).  But in the branch it was marked as
  octet stream.   So perhaps svn is doing a text-based diff on this binary
  file because it used to be incorrectly marked as text?
 

 If either side is marked as binary then svn will defer to the Use
 merge-right if merge-left == base, else conflict algorithm.

 Could you share the value of 'svn proplist --verbose' on both files?

 Thanks,

 Daniel



Re: svn merge operation extremely slow

2011-10-01 Thread Johan Corveleyn
[ Please do not top-post on this list, i.e. please put your reply
below or inline. More below ... ]

On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber kyle.le...@gmail.com wrote:
 On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn jcor...@gmail.com wrote:

 On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com wrote:
  I've encountered what I think is a problem with subversion, but I'm not
  completely sure (and according to the online instructions I should bring
  it
  up here prior to filing a bug).

 Actually, the instructions on
 http://subversion.apache.org/issue-tracker.html say that you should
 send your report to users@, not dev@. So I'm adding users@. Please
 drop dev@ from any further replies.

  Basically, we're trying to merge a rather large collection of fixes back
  in
  our trunk.  I check out a fresh copy of the trunk, then use the merge
  syntax:
  svn merge https://path/to/my/branch .
 
  This generally churns along just fine, but we occasionally get hung up
  on
  medium sized binary files where the svn client jumps to 100% cpu usage
  and
  sits on it for 3+ hours before moving on to the next file.  These files
  are
  anywhere from 3-10MB in size, so not ridiculously huge.  We generally
  have
  these files marked as octet stream, but changing to text did not help
  the
  situation when we tried that.
 
  I did find an old forum discussion about a potential issue that could be
  related.  I was wondering if this was ever addressed and could it still
  be
  the same problem.  Link is here:
  http://www.svnforum.org/threads/36123-Slow-SVN-merge
 
  I'm using svn client 1.6.12.  I looked at the online change log up
  through
  the 1.7 alphas and didn't see any bug fixes that sounded relevant.

 This could be a relevant change (listed in the 1.7 release notes, not
 in the change log):

 http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations

 Can you please try one of the 1.7 pre-release binaries, and see if it
 helps? See http://subversion.apache.org/packages.html#pre-release

 Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
 difference.  Anything else I can try?

Hm, that's unfortunate.

Actually, it was to be expected that this wouldn't help, because the
diff-optimizations in 1.7 only play a role when merging text files
(and diffing and blaming). And you said those
files-that-make-merge-hang are generally marked as octet-stream, and
changing them to text made no difference.

That seems to indicate that the 100% cpu usage on the client isn't
spent in the diff code (unlike the forum thread that you linked to,
where the poster tracked it down to libsvn_diff/lcs.c --- he would
definitely have been helped by the 1.7 improvements).

So there's another reason. Maybe it has something to do with (lots of)
subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
on directories and files all over the place?

Also: can you tell us what version is running on the server?

Maybe other people on this list have had similar experiences, and can
give some suggestions?

-- 
Johan


Re: svn merge operation extremely slow

2011-10-01 Thread Daniel Shahaf
Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
 [ Please do not top-post on this list, i.e. please put your reply
 below or inline. More below ... ]
 
 On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber kyle.le...@gmail.com wrote:
  On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn jcor...@gmail.com wrote:
 
  On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com wrote:
   I've encountered what I think is a problem with subversion, but I'm not
   completely sure (and according to the online instructions I should bring
   it
   up here prior to filing a bug).
 
  Actually, the instructions on
  http://subversion.apache.org/issue-tracker.html say that you should
  send your report to users@, not dev@. So I'm adding users@. Please
  drop dev@ from any further replies.
 
   Basically, we're trying to merge a rather large collection of
   fixes back in our trunk.  I check out a fresh copy of the trunk,
   then use the merge syntax: svn merge https://path/to/my/branch .
  
   This generally churns along just fine, but we occasionally get
   hung up on medium sized binary files where the svn client jumps
   to 100% cpu usage and sits on it for 3+ hours before moving on to
   the next file.  These files are anywhere from 3-10MB in size, so
   not ridiculously huge.  We generally have these files marked as
   octet stream, but changing to text did not help the situation
   when we tried that.
  
   I did find an old forum discussion about a potential issue that
   could be related.  I was wondering if this was ever addressed and
   could it still be the same problem.  Link is here:
   http://www.svnforum.org/threads/36123-Slow-SVN-merge
  
   I'm using svn client 1.6.12.  I looked at the online change log
   up through the 1.7 alphas and didn't see any bug fixes that
   sounded relevant.
 
  This could be a relevant change (listed in the 1.7 release notes, not
  in the change log):
 
  http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
 
  Can you please try one of the 1.7 pre-release binaries, and see if it
  helps? See http://subversion.apache.org/packages.html#pre-release
 
  Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
  difference.  Anything else I can try?
 
 Hm, that's unfortunate.
 
 Actually, it was to be expected that this wouldn't help, because the
 diff-optimizations in 1.7 only play a role when merging text files
 (and diffing and blaming). And you said those
 files-that-make-merge-hang are generally marked as octet-stream, and
 changing them to text made no difference.
 
 That seems to indicate that the 100% cpu usage on the client isn't
 spent in the diff code (unlike the forum thread that you linked to,
 where the poster tracked it down to libsvn_diff/lcs.c --- he would
 definitely have been helped by the 1.7 improvements).
 

What does 'svn merge' do for binary files?  I checked svn_wc__merge()
a few months ago and for binary files all it knew to do was

(a) if mine == merge-left then set merged := merge-right
(b) invoke the configured diff3-cmd
(c) raise a conflict

but it didn't do any line-based merge (per Johan's second response).

 So there's another reason. Maybe it has something to do with (lots of)
 subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
 on directories and files all over the place?
 
 Also: can you tell us what version is running on the server?
 
 Maybe other people on this list have had similar experiences, and can
 give some suggestions?
 

Another line of thought: the algorithm for computing binary deltas
changed a few years ago, and I recall reading (on old bug reports?)
about some cases in which the delta combiner would be inefficient for
deltas generated by old servers --- i.e., it would be expensive to 'svn
cat' files that were committed to old servers in repositories that
haven't been dumped/loaded by a newer server.

In any case: can you run the merge under a profiler and tell us in what
function(s) time is spent?

Daniel

 -- 
 Johan


Re: svn merge operation extremely slow

2011-10-01 Thread Kyle Leber
Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
difference.  Anything else I can try?

On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn jcor...@gmail.com wrote:

 On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com wrote:
  I've encountered what I think is a problem with subversion, but I'm not
  completely sure (and according to the online instructions I should bring
 it
  up here prior to filing a bug).

 Actually, the instructions on
 http://subversion.apache.org/issue-tracker.html say that you should
 send your report to users@, not dev@. So I'm adding users@. Please
 drop dev@ from any further replies.

  Basically, we're trying to merge a rather large collection of fixes back
 in
  our trunk.  I check out a fresh copy of the trunk, then use the merge
  syntax:
  svn merge https://path/to/my/branch .
 
  This generally churns along just fine, but we occasionally get hung up on
  medium sized binary files where the svn client jumps to 100% cpu usage
 and
  sits on it for 3+ hours before moving on to the next file.  These files
 are
  anywhere from 3-10MB in size, so not ridiculously huge.  We generally
 have
  these files marked as octet stream, but changing to text did not help the
  situation when we tried that.
 
  I did find an old forum discussion about a potential issue that could be
  related.  I was wondering if this was ever addressed and could it still
 be
  the same problem.  Link is here:
  http://www.svnforum.org/threads/36123-Slow-SVN-merge
 
  I'm using svn client 1.6.12.  I looked at the online change log up
 through
  the 1.7 alphas and didn't see any bug fixes that sounded relevant.

 This could be a relevant change (listed in the 1.7 release notes, not
 in the change log):
 http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations

 Can you please try one of the 1.7 pre-release binaries, and see if it
 helps? See http://subversion.apache.org/packages.html#pre-release

 Cheers,
 --
 Johan



Re: svn merge operation extremely slow

2011-10-01 Thread Kyle Leber
What method of profiling do you recommend?  I have used gprof previously
(it's been awhile) but am not familiar with the subversion project source
code and build setup.  Is the a online guide or wiki describing the
preferred setup for performing this?

Kyle

On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf d...@daniel.shahaf.namewrote:

 Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
  [ Please do not top-post on this list, i.e. please put your reply
  below or inline. More below ... ]
 
  On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber kyle.le...@gmail.com wrote:
   On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn jcor...@gmail.com
 wrote:
  
   On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com
 wrote:
I've encountered what I think is a problem with subversion, but I'm
 not
completely sure (and according to the online instructions I should
 bring
it
up here prior to filing a bug).
  
   Actually, the instructions on
   http://subversion.apache.org/issue-tracker.html say that you should
   send your report to users@, not dev@. So I'm adding users@. Please
   drop dev@ from any further replies.
  
Basically, we're trying to merge a rather large collection of
fixes back in our trunk.  I check out a fresh copy of the trunk,
then use the merge syntax: svn merge https://path/to/my/branch .
   
This generally churns along just fine, but we occasionally get
hung up on medium sized binary files where the svn client jumps
to 100% cpu usage and sits on it for 3+ hours before moving on to
the next file.  These files are anywhere from 3-10MB in size, so
not ridiculously huge.  We generally have these files marked as
octet stream, but changing to text did not help the situation
when we tried that.
   
I did find an old forum discussion about a potential issue that
could be related.  I was wondering if this was ever addressed and
could it still be the same problem.  Link is here:
http://www.svnforum.org/threads/36123-Slow-SVN-merge
   
I'm using svn client 1.6.12.  I looked at the online change log
up through the 1.7 alphas and didn't see any bug fixes that
sounded relevant.
  
   This could be a relevant change (listed in the 1.7 release notes, not
   in the change log):
  
  
 http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
  
   Can you please try one of the 1.7 pre-release binaries, and see if it
   helps? See http://subversion.apache.org/packages.html#pre-release
  
   Thanks, Johan.  I tested with 1.7rc4 and it did not make any
 perceptible
   difference.  Anything else I can try?
 
  Hm, that's unfortunate.
 
  Actually, it was to be expected that this wouldn't help, because the
  diff-optimizations in 1.7 only play a role when merging text files
  (and diffing and blaming). And you said those
  files-that-make-merge-hang are generally marked as octet-stream, and
  changing them to text made no difference.
 
  That seems to indicate that the 100% cpu usage on the client isn't
  spent in the diff code (unlike the forum thread that you linked to,
  where the poster tracked it down to libsvn_diff/lcs.c --- he would
  definitely have been helped by the 1.7 improvements).
 

 What does 'svn merge' do for binary files?  I checked svn_wc__merge()
 a few months ago and for binary files all it knew to do was

 (a) if mine == merge-left then set merged := merge-right
 (b) invoke the configured diff3-cmd
 (c) raise a conflict

 but it didn't do any line-based merge (per Johan's second response).

  So there's another reason. Maybe it has something to do with (lots of)
  subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
  on directories and files all over the place?
 
  Also: can you tell us what version is running on the server?
 
  Maybe other people on this list have had similar experiences, and can
  give some suggestions?
 

 Another line of thought: the algorithm for computing binary deltas
 changed a few years ago, and I recall reading (on old bug reports?)
 about some cases in which the delta combiner would be inefficient for
 deltas generated by old servers --- i.e., it would be expensive to 'svn
 cat' files that were committed to old servers in repositories that
 haven't been dumped/loaded by a newer server.

 In any case: can you run the merge under a profiler and tell us in what
 function(s) time is spent?

 Daniel

  --
  Johan



Re: svn merge operation extremely slow

2011-10-01 Thread Daniel Shahaf
gprof is what I'm familiar with (nutshell: compile with 'gcc -pg' and
read gmon.out).  There are no specific profiling docs for svn; if you
need more specific advice please post to the dev@ list.  Thanks!

Kyle Leber wrote on Sat, Oct 01, 2011 at 19:33:10 -0400:
 What method of profiling do you recommend?  I have used gprof previously
 (it's been awhile) but am not familiar with the subversion project source
 code and build setup.  Is the a online guide or wiki describing the
 preferred setup for performing this?
 
 Kyle
 
 On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf d...@daniel.shahaf.namewrote:
 
  Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
   [ Please do not top-post on this list, i.e. please put your reply
   below or inline. More below ... ]
  
   On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber kyle.le...@gmail.com wrote:
On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn jcor...@gmail.com
  wrote:
   
On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com
  wrote:
 I've encountered what I think is a problem with subversion, but I'm
  not
 completely sure (and according to the online instructions I should
  bring
 it
 up here prior to filing a bug).
   
Actually, the instructions on
http://subversion.apache.org/issue-tracker.html say that you should
send your report to users@, not dev@. So I'm adding users@. Please
drop dev@ from any further replies.
   
 Basically, we're trying to merge a rather large collection of
 fixes back in our trunk.  I check out a fresh copy of the trunk,
 then use the merge syntax: svn merge https://path/to/my/branch .

 This generally churns along just fine, but we occasionally get
 hung up on medium sized binary files where the svn client jumps
 to 100% cpu usage and sits on it for 3+ hours before moving on to
 the next file.  These files are anywhere from 3-10MB in size, so
 not ridiculously huge.  We generally have these files marked as
 octet stream, but changing to text did not help the situation
 when we tried that.

 I did find an old forum discussion about a potential issue that
 could be related.  I was wondering if this was ever addressed and
 could it still be the same problem.  Link is here:
 http://www.svnforum.org/threads/36123-Slow-SVN-merge

 I'm using svn client 1.6.12.  I looked at the online change log
 up through the 1.7 alphas and didn't see any bug fixes that
 sounded relevant.
   
This could be a relevant change (listed in the 1.7 release notes, not
in the change log):
   
   
  http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
   
Can you please try one of the 1.7 pre-release binaries, and see if it
helps? See http://subversion.apache.org/packages.html#pre-release
   
Thanks, Johan.  I tested with 1.7rc4 and it did not make any
  perceptible
difference.  Anything else I can try?
  
   Hm, that's unfortunate.
  
   Actually, it was to be expected that this wouldn't help, because the
   diff-optimizations in 1.7 only play a role when merging text files
   (and diffing and blaming). And you said those
   files-that-make-merge-hang are generally marked as octet-stream, and
   changing them to text made no difference.
  
   That seems to indicate that the 100% cpu usage on the client isn't
   spent in the diff code (unlike the forum thread that you linked to,
   where the poster tracked it down to libsvn_diff/lcs.c --- he would
   definitely have been helped by the 1.7 improvements).
  
 
  What does 'svn merge' do for binary files?  I checked svn_wc__merge()
  a few months ago and for binary files all it knew to do was
 
  (a) if mine == merge-left then set merged := merge-right
  (b) invoke the configured diff3-cmd
  (c) raise a conflict
 
  but it didn't do any line-based merge (per Johan's second response).
 
   So there's another reason. Maybe it has something to do with (lots of)
   subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
   on directories and files all over the place?
  
   Also: can you tell us what version is running on the server?
  
   Maybe other people on this list have had similar experiences, and can
   give some suggestions?
  
 
  Another line of thought: the algorithm for computing binary deltas
  changed a few years ago, and I recall reading (on old bug reports?)
  about some cases in which the delta combiner would be inefficient for
  deltas generated by old servers --- i.e., it would be expensive to 'svn
  cat' files that were committed to old servers in repositories that
  haven't been dumped/loaded by a newer server.
 
  In any case: can you run the merge under a profiler and tell us in what
  function(s) time is spent?
 
  Daniel
 
   --
   Johan
 


Re: svn merge operation extremely slow

2011-09-30 Thread Johan Corveleyn
On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber kyle.le...@gmail.com wrote:
 I've encountered what I think is a problem with subversion, but I'm not
 completely sure (and according to the online instructions I should bring it
 up here prior to filing a bug).

Actually, the instructions on
http://subversion.apache.org/issue-tracker.html say that you should
send your report to users@, not dev@. So I'm adding users@. Please
drop dev@ from any further replies.

 Basically, we're trying to merge a rather large collection of fixes back in
 our trunk.  I check out a fresh copy of the trunk, then use the merge
 syntax:
 svn merge https://path/to/my/branch .

 This generally churns along just fine, but we occasionally get hung up on
 medium sized binary files where the svn client jumps to 100% cpu usage and
 sits on it for 3+ hours before moving on to the next file.  These files are
 anywhere from 3-10MB in size, so not ridiculously huge.  We generally have
 these files marked as octet stream, but changing to text did not help the
 situation when we tried that.

 I did find an old forum discussion about a potential issue that could be
 related.  I was wondering if this was ever addressed and could it still be
 the same problem.  Link is here:
 http://www.svnforum.org/threads/36123-Slow-SVN-merge

 I'm using svn client 1.6.12.  I looked at the online change log up through
 the 1.7 alphas and didn't see any bug fixes that sounded relevant.

This could be a relevant change (listed in the 1.7 release notes, not
in the change log):
http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations

Can you please try one of the 1.7 pre-release binaries, and see if it
helps? See http://subversion.apache.org/packages.html#pre-release

Cheers,
-- 
Johan