Re: svn merge operation extremely slow
Yup, trunk version has empty properties branch version has: svn:mime-type application/octet-stream On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf wrote: > Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400: > > Johan, > > > > I did a little more digging. There were a few different places where svn > > seems to get hung up so I ran the gprof report on just the first one (the > > merge takes hours otherwise). In this particular case, svn prints out > that > > it is merging from a small text file while it is hanging for more than a > > minute @ 100% CPU. When I examine "lsof", however, it see it actually > has a > > different file open. This one is a large (15 MB) "binary" file. It > turns > > out this binary file did not have a property in the trunk (which I think > > means it's treated as text, right?). But in the branch it was marked as > > octet stream. So perhaps svn is doing a text-based diff on this binary > > file because it used to be incorrectly marked as text? > > > > If either side is marked as binary then svn will defer to the "Use > merge-right if merge-left == base, else conflict" algorithm. > > Could you share the value of 'svn proplist --verbose' on both files? > > Thanks, > > Daniel >
Re: svn merge operation extremely slow
Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400: > Johan, > > I did a little more digging. There were a few different places where svn > seems to get hung up so I ran the gprof report on just the first one (the > merge takes hours otherwise). In this particular case, svn prints out that > it is merging from a small text file while it is hanging for more than a > minute @ 100% CPU. When I examine "lsof", however, it see it actually has a > different file open. This one is a large (15 MB) "binary" file. It turns > out this binary file did not have a property in the trunk (which I think > means it's treated as text, right?). But in the branch it was marked as > octet stream. So perhaps svn is doing a text-based diff on this binary > file because it used to be incorrectly marked as text? > If either side is marked as binary then svn will defer to the "Use merge-right if merge-left == base, else conflict" algorithm. Could you share the value of 'svn proplist --verbose' on both files? Thanks, Daniel
Re: svn merge operation extremely slow
Johan, I did a little more digging. There were a few different places where svn seems to get hung up so I ran the gprof report on just the first one (the merge takes hours otherwise). In this particular case, svn prints out that it is merging from a small text file while it is hanging for more than a minute @ 100% CPU. When I examine "lsof", however, it see it actually has a different file open. This one is a large (15 MB) "binary" file. It turns out this binary file did not have a property in the trunk (which I think means it's treated as text, right?). But in the branch it was marked as octet stream. So perhaps svn is doing a text-based diff on this binary file because it used to be incorrectly marked as text? Side-note: The contents of this 15MB file are actually ASCII, but we do want it treated as binary b/c line-based merges are never valid. Another snippet from the same gprof report is below. Cheers, Kyle --- 0.00 144.03 27/27 do_text_merge [10] [11]95.90.00 144.03 27 svn_diff_file_diff3_2 [11] 0.01 144.02 27/27 svn_diff_diff3_2 [12] 0.000.00 27/5723apr_pool_destroy [833] 0.000.00 27/6430svn_pool_create_ex [1558] --- 0.01 144.02 27/27 svn_diff_file_diff3_2 [11] [12]95.90.01 144.02 27 svn_diff_diff3_2 [12] 8.64 128.73 54/56 svn_diff(long, char, short) [13] 0.015.09 21014/21014 svn_diff__resolve_conflict [15] 0.031.51 81/81 svn_diff__get_tokens [25] 0.000.01 27/27 datasources_open [272] 0.010.00 81/85 svn_diff__get_token_counts [284] 0.000.00 42065/2476341 apr_palloc [136] 0.000.00 27/27 svn_diff__tree_create [1235] 0.000.00 54/5723apr_pool_destroy [833] 0.000.00 27/27 token_discard_all [1282] 0.000.00 54/6430svn_pool_create_ex [1558] 0.000.00 27/27 svn_diff__get_node_count [1911] --- 0.324.77 2/56 svn_diff__resolve_conflict [15] 8.64 128.73 54/56 svn_diff_diff3_2 [12] [13]94.88.96 133.49 56 svn_diff(long, char, short) [13] 133.490.00 2002891836/2002891836 svn_diff__snake [14] 0.000.00 224/2476341 apr_palloc [136] 0.000.00 64/64 prepend_lcs [1103] 0.000.00 56/56 svn_diff__lcs_reverse [1875] --- 133.490.00 2002891836/2002891836 svn_diff(long, char, short) [13] [14]88.9 133.490.00 2002891836 svn_diff__snake [14] 0.000.00 168906/2476341 apr_palloc [136] On Sun, Oct 2, 2011 at 5:58 PM, Johan Corveleyn wrote: > On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber wrote: > > I was able to capture a profile from svn (after remembering I have to > link > > statically). I compiled with "-pg -O0" Here is the top of the file: > > > > Each sample counts as 0.01 seconds. > > % cumulative self self total > > time seconds secondscalls s/call s/call name > > 88.88133.49 133.49 2002891836 0.00 0.00 svn_diff__snake > > 5.97142.45 8.96 56 0.16 2.54 svn_diff(long, > char, > > short) > > 1.98145.42 2.97 4163001 0.00 0.00 MD5Transform > > 0.41146.04 0.62 4163001 0.00 0.00 Decode > > What's it doing in svn_diff__snake (or svn_diff for that matter)? That > should only be hit when svn is doing textual merges (in which case it > must do rather expensive diff calculations --- I'm sure those > calculations can go ballistic when being confronted with a large > binary file, not consisting of text lines). > > Are you sure those files were actually marked as binary (svn:mime-type > of application/octet-stream or something else non-texty)? > > -- > Johan >
Re: svn merge operation extremely slow
On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber wrote: > I was able to capture a profile from svn (after remembering I have to link > statically). I compiled with "-pg -O0" Here is the top of the file: > > Each sample counts as 0.01 seconds. > % cumulative self self total > time seconds seconds calls s/call s/call name > 88.88 133.49 133.49 2002891836 0.00 0.00 svn_diff__snake > 5.97 142.45 8.96 56 0.16 2.54 svn_diff(long, char, > short) > 1.98 145.42 2.97 4163001 0.00 0.00 MD5Transform > 0.41 146.04 0.62 4163001 0.00 0.00 Decode What's it doing in svn_diff__snake (or svn_diff for that matter)? That should only be hit when svn is doing textual merges (in which case it must do rather expensive diff calculations --- I'm sure those calculations can go ballistic when being confronted with a large binary file, not consisting of text lines). Are you sure those files were actually marked as binary (svn:mime-type of application/octet-stream or something else non-texty)? -- Johan
Re: svn merge operation extremely slow
Kyle Leber wrote on Sun, Oct 02, 2011 at 17:08:16 -0400: > Is it OK to attach the full report to this user list? The resulting text > file is 1.3MB and I wasn't sure if the list would tolerate an attachment of > that size. It would be better to upload it somewhere and send a link to this list, or to digest the report and post only the highlights to this list (as you have done).
Re: svn merge operation extremely slow
On 10/2/2011 2:08 PM, Kyle Leber wrote: I was able to capture a profile from svn (after remembering I have to link statically). I compiled with "-pg -O0" Here is the top of the file: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 88.88133.49 133.49 2002891836 0.00 0.00 svn_diff__snake 5.97142.45 8.96 56 0.16 2.54 svn_diff(long, char, short) 1.98145.42 2.97 4163001 0.00 0.00 MD5Transform 0.41146.04 0.62 4163001 0.00 0.00 Decode Is it OK to attach the full report to this user list? The resulting text file is 1.3MB and I wasn't sure if the list would tolerate an attachment of that size. It's a weekend, so you might not get a lot of replies from people who know SVN source code, but it's likely that the full report won't be needed. There are two billion (!) calls to svn_diff__snake(), and the question is why there are so many. It might help the devs if you pasted in the entries for functions which directly called svn_diff__snake() (quite possibly svn_diff() only) and perhaps the functions which svn_diff__snake() called directly (none of any significance, if I read the above report correctly). This should be only a few dozen lines of the report. Note that you'll have to trace through the report (the top-level function is listed first, followed by its children, grandchildren, etc.) to find the entries for these functions. I have a suspicion that one of the devs will be able to identify the issue from just the above report, but a little more information might turn out to be helpful. They certainly won't need to see information for all of the zillion functions in SVN. -- David Chapman dcchap...@acm.org Chapman Consulting -- San Jose, CA
Re: svn merge operation extremely slow
I was able to capture a profile from svn (after remembering I have to link statically). I compiled with "-pg -O0" Here is the top of the file: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 88.88133.49 133.49 2002891836 0.00 0.00 svn_diff__snake 5.97142.45 8.96 56 0.16 2.54 svn_diff(long, char, short) 1.98145.42 2.97 4163001 0.00 0.00 MD5Transform 0.41146.04 0.62 4163001 0.00 0.00 Decode Is it OK to attach the full report to this user list? The resulting text file is 1.3MB and I wasn't sure if the list would tolerate an attachment of that size. Cheers, Kyle On Sat, Oct 1, 2011 at 7:55 PM, Daniel Shahaf wrote: > gprof is what I'm familiar with (nutshell: compile with 'gcc -pg' and > read gmon.out). There are no specific profiling docs for svn; if you > need more specific advice please post to the dev@ list. Thanks! > > Kyle Leber wrote on Sat, Oct 01, 2011 at 19:33:10 -0400: > > What method of profiling do you recommend? I have used gprof previously > > (it's been awhile) but am not familiar with the subversion project source > > code and build setup. Is the a online guide or wiki describing the > > preferred setup for performing this? > > > > Kyle > > > > On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf >wrote: > > > > > Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200: > > > > [ Please do not top-post on this list, i.e. please put your reply > > > > below or inline. More below ... ] > > > > > > > > On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber > wrote: > > > > > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn < > jcor...@gmail.com> > > > wrote: > > > > >> > > > > >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber > > > > wrote: > > > > >> > I've encountered what I think is a problem with subversion, but > I'm > > > not > > > > >> > completely sure (and according to the online instructions I > should > > > bring > > > > >> > it > > > > >> > up here prior to filing a bug). > > > > >> > > > > >> Actually, the instructions on > > > > >> http://subversion.apache.org/issue-tracker.html say that you > should > > > > >> send your report to users@, not dev@. So I'm adding users@. > Please > > > > >> drop dev@ from any further replies. > > > > >> > > > > >> > Basically, we're trying to merge a rather large collection of > > > > >> > fixes back in our trunk. I check out a fresh copy of the trunk, > > > > >> > then use the merge syntax: svn merge https://path/to/my/branch. > > > > >> > > > > > >> > This generally churns along just fine, but we occasionally get > > > > >> > hung up on medium sized binary files where the svn client jumps > > > > >> > to 100% cpu usage and sits on it for 3+ hours before moving on > to > > > > >> > the next file. These files are anywhere from 3-10MB in size, so > > > > >> > not ridiculously huge. We generally have these files marked as > > > > >> > octet stream, but changing to text did not help the situation > > > > >> > when we tried that. > > > > >> > > > > > >> > I did find an old forum discussion about a potential issue that > > > > >> > could be related. I was wondering if this was ever addressed > and > > > > >> > could it still be the same problem. Link is here: > > > > >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge > > > > >> > > > > > >> > I'm using svn client 1.6.12. I looked at the online change log > > > > >> > up through the 1.7 alphas and didn't see any bug fixes that > > > > >> > sounded relevant. > > > > >> > > > > >> This could be a relevant change (listed in the 1.7 release notes, > not > > > > >> in the change log): > > > > >> > > > > >> > > > > http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations > > > > >> > > > > >> Can you please try one of the 1.7 pre-release binaries, and see if > it > > > > >> helps? See http://subversion.apache.org/packages.html#pre-release > > > > >> > > > > > Thanks, Johan. I tested with 1.7rc4 and it did not make any > > > perceptible > > > > > difference. Anything else I can try? > > > > > > > > Hm, that's unfortunate. > > > > > > > > Actually, it was to be expected that this wouldn't help, because the > > > > diff-optimizations in 1.7 only play a role when merging text files > > > > (and diffing and blaming). And you said those > > > > "files-that-make-merge-hang" are generally marked as octet-stream, > and > > > > changing them to text made no difference. > > > > > > > > That seems to indicate that the 100% cpu usage on the client isn't > > > > spent in the diff code (unlike the forum thread that you linked to, > > > > where the poster tracked it down to libsvn_diff/lcs.c --- he would > > > > definitely have been helped by the 1.7 improvements). > > > > > > > > > > What does 'svn merge' do for binary files? I checked svn_wc__merge() > > > a few months ago and for binary files all it knew to do was > >
Re: File access control
Guten Tag Grant, am Sonntag, 2. Oktober 2011 um 03:07 schrieben Sie: > All of the big enterprise websites allow each of their developers to > check out a full working copy of the company code with only an NDA/NCC > to protect them? Mostly, yes and why do you think this is not enough? > It would be so easy for any of them to use, sell, or > give the code away, or even to accidentally allow an unauthorized > person access to it. No hard feelings, but your thread doesn't read that whatever you try to protect is interesting enough for anyone out there to justify the time you spent on thinking about how to protect your code against god and the world. You speak of hiring a developer, if it's not the son of a friend who worked with computers in school, but a professional, he will act like a professional if he wants to get paid without problems. Mit freundlichen Grüßen, Thorsten Schöning -- Thorsten Schöning AM-SoFT IT-Systeme - Hameln | Potsdam | Leipzig Telefon: Potsdam: 0331-743881-0 E-Mail: tschoen...@am-soft.de Web: http://www.am-soft.de AM-SoFT GmbH IT-Systeme, Konsumhof 1-5, 14482 Potsdam Amtsgericht Potsdam HRB 21278 P, Geschäftsführer: Andreas Muchow