Re: [serf-dev] Re: What stands between us and branching 1.7?
On Fri, Jan 21, 2011 at 7:59 AM, Justin Erenkrantz jus...@erenkrantz.comwrote: On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts svn...@mobsol.be wrote: Greg or Lieven, any thoughts here? -- justin At least the one rev that fixes this issue, don't know if the other are already working in all scenario's. I'll look at it this weekend and make a release. Woohoo. Thanks. I'll be sure to test it. Let me know if you need any help. -- justin I've prepared branch 0.7.x with the changes that are safe to go in serf 0.7.1. I'm running some tests on Linux, Windows 7 and Mac Os X. On the latter svn trunk currently doesn't build for me (is_atomicity_error symbol not found), but doesn't seem to be related to serf. I'd appreciate some more testing, before I make the release later this week. The code to get is: http://serf.googlecode.com/svn/branches/0.7.x at r1427. Lieven
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 02:41:49PM +0100, Lieven Govaerts wrote: On Fri, Jan 21, 2011 at 7:59 AM, Justin Erenkrantz jus...@erenkrantz.comwrote: On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts svn...@mobsol.be wrote: Greg or Lieven, any thoughts here? -- justin At least the one rev that fixes this issue, don't know if the other are already working in all scenario's. I'll look at it this weekend and make a release. Woohoo. Thanks. I'll be sure to test it. Let me know if you need any help. -- justin I've prepared branch 0.7.x with the changes that are safe to go in serf 0.7.1. I'm running some tests on Linux, Windows 7 and Mac Os X. On the latter svn trunk currently doesn't build for me (is_atomicity_error symbol not found), but doesn't seem to be related to serf. I'd appreciate some more testing, before I make the release later this week. The code to get is: http://serf.googlecode.com/svn/branches/0.7.x at r1427. The svn-openbsd-i386 buildbot is using serf 0.7.x. The working copy isn't updated automatically, but I've just removed it so it should get a fresh checkout during the next build and run all tests with the newest serf. Stean
Re: [serf-dev] Re: What stands between us and branching 1.7?
Lieven Govaerts wrote on Sun, Jan 23, 2011 at 14:41:49 +0100: I'm running some tests on Linux, Windows 7 and Mac Os X. On the latter svn trunk currently doesn't build for me (is_atomicity_error symbol not found), but doesn't seem to be related to serf. That's odd: in HEAD, that symbol only appears in ra/util.c, and it's declared static there.
Re: [PATCH] extend svn_subst_translate_string2() with a REPAIR parameter
Ping. I haven't yet received feedback on this patch. On Fri, Jan 14, 2011 at 5:57 PM, Danny Trebbien dtrebb...@gmail.com wrote: On Fri, Jan 14, 2011 at 12:32 PM, Danny Trebbien dtrebb...@gmail.com wrote: Attached are a patch and log message to extend svn_subst_translate_string2() with a new parameter, REPAIR. This allows the caller to customize whether the translation repairs line endings. I noticed a slight problem with the extra asserts in test_svn_subst_translate_string2() of subversion/tests/libsvn_subr/subst_translate-test.c that this patch adds. When REPAIR is FALSE and VALUE has inconsistent line endings, then an SVN_ERR_IO_INCONSISTENT_EOL error is returned. If, however, a future change causes svn_subst_translate_string2() to return SVN_NO_ERROR, then the assert that the error was SVN_ERR_IO_INCONSISTENT_EOL would dereference a NULL pointer. Attached are the patch and log message for version 1.1 of this patch. The new patch corrects this flaw. The new log message corrects a grammar mistake.
Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 4)
Ping. I haven't yet received feedback on this patch. On Fri, Jan 14, 2011 at 1:01 PM, Danny Trebbien dtrebb...@gmail.com wrote: Attached are version 4 of the patch and log message. Also attached are two TGZ archives containing the six DUMP files that the new svnsync and svnrdump tests depend upon. These TGZ archives can be extracted directly into your working copy of trunk. Note that this patch depends on http://thread.gmane.org/gmane.comp.version-control.subversion.devel/125056, which must be committed first.
Re: [serf-dev] Re: What stands between us and branching 1.7?
On 23.01.2011 14:41, Lieven Govaerts wrote: On Fri, Jan 21, 2011 at 7:59 AM, Justin Erenkrantz jus...@erenkrantz.com mailto:jus...@erenkrantz.com wrote: On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts svn...@mobsol.be mailto:svn...@mobsol.be wrote: Greg or Lieven, any thoughts here? -- justin At least the one rev that fixes this issue, don't know if the other are already working in all scenario's. I'll look at it this weekend and make a release. Woohoo. Thanks. I'll be sure to test it. Let me know if you need any help. -- justin I've prepared branch 0.7.x with the changes that are safe to go in serf 0.7.1. I'm running some tests on Linux, Windows 7 and Mac Os X. On the latter svn trunk currently doesn't build for me (is_atomicity_error symbol not found), but doesn't seem to be related to serf. I'd appreciate some more testing, before I make the release later this week. The code to get is: http://serf.googlecode.com/svn/branches/0.7.x at r1427. I've used a build from serf trunk for a while now without problems. The serf log indicates that r1416 has been merged, but r1417 has not. Is there a reason why this memory leak fix hasn't been merged to 0.7.x? Stefan -- ___ oo // \\ De Chelonian Mobile (_,\/ \_/ \ TortoiseSVN \ \_/_\_/The coolest Interface to (Sub)Version Control /_/ \_\ http://tortoisesvn.net
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 3:59 PM, Stefan Küng tortoise...@gmail.com wrote: On 23.01.2011 14:41, Lieven Govaerts wrote: On Fri, Jan 21, 2011 at 7:59 AM, Justin Erenkrantz jus...@erenkrantz.com mailto:jus...@erenkrantz.com wrote: On Thu, Jan 20, 2011 at 3:18 PM, Lieven Govaerts svn...@mobsol.be mailto:svn...@mobsol.be wrote: Greg or Lieven, any thoughts here? -- justin At least the one rev that fixes this issue, don't know if the other are already working in all scenario's. I'll look at it this weekend and make a release. Woohoo. Thanks. I'll be sure to test it. Let me know if you need any help. -- justin I've prepared branch 0.7.x with the changes that are safe to go in serf 0.7.1. I'm running some tests on Linux, Windows 7 and Mac Os X. On the latter svn trunk currently doesn't build for me (is_atomicity_error symbol not found), but doesn't seem to be related to serf. I'd appreciate some more testing, before I make the release later this week. The code to get is: http://serf.googlecode.com/svn/branches/0.7.x at r1427. I've used a build from serf trunk for a while now without problems. The serf log indicates that r1416 has been merged, but r1417 has not. Is there a reason why this memory leak fix hasn't been merged to 0.7.x? r1417 is part of a refactoring on trunk to move the connection related code in a separate bucket (httpconn). I doubt that that code is really stable yet, so I'd rather not merge it to 0.7.x. Lieven
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 5:41 AM, Lieven Govaerts svn...@mobsol.be wrote: I'd appreciate some more testing, before I make the release later this week. The code to get is: http://serf.googlecode.com/svn/branches/0.7.x at r1427. % uname -v Darwin Kernel Version 10.5.0: Fri Nov 5 23:20:39 PDT 2010; root:xnu-1504.9.17~1/RELEASE_I386 (OS X 10.6.5) % ln -s ~/work/serf-0.7.x/test/serftestca.pem test/serftestca.pem (I use VPATH builds, so had to bring over serftestca.pem into the build dir to get the 2 SSL tests to pass) % ./test/test_all . OK (17 tests) And, 0.7.x builds fine against latest SVN 1.6.x branches and basic co/ls functionality via ra_serf seems good. Looks good! Thanks! -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On 23.01.2011 16:09, Lieven Govaerts wrote: I've used a build from serf trunk for a while now without problems. The serf log indicates that r1416 has been merged, but r1417 has not. Is there a reason why this memory leak fix hasn't been merged to 0.7.x? r1417 is part of a refactoring on trunk to move the connection related code in a separate bucket (httpconn). I doubt that that code is really stable yet, so I'd rather not merge it to 0.7.x. Ok, tested with serf from the 0.7.x branch: memory rise is still higher than with neon, indicating that there's still some (small) memory leak somewhere. But checkouts and updates of even larger projects succeed without using up all available memory as before. To compare: checkout of the TSVN source including all externals with serf uses up about 40MB more RAM than when using neon. I'd say that's ok. Stefan -- ___ oo // \\ De Chelonian Mobile (_,\/ \_/ \ TortoiseSVN \ \_/_\_/The coolest Interface to (Sub)Version Control /_/ \_\ http://tortoisesvn.net
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 8:42 AM, Stefan Küng tortoise...@gmail.com wrote: Ok, tested with serf from the 0.7.x branch: memory rise is still higher than with neon, indicating that there's still some (small) memory leak somewhere. But checkouts and updates of even larger projects succeed without using up all available memory as before. To compare: checkout of the TSVN source including all externals with serf uses up about 40MB more RAM than when using neon. I'd say that's ok. Does the memory keep going up? Or, does it reach a steady point? I'd expect that ra_serf would use a bit more memory than ra_neon as it has to manage a lot more than what neon has to do. As a point of reference, on Mac OS X 10.6, svn 1.6.x with ra_serf checking out svn trunk peaks at 81MB, while ra_neon peaks at 12MB. HTH. -- justin
Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)
I think this probably belongs in the book, but I don't know exactly where...so I'll ensure that it's noted again and let someone else figure out where it should live. As we start to have more users with ra_serf, we should also recommend that httpd server admins bump up their MaxKeepAliveRequests from 100 (default) to at least 1000. This'll help ra_serf utilize TCP connections more efficiently - as ra_serf issues 2 HTTP requests per file that it needs to check out or update. ra_serf has heuristics so that it'll work gracefully under the default settings (and determine what the value actually is), but you'll get better network performance if you can utilize fewer TCP connections. This way, ra_serf will only need to open a new connection every 500 files rather than every 50. (Honestly, there is really no reason that it couldn't be 1.) If we also have a tuning section, we should remind folks to also enable mod_deflate (SetOutputFilter DEFLATE in their Location block) as that'll help a bit when transferring XML content back and forth. mod_deflate will come at a slight latency penalty, but that'll be offset for folks with slower connections taking less time to transfer the responses overall. FWIW, I've already asked ASF infra to bump up the MaxKeepAliveRequests settings for svn.apache.org. Thanks! -- justin
Re: [serf-dev] Re: What stands between us and branching 1.7?
On 23.01.2011 18:06, Justin Erenkrantz wrote: On Sun, Jan 23, 2011 at 8:42 AM, Stefan Küngtortoise...@gmail.com wrote: Ok, tested with serf from the 0.7.x branch: memory rise is still higher than with neon, indicating that there's still some (small) memory leak somewhere. But checkouts and updates of even larger projects succeed without using up all available memory as before. To compare: checkout of the TSVN source including all externals with serf uses up about 40MB more RAM than when using neon. I'd say that's ok. Does the memory keep going up? Or, does it reach a steady point? I'd expect that ra_serf would use a bit more memory than ra_neon as it has to manage a lot more than what neon has to do. As a point of reference, on Mac OS X 10.6, svn 1.6.x with ra_serf checking out svn trunk peaks at 81MB, while ra_neon peaks at 12MB. I can't really say for sure if serf memory keeps going up or not. I was testing with TSVN and there the memory has to keep going up since all the output is stored in memory for the UI controls to show to the user. I can only say for sure that the memory for both neon and serf goes up at approximately the same rate. Only with serf, memory goes up at the beginning very fast until about 40-50MB more than with neon, but then keeps that 'distance' (well, almost: the distance increases a little bit, but not much - around 8kbytes per two seconds) to neon until the end. Stefan -- ___ oo // \\ De Chelonian Mobile (_,\/ \_/ \ TortoiseSVN \ \_/_\_/The coolest Interface to (Sub)Version Control /_/ \_\ http://tortoisesvn.net
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 6:06 PM, Justin Erenkrantz jus...@erenkrantz.comwrote: On Sun, Jan 23, 2011 at 8:42 AM, Stefan Küng tortoise...@gmail.com wrote: Ok, tested with serf from the 0.7.x branch: memory rise is still higher than with neon, indicating that there's still some (small) memory leak somewhere. But checkouts and updates of even larger projects succeed without using up all available memory as before. To compare: checkout of the TSVN source including all externals with serf uses up about 40MB more RAM than when using neon. I'd say that's ok. Does the memory keep going up? Or, does it reach a steady point? I'd expect that ra_serf would use a bit more memory than ra_neon as it has to manage a lot more than what neon has to do. As a point of reference, on Mac OS X 10.6, svn 1.6.x with ra_serf checking out svn trunk peaks at 81MB, while ra_neon peaks at 12MB. I have been looking at ra_serf's memory usage with svn trunk at the last svn days in Berlin and found that most memory was allocated and held in the sqlite code. Things have changed since then though. Can anyone test with svn 1.6.x to see how it uses memory? Lieven
Re: Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)
On Sun, Jan 23, 2011 at 09:27:18AM -0800, Justin Erenkrantz wrote: If we also have a tuning section, we should remind folks to also enable mod_deflate (SetOutputFilter DEFLATE in their Location block) as that'll help a bit when transferring XML content back and forth. mod_deflate will come at a slight latency penalty, but that'll be offset for folks with slower connections taking less time to transfer the responses overall. Has this problem been fixed, then? http://svn.haxx.se/dev/archive-2009-08/0274.shtml I don't think we should recommend mod_deflate if that problem still exists. I don't think that thread came to a conclusion. Greg was implying the bug was in mod_deflate rather than svn: http://svn.haxx.se/dev/archive-2009-08/0290.shtml Stefan
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 10:22 AM, Lieven Govaerts svn...@mobsol.be wrote: Things have changed since then though. Can anyone test with svn 1.6.x to see how it uses memory? For ra_serf, I'm wondering if we're creating an additional pool that isn't necessary - namely the editor_pool. I've done some light testing with the ra_serf patch below, which does the following things: - Create a separate pool hierarchy for files - Combines the per-file editor pool with the regular pool for file For serf itself, I've also switched zlib to use the bucket allocator - which also helps which churn - as every call to inflateInit/inflateEnd invokes malloc/free - so we can save on that quite bit. Overall, this seems to reduce the amount of 20KB allocations substantially...but not sure it does much to the overall memory usage. Not sure if/when I'll pick this up again...so patches below if someone else wants to run with it. -- justin Index: update.c === --- update.c(revision 1062462) +++ update.c(working copy) @@ -107,6 +107,7 @@ /* controlling dir baton - this is only created in open_dir() */ void *dir_baton; apr_pool_t *dir_baton_pool; + apr_pool_t *file_pool; /* Our master update editor and baton. */ const svn_delta_editor_t *update_editor; @@ -202,9 +203,6 @@ /* The properties for this file */ apr_hash_t *props; - /* pool passed to update-add_file, etc. */ - apr_pool_t *editor_pool; - /* controlling file_baton and textdelta handler */ void *file_baton; const char *base_checksum; @@ -403,7 +401,7 @@ report_info_t *new_info; new_info = apr_pcalloc(info_parent_pool, sizeof(*new_info)); - apr_pool_create(new_info-pool, info_parent_pool); + apr_pool_create(new_info-pool, info-dir-file_pool); new_info-file_baton = NULL; new_info-lock_token = NULL; new_info-fetch_file = FALSE; @@ -500,6 +498,7 @@ if (dir-base_name[0] == '\0') { apr_pool_create(dir-dir_baton_pool, dir-pool); + apr_pool_create(dir-file_pool, dir-pool); if (dir-report_context-destination dir-report_context-sess-wc_callbacks-invalidate_wc_props) @@ -519,6 +518,7 @@ SVN_ERR(open_dir(dir-parent_dir)); apr_pool_create(dir-dir_baton_pool, dir-parent_dir-dir_baton_pool); + dir-file_pool = dir-parent_dir-file_pool; if (SVN_IS_VALID_REVNUM(dir-base_rev)) { @@ -632,7 +632,7 @@ if (lock_val) { char *new_lock; - new_lock = apr_pstrdup(info-editor_pool, lock_val); + new_lock = apr_pstrdup(info-pool, lock_val); apr_collapse_spaces(new_lock, new_lock); lock_val = new_lock; } @@ -641,7 +641,7 @@ { svn_string_t *str; - str = svn_string_ncreate(, 1, info-editor_pool); + str = svn_string_ncreate(, 1, info-pool); svn_ra_serf__set_ver_prop(info-dir-removed_props, info-base_name, info-base_rev, DAV:, lock-token, @@ -756,13 +756,11 @@ return error_fetch(request, fetch_ctx, err); } - apr_pool_create(info-editor_pool, info-dir-dir_baton_pool); - /* Expand our full name now if we haven't done so yet. */ if (!info-name) { info-name_buf = svn_stringbuf_dup(info-dir-name_buf, - info-editor_pool); + info-pool); svn_path_add_component(info-name_buf, info-base_name); info-name = info-name_buf-data; } @@ -772,7 +770,7 @@ err = info-dir-update_editor-open_file(info-name, info-dir-dir_baton, info-base_rev, -info-editor_pool, +info-pool, info-file_baton); } else @@ -781,7 +779,7 @@ info-dir-dir_baton, info-copyfrom_path, info-copyfrom_rev, - info-editor_pool, + info-pool, info-file_baton); } @@ -792,7 +790,7 @@ err = info-dir-update_editor-apply_textdelta(info-file_baton, info-base_checksum, - info-editor_pool, + info-pool, info-textdelta, info-textdelta_baton); @@ -806,7 +804,7 @@
Re: Tuning recommendations for Apache HTTP Server w/ra_serf clients (MaxKeepAliveRequests)
On Sun, Jan 23, 2011 at 11:26 AM, Stefan Sperling s...@elego.de wrote: On Sun, Jan 23, 2011 at 09:27:18AM -0800, Justin Erenkrantz wrote: If we also have a tuning section, we should remind folks to also enable mod_deflate (SetOutputFilter DEFLATE in their Location block) as that'll help a bit when transferring XML content back and forth. mod_deflate will come at a slight latency penalty, but that'll be offset for folks with slower connections taking less time to transfer the responses overall. Has this problem been fixed, then? http://svn.haxx.se/dev/archive-2009-08/0274.shtml I don't think we should recommend mod_deflate if that problem still exists. No, it doesn't look like it has been fixed yet. However, how many real-world HTTP clients are out there that don't speak zlib? =) I don't think that thread came to a conclusion. Greg was implying the bug was in mod_deflate rather than svn: http://svn.haxx.se/dev/archive-2009-08/0290.shtml Greg is correct in that the patch doesn't go far enough - mod_dav_svn should change all uses of output to be a ** rather than *. As far as mod_deflate being broken, yah, it probably shouldn't be trying to do any memory usage until it knows it is activated...but...that's a topic for another list (dev@httpd). mod_deflate should probably memorize the fact that the checks have already run and get out of the way...probably with setting no-gzip in subprocess_env? -- justin
Re: My take on the diff-optimizations-bytes branch
On Sat, Jan 8, 2011 at 6:50 AM, Stefan Fuhrmann stefanfuhrm...@alice-dsl.de wrote: On 03.01.2011 02:14, Johan Corveleyn wrote: It would be interesting to see where the biggest gains are coming from (I'm guessing from the per-machine-word reading/comparing; I'd like to try that first, maybe together with the allocating of the file[] on the stack; I'd like to avoid special-casing file_len==2, splitting up functions in *_slow and *_fast, because it makes it a lot more difficult imho (so I'd only like to do that if it's a significant contributor)). But if you want to leave that as an exercise for the reader, that's fine as well :-). Exercise is certainly not a bad thing ;) But I think, the stack variable is certainly helpful and easy to do. While chunky operation gives a *big* boost, it is much more difficult to code if you need to compare multiple sources. It should be possible, though. Ok, I finished the exercise :-). - File info structures on stack instead of on heap: committed in r1060625. Gives 10% boost. - chunky operation for multiple sources: committed in r1062532. Gives ~250% boost on my machine. For suffix scanning, I made a couple of changes to the logic you sent in your patch (as you remember, there was a problem with the implementation of suffix scanning [1]). It was (in scan_backward_fast): [[[ +#if SVN_UNALIGNED_ACCESS_IS_OK + + if ( (file[0].curp - sizeof(apr_size_t) file[0].curp) + (file[1].curp - sizeof(apr_size_t) file[1].curp)) +{ + do +{ + file[0].curp -= sizeof(apr_size_t); + file[1].curp -= sizeof(apr_size_t); +} + while ( (file[0].curp = min_curp0) + (file[1].curp = min_curp1) + ( *(const apr_size_t*)(file[0].curp + 1) + == *(const apr_size_t*)(file[1].curp + 1))); + + file[0].curp += sizeof(apr_size_t); + file[1].curp += sizeof(apr_size_t); +} ]]] I changed this to (the N-file equivalent of): [[[ +#if SVN_UNALIGNED_ACCESS_IS_OK + + if ( (file[0].curp - sizeof(apr_size_t) = min_curp0) + (file[1].curp - sizeof(apr_size_t) = min_curp1)) +{ + do +{ + file[0].curp -= sizeof(apr_size_t); + file[1].curp -= sizeof(apr_size_t); +} + while ( (file[0].curp - sizeof(apr_size_t) = min_curp0) + (file[1].curp - sizeof(apr_size_t) = min_curp1) + ( *(const apr_size_t*)(file[0].curp + 1) + == *(const apr_size_t*)(file[1].curp + 1))); + + file[0].curp += sizeof(apr_size_t); + file[1].curp += sizeof(apr_size_t); +} ]]] (so, changed the if-comparisons before the do-while-loop, to make sure there is room to subtract a sizeof(apr_size_t) (the original didn't really make sense to me), and changed the comparisons with min_curpX in the while condition (check if there is still room to subtract a sizeof(apr_size_t)). After that, all tests went fine. Now, you suggested in a previous post in this thread that hardcoding file_len to 2 should speed it up with up to 50%. And you are correct (on my machine, it's sped up by ~40% if I replace all file_len's in find_identical_* with 2). So, that leads me to conclude that I should probably throw away the generic code, and replace it with 3 variants: find_identical_*2, find_identical_*3 and find_identical_*4 (once I write *2, the rest should be mostly copy-n-paste). As you said, this should also vastly simplify the code (drop the for loops, simpler conditionals, ...). Cheers, -- Johan [1] http://svn.haxx.se/dev/archive-2011-01/0022.shtml
Re: My take on the diff-optimizations-bytes branch
On Sun, Jan 23, 2011 at 10:46 PM, Johan Corveleyn jcor...@gmail.com wrote: On Sat, Jan 8, 2011 at 6:50 AM, Stefan Fuhrmann stefanfuhrm...@alice-dsl.de wrote: On 03.01.2011 02:14, Johan Corveleyn wrote: It would be interesting to see where the biggest gains are coming from (I'm guessing from the per-machine-word reading/comparing; I'd like to try that first, maybe together with the allocating of the file[] on the stack; I'd like to avoid special-casing file_len==2, splitting up functions in *_slow and *_fast, because it makes it a lot more difficult imho (so I'd only like to do that if it's a significant contributor)). But if you want to leave that as an exercise for the reader, that's fine as well :-). Exercise is certainly not a bad thing ;) But I think, the stack variable is certainly helpful and easy to do. While chunky operation gives a *big* boost, it is much more difficult to code if you need to compare multiple sources. It should be possible, though. Ok, I finished the exercise :-). - File info structures on stack instead of on heap: committed in r1060625. Gives 10% boost. - chunky operation for multiple sources: committed in r1062532. Gives ~250% boost on my machine. Oops, as some people pointed out on IRC, that should be ~60% (I meant 2.5 times faster, going from ~50s to ~20s, so that's about ~60% less time spent in datasources_open). -- Johan
Assertion failure during update_tests.py 58 (XFAIL: update a nonexistent child of a copied dir)
Hi, Already for some time now, update_tests.py 58 (XFAIL: update a nonexistent child of a copied dir) crashes on my machine: svn: In file '..\..\..\subversion\libsvn_wc\update_editor.c' line 4877: assertion failed (repos_root != NULL repos_uuid != NULL) I understand that this test is XFAIL, that this isn't addressed yet, but is it supposed to fail an assert? On my system (Win XP) this causes an ugly popup to appear (which I need to click away to continue), each time I run the full test suite (This application has requested the Runtime to terminate it in an unusual way...) Relevant excerpt from tests.log in attachment (this was with trunk@1062600). Cheers, -- Johan
Re: FSFS format 6
On Wed, Dec 29, 2010 at 8:37 PM, Stefan Fuhrmann eq...@web.de wrote: On 29.12.2010 01:58, Johan Corveleyn wrote: On Sun, Dec 12, 2010 at 4:23 PM, Stefan Fuhrmann stefanfuhrm...@alice-dsl.de wrote: On 19.10.2010 15:10, Daniel Shahaf wrote: Greg Stein wrote on Tue, Oct 19, 2010 at 04:31:42 -0400: Personally, I see [FSv2] as a broad swath of API changes to align our needs with the underlying storage. Trowbridge noted that our current API makes it *really* difficult to implement an effective backend. I'd also like to see a backend that allows for parallel PUTs during the commit process. Hyrum sees FSv2 as some kind of super-key-value storage with layers on top, allowing for various types of high-scaling mechanisms. At the retreat, stefan2 also had some thoughts about this... [This is just a brain-dump for 1.8+] While working on the performance branch I made some observations concerning the way FSFS organizes data and how that could be changed for reduced I/O overhead. notes/fsfs-improvements.txt contains a summary of that could be done to improve FSFS before FS-NG. A later FS-NG implementation should then still benefit from the improvements. +(number of fopen calls during a log operation) I like this proposal a lot. As I already told before, we are running our FSFS back-end on a SAN over NFS (and I suspect we're not the only company doing this). In this environment, the server-side I/O of SVN (especially the amount of random reads and fopen calls during e.g. log) is often the major bottleneck. There is one question going around in my head though: won't you have to change/rearrange a lot of the FS layer code (and maybe repos layer?) to benefit from this new format? Maybe. But as far as I understand the current FSFS structure, data access is mainly chasing pointers, i.e. reading relative or absolute byte offsets and moving there for the next piece of information. If everything goes well, none of that code needs to change; the revision packing algorithm will simply produce different offset values. The current code is written in a certain way, not particularly optimized for this new format (I seem to remember log does around 10 fopen calls for every interesting rev file, each time reading a different part of it). Also, if an operation currently needs to access many revisions (like log or blame), it doesn't take advantage at all of the fact that they might be in a single packed rev file. The pack file is opened and seeked in just as much as the sum of the individual rev files. The fopen() calls should be eliminated by the file handle cache. IOW, they should already be addressed on the performance branch. Please let me know if that is not the case. Ok, finally got around to verifying this. You are completely correct: the performance branch avoids the vast amount of repeated fopen() calls. With a simple test (testfile with 3 revisions, executing svn log of it) (note: this is an unpacked 1.6 repository): - trunk: opens each rev file between 19 and 21 times. - performance branch: opens each rev file 2 times. (I don't know why it's not simply 1 time, but ok, 2 times is already a factor 10 better than trunk :-)). I tested this simply by adding one line of printf instrumentation inside libsvn_subr/io.c#svn_io_file_open (see patch in attachment, as well as the output for trunk and for perf-branch). Now, if only that file-handle cache could be merged to trunk :-) ... Cheers, -- Johan Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1062600) +++ subversion/libsvn_subr/io.c (working copy) @@ -2786,6 +2786,7 @@ svn_io_file_open(apr_file_t **new_file, const char const char *fname_apr; apr_status_t status; + fprintf(stderr, OPEN(%d) %s\n, flag, fname); SVN_ERR(cstring_from_utf8(fname_apr, fname, pool)); status = file_open(new_file, fname_apr, flag | APR_BINARY, perm, TRUE, pool); $ ../../client_build/svn_branch_performance/dist/bin/svnserve -d -r c:/research/svn/experiment/repos OPEN(1) C:/research/svn/experiment/repos/format OPEN(129) C:/research/svn/experiment/repos/db/fs-type OPEN(129) C:/research/svn/experiment/repos/db/fs-type OPEN(129) C:/research/svn/experiment/repos/db/format OPEN(129) C:/research/svn/experiment/repos/db/uuid OPEN(129) C:/research/svn/experiment/repos/db/min-unpacked-rev OPEN(161) C:/research/svn/experiment/repos/db/fsfs.conf OPEN(129) C:/research/svn/experiment/repos/db/min-unpacked-revprop OPEN(129) C:/research/svn/experiment/repos/db/current OPEN(161) C:/research/svn/experiment/repos/conf/svnserve.conf OPEN(129) C:/research/svn/experiment/repos/db/revs/0/0 OPEN(129) C:/research/svn/experiment/repos/db/current
Re: [serf-dev] Re: What stands between us and branching 1.7?
On Sun, Jan 23, 2011 at 8:39 PM, Justin Erenkrantz jus...@erenkrantz.comwrote: On Sun, Jan 23, 2011 at 10:22 AM, Lieven Govaerts svn...@mobsol.be wrote: Things have changed since then though. Can anyone test with svn 1.6.x to see how it uses memory? For ra_serf, I'm wondering if we're creating an additional pool that isn't necessary - namely the editor_pool. I've done some light testing with the ra_serf patch below, which does the following things: - Create a separate pool hierarchy for files - Combines the per-file editor pool with the regular pool for file For serf itself, I've also switched zlib to use the bucket allocator - which also helps which churn - as every call to inflateInit/inflateEnd invokes malloc/free - so we can save on that quite bit. Overall, this seems to reduce the amount of 20KB allocations substantially...but not sure it does much to the overall memory usage. Not sure if/when I'll pick this up again...so patches below if someone else wants to run with it. -- justin Index: update.c === --- update.c(revision 1062462) +++ update.c(working copy) @@ -107,6 +107,7 @@ /* controlling dir baton - this is only created in open_dir() */ void *dir_baton; apr_pool_t *dir_baton_pool; + apr_pool_t *file_pool; /* Our master update editor and baton. */ const svn_delta_editor_t *update_editor; @@ -202,9 +203,6 @@ /* The properties for this file */ apr_hash_t *props; - /* pool passed to update-add_file, etc. */ - apr_pool_t *editor_pool; - /* controlling file_baton and textdelta handler */ void *file_baton; const char *base_checksum; @@ -403,7 +401,7 @@ report_info_t *new_info; new_info = apr_pcalloc(info_parent_pool, sizeof(*new_info)); - apr_pool_create(new_info-pool, info_parent_pool); + apr_pool_create(new_info-pool, info-dir-file_pool); new_info-file_baton = NULL; new_info-lock_token = NULL; new_info-fetch_file = FALSE; @@ -500,6 +498,7 @@ if (dir-base_name[0] == '\0') { apr_pool_create(dir-dir_baton_pool, dir-pool); + apr_pool_create(dir-file_pool, dir-pool); if (dir-report_context-destination dir-report_context-sess-wc_callbacks-invalidate_wc_props) @@ -519,6 +518,7 @@ SVN_ERR(open_dir(dir-parent_dir)); apr_pool_create(dir-dir_baton_pool, dir-parent_dir-dir_baton_pool); + dir-file_pool = dir-parent_dir-file_pool; Linking the file pool to the pool of its parent dir has resulted in crashes earlier. I seem to remember it's because the files are received after the dir is already closed in the editor drive, but I don't remember all the details. Lieven [..]