./subversion/svnserve/svnserve -Tdr /media/stefan/033c7ee9-9980-45a8-b9c6-f911dc2ac664/f7-test/ -M 1000 -c 0 --client-speed 100000 stefan@macbook:~/develop/trunk$ time ./subversion/svn/svn diff --summarize svn://localhost/ruby-f6-packed -r0:HEAD > /dev/null
real 0m2.150s user 0m1.964s sys 0m0.104s stefan@macbook:~/develop/trunk$ time ./subversion/svn/svn diff --summarize svn://localhost/ruby-f6-packed -r0:HEAD > /dev/null real 0m1.960s user 0m1.764s sys 0m0.120s ==24293== I refs: 14,916,573,539 real 0m2.335s user 0m2.176s sys 0m0.100s stefan@macbook:~/develop/trunk$ time ./subversion/svn/svn diff --summarize svn://localhost/ruby-f6-packed -r0:HEAD > /dev/null real 0m2.132s user 0m1.916s sys 0m0.128s ==5689== I refs: 16,225,780,792 3.1Gb On Tue, Aug 25, 2015 at 3:23 PM, Branko Čibej <br...@wandisco.com> wrote: > On 25.08.2015 14:10, Branko Čibej wrote: > > On 06.08.2015 18:10, stef...@apache.org wrote: > >> Author: stefan2 > >> Date: Thu Aug 6 16:10:39 2015 > >> New Revision: 1694533 > >> > >> URL: http://svn.apache.org/r1694533 > >> Log: > >> Fix an alignment problem on machines with 32 bit pointers but atomic 64 > >> data access that may not be misaligned. > >> > >> * subversion/libsvn_ra_svn/marshal.c > >> (read_item): Ensure that the array contents are always have the APR > >> default alignment. > >> > >> Found by: Rainer Jung <rainer.jung{_AT_}kippdata.de> > >> > >> Modified: > >> subversion/trunk/subversion/libsvn_ra_svn/marshal.c > >> > >> Modified: subversion/trunk/subversion/libsvn_ra_svn/marshal.c > >> URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_svn/marshal.c?rev=1694533&r1=1694532&r2=1694533&view=diff > >> > ============================================================================== > >> --- subversion/trunk/subversion/libsvn_ra_svn/marshal.c (original) > >> +++ subversion/trunk/subversion/libsvn_ra_svn/marshal.c Thu Aug 6 > 16:10:39 2015 > >> @@ -1190,14 +1190,20 @@ static svn_error_t *read_item(svn_ra_svn > >> } > >> else if (c == '(') > >> { > >> + /* On machines with 32 bit pointers, array headers are only 20 > bytes > >> + * which is not enough for our standard 64 bit alignment. > >> + * So, determine a suitable block size for the APR array header > that > >> + * keeps proper alignment for following structs. */ > >> + const apr_size_t header_size > >> + = APR_ALIGN_DEFAULT(sizeof(apr_array_header_t)); > >> + > >> /* Allocate an APR array with room for (initially) 4 items. > >> * We do this manually because lists are the most frequent > protocol > >> * element, often used to frame a single, optional value. We > save > >> * about 20% of total protocol handling time. */ > >> - char *buffer = apr_palloc(pool, sizeof(apr_array_header_t) > >> - + 4 * sizeof(svn_ra_svn_item_t)); > >> - svn_ra_svn_item_t *data > >> - = (svn_ra_svn_item_t *)(buffer + sizeof(apr_array_header_t)); > >> + char *buffer = apr_palloc(pool, > >> + header_size + 4 * > sizeof(svn_ra_svn_item_t)); > >> + svn_ra_svn_item_t *data = (svn_ra_svn_item_t *)(buffer + > header_size); > >> > >> item->kind = SVN_RA_SVN_LIST; > >> item->u.list = (apr_array_header_t *)buffer; > > > > How exactly is all this different from: > > > > apr_array_make(pool, 4, sizeof(svn_ra_svn_item_t)? > > > > Other than relying on magic knowledge of APR implementation details? > > All right, so I figured that the difference is that apr_array_make does > two allocations compared to one in this code. Still: relying on > knowledge of APR implementation details is a really bad idea. The structure definition of apr_array_header_t is part of the public API, i.e. will never change. > As it > stands, APR will correctly resize this array if necessary; but there's > no guarantee that, sometime in the future, resizing such arrays would > fail horribly. > Even if that was true, the resize is done outside APR as well - a few lines further down. > I very strongly suggest that we put a plain ol' apr_array_make here and, > if this is really perceived as such a huge *overall* performance > problem, put the allocation hack into APR itself. > > I suspect that apr_palloc is fast enough that we really don't need this > hack. > Is it absolutely necessary to use that hand-crafted section of code? Of course not. Does it have a significant impact on ra_svn throughput? Yes. The whole section of local array management saved 20% runtime in "command-heavy" reports like this one: $time svn diff --summarize svn://localhost/ruby -r0:HEAD > /dev/null real 0m1.960s user 0m1.764s sys 0m0.120s Going back to apr_array_make alone sets us back by ~10%: real 0m2.132s user 0m1.916s sys 0m0.128s The reason is that the protocol uses lists for every optional element of any struct being transmitted. Of the 36M protocol items passing through read_item(), 16M are lists; the rest is numbers and strings. Based on the outrage that a 3% and 5% slowdowns caused last year, I would expect nothing less than a cheering crowd for a 20% speedup ;) But seriously, I'd love to keep that section of code as it uses APR public API only and is fairly decoupled from the rest of the code. Alternatively, we could and maybe should replace the array in svn_ra_svn_item_t with plain C array but that requires us rev'ing a lot of deprecated public API in svn_ra_svn.h . -- Stefan^2.