Re: svn commit: r1001413 - in /subversion/branches/performance/subversion: include/svn_string.h libsvn_subr/svn_string.c
One correction (in the comments -- which are great, BTW): s/descression/discretion -Ed On Mon, Sep 27, 2010 at 5:05 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: +1 to merge to trunk. On Sun, Sep 26, 2010 at 6:01 AM, stef...@apache.org wrote: Author: stefan2 Date: Sun Sep 26 11:01:03 2010 New Revision: 1001413 URL: http://svn.apache.org/viewvc?rev=1001413view=rev Log: Extensively document the benefits of svn_stringbuf_appendbyte and the rationals behind its implementation. To simplify the explanations, one statement had to be moved. * subversion/include/svn_string.h (svn_stringbuf_appendbyte): extend docstring to indicate that this method is cheaper to call and execute * subversion/libsvn_subr/svn_string.c (svn_stringbuf_appendbyte): reorder statements for easier description; add extensive description about the optimizations done Modified: subversion/branches/performance/subversion/include/svn_string.h subversion/branches/performance/subversion/libsvn_subr/svn_string.c Modified: subversion/branches/performance/subversion/include/svn_string.h URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/include/svn_string.h?rev=1001413r1=1001412r2=1001413view=diff == --- subversion/branches/performance/subversion/include/svn_string.h (original) +++ subversion/branches/performance/subversion/include/svn_string.h Sun Sep 26 11:01:03 2010 @@ -259,6 +259,10 @@ void svn_stringbuf_fillchar(svn_stringbuf_t *str, unsigned char c); /** Append a single character @a byte onto @a targetstr. + * This is an optimized version of @ref svn_stringbuf_appendbytes + * that is much faster to call and execute. Gains vary with the ABI. + * The advantages extend beyond the actual call because the reduced + * register pressure allows for more optimization within the caller. * * reallocs if necessary. @a targetstr is affected, nothing else is. * @since New in 1.7. Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=1001413r1=1001412r2=1001413view=diff == --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c (original) +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Sun Sep 26 11:01:03 2010 @@ -393,28 +393,60 @@ svn_stringbuf_ensure(svn_stringbuf_t *st } +/* WARNING - Optimized code ahead! + * This function has been hand-tuned for performance. Please read + * the comments below before modifying the code. + */ void svn_stringbuf_appendbyte(svn_stringbuf_t *str, char byte) { + char *dest; + apr_size_t old_len = str-len; + /* In most cases, there will be pre-allocated memory left * to just write the new byte at the end of the used section * and terminate the string properly. */ - apr_size_t old_len = str-len; - if (str-blocksize old_len + 1) + if (str-blocksize old_len + 1) { - char *dest = str-data; + /* The following read does not depend this write, so we + * can issue the write first to minimize register pressure: + * The value of old_len+1 is no longer needed; on most processors, + * dest[old_len+1] will be calculated implicitly as part of + * the addressing scheme. + */ + str-len = old_len+1; + + /* Since the compiler cannot be sure that *src-data and *src + * don't overlap, we read src-data *once* before writing + * to *src-data. Replacing dest with str-data would force + * the compiler to read it again after the first byte. + */ + dest = str-data; + /* If not already available in a register as per ABI, load + * byte into the register (e.g. the one freed from old_len+1), + * then write it to the string buffer and terminate it properly. + * + * Including the byte fetch, all operations so far could be + * issued at once and be scheduled at the CPU's descression. + * Most likely, no-one will soon depend on the data that will be + * written in this function. So, no stalls there, either. + */ dest[old_len] = byte; dest[old_len+1] = '\0'; - - str-len = old_len+1; } else { /* we need to re-allocate the string buffer * - let the more generic implementation take care of that part */ + + /* Depending on the ABI, byte is a register value. If we were + * to take its address directly, the compiler might decide to + * put in on the stack *unconditionally*, even if that would + * only be necessary for this block. + */ char b = byte; svn_stringbuf_appendbytes(str, b, 1); }
Re: svn commit: r1001413 - in /subversion/branches/performance/subversion: include/svn_string.h libsvn_subr/svn_string.c
+1 to merge to trunk. On Sun, Sep 26, 2010 at 6:01 AM, stef...@apache.org wrote: Author: stefan2 Date: Sun Sep 26 11:01:03 2010 New Revision: 1001413 URL: http://svn.apache.org/viewvc?rev=1001413view=rev Log: Extensively document the benefits of svn_stringbuf_appendbyte and the rationals behind its implementation. To simplify the explanations, one statement had to be moved. * subversion/include/svn_string.h (svn_stringbuf_appendbyte): extend docstring to indicate that this method is cheaper to call and execute * subversion/libsvn_subr/svn_string.c (svn_stringbuf_appendbyte): reorder statements for easier description; add extensive description about the optimizations done Modified: subversion/branches/performance/subversion/include/svn_string.h subversion/branches/performance/subversion/libsvn_subr/svn_string.c Modified: subversion/branches/performance/subversion/include/svn_string.h URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/include/svn_string.h?rev=1001413r1=1001412r2=1001413view=diff == --- subversion/branches/performance/subversion/include/svn_string.h (original) +++ subversion/branches/performance/subversion/include/svn_string.h Sun Sep 26 11:01:03 2010 @@ -259,6 +259,10 @@ void svn_stringbuf_fillchar(svn_stringbuf_t *str, unsigned char c); /** Append a single character @a byte onto @a targetstr. + * This is an optimized version of @ref svn_stringbuf_appendbytes + * that is much faster to call and execute. Gains vary with the ABI. + * The advantages extend beyond the actual call because the reduced + * register pressure allows for more optimization within the caller. * * reallocs if necessary. @a targetstr is affected, nothing else is. * @since New in 1.7. Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c URL: http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=1001413r1=1001412r2=1001413view=diff == --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c (original) +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Sun Sep 26 11:01:03 2010 @@ -393,28 +393,60 @@ svn_stringbuf_ensure(svn_stringbuf_t *st } +/* WARNING - Optimized code ahead! + * This function has been hand-tuned for performance. Please read + * the comments below before modifying the code. + */ void svn_stringbuf_appendbyte(svn_stringbuf_t *str, char byte) { + char *dest; + apr_size_t old_len = str-len; + /* In most cases, there will be pre-allocated memory left * to just write the new byte at the end of the used section * and terminate the string properly. */ - apr_size_t old_len = str-len; - if (str-blocksize old_len + 1) + if (str-blocksize old_len + 1) { - char *dest = str-data; + /* The following read does not depend this write, so we + * can issue the write first to minimize register pressure: + * The value of old_len+1 is no longer needed; on most processors, + * dest[old_len+1] will be calculated implicitly as part of + * the addressing scheme. + */ + str-len = old_len+1; + + /* Since the compiler cannot be sure that *src-data and *src + * don't overlap, we read src-data *once* before writing + * to *src-data. Replacing dest with str-data would force + * the compiler to read it again after the first byte. + */ + dest = str-data; + /* If not already available in a register as per ABI, load + * byte into the register (e.g. the one freed from old_len+1), + * then write it to the string buffer and terminate it properly. + * + * Including the byte fetch, all operations so far could be + * issued at once and be scheduled at the CPU's descression. + * Most likely, no-one will soon depend on the data that will be + * written in this function. So, no stalls there, either. + */ dest[old_len] = byte; dest[old_len+1] = '\0'; - - str-len = old_len+1; } else { /* we need to re-allocate the string buffer * - let the more generic implementation take care of that part */ + + /* Depending on the ABI, byte is a register value. If we were + * to take its address directly, the compiler might decide to + * put in on the stack *unconditionally*, even if that would + * only be necessary for this block. + */ char b = byte; svn_stringbuf_appendbytes(str, b, 1); }