Re: My take on the diff-optimizations-bytes branch

Johan Corveleyn Tue, 18 Jan 2011 03:57:34 -0800

On Mon, Jan 17, 2011 at 3:12 AM, Johan Corveleyn <jcor...@gmail.com> wrote:
> On Sat, Jan 8, 2011 at 6:50 AM, Stefan Fuhrmann
> <stefanfuhrm...@alice-dsl.de> wrote:


[ ... snip ... ]

>> But I think, the stack variable is certainly helpful
>> and easy to do.

Ok, I've done this (locally, still have to clean up a little and then
commit). It gives a nice 10% speedup of prefix/suffix scanning, which
is great.

I have a couple of small questions though, about other small
optimizations you did:

> Index: subversion/libsvn_diff/diff_file.c
> ===================================================================
> --- subversion/libsvn_diff/diff_file.c        (revision 1054044)
> +++ subversion/libsvn_diff/diff_file.c        (working copy)
...
> @@ -258,10 +259,10 @@
>                                                                               
> \
>      for (i = 0; i < files_len; i++)                                          
> \
>      {                                                                        
> \
> -      if (all_files[i]->curp < all_files[i]->endp - 1)                       
> \
> -        all_files[i]->curp++;                                                
> \
> +      if (all_files[i].curp + 1 < all_files[i].endp)                         
> \
> +        all_files[i].curp++;                                                 
> \

Just curious: why did you change that condition (from 'X < Y - 1' to
'X + 1 < Y')?

You mention in your log message: "minor re-arragements in arithmetic
expressions to maximize reuse of results (e.g. in
INCREMENT_POINTERS)".

Why does this help, and what's the potential impact?


Second question:

> +find_identical_prefix_slow(svn_boolean_t *reached_one_eof,
> +                           apr_off_t *prefix_lines,
> +                           struct file_info file[], apr_size_t file_len,
> +                           apr_pool_t *pool)
> +{
> +  svn_boolean_t had_cr = FALSE;
>    svn_boolean_t is_match, reached_all_eof;
>    apr_size_t i;
> +  apr_off_t lines = 0;
>
>    *prefix_lines = 0;
>    for (i = 1, is_match = TRUE; i < file_len; i++)
> -    is_match = is_match && *file[0]->curp == *file[i]->curp;
> +    is_match = is_match && *file[0].curp == *file[i].curp;
> +
>    while (is_match)
>      {
>        /* ### TODO: see if we can take advantage of
>           diff options like ignore_eol_style or ignore_space. */
>        /* check for eol, and count */
> -      if (*file[0]->curp == '\r')
> +      if (*file[0].curp == '\r')
>          {
> -          (*prefix_lines)++;
> +          lines++;

Here you added a local variable 'lines', which is incremented in the
function, and copied to *prefix_lines at the end (you do the same in
fine_identical_prefix_fast). The original code just sets and
increments *prefix_lines directly. So, out of curiousness: why does
this help, and how much?

Thanks,
-- 
Johan

Re: My take on the diff-optimizations-bytes branch

Reply via email to