Re: My take on the diff-optimizations-bytes branch

Stefan Fuhrmann Wed, 26 Jan 2011 13:15:47 -0800

On 26.01.2011 03:09, Johan Corveleyn wrote:

On Tue, Jan 25, 2011 at 1:37 AM, Stefan Fuhrmann<eq...@web.de>  wrote:
[ ... snip ...]

And, as promised, here some ideas how to get more
speed from the generic code. Your latest commit:


+#if SVN_UNALIGNED_ACCESS_IS_OK
+
+      /* Skip quickly over the stuff between EOLs. */
+      for (i = 0, can_read_word = TRUE; i<    file_len; i++)
+        can_read_word = can_read_word
+&&    (file[i].curp + sizeof(apr_size_t)<    file[i].endp);
+      while (can_read_word)
+        {
+          for (i = 1, is_match = TRUE; i<    file_len; i++)
+            is_match = is_match
+&&    (   *(const apr_size_t *)file[0].curp
+                           == *(const apr_size_t *)file[i].curp);
+
+          if (!is_match || contains_eol(*(const apr_size_t *)file[0].curp))
+            break;
+
+          for (i = 0; i<    file_len; i++)
+            file[i].curp += sizeof(apr_size_t);
+          for (i = 0, can_read_word = TRUE; i<    file_len; i++)
+            can_read_word = can_read_word
+&&    (file[i].curp + sizeof(apr_size_t)<    file[i].endp);
+        }
+
+#endif

could be changed to something like the following.
Please note that I haven't tested any of this:

Thanks. There was one error in your suggestion, which I found out
after testing. See below.

I was afraid so. I just hacked the code directly
into Thunderbird and was unsure whether the
exit behavior was correct.

/* Determine how far we may advance with chunky ops without reaching
  * endp for any of the files.
  * Signedness is important here if curp gets close to endp.
  */
apr_ssize_t max_delta = file[0].endp - file[0].curp - sizeof(apr_size_t);
for (i = 1; i<    file_len; i++)
{
    apr_ssize_t delta = file[i].endp - file[i].curp - sizeof(apr_size_t);
    if (delta<    max_delta)
        max_delta = delta;
}

/* the former while() loop */
is_match = TRUE;
for (delta = 0; delta<    max_delta&&    is_match; delta += sizeof(apr_size_t))
{
    apr_size_t chunk = *(const apr_size_t *)(file[0].curp + delta);
    if (contains_eol(chunk))
        break;

    for (i = 1; i<    file_len; i++)
        if (chunk != *(const apr_size_t *)(file[i].curp + delta))
        {
            is_match = FALSE;

Here, I inserted:

             delta -= sizeof(apr_size_t);

because otherwise, delta will be increased too far (it will still be
increased by the counting expression of the outer for-loop (after
which it will stop because of !is_match)). Maybe there is a
cleaner/clearer way to break out of the outer for-loop here, without
incrementing delta again, but for now, I've committed it with this
change (r1063565).

Yeah, I really wished for a "goto" there. The clean
solution, I guess, would be moving that section to
some sub-routine returning the final value of delta.

            break;
        }
}

/* We either found a mismatch or an EOL at or shortly behind curp+delta
  * or we cannot proceed with chunky ops without exceeding endp.
  * In any way, everything up to curp + delta is equal and not an EOL.
  */
for (i = 0; i<    file_len; i++)
    file[i].curp += delta;

Thanks. This gives on my machine/example another 15-20% performance
increase (datasources_open time going down from ~21 s to ~17 s). We
should probably do the same for suffix scanning, but I'm too tired
right now :-) (and suffix scanning is more difficult to grok, so not a
good idea to do at 3 am).

Don't get yourself burned out ;)

-- Stefan^2.

Re: My take on the diff-optimizations-bytes branch

Reply via email to