On 26.01.2011 03:09, Johan Corveleyn wrote:
On Tue, Jan 25, 2011 at 1:37 AM, Stefan Fuhrmann<eq...@web.de> wrote:
[ ... snip ...]
And, as promised, here some ideas how to get more
speed from the generic code. Your latest commit:
+#if SVN_UNALIGNED_ACCESS_IS_OK
+
+ /* Skip quickly over the stuff between EOLs. */
+ for (i = 0, can_read_word = TRUE; i< file_len; i++)
+ can_read_word = can_read_word
+&& (file[i].curp + sizeof(apr_size_t)< file[i].endp);
+ while (can_read_word)
+ {
+ for (i = 1, is_match = TRUE; i< file_len; i++)
+ is_match = is_match
+&& ( *(const apr_size_t *)file[0].curp
+ == *(const apr_size_t *)file[i].curp);
+
+ if (!is_match || contains_eol(*(const apr_size_t *)file[0].curp))
+ break;
+
+ for (i = 0; i< file_len; i++)
+ file[i].curp += sizeof(apr_size_t);
+ for (i = 0, can_read_word = TRUE; i< file_len; i++)
+ can_read_word = can_read_word
+&& (file[i].curp + sizeof(apr_size_t)< file[i].endp);
+ }
+
+#endif
could be changed to something like the following.
Please note that I haven't tested any of this:
Thanks. There was one error in your suggestion, which I found out
after testing. See below.
I was afraid so. I just hacked the code directly
into Thunderbird and was unsure whether the
exit behavior was correct.
/* Determine how far we may advance with chunky ops without reaching
* endp for any of the files.
* Signedness is important here if curp gets close to endp.
*/
apr_ssize_t max_delta = file[0].endp - file[0].curp - sizeof(apr_size_t);
for (i = 1; i< file_len; i++)
{
apr_ssize_t delta = file[i].endp - file[i].curp - sizeof(apr_size_t);
if (delta< max_delta)
max_delta = delta;
}
/* the former while() loop */
is_match = TRUE;
for (delta = 0; delta< max_delta&& is_match; delta += sizeof(apr_size_t))
{
apr_size_t chunk = *(const apr_size_t *)(file[0].curp + delta);
if (contains_eol(chunk))
break;
for (i = 1; i< file_len; i++)
if (chunk != *(const apr_size_t *)(file[i].curp + delta))
{
is_match = FALSE;
Here, I inserted:
delta -= sizeof(apr_size_t);
because otherwise, delta will be increased too far (it will still be
increased by the counting expression of the outer for-loop (after
which it will stop because of !is_match)). Maybe there is a
cleaner/clearer way to break out of the outer for-loop here, without
incrementing delta again, but for now, I've committed it with this
change (r1063565).
Yeah, I really wished for a "goto" there. The clean
solution, I guess, would be moving that section to
some sub-routine returning the final value of delta.
break;
}
}
/* We either found a mismatch or an EOL at or shortly behind curp+delta
* or we cannot proceed with chunky ops without exceeding endp.
* In any way, everything up to curp + delta is equal and not an EOL.
*/
for (i = 0; i< file_len; i++)
file[i].curp += delta;
Thanks. This gives on my machine/example another 15-20% performance
increase (datasources_open time going down from ~21 s to ~17 s). We
should probably do the same for suffix scanning, but I'm too tired
right now :-) (and suffix scanning is more difficult to grok, so not a
good idea to do at 3 am).
Don't get yourself burned out ;)
-- Stefan^2.