With these changes, 'busybox find .' and 'busybox find libbb' fail
entirely, with 'libbb' becoming 'libb' instead. Running 'busybox find
libbb/' works but the result has double-slashes: 'libbb//whatever.c'.
The last char of the parameter passed to 'find' is being erased if there
is no slash.
I left the old code commented instead of taking it out because there is
a good chance the maintainers will want to see this "bloat" gated behind
a config option.
On 2024-04-11 1:47 AM, Rolf Eike Beer wrote:
On Mittwoch, 10. April 2024 21:36:06 CEST Jody Bruchon wrote:
This patch uses pre-calculated name lengths to massively speed up various
recursive operations. Three new *_fast variant functions are added along
with get_d_namlen copied from libjodycode. Passing lengths allows use of
memcpy() instead of strcpy()/strcat() and replacement of a particularly
hot xasprintf(). Cachegrind shows CPU instructions on Linux x86_64 drop
by 24% to 67% with similar reductions in data reads and writes.
Anything in BusyBox that uses a while(readdir()) loop or that calls
concat_*path_file() or last_char_is() might benefit from adopting this
optimization framework.
Completely untested, but how about this:
char* FAST_FUNC concat_path_file_fast(const char *path, const struct dirent
*dirp)
{
const char *filename = dirp->d_name;
char *buf;
int end_offset;
int pathlen, namelen;
if (!path) {
path = "";
pathlen = 0;
} else {
pathlen = strlen(path);
if (last_char_is_fast(path, '/', pathlen) == NULL)
pathlen--;
}
namelen = get_d_namlen(dirp);
while (*filename == '/') {
filename++;
namelen--;
}
end_offset = pathlen + 1 + namelen;
buf = (char *)malloc(end_offset + 1);
if (!buf) return NULL;
memcpy(buf, path, pathlen);
*(buf + pathlen) = '/';
memcpy(buf + pathlen + 1, filename, namelen);
*(buf + end_offset) = '\0';
return buf;
}
This avoids scanning an empty path for the trailing slash, and avoids the
check for a trailing slash later entirely, saving one instruction and 2
variables.
You also have quite some places where you have the old code still around and
just commented out, I would have just removed them.
Generally I wonder if the length variables shouldn't be size_t or the like,
which would not affect this function but all of them.
Eike
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox