Hi, back in November 2013, following an idea by guenther@, i cooked up another optimization for seekdir(3), then failed to send out the patch. So here it is.
Philip Guenther wrote on Tue, Nov 05, 2013 at 07:57:43PM -0800: > On Wed, 6 Nov 2013, Ingo Schwarze wrote: >> * Worst case: >> opendir; telldir >> then 100.000 times (seekdir; readdir) at that position >> = The entry asked for is the very first entry in the buffer, >> which cannot be found, because each dirent only contains >> the address of the *following* dirent, so each readdir >> triggers getdents anyway, but we still search the whole >> buffer each time. > While caching the offset of the start of the buffer is possible, it's > not obvious that that case will happen often enough to be worth it. > > Hmm, rewinddir() will _always_ be this worst case. Perhaps rewinddir() > should not call seekdir() and just be the two assignments with lseek(), > skipping the scan of the current buffer. That wouldn't be very useful because after lseek(2), getdents(2) is required, which is *much* more expensive than searching the buffer. > It would be assisted by the start-o-buffer cache though. Exactly. Right now, each rewinddir(3) seeks the whole buffer in vain, wasting 5 microseconds on that, then proceeds to the extremely expensive getdents(2) anyway, wasting another 100 microseconds. The patch to cache the cookie of the first directory entry in addition to all the other entries that we already cache is relatively simple, see below. With this patch, rewinddir(3) changes from always being the worst case to often being the best case (now always the best case for directories with less than about 500 entries, or for rewinding before reading less than about 500 entries), no other case becomes any worse, and some other, less important edge cases improve, too. What the patch does is: * Let opendir(3) save the cookie of the first entry of the directory. * Let seekdir(3) use that saved cookie to move the internal pointers to that entry when asked to do so, for example on rewinddir(3). * Let seekdir(3) update that saved cookie when the buffer is exhausted and when consequently, an lseek(2) had to be done. * Let seekdir(3) invalidate the buffer by setting the pointer after its end, instead of to its beginning. * Have readdir(3) refrain from invalidating the buffer when asked to return the first element from it. This comes with only one price: Adding another off_t to the opaque struct _dirdesc (= DIR), which already contains another off_t, two long, two int, and two pointers. That's not much growth, especially since no sane code will allocate large numbers of DIRs. As explained last year, this patch reduced typical execution times of rewinddir(3) on my notebook for about 100 microseconds to about 0.05 microseconds. The /usr/src/regress/lib/libc/telldir/ test suite is still happy, and so are some other, manual tests i have done. Am i right that, the type DIR being opaque, this patch doesn't require any libc bump? Comments, OKs? Ingo Index: opendir.c =================================================================== RCS file: /cvs/src/lib/libc/gen/opendir.c,v retrieving revision 1.26 diff -u -p -r1.26 opendir.c --- opendir.c 6 Nov 2013 20:35:25 -0000 1.26 +++ opendir.c 6 Mar 2014 17:14:16 -0000 @@ -76,7 +76,7 @@ fdopendir(int fd) dirp = __fdopendir(fd); if (dirp != NULL) { /* Record current offset for immediate telldir() */ - dirp->dd_curpos = lseek(fd, 0, SEEK_CUR); + dirp->dd_bufpos = dirp->dd_curpos = lseek(fd, 0, SEEK_CUR); /* * POSIX doesn't require fdopendir() to set @@ -116,6 +116,7 @@ __fdopendir(int fd) dirp->dd_fd = fd; dirp->dd_lock = NULL; dirp->dd_curpos = 0; + dirp->dd_bufpos = 0; return (dirp); } Index: readdir.c =================================================================== RCS file: /cvs/src/lib/libc/gen/readdir.c,v retrieving revision 1.20 diff -u -p -r1.20 readdir.c --- readdir.c 6 Nov 2013 22:26:14 -0000 1.20 +++ readdir.c 6 Mar 2014 17:14:16 -0000 @@ -43,9 +43,8 @@ _readdir_unlocked(DIR *dirp, struct dire *result = NULL; for (;;) { - if (dirp->dd_loc >= dirp->dd_size) + if (dirp->dd_loc >= dirp->dd_size) { dirp->dd_loc = 0; - if (dirp->dd_loc == 0) { dirp->dd_size = getdents(dirp->dd_fd, dirp->dd_buf, dirp->dd_len); if (dirp->dd_size == 0) Index: seekdir.c =================================================================== RCS file: /cvs/src/lib/libc/gen/seekdir.c,v retrieving revision 1.11 diff -u -p -r1.11 seekdir.c --- seekdir.c 6 Nov 2013 20:35:25 -0000 1.11 +++ seekdir.c 6 Mar 2014 17:14:16 -0000 @@ -36,6 +36,13 @@ seekdir(DIR *dirp, long loc) */ _MUTEX_LOCK(&dirp->dd_lock); + if (dirp->dd_size && dirp->dd_bufpos == loc) { + dirp->dd_loc = 0; + dirp->dd_curpos = loc; + _MUTEX_UNLOCK(&dirp->dd_lock); + return; + } + for (dirp->dd_loc = 0; dirp->dd_loc < dirp->dd_size; dirp->dd_loc += dp->d_reclen) { @@ -61,7 +68,7 @@ seekdir(DIR *dirp, long loc) * In particular, invalidate dd_loc. */ - dirp->dd_loc = 0; - dirp->dd_curpos = lseek(dirp->dd_fd, loc, SEEK_SET); + dirp->dd_loc = dirp->dd_size; + dirp->dd_bufpos = dirp->dd_curpos = lseek(dirp->dd_fd, loc, SEEK_SET); _MUTEX_UNLOCK(&dirp->dd_lock); } Index: telldir.h =================================================================== RCS file: /cvs/src/lib/libc/gen/telldir.h,v retrieving revision 1.8 diff -u -p -r1.8 telldir.h --- telldir.h 12 Nov 2013 20:19:23 -0000 1.8 +++ telldir.h 6 Mar 2014 17:14:16 -0000 @@ -44,6 +44,7 @@ struct _dirdesc { char *dd_buf; /* data buffer */ int dd_len; /* size of data buffer */ off_t dd_curpos; /* current cookie */ + off_t dd_bufpos; /* cookie of the first entry in dd_buf */ void *dd_lock; /* mutex to protect struct */ };