Re: [Full-disclosure] Re: readdir_r considered harmful
>In practice, you're correct. In theory, however, consider the >following code >path. > > >> THREAD 1 THREAD 2 >> ---- >> DIR *d1 = opendir(dir1); >> DIR *d2 = opendir(dir2); >> dent1 = readdir(dir1); >> dent2 = readdir(dir2); >> use(dent1); >> > >In most implementations, dent1 != dent2. HOWEVER, there is no >guarantee that >they will not both point to the same statically allocated buffer, and >some >implementations may do so. For example, this is why ctime_r exists: >ctime >returns a pointer to a statically allocated buffer, and hence is not >thread >safe. The standard actually guarantees that the static storage is associated with the specific directory STREAM. So a system on which dent1 and dent2 point to the same buffer and reads from one stream affect the buffer returned by reads from another stream are not POSIX compliant. See: http://www.opengroup.org/onlinepubs/009695399/functions/readdir.html "The pointer returned by readdir() points to data which may be overwritten by another call to readdir() on the same directory stream. This data is not overwritten by another call to readdir() on a different directory stream." But is also goes on to say: "The readdir() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe." which is the one thing I like POSIX to fix for thread safe implementations. Casper
Re: [Full-disclosure] Re: readdir_r considered harmful
[EMAIL PROTECTED] wrote: ... >Had they done so, we would never have had to use readdir_r() and progammers >would not have introduced bugs in the (mis)use of pathconf, over allocating, >etc. > >I would be interested in seeing any real-world use of readdir_r() in >a context where readdir_r() is required (multiple threads reading from >a single DIR *). > > Consider the following situation(I'm not sure if anyone actually does this): 1) You have a "spool" directory containing a large number of files, each which represents a task to process. 2) You have a number of worker threads. Each worker thread reads a file from the global DIR*, and then opens and reads the file for processing(and then loops on 2). Of course, you could always just put a mutex around every call to readdir(), and copy the filename somewhere safe, or you could invent a signalling system to ask one thread to do all the readdir()s. Whether this makes sense depends on how much of readdir_r has to be spent inside a global mutex/spinlock anyway, and how long the processing part takes compared with the readdir() part. Andrew
Re: [Full-disclosure] Re: readdir_r considered harmful
>On 11/6/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: >> I don't see how that is relevant; the typical use of readdir() is as follows: >> >> DIR *dirp = opendir(name); >> >> while ((dent = readdir(dirp)) != NULL) { >> ... >> } >> >> closedir(dirp); >> >> Nothing other threads do with readdir() on different dirp's will influence >> what "dent" points to. > >The issue is multiple threads using the same DIR. No, it isn't. I certainly limited the scope of my contribution to single threads reading from a DIR. All the 80-odd uses of readdir_r() in the Solaris core source code, all can (and should) be replaced with readdir(). All have a single thread reading and reusing the same "struct dirent", so readdir() could be used in POSIXly correct fashion if the following sentence in the open group's manual page was not present: "The readdir() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe." I believe that this is an error in POSIX; when "threadedness" was added the manual page could have been changed to indicate that a single thread using the above idiom was safe. Had they done so, we would never have had to use readdir_r() and progammers would not have introduced bugs in the (mis)use of pathconf, over allocating, etc. I would be interested in seeing any real-world use of readdir_r() in a context where readdir_r() is required (multiple threads reading from a single DIR *). Casper
Re: [Full-disclosure] Re: readdir_r considered harmful
On 11/6/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > I don't see how that is relevant; the typical use of readdir() is as follows: > > DIR *dirp = opendir(name); > > while ((dent = readdir(dirp)) != NULL) { > ... > } > > closedir(dirp); > > Nothing other threads do with readdir() on different dirp's will influence > what "dent" points to. The issue is multiple threads using the same DIR.
Re: [Full-disclosure] Re: readdir_r considered harmful
>Then you never really understood the implementation, seems. Of course >all implementations keep the content of the directory as read with >getdents or so in the DIR descriptor. But it is usually not the case >that the whole content fits into the buffer allocated. One could, of >course, resize the buffer to fit the content of the directory read, >even if this means reserving hundreds or thousands of kBs. But this >is not how most implementations work. I don't see how that is relevant; the typical use of readdir() is as follows: DIR *dirp = opendir(name); while ((dent = readdir(dirp)) != NULL) { ... } closedir(dirp); Nothing other threads do with readdir() on different dirp's will influence what "dent" points to. I have *never* seen a program where multiple threads read from a single dirp; and I can't image the use. >Instead implementations keep work similar to every buffered file I/O >operation. But this means that buffer content is replaced. If this >happens and some thread uses readdir() instead of readdir_r(), the >returned string pointer suddenly becomes invalid since it points to >memory which has been replaced. Yes, the next call to readdir() *on the same dirp* may change what the previous call; but that's completely irrelevant for most uses of readdir(). Of course, an application may want to save all readdir() return values, but that is completely orthogonal to threads; there is no reason why the POSIX *thread* specification includes readdir_r(). >Next time, before you make such comments, ask Don Cragun to explain >things to you. Next time before you mail, you might want to engage your brain. There is NO reason for a thread-safe library to use readdir_r() over readdir(), with common readdir() implementations. Casper
Re: [Full-disclosure] Re: readdir_r considered harmful
On 11/5/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Why not: > > 4. Require the readdir() implementation to use state local to dirp. > > I've never understood the rationale behind readdir_r; Then you never really understood the implementation, seems. Of course all implementations keep the content of the directory as read with getdents or so in the DIR descriptor. But it is usually not the case that the whole content fits into the buffer allocated. One could, of course, resize the buffer to fit the content of the directory read, even if this means reserving hundreds or thousands of kBs. But this is not how most implementations work. Instead implementations keep work similar to every buffered file I/O operation. But this means that buffer content is replaced. If this happens and some thread uses readdir() instead of readdir_r(), the returned string pointer suddenly becomes invalid since it points to memory which has been replaced. Next time, before you make such comments, ask Don Cragun to explain things to you.
Re: [Full-disclosure] Re: readdir_r considered harmful
On 06 Nov 05, at 01:00, [EMAIL PROTECTED] wrote: Then you never really understood the implementation, seems. Of course all implementations keep the content of the directory as read with getdents or so in the DIR descriptor. But it is usually not the case that the whole content fits into the buffer allocated. One could, of course, resize the buffer to fit the content of the directory read, even if this means reserving hundreds or thousands of kBs. But this is not how most implementations work. I don't see how that is relevant; the typical use of readdir() is as follows: DIR *dirp = opendir(name); while ((dent = readdir(dirp)) != NULL) { ... } closedir(dirp); Nothing other threads do with readdir() on different dirp's will influence what "dent" points to. I have *never* seen a program where multiple threads read from a single dirp; and I can't image the use. In practice, you're correct. In theory, however, consider the following code path. THREAD 1 THREAD 2 ---- DIR *d1 = opendir(dir1); DIR *d2 = opendir(dir2); dent1 = readdir(dir1); dent2 = readdir(dir2); use(dent1); In most implementations, dent1 != dent2. HOWEVER, there is no guarantee that they will not both point to the same statically allocated buffer, and some implementations may do so. For example, this is why ctime_r exists: ctime returns a pointer to a statically allocated buffer, and hence is not thread safe. You are correct, though, that the glibc implementation of readdir is thread-safe, so readdir_r is unnecessary in all common situations. PGP.sig Description: This is a digitally signed message part