On Tue, Apr 12, 2022 at 10:42:02AM +0100, Geoff Clare via austin-group-l at The 
Open Group wrote:
> Rob Landley wrote, on 11 Apr 2022:
> > A bunch of protocols (git, http, mbox, etc) start with lines of data
> > followed by a block of data, so it's natural to want to call
> > getline() and then handle the data block. But getline() takes a FILE
> > * and things like zlib and sendfile() take an integer file
> > descriptor.

> > Posix lets me get the file descriptor out of a FILE * with fileno(),
> > but the point of FILE * is to readahead and buffer. How do I get the
> > buffered data out without reading more from the file descriptor?

> > I can't find a portable way to do this?

> I tried this sequence of calls on a few systems, and it worked in the
> way you would expect:

>     fgets(buf, sizeof buf, fp);
>     int fd = dup(fileno(fp));
>     close(fileno(fp));
>     while ((ret = fread(buf, 1, sizeof buf, fp)) > 0) { ... }
>     read(fd, buf, sizeof buf);

> It relies on fread() not detecting EBADF until it tries to read more
> data from the underlying fd.

> It has some caveats:

> 1. It needs a file descriptor to be available.

> 2. The close() will remove any fcntl() locks that the calling process
>    holds for the file.

> 3. In a multi-threaded process it has the usual problem around fd
>    inheritance, but that's addressed in Issue 8 with the addition
>    of dup3().

There is another dangerous problem: if another thread or a signal
handler allocates another fd and it is assigned the number fileno(fp),
the while loop might read data from a completely unrelated file. This
could be avoided by dup2/dup3'ing /dev/null onto fileno(fp) instead of
closing it (at the cost of another file descriptor).

> Also, for the standard to require it to work, I think we would need to
> tweak the EBADF error for fgetc() (which fread() references) to say:

>     The file descriptor underlying stream is not a valid file
>     descriptor open for reading and there is no buffered data
>     available to be returned.

Although I don't expect it to break in practice, the close(fileno(fp))
or dup2(..., fileno(fp)) violates the rules about the "active handle" in
XSH 2.5.1 Interaction of File Descriptors and Standard I/O Streams.

I believe the "correct" solution with a stdio implementation that
doesn't offer something like freadhead() is not to use stdio but
implement own buffering.

-- 
Jilles Tjoelker

Reply via email to