On Jan 27, 2004, at 3:47 PM, Cory Spencer wrote:

Perhaps someone with a bit more familiarity with the Parrot IO subsystem
could give me some guidance here. I'm currently trying to get a new
'peek' opcode working, and I'm having difficulties getting the io_unix
layer implemented correctly.


As far as I know, I'd get a call down into the io_unix layer when the
ParrotIO object isn't buffered. What I want to be able to do is to
read()/fread() a character off of the io->fd filedescriptor, copy it into
the buffer, then ungetc() it back onto the stream.

You can't push a character back onto a Unix file descriptor. In order to emulate this for parrot, you'll need some storage hanging off of the ParrotIO structure to store the "pushed back" characters, and then munge the read methods to pull data from here before reading from the real descriptor, if there has been anything pushed back. This is, essentially what the C std. lib. buffered IO API does--the core Unix IO API doesn't provide this functionality. For parrot, I think that we should only do this for the io_buf layer (and maybe the io_stdio layer), which is the buffered IO layer, and already has a buffer which can be used for this purpose. I don't think it's appropriate for the io_unix layer--I see that as a direct wrapper around the Unix API.


Unfortunately, however, ungetc requires a (FILE *), while the ParrotIO object carries around only a raw file descriptor (I think).

Yes, the C std. lib. IO API is a wrapper on top of the core OS IO routines (for Unix or Windows), and we're using the core IO routines to implement our IO functionality, rather than going through an extra layer. (And io_stdio is based on the C std. lib., and I believe is provided so that it can be used on systems for which none of the other base layers is available--non-Unix and non-Windows.)


I've seen some instances where people will cast the raw descriptor to a
(FILE *)

I can't imagine where that would ever work. A (FILE *) is a pointer to a struct which stores various bits of data, including the actual file descriptor. A file descriptor is just an integer, and isn't going to be interpretable as a pointer to such a struct--using the core Unix IO API, no such struct will have been created anywhere in memory. So you can't get a FILE* via any sort of casting, at least not on Unix platforms.


however the man page for ungetc warns ominously in its BUGS
section that:

       It  is  not advisable to mix calls to input functions from
       the stdio library with low - level calls to read() for the
       file  descriptor  associated  with  the  input stream; the
       results will be undefined and very probably not  what  you
       want.

This is warning about something else. It's saying don't use the C API to do IO on a FILE*, and also use the Unix IO API on the descriptor which is is the fileno() of that FILE*. But even this you wouldn't do by casting--you'd either get the descriptor from the FILE* using fileno(), or use fdopen() to create a FILE* from a descriptor.


But in any event, we don't want to use the C std. lib. IO API inside of the io_unix layer.

That being said, what is the best course for buffering such characters at
the io_unix layer? I apparently am not able to use the standard library
functions to do so (additionally, they only guarantee that you can peek
and replace a single character).

As I said above, I think we'd only want to do this for the io_buf layer, though others may disagree. If we do want to do it at the io_unix layer, then we can just copy down a bunch of code from io_buf, because we will be making the io_unix layer a buffered layer (with the difference being that the buffer would only be populated in the case of pushing back read items, an not during reads).


JEff



Reply via email to