This email contains follow-up on some old points on I/O, and a few new ones.
1. [Old] Re my proposal for STANDARD-{INPUT,OUTPUT,ERROR}-PORT. John Cowan (I
think) felt that these were useless. I'm not a big one for rebinding/mutating
current I/O streams; I prefer normally to use ports directly, or write small
blocks of code that use WITH-{INPUT-FROM,OUTPUT-TO}-FILE. However, in a messy
enough program that's constantly switching current ports all over the place,
it's convenient to be able to access the standard ports directly. Obviously,
three lines of code at the beginning of the program will capture them, but I'd
still like to see them brought into the standard. I don't feel strongly on this
matter, but thought I'd give it a second kick at the can.
Also, are these ports always defined? Is it possible that CURRENT-INPUT-PORT
might not be set at all, or might have value #f, in some cases? If I'm not
mistaken, a Windows executable has no standard input or output.
I would also put in a weak suggestion for CONSOLE-INPUT-PORT and
CONSOLE-OUTPUT-PORT, for situations where the I/O has to be from/to the REAL
terminal, with a proviso that these might not be available under some
implementations (i.e., their value is #f). I don't feel strongly about this,
but thought I'd toss it in for consideration.
2. [OLD] The fact that IEEE Scheme is required to be a subset of WG1 is
sufficient reason to include CHAR-READY? and U8-READY?. However, given the
difficulty of implementing them correctly in many environments, it's also
reasonable to discourage programs from using them. A careful reading of the
CHAR-READY? entry shows that it's possible that CHAR-READY? returns #f when
there actually is a character available [*], which exactly matches the case
where you can only find out whether there's any data by attempting to read.
This is either accidental or a brilliant example of VERY careful language
lawyering! I would suggest clarifying this point by adding some remark about
some environments making it extremely difficult to implement CHAR-READY
reliably, so it might return #f when a character is available, and adding a
similar remark to the U8-READY? entry.
[*] Technically, CHAR-READY? is to return #f `otherwise', when no character is
available. However, nobody can distinguish the case where CHAR-READY? outright
lies, claiming there's nothing there when there is, from the case where there
really WAS no character available, and then 1 zeptosecond later one appeared.
It is not stated that, at the moment a character would be read successfully,
CHAR-READY?, if called, would have to return #t.
[I assume that few if any implementations would use non-blocking I/O just so
they can support CHAR-READY? correctly.]
3. [Old] I had suggested adding a remark that some implementations support
other kinds of sources and sinks beside files (and devices). John remarked that
this is addressed in the first paragraph of §6.7. That says that other kinds of
ports besides binary and character might be provided, which is a different
point. My remark was aimed at conveying that an implementation might provide
other kinds of binary/character ports that the procedures in §§6.7.2 and 6.7.3
will handle.
4. [Old] I had expressed confusion about the notion that binary ports
inherently support character operations. This morning I had an epiphany on this
subject. To me, a `binary port' is a port that is used to read or write
successive octets, while a `character port' contains additional encoding
support, even if it's just end-of-line translation. Thus in C-derived I/O
systems one might do a fopen(filnam, "r") for character reading, and
fopen(filnam, "rb") for binary reading.
This is NOT how these terms are used in the Report! A binary port is one whose
backing store (on disk or elsewhere) contains octets, while a character port
(e.g., a string port) has a backing store containing Scheme characters. The
term `binary' doesn't refer to reading or writing in binary mode, but to the
type of backing store the port uses. This is implied, but not stated, by the
current wording, leaving people like me relatively free to misunderstand the
point.
Short of changing the terminology, which may not be practical, perhaps a
sentence or two defining these terms more precisely could be added.
5. [New] §6.7.1, bottom of col 2, p. 45. WITH-INPUT-FROM-FILE and
WITH-OUTPUT-TO-FILE are defined, but should not WITH-ERROR-TO-FILE also be
added?
6. [New] Most implementations provide a procedure named something like
READ-LINE that reads the next line from an input port. Processing a file by
lines is an extremely common paradigm, and should therefore be supported. (I
can rant at great length about why this should be here, but I'll spare you my
ranting on this point unless you think it's needed :).
7. [New] What happens if both READ-CHAR and READ-U8 are used on the same port?
I can envision several possible answers.
A. legal
B. `it is an error'
C. `an error is signalled'
D. implementation-defined, might be an error in some or all cases
The example I've been thinking of is a UTF-8 encoded file in which one reads
the first octet of a character via READ-U8 and then attempts to do a READ-CHAR.
If those were the options, I'd vote for D, which allows the implementation to
provide additional ways of resynchronizing (e.g., by rewinding the file) that
are outside the scope of WG1. B is also fine; C is implementable; one just
needs a tri-state variable (neutral/char/u8) in each port, but I'd question the
point of doing this. I'm not sure that A makes any sense.
I don't much care which option (or some other one) is selected, but it's
important to say what happens.
Writing doesn't suffer from this problem, I'm not sure if symmetry is important
or not.
8. [New] §6.7.4: LOAD/INCLUDE. Some implementations use LOAD's argument to name
a file, others do some kind of path search, or do some other transformation on
the name. Gambit, for example, uses a prefix of ~~/ to signify looking in the
Gambit directory. I suggest replacing `_filename_ should be a string naming an
existing file containing Scheme source code' with `An implementation-dependent
operation is used to transform _filename_ into the name of an existing file
containing Scheme source code'. Whether the parameter name should still be
_filename_ is not for me to say.
-- vincent
_______________________________________________
Scheme-reports mailing list
[email protected]
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports