This is a supplementary proposal for I/O in the "small Scheme" language
beyond that of R5RS.  I am publishing this document to invite wide
comment.  There is nothing official about it.  I acknowledge the members
of the r6rs-discuss list for their assistance.  As before I retain sole
responsibility for it, including all errors.

The bindings formerly proposed in part 2, the ones that are left, have
migrated to http://tinyurl.com/thing-one.  The purpose of this document,
therefore, is primarily to explain how the newly introduced binary I/O
interoperates (or does not) with textual I/O.  Textual I/O is essentially
unchanged from R5RS, except for the introduction of read-line, which
understands all types of newlines that the Scheme implementation can
support: at least CR, LF, CR+LF, and possibly NEL, CR+NEL, and LS as well.

In an ideal situation, there would be distinct binary and textual ports,
and textual ports would be created from binary ports by adding encoding
and related information.  This is sort of what happens in R6RS, except
that when a textual port is created from a binary port, the binary port
is side-effected so that it remains open but the program can do nothing
with it.

The rationale for this is that the textual port may do an arbitrary
amount of buffering (encoding conversion is more efficient if done
by blocks than by characters), leaving the state of the underlying
binary port completely indeterminate.  Given that the binary port is
being side-effected anyway, I don't see the R6RS ports library as an
acceptable implementation of the ideal.

Alternatively, textual and binary ports could be completely
separate with no way to derive or convert one to the other.  This,
however, means that many of the R5RS procedures must be duplicated: to
input-port? we add binary-input-port?, and to open-file-for-input we add
open-file-for-binary-input (which dynamically binds current-binary-input),
for example.  This duplication seems inappropriate in a small Scheme.
What's more, it will be a mere matter of history that the short names
will be assigned to the textual operations, and the longer names to the
binary operations (unless indeed we triple the number of names).

A third alternative is to omit binary I/O altogether.  One of the use
cases for small Scheme, however, is in embedded systems.  If you don't
have any files, but do process packets from the network, you need to be
able to transput those packets without interference from any encoding.

So I reluctantly came to the fourth alternative.  A newly opened port
is ambiguous.  If R5RS operations are done on it, it is marked textual
and remains so.  If binary operations are done on it, it is marked binary
and remains so.  This is a mess, but the need to remain compatible with
R5RS/IEEE and the desire avoid duplicating many of its routines led me
to see it as the least bad alternative.

The binary operations provided are minimal: read-u8 (which is like
read-char), write-u8 (like write-char), and file-u8-position and
set-file-u8-position!.  The names are like those of R6RS, but use read
and write instead of get and put, and inject u8 into the file-position
names to make it clear that they only work on binary ports.

I looked at R6RS file-options, buffer-modes, codecs, and newline-styles,
but thought they were unnecessary for small Scheme.  Implementations are
urged to apply system defaults, either depending on the OS (LF as default
output for newline on Unix, e.g.)  or the setting of an OS-specific or
implementation-specific locale.

-- 
"Well, I'm back."  --Sam        John Cowan <[email protected]>

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to