This is a supplementary proposal for I/O in the "small Scheme" language beyond that of R5RS. I am publishing this document to invite wide comment. There is nothing official about it. I acknowledge the members of the r6rs-discuss list for their assistance. As before I retain sole responsibility for it, including all errors.
The bindings formerly proposed in part 2, the ones that are left, have migrated to http://tinyurl.com/thing-one. The purpose of this document, therefore, is primarily to explain how the newly introduced binary I/O interoperates (or does not) with textual I/O. Textual I/O is essentially unchanged from R5RS, except for the introduction of read-line, which understands all types of newlines that the Scheme implementation can support: at least CR, LF, CR+LF, and possibly NEL, CR+NEL, and LS as well. In an ideal situation, there would be distinct binary and textual ports, and textual ports would be created from binary ports by adding encoding and related information. This is sort of what happens in R6RS, except that when a textual port is created from a binary port, the binary port is side-effected so that it remains open but the program can do nothing with it. The rationale for this is that the textual port may do an arbitrary amount of buffering (encoding conversion is more efficient if done by blocks than by characters), leaving the state of the underlying binary port completely indeterminate. Given that the binary port is being side-effected anyway, I don't see the R6RS ports library as an acceptable implementation of the ideal. Alternatively, textual and binary ports could be completely separate with no way to derive or convert one to the other. This, however, means that many of the R5RS procedures must be duplicated: to input-port? we add binary-input-port?, and to open-file-for-input we add open-file-for-binary-input (which dynamically binds current-binary-input), for example. This duplication seems inappropriate in a small Scheme. What's more, it will be a mere matter of history that the short names will be assigned to the textual operations, and the longer names to the binary operations (unless indeed we triple the number of names). A third alternative is to omit binary I/O altogether. One of the use cases for small Scheme, however, is in embedded systems. If you don't have any files, but do process packets from the network, you need to be able to transput those packets without interference from any encoding. So I reluctantly came to the fourth alternative. A newly opened port is ambiguous. If R5RS operations are done on it, it is marked textual and remains so. If binary operations are done on it, it is marked binary and remains so. This is a mess, but the need to remain compatible with R5RS/IEEE and the desire avoid duplicating many of its routines led me to see it as the least bad alternative. The binary operations provided are minimal: read-u8 (which is like read-char), write-u8 (like write-char), and file-u8-position and set-file-u8-position!. The names are like those of R6RS, but use read and write instead of get and put, and inject u8 into the file-position names to make it clear that they only work on binary ports. I looked at R6RS file-options, buffer-modes, codecs, and newline-styles, but thought they were unnecessary for small Scheme. Implementations are urged to apply system defaults, either depending on the OS (LF as default output for newline on Unix, e.g.) or the setting of an OS-specific or implementation-specific locale. -- "Well, I'm back." --Sam John Cowan <[email protected]> _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
