After a good deal of muttering and muddling and whining on #vcheme, I have come up with a proposal for mildly extensible lexical syntax (known in R5RS as "lexical structure"), such that a newly introduced type can have appropriate constants specified for it in Scheme code. Historically, standard Scheme readers have not provided this feature, though some implementations have provided fully redefinable syntax using CL-style readtables.
Arbitrarily mutable lexical syntax, however, means that people trying to read Scheme code may not even be able to *recognize* it as such: such basic features as (...) can be defined away or redefined. This may be useful when constructing lexers at run time, but that doesn't have to be done through "read". Readtables, furthermore, have problems in compiled implementations. In order to make sure that the compiler is aware of the extension, it's necessary to make sure that the setter procedure, or whatever, runs at compile time. This immediately drags in "eval-when" with all its complications. I will refer to this as "the phasing problem" below. SRFI-10 uses #. <list> to invoke a reader constructor specified by the car of the list, which is looked up in a registry to determine which procedure to call. This provides extensibility without full redefinability, but it still has the phasing problem in compilers. I am therefore proposing something weaker than SRFI-10 that is not implemented through the reader alone, but requires cooperation from the Scheme evaluator. It is closely analogous to the behavior of backquote, comma, and atsign-comma, which are expanded by a conformant reader to S-expressions involving the special identifiers quasiquote, unquote, and splicing-unquote. The user then relies on the Scheme evaluator to construct the expected value; this does not happen when backquotes are read at run time unless "eval" is called explicitly. (By contrast, the CL reader is free to expand backquote to any CL form, system-dependent or not, which will evaluate to the correct result.) Similarly, when the reader sees #v [<digits>] <identifier> <datum>, it is expanded to an S-expression of the form (construct-<identifier>-from-reader <digits> (quote <datum>)). If there are no <digits>, the reader substitutes #f. When this (sub)form is eventually evaluated, it hopefully constructs and returns an object of some appropriate type. The datum is quoted to make it easy for construct-<identifier>-from-reader to be a procedure rather than a macro. The main limitation of this syntax is that it doesn't work in quoted datums, nor at run time. I see no way to get those features directly without re-creating the phasing problem. Quoted lists can be replaced with backquoted ones, with the syntax unquoted. The run time problem can also be worked around: see below. The choice of #v is primarily for compatibility with the R6RS lexical syntax #vu8. In that context, it is presumably a mnemonic for "vector"; in this context, it has no particular meaning. If R6RS compatibility is rejected, #s would be another choice, similar to though not compatible with its use in CL. I also considered leaving out the "v" altogether, and going with simple "#<digits><identifier>", with <identifier> required to be at least two characters long. But existing Schemes interpret "#table" as "#t able", which led me to having a single-character flag. With this feature, the hypothetical queue package I mentioned before would allow the creation of an literal queue in Scheme code with #v queue (1 7 111 10), and with the record printing feature, it could be printed back in the same way. The package could also provide a procedure that would accept an arbitrary S-expression and return it (or a copy of it) with all lists whose car is construct-queue-from-reader replaced with actual queues, for those who want to read in queue syntax at run time. (This could be generalized into a procedure that expands all such lists, but only with the use of eval, since there is no other way to get from a symbol to its definition at run time.) Note that although the lexical syntax is not lexically scoped, its use can be. If you specify #v queue <datum> in a location where construct-queue-from-reader is not bound, the reader will not complain, but the Scheme evaluator will complain of an undefined variable in the ordinary way. -- Dream projects long deferred John Cowan <[email protected]> usually bite the wax tadpole. http://www.ccil.org/~cowan --James Lileks _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
