[r6rs-discuss] Proposed features for small Scheme, part 5: extensible syntax

John Cowan Wed, 09 Sep 2009 12:38:30 -0700

After a good deal of muttering and muddling and whining on #vcheme,
I have come up with a proposal for mildly extensible lexical syntax
(known in R5RS as "lexical structure"), such that a newly introduced
type can have appropriate constants specified for it in Scheme code.
Historically, standard Scheme readers have not provided this feature,
though some implementations have provided fully redefinable syntax using
CL-style readtables.


Arbitrarily mutable lexical syntax, however, means that people trying to
read Scheme code may not even be able to *recognize* it as such: such
basic features as (...) can be defined away or redefined.  This may be
useful when constructing lexers at run time, but that doesn't have to
be done through "read".

Readtables, furthermore, have problems in compiled implementations.
In order to make sure that the compiler is aware of the extension,
it's necessary to make sure that the setter procedure, or whatever,
runs at compile time.  This immediately drags in "eval-when" with all
its complications.  I will refer to this as "the phasing problem" below.

SRFI-10 uses #. <list> to invoke a reader constructor specified by
the car of the list, which is looked up in a registry to determine
which procedure to call.  This provides extensibility without full
redefinability, but it still has the phasing problem in compilers.

I am therefore proposing something weaker than SRFI-10 that is not
implemented through the reader alone, but requires cooperation from the
Scheme evaluator.  It is closely analogous to the behavior of backquote,
comma, and atsign-comma, which are expanded by a conformant reader to
S-expressions involving the special identifiers quasiquote, unquote,
and splicing-unquote.  The user then relies on the Scheme evaluator to
construct the expected value; this does not happen when backquotes are
read at run time unless "eval" is called explicitly.  (By contrast, the
CL reader is free to expand backquote to any CL form, system-dependent
or not, which will evaluate to the correct result.)

Similarly, when the reader sees #v [<digits>] <identifier> <datum>,
it is expanded to an S-expression of the form
(construct-<identifier>-from-reader <digits> (quote <datum>)).  If there
are no <digits>, the reader substitutes #f.  When this (sub)form is
eventually evaluated, it hopefully constructs and returns an object
of some appropriate type.  The datum is quoted to make it easy for
construct-<identifier>-from-reader to be a procedure rather than a macro.

The main limitation of this syntax is that it doesn't work in quoted
datums, nor at run time.  I see no way to get those features directly
without re-creating the phasing problem.  Quoted lists can be replaced
with backquoted ones, with the syntax unquoted.  The run time problem
can also be worked around: see below.

The choice of #v is primarily for compatibility with the R6RS lexical
syntax #vu8.  In that context, it is presumably a mnemonic for "vector";
in this context, it has no particular meaning.  If R6RS compatibility is
rejected, #s would be another choice, similar to though not compatible
with its use in CL.

I also considered leaving out the "v" altogether, and going with simple
"#<digits><identifier>", with <identifier> required to be at least two
characters long.  But existing Schemes interpret "#table" as "#t able",
which led me to having a single-character flag.

With this feature, the hypothetical queue package I mentioned before
would allow the creation of an literal queue in Scheme code with #v queue
(1 7 111 10), and with the record printing feature, it could be printed
back in the same way.  The package could also provide a procedure that
would accept an arbitrary S-expression and return it (or a copy of it)
with all lists whose car is construct-queue-from-reader replaced with
actual queues, for those who want to read in queue syntax at run time.
(This could be generalized into a procedure that expands all such lists,
but only with the use of eval, since there is no other way to get from
a symbol to its definition at run time.)

Note that although the lexical syntax is not lexically scoped,
its use can be.  If you specify #v queue <datum> in a location where
construct-queue-from-reader is not bound, the reader will not complain,
but the Scheme evaluator will complain of an undefined variable in the
ordinary way.

-- 
Dream projects long deferred             John Cowan <[email protected]>
usually bite the wax tadpole.            http://www.ccil.org/~cowan
        --James Lileks

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

[r6rs-discuss] Proposed features for small Scheme, part 5: extensible syntax

Reply via email to