John,
I would like to make a counter-proposal to your module proposal. The
proposal is at
<http://my.opera.com/arcfide/blog/2009/10/14/a-philosophy-on-scheme-modules>
but I will include a version at the bottom as well for the archives. My
comments as they relate to your system in comparison with my proposal
follows.
On Mon, 12 Oct 2009 00:23:33 -0400, John Cowan <[email protected]> wrote:
> The syntax I am proposing is mostly a subset, with two extensions, of
> the R6RS library syntax. However, I'm using the keyword "module" rather
> than "library". This is a temporary feature pending the acceptance of
> the extensions in large Scheme's module system; if they are accepted,
> the two systems will be directly upward compatible and can share the
> "library" keyword.
We might as well use the library word, because this is the word that is
standardized already. I also don't agree with choosing a subset of R6RS
libraries. I believe that we should try to maintain backwards
compatibility. My proposal is completely backwards compatible with R6RS.
The reason I say this is that I don't think the core module system should
be changed by WG2. If it is, then this hampers portability between the two
systems. i would prefer to assure implementations that at least, some
level of communication is possible intra-working group here. I don't think
choosing only a subset is going to be good enough in this case. The R6RS
library form is simple enough that I think the WG1 can deal with it.
> A small Scheme module takes the general form:
>
> (module module-name
> [(export name ...)]
> [(import importspec ...)]
> [(feature-groups name ...)]
> form ...)
>
> This defines a module with the name module-name, which is either a simple
> identifier or a list of simple identifiers. The utility of hierarchical
> namespaces has long been established, but I propose that small Scheme
> allow simple identifiers for simple cases.
Yes, I agree with this here, and my proposal agrees with you here. Howver,
I also think that we should allow for anonymous modules without names,
which is important when we want to extend the module features and create
new module languages on top of the existing forms, or when they are used
for other purposes where a name should not be used.
> The export list specifies the names exported by this module. Renaming
> identifiers on export is not part of this proposal, though some form of
> "deprefixing" would be a Good Thing. I'm not sure just how to specify
> it, though.
My proposal deprefixes on import. Additionally, my proposal enforces a
syntactic, generatable nature to macros, making it possible to enhance the
module system with new export forms that will do interesting things like
renaming. It also has one additional feature in the export list taken from
almost all other module systems: the identification of syntax as syntax,
making it possible to specify the exact dependencies of a macro, greatly
improving a library's compilability.
> The import list specifies the modules imported by this module.
> Import specs can be either module names, or else take the form (prefix
> module-name name). In the latter case, the exported identifiers of
> the specified module are prefixed with prefix (an identifier) before
> importing.
I see no reason not to go with the standard R6RS import forms plus a
drop-prefix and alias form. I also think that the import should be its own
form, so that it can appear in any definition context, just like modules
should be allowed to do so.
> The feature-groups list specifies the feature groups which this
> implementation must provide in order for the module to work. This is
> an extension to R6RS, which does not have feature groups. If the
> implementation cannot load the feature groups dynamically, it must
> fail to define the module. (Why aren't feature groups just modules?
> Because they don't define separate namespaces, and because they mostly
> reflect the limitations of particular small Scheme systems rather than
> components that can be loaded into them.)
i am going to argue that this is a new feature we should not implement in
the standard. We can try it as an experiment somewhere else, but I don't
think such an idea is ready for the standardization process yet, and
because no Scheme system does this currently, I do not have this in my
proposal. Instead, I propose that feature parameters can be used, or some
other set of forms not tied to the module system, until we have a tested
implementation.
> The forms constituting the body of the module are either Scheme
> definitions, or Scheme expressions, or (include "filename") forms in any
> order. The semantics of the ordering are the same as that of a REPL;
> in particular, any forward reference to an unknown name is presumed
> to be to an unknown variable rather than to an unknown syntax; macros
> cannot be used before they are defined. Include forms cause the direct
> incorporation of the contents of the form, and differ from load in two
> ways: they refer only to source code files, and they happen at macro
> expansion time rather than run time. These rules deviate from those
> of R6RS.
I don't think we should deviate here. There's no real need to do so
either. Most existing systems allow forward macro references, among other
things, and a module form will be evaluated as a whole on the REPL, so
there is no need to worry that we won't know for sure one way or the
other. This is a restriction that I don't like. I do suggest that include
should be a form, but it should be usable anywhere. I also propose the
addition of INCLUDE/CI for case-insensitve includes. This may not be
necessary, but in practice, makdes a difference with code that doesn't use
the #!case-fold specifier (legacy code that you don't want to change).
> Importing a module causes any code in, or included in, the module to
> be loaded if the implementation can figure out where to load it from.
> In some cases this may not be possible, and the code will have to be
> loaded separately.
Be careful about the differences in visiting, evaluating, loading, &c.
What if I LOAD the library before hand and it has some side-effects? When
should those side-effects take place? I don't know that we should specify
this any more than the R6RS did, to accomodate people who think that the
library should be instantiated the moment it is loaded, as well as those
who think it shold be at the import time or even the usage time, some of
whom may instantiate it multiple times.
> In the top level (that is, outside any module), import and include
> are both available and do what you'd expect at macro expansion time.
> Unlike R6RS top-level programs, the small Scheme top level is not
> required
> to have all imports at the beginning.
There should be no need to specify a different treatment of IMPORT or
INCLUDE at the top-level. It can work just the same as it works inside, if
import and include are just their own forms. Include can work much like a
BEGIN. I suggest this because I like the idea of BEGIN making it possible
to have forward references and the like.
> Module names, like feature groups, are recognized by cond-expand (from
> SRFI 0), which makes it possible to conditionally control the actions
> of a program depending on what modules have been imported into it.
I don't want cond-expand brought into this at all.
> Other than as specified here, the provisions of R6RS section 7.1 apply.
> Since there are no macros except syntax-rules macros, implicit phasing
> is sufficient. What, if anything, to do about version numbers on modules
> remains an open issue: I don't believe they are necessary for small
> Scheme.
I agree that they probably aren't necessary, but believe that keeping
things more backwards compatible and upwards compatible, meaning that WG2
doesn't have a semantically different system than WG1 for their modules is
good. Let's just go with one single general module system that is
suffciently simple, but general enough. See my counter proposal for my
opinions on this.
Aaron W. Hsu
Counter-proposal follows:
A Philosophy on Scheme Modules
Tuesday, 13. October 2009, 23:26:36
Tags/Keywords: scheme, wg2, modules, wg1, tech, libraries, r6rs,
languages, r6rs-discuss, programming
Almost every Scheme system worth its salt has some means of encapsulating
a set of definitions and expressions and providing controlled visibility
to those bindings created in some fashion. Recent discussions on
r6rs-discuss have brought the classic debate about modules and
library/packaging systems to the forefront of the Scheme community yet
again.
In this article, I hope to address my own concerns regarding a module
system, make some observations about solutions, and propose a direction
for the development of a standardized module system. The end result is, I
hope, a module system that feels "Schemely;" that is, a general,
expressive construct that provides the means of satisfying the needs of
the entire Scheme community with regards to module systems. This goal may
never be fully achieved, but the ideas I present here, I hope, will make
it possible to move towards that goal.
A Brief Historical Examination
Before I tackle this issue in full, I would to examine some of the past
history of Scheme modules systems.
Before module systems were widespread, forms like EVAL-WHEN and LOAD were
used to control the order of loading of files containing Scheme code. Note
that there was no control of visibility here, just controlling the order
of evaluation of Scheme code.
Later, a number of systems for managing Scheme code were developed. I will
mention the three most relavant systems to my discussion here: the
Scheme48 module system, the Chez Scheme module system, and the R6RS
Library form.These three systems represent two opposite approaches to
Scheme modules as well as the current module standard, which is, of
course, a compromise among the systems.
The Scheme48 package system (which I can only discuss at a high level,
since I have not actively used this packaging system for some time)
represents the philosophies of separate implementation and interface,
separate package declaration from code locations, and the idea that module
systems exist more as a metalanguage than as a syntax of the Scheme
language. In a sense, these modules represent a static description of the
relations between groups of definitions and expressions.
The Chez Scheme module system represents the other end of the spectrum,
where a module form is viewed as just another syntactic extension of the
language for controlling scope and visibility. There is a basic form for
encapsulating code and making only select identifiers visible to the
outside world. Additionally, there is an import form which may appear in
any place that a definition may.
The R6RS Library form represents a compromise of goals between the two
systems. It cannot be generated by macros, and the import form is a direct
part of the R6RS library form. However, there is no separation of
interface and the module form itself is tied to the source code unless
additional macros are used to separate the source code from the library
form.
Modules versus Packages
The term "module system" can confuse the discussion somewhat, because
people have different ideas of what should constitute a sufficient module
system. Generally, there are those who look at module systems as package
management systems, which are used as a "distribution fomat" for source
code. This is much like the Scheme48 approach. Others view modules as a
building block inside of Scheme, like the Chez Scheme module system.
Originally, the R6RS library form was touted as being a distribution
system by some. Unfortunately, it lacks some of the important features
that many consider necessary for that purpose. At the same time, it fails
to satisfy the needs of the syntactic module crowd, since the library
system is entirely top-level and static.
Desirables in a Module System
What important features should be in a module system? Again, the two camps
of module philosophies will have two different answers, some of which will
not agree.
The "packaging system" crowd's primary use of modules is to describe
interfaces to code, to make discoverable descriptions of that code, and to
make it easier to control the loading and evaluating of the various
components. Generally speaking, these systems benefit from the following
features:
Static, top-level metalanguages for package descriptions.
Separation of source code, implementation modules, and interfaces.
Some way to map libraries to files to make loading of software more
automatic.
The syntactic module crowd favors the ability to use modules in a variety
of locations to do micro-packaging. This means they may be generated by
macros, and may not even have names that map directly to files at all. The
forms may also be more closely tied to the rest of the Scheme code,
because the code itself is generating modules. The features that tend to
be important here are:
Syntactic, dynamic module forms that can occur inside of code, and not
just at the top-level.
The ability to create anonymous modules.
Generally speaking, the two crowds don't approve of the module systems
created by the other, because, obviously, they have conflicting goals.
If we are to design a module system that will work, I submit that we
cannot strictly follow either philosophy because it is the right one. That
is, neither view is right, and we should figure out ways to handle both.
Some have suggested (and I have generally agreed) that perhaps having two
different systems in Scheme will make it possible to have a packaging
system together with syntactic modules, both disjoint from each other and
neither requiring the other.
I have thought this was a good idea. After all, it would satisfy the needs
of almost everyone. However, doing this in the standard will create a much
greater number of constructs than necessary, and chances are, both
standards will not end up in the core Scheme. WG 1 requires a module
system, but if there are to be two of them, I doubt that the community
would support placing both in the Core Scheme document.
I began thinking, then, by going back and trying to approach the problem
from what I call my "Schemely Philosophy." Generally speaking, this is a
philosophy that tries to take away features and discover the most
expressive, practically useful construct that makes implementing the other
features in a standard unnecessary. Scheme has traditionally succeeded in
having a great many general features that allow you to express a great
deal of other things without having to require them in the standard.
With this in mind, I propose a different primary goal for a standard
module system: generality. It should be possible to create, from this
module system, an interface on top of it that will satisfy the needs of
either crowd. In other words, while the module system need not satisfy the
needs of either crowd, but it should be possible to build systems on top
of the module system that will satisfy one crowd or the other. This is
generality.
Moreover, I believe that this initial module system should be simple. It
should not require a great deal to understand the core constructs.
It should also be backwards compatible, such as can be done, with R6RS
libraries. The reason I put forth this requirement is because it doesn't
make sense to ignore the one standard library system that we have, unless
it really makes sense to do so.
How would such a system look?
The first conclusion to be made about this system is that it would have to
be syntactic. It is possible that a syntactic module system can be used as
the base to a standard, discoverable library description language, but it
is not possible for a static description of module systems to somehow
enable syntactic modules. In order for the two to cooperate, the core
forms must be syntactic.
In other words, examining the requirements of the two systems, it appears
clear to me that the syntactic approach is more general.
Nonetheless, there are faults with Chez Scheme's module system which makes
it inadequate as it standards.
With the pre-release of Chez Scheme v8.0, the import form in Chez Scheme
can take either module names or R6RS library descriptions. This makes it
possible to search for the library, but the Chez Scheme module system's
naming scheme doesn't permit a naming scheme that makes it easy to map
names to files in an useful way. Additionally, Chez Scheme's module system
uses a positional export form, so extending the naming convention will
result in ambiguous module forms. Clearly, while a syntactic module system
would be good, the existing example I cite here will not work.
Let's examine the R6RS library form however. Syntactically, it uses an
export keyword to identify the exports, and this allows the naming
convention to be the way it is. Nothing about it is inherently static, and
it would be easy to extend the naming of the libraries to handle single
identifiers.
So, the module system I propose consists of two forms:
<library> := (library [<name>] <exports> . <body>)
<name> := #identifier | <r6rs library name>
<exports> := (export <export-spec> ...)
<export-spec> := #identifier | (#identifier #identifier ...)
<body> := (#expr|#def #expr|#def ...)
In the above form, I have tried to avoid introducing anything new that
does not already exist in current module systems. This is almost like the
existing R6RS library form, and is backwards compatible with it. I provide
for simpler names, and I allow for the Chez Scheme method of specifying
syntactic dependencies in exports (which is something that should have
been there from the beginning in my opinion). This makes it possible to be
more efficient in handling libraries. I have allowed the intermingling of
expressions and definitions above, but I am not tied to this, and would be
willing to accept only definitions followed by expressions, since this is
in fact, how Chez Scheme's module system does it, and is the current modus
operandi on R6RS. I have also made the name of the library optional. This
is to allow for anonymous modules, which is essential if they are to be
used effectively in macros and to be sufficiently general for implementing
packaging description languages.
The major difference here is not in the form, which is basically the same,
but in the fact that this form should be syntactic, in that you can
generate it from macros. It should be possible to nest these library
forms. I make no arguments about how they should map to files, since this
should be up to the implementation.
I have removed, in the above, the import form. This is because this import
form should be usable anywhere, and is, in this proposed system, its own
form, and not a component of the library syntax. I would like to define
two import forms which I believe are generally useful enough to be
included.
<import> := (import|import-only <import-spec> <import-spec> ...)
<import-spec> :=
<R6RS library reference>
#identifier
(only <import-spec> #identifier ...) |
(except <import-spec> #identifier ...) |
(prefix <import-spec> #identifier ...) |
(drop-prefix <import-spec> #identifier) |
(rename <import-spec> (#identifier #identifier) ...) |
(alias <import-spec> (#identifier #identifier) ...)
The above is a combination of R6RS and Chez Scheme module import forms.
Multiple specs may be listed in a single import form, but drop-prefix and
alias have been added. The use of import-only means that only those
identifiers imported from the import specs listed will be visible in the
scope that the import-only form affects. This is useful when you want to
generate these module forms.
These forms, import, library, and import-only can appear in any definition
context.
I am also proposing that include, and possibly include/ci be a part of the
standard. This will easy the creation of module and source code separation.
<include> := (include|include/ci <file-name-string>)
It should have the effect of expanding into the forms from the specified
file. The /ci variant should be a case-insensitive version.
It is possible to do this with macros, so these forms are not strictly
necessary, but they are of general interest, and make it much easier to
write a sophisticated macro system. Additionally, it is more likely that
Scheme's will include useful positioning information if the include forms
are built in, rather than losing much positioning information from the
current R6RS implementations of include.
But wait! Foul! Foul!
"This is just syntactic macros, you're just selling out the static package
folks," I hear you say! No, actually, I am not. I am suggesting a standard
module system that is general enough to be used by both crowds. The
syntactic module crowd won't have to develop any new syntax to use this
system, and the static crowd will have to do some extra macrology; this is
true. Nonetheless, at least it is possible for both crowds to use the same
underlying system! Otherwise, this is not possible.
An astute reader will also notice that I am proposing this be the module
system for both WG 1 and WG 2. In fact, I believe that this system is
simple enough for both crowds to use, is backwards compatible, and
general.However, said reader will observe that if no procedural macro
system is provided by the WG1, it will not be possible to create macros of
sufficient expressiveness to create the static package description
language. Yes, this is true.
The question then becomes, is the module system in WG1 supposed to be such
that it satisfies everyone? Should it be a compromise to cater only to a
select few, or really, cater to no one in particular, making no one happy
with it? In the end, I contest that WG1 should have a module system that
is simple and general, and not require additional work by WG2 to use it
for all module systems. The above system would satisfy these conditions. I
don't think it is necessary that a package description language be
available in WG1, just that the system specified there facilitates the
creation of one at the WG2 level.
The Evaluation of this System: Benefits and Drawbacks
Obviously, the drawbacks of this system are that by default, it ties the
library system to code evaluation, which many people consider a bad thing.
The R6RS library form does this as well. The syntactic base will also be
seen by some as making it difficult for people who care about the
"introspectability" or "discoverability" of modules.
Yes, the module system above will require some extra work to make a
suitably sophisticated system on top of it that will satisfy the needs of
the static description language crowd. It is however, possible to do this,
even to the point that modules defined in this way may be introspected
procedurally, to discover their imports and exports, &c. Moreover, once
this is done once, in portable Scheme, this system will be portable to all
compliant systems, making it more portable than existing solutions. This
makes it possible to have the best of both worlds while maintaining a
simple standard. WG2 may even want to develop an implementation of the
static language on top of this proposed system.
The system itself has more potential benefits, however, that I believe
outweigh the minor inconvenience presented by the above argument. Firstly,
it is general enough to handle the entire spectrum of module systems.
Secondly, it introduces no new concepts. All the above features exist
already in one or another module systems, and the majority of the syntax
comes directly from the existing library standard.
The above system is also fully backwards compatible with the R6RS library
standard, making the transition to the new module system that much easier.
Thus, this system holds to tradition, promotes maximum backwards
compatibility, and is at the same time a very simple system.
Conclusion
The above module system proposed is simple, general, and backwards
compatible, with no new features introduced, and promotes all of the
necessary core features necessary to make an effective module system of
most any desirable shape. The system is simple enough to be incorporated
into WG1, and is expansive enough to require no changes for WG2, since it
is a full module system. It is, in my opinion, the right approach to
making a module system that is maximally applicable, while retaining the
simple qualities of a good Scheme solution.
This proposal is obviously only a draft, and I would readily accept
feedback on this issue.
--
Of all tyrannies, a tyranny sincerely exercised for the good of its
victims may be the most oppressive. -- C. S. Lewis
_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss