[r6rs-discuss] Counter-proposal (Re: Proposed features for small Scheme, part 9: modules)

Aaron W. Hsu Tue, 13 Oct 2009 22:00:59 -0700

John,

I would like to make a counter-proposal to your module proposal. The  
proposal is at

<http://my.opera.com/arcfide/blog/2009/10/14/a-philosophy-on-scheme-modules>

but I will include a version at the bottom as well for the archives. My  
comments as they relate to your system in comparison with my proposal  
follows.

On Mon, 12 Oct 2009 00:23:33 -0400, John Cowan <[email protected]> wrote:

> The syntax I am proposing is mostly a subset, with two extensions, of
> the R6RS library syntax.  However, I'm using the keyword "module" rather
> than "library".  This is a temporary feature pending the acceptance of
> the extensions in large Scheme's module system; if they are accepted,
> the two systems will be directly upward compatible and can share the
> "library" keyword.

We might as well use the library word, because this is the word that is  
standardized already. I also don't agree with choosing a subset of R6RS  
libraries. I believe that we should try to maintain backwards  
compatibility. My proposal is completely backwards compatible with R6RS.  
The reason I say this is that I don't think the core module system should  
be changed by WG2. If it is, then this hampers portability between the two  
systems. i would prefer to assure implementations that at least, some  
level of communication is possible intra-working group here. I don't think  
choosing only a subset is going to be good enough in this case. The R6RS  
library form is simple enough that I think the WG1 can deal with it.

> A small Scheme module takes the general form:
>
>       (module module-name
>         [(export name ...)]
>         [(import importspec ...)]
>         [(feature-groups name ...)]
>         form ...)
>
> This defines a module with the name module-name, which is either a simple
> identifier or a list of simple identifiers.  The utility of hierarchical
> namespaces has long been established, but I propose that small Scheme
> allow simple identifiers for simple cases.

Yes, I agree with this here, and my proposal agrees with you here. Howver,  
I also think that we should allow for anonymous modules without names,  
which is important when we want to extend the module features and create  
new module languages on top of the existing forms, or when they are used  
for other purposes where a name should not be used.

> The export list specifies the names exported by this module.  Renaming
> identifiers on export is not part of this proposal, though some form of
> "deprefixing" would be a Good Thing.  I'm not sure just how to specify
> it, though.

My proposal deprefixes on import. Additionally, my proposal enforces a  
syntactic, generatable nature to macros, making it possible to enhance the  
module system with new export forms that will do interesting things like  
renaming. It also has one additional feature in the export list taken from  
almost all other module systems: the identification of syntax as syntax,  
making it possible to specify the exact dependencies of a macro, greatly  
improving a library's compilability.

> The import list specifies the modules imported by this module.
> Import specs can be either module names, or else take the form (prefix
> module-name name).  In the latter case, the exported identifiers of
> the specified module are prefixed with prefix (an identifier) before
> importing.

I see no reason not to go with the standard R6RS import forms plus a  
drop-prefix and alias form. I also think that the import should be its own  
form, so that it can appear in any definition context, just like modules  
should be allowed to do so.

> The feature-groups list specifies the feature groups which this
> implementation must provide in order for the module to work.  This is
> an extension to R6RS, which does not have feature groups.  If the
> implementation cannot load the feature groups dynamically, it must
> fail to define the module.  (Why aren't feature groups just modules?
> Because they don't define separate namespaces, and because they mostly
> reflect the limitations of particular small Scheme systems rather than
> components that can be loaded into them.)

i am going to argue that this is a new feature we should not implement in  
the standard. We can try it as an experiment somewhere else, but I don't  
think such an idea is ready for the standardization process yet, and  
because no Scheme system does this currently, I do not have this in my  
proposal. Instead, I propose that feature parameters can be used, or some  
other set of forms not tied to the module system, until we have a tested  
implementation.

> The forms constituting the body of the module are either Scheme
> definitions, or Scheme expressions, or (include "filename") forms in any
> order.  The semantics of the ordering are the same as that of a REPL;
> in particular, any forward reference to an unknown name is presumed
> to be to an unknown variable rather than to an unknown syntax; macros
> cannot be used before they are defined.  Include forms cause the direct
> incorporation of the contents of the form, and differ from load in two
> ways: they refer only to source code files, and they happen at macro
> expansion time rather than run time.  These rules deviate from those
> of R6RS.

I don't think we should deviate here. There's no real need to do so  
either. Most existing systems allow forward macro references, among other  
things, and a module form will be evaluated as a whole on the REPL, so  
there is no need to worry that we won't know for sure one way or the  
other. This is a restriction that I don't like. I do suggest that include  
should be a form, but it should be usable anywhere. I also propose the  
addition of INCLUDE/CI for case-insensitve includes. This may not be  
necessary, but in practice, makdes a difference with code that doesn't use  
the #!case-fold specifier (legacy code that you don't want to change).

> Importing a module causes any code in, or included in, the module to
> be loaded if the implementation can figure out where to load it from.
> In some cases this may not be possible, and the code will have to be
> loaded separately.

Be careful about the differences in visiting, evaluating, loading, &c.  
What if I LOAD the library before hand and it has some side-effects? When  
should those side-effects take place? I don't know that we should specify  
this any more than the R6RS did, to accomodate people who think that the  
library should be instantiated the moment it is loaded, as well as those  
who think it shold be at the import time or even the usage time, some of  
whom may instantiate it multiple times.

> In the top level (that is, outside any module), import and include
> are both available and do what you'd expect at macro expansion time.
> Unlike R6RS top-level programs, the small Scheme top level is not  
> required
> to have all imports at the beginning.

There should be no need to specify a different treatment of IMPORT or  
INCLUDE at the top-level. It can work just the same as it works inside, if  
import and include are just their own forms. Include can work much like a  
BEGIN. I suggest this because I like the idea of BEGIN making it possible  
to have forward references and the like.

> Module names, like feature groups, are recognized by cond-expand (from
> SRFI 0), which makes it possible to conditionally control the actions
> of a program depending on what modules have been imported into it.

I don't want cond-expand brought into this at all.

> Other than as specified here, the provisions of R6RS section 7.1 apply.
> Since there are no macros except syntax-rules macros, implicit phasing
> is sufficient.  What, if anything, to do about version numbers on modules
> remains an open issue:  I don't believe they are necessary for small  
> Scheme.

I agree that they probably aren't necessary, but believe that keeping  
things more backwards compatible and upwards compatible, meaning that WG2  
doesn't have a semantically different system than WG1 for their modules is  
good. Let's just go with one single general module system that is  
suffciently simple, but general enough. See my counter proposal for my  
opinions on this.

        Aaron W. Hsu

Counter-proposal follows:

A Philosophy on Scheme Modules

Tuesday, 13. October 2009, 23:26:36

Tags/Keywords: scheme, wg2, modules, wg1, tech, libraries, r6rs,  
languages, r6rs-discuss, programming

Almost every Scheme system worth its salt has some means of encapsulating  
a set of definitions and expressions and providing controlled visibility  
to those bindings created in some fashion. Recent discussions on  
r6rs-discuss have brought the classic debate about modules and  
library/packaging systems to the forefront of the Scheme community yet  
again.

In this article, I hope to address my own concerns regarding a module  
system, make some observations about solutions, and propose a direction  
for the development of a standardized module system. The end result is, I  
hope, a module system that feels "Schemely;" that is, a general,  
expressive construct that provides the means of satisfying the needs of  
the entire Scheme community with regards to module systems. This goal may  
never be fully achieved, but the ideas I present here, I hope, will make  
it possible to move towards that goal.

A Brief Historical Examination

Before I tackle this issue in full, I would to examine some of the past  
history of Scheme modules systems.

Before module systems were widespread, forms like EVAL-WHEN and LOAD were  
used to control the order of loading of files containing Scheme code. Note  
that there was no control of visibility here, just controlling the order  
of evaluation of Scheme code.

Later, a number of systems for managing Scheme code were developed. I will  
mention the three most relavant systems to my discussion here: the  
Scheme48 module system, the Chez Scheme module system, and the R6RS  
Library form.These three systems represent two opposite approaches to  
Scheme modules as well as the current module standard, which is, of  
course, a compromise among the systems.

The Scheme48 package system (which I can only discuss at a high level,  
since I have not actively used this packaging system for some time)  
represents the philosophies of separate implementation and interface,  
separate package declaration from code locations, and the idea that module  
systems exist more as a metalanguage than as a syntax of the Scheme  
language. In a sense, these modules represent a static description of the  
relations between groups of definitions and expressions.

The Chez Scheme module system represents the other end of the spectrum,  
where a module form is viewed as just another syntactic extension of the  
language for controlling scope and visibility. There is a basic form for  
encapsulating code and making only select identifiers visible to the  
outside world. Additionally, there is an import form which may appear in  
any place that a definition may.

The R6RS Library form represents a compromise of goals between the two  
systems. It cannot be generated by macros, and the import form is a direct  
part of the R6RS library form. However, there is no separation of  
interface and the module form itself is tied to the source code unless  
additional macros are used to separate the source code from the library  
form.

Modules versus Packages

The term "module system" can confuse the discussion somewhat, because  
people have different ideas of what should constitute a sufficient module  
system. Generally, there are those who look at module systems as package  
management systems, which are used as a "distribution fomat" for source  
code. This is much like the Scheme48 approach. Others view modules as a  
building block inside of Scheme, like the Chez Scheme module system.

Originally, the R6RS library form was touted as being a distribution  
system by some. Unfortunately, it lacks some of the important features  
that many consider necessary for that purpose. At the same time, it fails  
to satisfy the needs of the syntactic module crowd, since the library  
system is entirely top-level and static.

Desirables in a Module System

What important features should be in a module system? Again, the two camps  
of module philosophies will have two different answers, some of which will  
not agree.

The "packaging system" crowd's primary use of modules is to describe  
interfaces to code, to make discoverable descriptions of that code, and to  
make it easier to control the loading and evaluating of the various  
components. Generally speaking, these systems benefit from the following  
features:

Static, top-level metalanguages for package descriptions.
Separation of source code, implementation modules, and interfaces.
Some way to map libraries to files to make loading of software more  
automatic.

The syntactic module crowd favors the ability to use modules in a variety  
of locations to do micro-packaging. This means they may be generated by  
macros, and may not even have names that map directly to files at all. The  
forms may also be more closely tied to the rest of the Scheme code,  
because the code itself is generating modules. The features that tend to  
be important here are:

Syntactic, dynamic module forms that can occur inside of code, and not  
just at the top-level.
The ability to create anonymous modules.

Generally speaking, the two crowds don't approve of the module systems  
created by the other, because, obviously, they have conflicting goals.

If we are to design a module system that will work, I submit that we  
cannot strictly follow either philosophy because it is the right one. That  
is, neither view is right, and we should figure out ways to handle both.

Some have suggested (and I have generally agreed) that perhaps having two  
different systems in Scheme will make it possible to have a packaging  
system together with syntactic modules, both disjoint from each other and  
neither requiring the other.

I have thought this was a good idea. After all, it would satisfy the needs  
of almost everyone. However, doing this in the standard will create a much  
greater number of constructs than necessary, and chances are, both  
standards will not end up in the core Scheme. WG 1 requires a module  
system, but if there are to be two of them, I doubt that the community  
would support placing both in the Core Scheme document.

I began thinking, then, by going back and trying to approach the problem  
 from what I call my "Schemely Philosophy." Generally speaking, this is a  
philosophy that tries to take away features and discover the most  
expressive, practically useful construct that makes implementing the other  
features in a standard unnecessary. Scheme has traditionally succeeded in  
having a great many general features that allow you to express a great  
deal of other things without having to require them in the standard.

With this in mind, I propose a different primary goal for a standard  
module system: generality. It should be possible to create, from this  
module system, an interface on top of it that will satisfy the needs of  
either crowd. In other words, while the module system need not satisfy the  
needs of either crowd, but it should be possible to build systems on top  
of the module system that will satisfy one crowd or the other. This is  
generality.

Moreover, I believe that this initial module system should be simple. It  
should not require a great deal to understand the core constructs.

It should also be backwards compatible, such as can be done, with R6RS  
libraries. The reason I put forth this requirement is because it doesn't  
make sense to ignore the one standard library system that we have, unless  
it really makes sense to do so.

How would such a system look?

The first conclusion to be made about this system is that it would have to  
be syntactic. It is possible that a syntactic module system can be used as  
the base to a standard, discoverable library description language, but it  
is not possible for a static description of module systems to somehow  
enable syntactic modules. In order for the two to cooperate, the core  
forms must be syntactic.

In other words, examining the requirements of the two systems, it appears  
clear to me that the syntactic approach is more general.

Nonetheless, there are faults with Chez Scheme's module system which makes  
it inadequate as it standards.

With the pre-release of Chez Scheme v8.0, the import form in Chez Scheme  
can take either module names or R6RS library descriptions. This makes it  
possible to search for the library, but the Chez Scheme module system's  
naming scheme doesn't permit a naming scheme that makes it easy to map  
names to files in an useful way. Additionally, Chez Scheme's module system  
uses a positional export form, so extending the naming convention will  
result in ambiguous module forms. Clearly, while a syntactic module system  
would be good, the existing example I cite here will not work.

Let's examine the R6RS library form however. Syntactically, it uses an  
export keyword to identify the exports, and this allows the naming  
convention to be the way it is. Nothing about it is inherently static, and  
it would be easy to extend the naming of the libraries to handle single  
identifiers.

So, the module system I propose consists of two forms:

<library> := (library [<name>] <exports> . <body>)
<name> := #identifier | <r6rs library name>
<exports> := (export <export-spec> ...)
<export-spec> := #identifier | (#identifier #identifier ...)
<body> := (#expr|#def #expr|#def ...)

In the above form, I have tried to avoid introducing anything new that  
does not already exist in current module systems. This is almost like the  
existing R6RS library form, and is backwards compatible with it. I provide  
for simpler names, and I allow for the Chez Scheme method of specifying  
syntactic dependencies in exports (which is something that should have  
been there from the beginning in my opinion). This makes it possible to be  
more efficient in handling libraries. I have allowed the intermingling of  
expressions and definitions above, but I am not tied to this, and would be  
willing to accept only definitions followed by expressions, since this is  
in fact, how Chez Scheme's module system does it, and is the current modus  
operandi on R6RS. I have also made the name of the library optional. This  
is to allow for anonymous modules, which is essential if they are to be  
used effectively in macros and to be sufficiently general for implementing  
packaging description languages.

The major difference here is not in the form, which is basically the same,  
but in the fact that this form should be syntactic, in that you can  
generate it from macros. It should be possible to nest these library  
forms. I make no arguments about how they should map to files, since this  
should be up to the implementation.

I have removed, in the above, the import form. This is because this import  
form should be usable anywhere, and is, in this proposed system, its own  
form, and not a component of the library syntax. I would like to define  
two import forms which I believe are generally useful enough to be  
included.

<import> := (import|import-only <import-spec> <import-spec> ...)
<import-spec> :=
<R6RS library reference>
#identifier
(only <import-spec> #identifier ...) |
(except <import-spec> #identifier ...) |
(prefix <import-spec> #identifier ...) |
(drop-prefix <import-spec> #identifier) |
(rename <import-spec> (#identifier #identifier) ...) |
(alias <import-spec> (#identifier #identifier) ...)

The above is a combination of R6RS and Chez Scheme module import forms.  
Multiple specs may be listed in a single import form, but drop-prefix and  
alias have been added. The use of import-only means that only those  
identifiers imported from the import specs listed will be visible in the  
scope that the import-only form affects. This is useful when you want to  
generate these module forms.

These forms, import, library, and import-only can appear in any definition  
context.

I am also proposing that include, and possibly include/ci be a part of the  
standard. This will easy the creation of module and source code separation.

<include> := (include|include/ci <file-name-string>)

It should have the effect of expanding into the forms from the specified  
file. The /ci variant should be a case-insensitive version.

It is possible to do this with macros, so these forms are not strictly  
necessary, but they are of general interest, and make it much easier to  
write a sophisticated macro system. Additionally, it is more likely that  
Scheme's will include useful positioning information if the include forms  
are built in, rather than losing much positioning information from the  
current R6RS implementations of include.

But wait! Foul! Foul!

"This is just syntactic macros, you're just selling out the static package  
folks," I hear you say! No, actually, I am not. I am suggesting a standard  
module system that is general enough to be used by both crowds. The  
syntactic module crowd won't have to develop any new syntax to use this  
system, and the static crowd will have to do some extra macrology; this is  
true. Nonetheless, at least it is possible for both crowds to use the same  
underlying system! Otherwise, this is not possible.

An astute reader will also notice that I am proposing this be the module  
system for both WG 1 and WG 2. In fact, I believe that this system is  
simple enough for both crowds to use, is backwards compatible, and  
general.However, said reader will observe that if no procedural macro  
system is provided by the WG1, it will not be possible to create macros of  
sufficient expressiveness to create the static package description  
language. Yes, this is true.

The question then becomes, is the module system in WG1 supposed to be such  
that it satisfies everyone? Should it be a compromise to cater only to a  
select few, or really, cater to no one in particular, making no one happy  
with it? In the end, I contest that WG1 should have a module system that  
is simple and general, and not require additional work by WG2 to use it  
for all module systems. The above system would satisfy these conditions. I  
don't think it is necessary that a package description language be  
available in WG1, just that the system specified there facilitates the  
creation of one at the WG2 level.

The Evaluation of this System: Benefits and Drawbacks

Obviously, the drawbacks of this system are that by default, it ties the  
library system to code evaluation, which many people consider a bad thing.  
The R6RS library form does this as well. The syntactic base will also be  
seen by some as making it difficult for people who care about the  
"introspectability" or "discoverability" of modules.

Yes, the module system above will require some extra work to make a  
suitably sophisticated system on top of it that will satisfy the needs of  
the static description language crowd. It is however, possible to do this,  
even to the point that modules defined in this way may be introspected  
procedurally, to discover their imports and exports, &c. Moreover, once  
this is done once, in portable Scheme, this system will be portable to all  
compliant systems, making it more portable than existing solutions. This  
makes it possible to have the best of both worlds while maintaining a  
simple standard. WG2 may even want to develop an implementation of the  
static language on top of this proposed system.

The system itself has more potential benefits, however, that I believe  
outweigh the minor inconvenience presented by the above argument. Firstly,  
it is general enough to handle the entire spectrum of module systems.  
Secondly, it introduces no new concepts. All the above features exist  
already in one or another module systems, and the majority of the syntax  
comes directly from the existing library standard.

The above system is also fully backwards compatible with the R6RS library  
standard, making the transition to the new module system that much easier.  
Thus, this system holds to tradition, promotes maximum backwards  
compatibility, and is at the same time a very simple system.

Conclusion

The above module system proposed is simple, general, and backwards  
compatible, with no new features introduced, and promotes all of the  
necessary core features necessary to make an effective module system of  
most any desirable shape. The system is simple enough to be incorporated  
into WG1, and is expansive enough to require no changes for WG2, since it  
is a full module system. It is, in my opinion, the right approach to  
making a module system that is maximally applicable, while retaining the  
simple qualities of a good Scheme solution.

This proposal is obviously only a draft, and I would readily accept  
feedback on this issue.

-- 
Of all tyrannies, a tyranny sincerely exercised for the good of its  
victims may be the most oppressive. -- C. S. Lewis

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

[r6rs-discuss] Counter-proposal (Re: Proposed features for small Scheme, part 9: modules)

Reply via email to