On Tue, 2009-08-25 at 00:23 -0400, John Cowan wrote:
> Thomas Lord scripsit:
> 
> > What is "string-set!" supposed to mean when applied to a
> > character-which-is-a-string-of-length-1?
> 
> I don't believe in mutable strings, any more than I believe in mutable
> bignums.  If you want vectors, YKWTFT.


I think you are confusing a few things.  

I think you're confusing a style of programming with
what the core language should provide.   I think you're
confused about how bignums, vectors, and strings are 
different.  I think you're confused about why to keep
characters and strings disjoint in small Scheme.

First, the differences between vectors, bignums, and
strings:

Vectors are heterogeneous arrays of objects.  In general,
that obligates implementations to carry type information
about each element of the array.   Therefore, vectors are
not suitable to represent strings in most environments.

Bignums might suggest some underlying array-like representation
and you might wonder: why not allow mutation of that?  The
answer is that the underlying representation need not be
array like and, even if it is, there is no agreed upon way
to encode a bignum.   Is it an array of bits?  Is it an 
array of integers that describes a continued fraction?
Bignums aren't mutable for the same reason there isn't a "-ref"
procedure for them.   On the other hand, if you were going
to implement bignums in Scheme, it isn't guaranteed but
won't be surprising if underneath them you want some
mutable representation.

Strings are like vectors and unlike bignus in that
"-ref" and "-set!" have a natural meaning for them
(although I acknowledge that you and I don't agree
on exactly what a character ought to be).   Strings 
are unlike vectors in that they are homogeneous arrays.

Second: strings without mutation:

Using strings without mutation, whether or not the
system has such a thing as "immutable strings", is
a fine programming style for many situations but
certainly not all.   Perhaps the easiest-to-see examples
come from systems programming where, for example, a 
string is a natural representation for a buffer of
characters that is displayed directly on a terminal;
or, another example, a device driver wants to accumulate
characters in a string-like buffer before passing them
to a client program.   In situations like those examples
you very much want a stable, string-like object (representing
a fixed region of memory) with mutations on the content
of the string.

If I were to build what we could call a "large implementation
of small scheme" the type system would probably include, at
its lowest levels, fixnums, floats, 8, 16, and 32-bit code
points, cons-pairs, vectors, a facility for deriving a 
new disjoint type from any earlier constructed type,
a facility for creating mutable, homogeneous arrays of any
earlier constructed type, a facility for 
constructing a disjoint immutable type from any mutable type,
procedures, environments, locatives, symbols, low-level
macros, disjoint booleans, and nil.  (I hope I didn't
forget any.)

I don't suppose that small Scheme should require all of
those types, I'm just saying that a large implementation
of what I think of as small scheme should permit them.

On the back of those things, plus lambda and two-argument
eval, I would implement generics and provide the primitives
of core small scheme as generics.   The result would
not be the same as but would have some similarities to 
common lisp, with low-level concrete types, user-constructed
disjoint types of arbitrary representation, and some
abstract types (such as the primitive types of small scheme).

In such a system I'd have a Scheme-class language suitable
for everything from low-level systems programming to very
high-level domain specific languages.  The standard small
scheme environment is then a kind of lingua franca.  For
example, if someone offers me a nice sort algorithm written
in small Scheme I can probably put it to good use.  Or, 
if I use my dialect as an extension language for a text editor,
anyone who knows Scheme can find a comfortable dialect in
which to write many conceivable extensions.

And *you* could (if inclined) make a Scheme-like environment
but with only immutable strings, if you're so inclined.  
And if someone wants to, they could write compiler optimizations
that are specific to that immutable-strings world. 

The job of a small Scheme spec, in this way of looking
at things, is to specify useful invariants for certain 
abstract types and macros and lambda but without precluding
something like "a large implementation of small Scheme" as
described here.

I don't think there is just one job for "big Scheme"
dialects.   There is one big important job for *a* big
Scheme dialect: to provide a large "growable" dialect
that a lot of implementers happen to agree upon.  But
in general, I think people should experiment in defining
big Scheme dialects with the notion that in a large
small Scheme like I described, multiple big dialects 
can co-exist and usefully inter-operate with, at the
very least, small Scheme as a lingua franca between 
environments.

Finally: should characters and strings be disjoint
or should characters be regarded as strings of length 1.

The argument for mutable strings is one way to answer
that question but not the only way.

Another argument is that:

As you've noted, implementations "want" to have a
special (usually immediate) representation
for characters.

Additionally, some programs "want" to make a distinction
between element types and sequence types.

Finally, it is easier to unify than pull apart:

Given disjoint characters, please by all means 
produce a library of "str-" functions that handles
only immutable strings and identifies characters 
with length-1 (in codepoints) strings.   Going the 
other way is not so easy, at least cleanly.  To
go the other way we shall have to wrap either or both
characters and strings in some disjoint type, leading
to a needlessly inefficient representation.

Disjointness is a good choice for the core standard.

-t



_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to