On Tue, 2009-08-25 at 00:23 -0400, John Cowan wrote: > Thomas Lord scripsit: > > > What is "string-set!" supposed to mean when applied to a > > character-which-is-a-string-of-length-1? > > I don't believe in mutable strings, any more than I believe in mutable > bignums. If you want vectors, YKWTFT.
I think you are confusing a few things. I think you're confusing a style of programming with what the core language should provide. I think you're confused about how bignums, vectors, and strings are different. I think you're confused about why to keep characters and strings disjoint in small Scheme. First, the differences between vectors, bignums, and strings: Vectors are heterogeneous arrays of objects. In general, that obligates implementations to carry type information about each element of the array. Therefore, vectors are not suitable to represent strings in most environments. Bignums might suggest some underlying array-like representation and you might wonder: why not allow mutation of that? The answer is that the underlying representation need not be array like and, even if it is, there is no agreed upon way to encode a bignum. Is it an array of bits? Is it an array of integers that describes a continued fraction? Bignums aren't mutable for the same reason there isn't a "-ref" procedure for them. On the other hand, if you were going to implement bignums in Scheme, it isn't guaranteed but won't be surprising if underneath them you want some mutable representation. Strings are like vectors and unlike bignus in that "-ref" and "-set!" have a natural meaning for them (although I acknowledge that you and I don't agree on exactly what a character ought to be). Strings are unlike vectors in that they are homogeneous arrays. Second: strings without mutation: Using strings without mutation, whether or not the system has such a thing as "immutable strings", is a fine programming style for many situations but certainly not all. Perhaps the easiest-to-see examples come from systems programming where, for example, a string is a natural representation for a buffer of characters that is displayed directly on a terminal; or, another example, a device driver wants to accumulate characters in a string-like buffer before passing them to a client program. In situations like those examples you very much want a stable, string-like object (representing a fixed region of memory) with mutations on the content of the string. If I were to build what we could call a "large implementation of small scheme" the type system would probably include, at its lowest levels, fixnums, floats, 8, 16, and 32-bit code points, cons-pairs, vectors, a facility for deriving a new disjoint type from any earlier constructed type, a facility for creating mutable, homogeneous arrays of any earlier constructed type, a facility for constructing a disjoint immutable type from any mutable type, procedures, environments, locatives, symbols, low-level macros, disjoint booleans, and nil. (I hope I didn't forget any.) I don't suppose that small Scheme should require all of those types, I'm just saying that a large implementation of what I think of as small scheme should permit them. On the back of those things, plus lambda and two-argument eval, I would implement generics and provide the primitives of core small scheme as generics. The result would not be the same as but would have some similarities to common lisp, with low-level concrete types, user-constructed disjoint types of arbitrary representation, and some abstract types (such as the primitive types of small scheme). In such a system I'd have a Scheme-class language suitable for everything from low-level systems programming to very high-level domain specific languages. The standard small scheme environment is then a kind of lingua franca. For example, if someone offers me a nice sort algorithm written in small Scheme I can probably put it to good use. Or, if I use my dialect as an extension language for a text editor, anyone who knows Scheme can find a comfortable dialect in which to write many conceivable extensions. And *you* could (if inclined) make a Scheme-like environment but with only immutable strings, if you're so inclined. And if someone wants to, they could write compiler optimizations that are specific to that immutable-strings world. The job of a small Scheme spec, in this way of looking at things, is to specify useful invariants for certain abstract types and macros and lambda but without precluding something like "a large implementation of small Scheme" as described here. I don't think there is just one job for "big Scheme" dialects. There is one big important job for *a* big Scheme dialect: to provide a large "growable" dialect that a lot of implementers happen to agree upon. But in general, I think people should experiment in defining big Scheme dialects with the notion that in a large small Scheme like I described, multiple big dialects can co-exist and usefully inter-operate with, at the very least, small Scheme as a lingua franca between environments. Finally: should characters and strings be disjoint or should characters be regarded as strings of length 1. The argument for mutable strings is one way to answer that question but not the only way. Another argument is that: As you've noted, implementations "want" to have a special (usually immediate) representation for characters. Additionally, some programs "want" to make a distinction between element types and sequence types. Finally, it is easier to unify than pull apart: Given disjoint characters, please by all means produce a library of "str-" functions that handles only immutable strings and identifies characters with length-1 (in codepoints) strings. Going the other way is not so easy, at least cleanly. To go the other way we shall have to wrap either or both characters and strings in some disjoint type, leading to a needlessly inefficient representation. Disjointness is a good choice for the core standard. -t _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
