On Sat, 19 Sep 2009 00:24:58 -0500, John Cowan <[email protected]> wrote:
> This is a proposal for the removal of string-set! (and consequently > string-fill!) from the R7RS small Scheme language. I am publishing this > document to invite wide comment. There is nothing official about it. > I very gratefully acknowledge the kind help of Alex Shinn, who provided > the topic sentences for most of the paragraphs below. However, I retain > sole responsibility for this document, including all errors. > > I believe that despite the prescription of the draft WG1 charter that > no features of IEEE Scheme (a subset of R4RS) should be removed from > R7RS small Scheme, an exception should be made for string-set!, for at > least the following reasons: [Snipped a list of points, most of which I agree with] > 4) As currently designed, strings are functionally just vectors of > characters. In an 8-bit world, using the traditional representation > of strings carries a 4:1 storage advantage, making it worthwhile > to distinguish them clearly from general vectors But 21-bit Unicode > characters are a much better fit, if represented as immediate (unboxed) > values, for general vectors using 32-bit pointers. Granted that not all > small Scheme systems will provide full Unicode support, general vectors > start to look much less expensive than they once were. In short: if > you want something that behaves like a vector of characters, simply use > a general vector that contains characters. I don't think there's any point to using a general vector of characters as a replacement for mutable strings. I'll add another reason for getting rid of mutable strings, as well as a rejoinder to reason #4: 6) There's no general utility to `string-set!' without also the ability to insert and delete characters in a string. Programs that work with text generally want either an immutable string or an editable string into which characters can be inserted and deleted. Editable strings are typically represented as gap buffers. It's possible to use mutable strings to build up an editable string representation, but this is not evidence of general utility of fixed-length mutable strings in my opinion. Nothing will give you immutable strings if you don't already have them. On the other hand, if the core Scheme strings were immutable and an editable strings library were provided, existing users of mutable strings could convert to using the latter representation with little effort, and writing programs which require an editable strings representation would be vastly simplified. It's much easier for an implementation that uses a variable-width internal string representation to provide immutable strings and editable strings than to provide only mutable fixed-length strings. When represented as a gap buffer, editable strings retain the 4-to-1 or 8-to-1 compactness advantage of strings over general vectors of characters. The implementation of editable strings is not significantly more complex than the implementation of mutable fixed-length strings. Complex algorithms expressed in terms of `string-set!' can be rewritten in terms of insert and delete operations with a great increase in clarity. > As a consequence of removing string-set!, string-fill! (not in IEEE > Scheme) becomes impossible and string-copy less useful. I do not propose > to remove string-copy, however, because it can eliminate space leaks > that are caused by taking a small shared substring of a large existing > string: when the larger string should be GC'ed, it is retained as a > whole because of the shared substring. Using string-copy judiciously > can prevent such leaks. I shouldn't have to use `string-copy' for this; my implementation should do it for me. If there's no user-exposed backpointer from the substring to the original string, the GC can dispose of the original string and copy out the retaining displaced substrings when it makes sense to do so. Implementations which don't want to implement this level of GC complexity can make `substring' always copy. There's really no sense to providing a copy operation for an immutable type. -- Brian Mastenbrook [email protected] http://brian.mastenbrook.net/ _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
