I am posting this as an individual member of the Scheme community. I am not speaking for the R6RS editors, and this message should not be confused with the editors' eventual formal response.
This message is a collection of responses to comments made by two different authors during the past 24 hours. When the author of a quotation is not identified, it is the same as the author of the previous quotation. John Cowan wrote: > > Quibble: I think the historical view of strings should be > > continued for backwards compatibility with Scheme tradition. > > In that case, you also have to make characters something other than > Unicode scalar values, or else go to very tricky implementations. > I'm trying to break as little of R5.92RS as possible. The character and string API described in the current draft of R6RS will not break any portable R5RS code. Your claim that backwards compatibility requires characters to be "something other than Unicode scalar values" or "very tricky implementations" is false, as shown by several implementations that already implement characters as Unicode scalar values and several other implementations that will do so within months. > I think the only way to make this fly is to introduce a CL-style > distinction between characters (in texts) and basic characters > (in strings), though keeping the name "character". The question is why you think that. > Alternatively, > we could say that strings are sequences of Scheme character objects, "Strings are sequences of characters." (Opening sentence of section 9.14 of the current draft R6RS.) > but the > atomic unit of texts is a text containing a single (Unicode) character. That would be fine. The new data type is unconstrained by the old. > However, I'd be pretty unhappy with doubling up like this. There is > little or nothing that strings can do that texts cannot. Aside from mutation if texts are immutable, or efficiency if you somehow manage to design an inherently inefficient data type of texts. > For that matter, you can implement R5RS strings as vectors, > provided you are allowed to redefine "vector?". If strings > are massively more efficient than texts, though, people will > go on using them. The strings described by the current draft R6RS are quite efficient for the traditional uses of strings in Scheme. It would not be hard to design a text type that is slow enough to encourage people to continue to use strings. I would hope, however, that the designers of the text type would try to avoid that outcome. Chris Hanson wrote: > The > historical string abstraction should remain vectors of 8-bit (or 7-bit) > characters with side effects. Neither the IEEE/ANSI standard nor the Scheme reports limited characters to 7 or 8 bits. Many implementations did so, but any code that relied upon such a limitation was never portable. > Or we could say that strings contain only a small subset, e.g. ISO > 8859-1 or US-ASCII. That would be a more radical departure from past reports than is taken by the current draft R6RS. > > In my opinion, texts should be written up as a SRFI, and > > then be considered for inclusion in the R7RS. > > <flame> > > Yes, but the same could be said for many of the experiments currently > being pushed into R6RS. Speaking only for myself and some > as-yet-unidentified historical brethren, I would be much happier with a > less radical and more evolutionary document. Why exactly is it > necessary to change **everything** now? Either this process works, in > which case there will be further revisions. Or it doesn't, in which > case it doesn't matter. That is a valid criticism of much of the current draft, but is not a valid criticism of its treatment of characters and strings. In the current draft, characters and strings are backwards compatible with the R5RS in the sense that no portable R5RS-conforming program will have to change its handling of characters and strings. Furthermore, the current draft's handling of characters and strings was first put forward as SRFI 75 in July 2005, which received the benefit of the usual SRFI discussion and at least one actual implementation before it was withdrawn in May 2006 to clear the way for a revision of it in the first draft R6RS. You can't call it experimental when people have been using it in production code since December 2005. > The editors should be in trying to make this document a success, rather > than in packing it with all these new things. A more conservative > document stands a much better chance of ratification and implementation. Agreed, except that characters and strings are not new in Scheme, none of the character and procedures described in (r6rs base) are new in Scheme, and the few additional character and strings procedures in (r6rs unicode) can't break any existing code. You can't get much more conservative than that. > That's important, because the way things are going I am very skeptical > that R6RS will be implemented. If the R6RS is ratified, it will be implemented. Whether the R6RS will be ratified is, however, an open question. > Well, except for things like XML libraries that won't work with strings, > or the fact that SYMBOL->STRING will signal an error on any symbol > containing a character outside the subset. Just to clarify: libraries written in other languages have never interoperated with Scheme strings without marshalling or dependence on implementation-specific representations. Furthermore, Chris Hanson may be the only person who thinks symbol->string should "signal an error on any symbol containing a character outside" Hanson's preferred set of characters. Neither can be blamed on the current draft R6RS. Will _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
