> > It's possible that I misunderstand you, but I think we're on the same page. > > I'm also concerned that R6RS, as currently written, seems to require > > UCS-4/UTF-32 strings. The problem is that string-ref returns characters, and > > characters can't be surrogates. Given that Windows, Mac, Java, andIBM's ICU > > all use UTF-16, that would be a Bad Thing. In fact, my position would be > > even more extreme: I lament the loss of single/multi byte strings in general > > (which would include UTF-8). They're still useful for low-level work. In > > fact, they'll still be needed--think of the various Scheme to C compilers, > > for example, that will need a char equivalent--they just won't be > > standardized anymore. > > Is there any reason why bytevectors will not fill the need for > single-byte strings?
They can, but... First, from a practical perspective many useful operations (string<?, string-downcase, etc.) have been lost. (If they were replaced they would have rather funny names--bytevector<?, bytevector-downcase, etc!) Second, from a clarity perspective bytevectors are meant to be much more general-purpose than strings; they have, for example, operations for getting and setting integer and floating point numbers. Those are rather odd operations for a string! Bytevectors are definitely a very useful low-level addition to Scheme. But single/multi-byte strings were, I think, an unnecessary loss, especially for those who do lots of operating sytem- and library-level work. (Automatic and unavoidable coversion to and from single/multi-byte strings isn't a good idea because a) there is a potentially avoidable performance hit and b) the conversion isn't guaranteed, so there are questions about what to do in the case of failure.) _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
