Re: [r6rs-discuss] Strings as codepoint-vectors: bad

Thomas Lord Thu, 15 Mar 2007 22:41:40 -0800

Jason Orendorff wrote:

Perhaps we would like to hide the in-memory encoding of strings from
users, but that's not really possible if you *also* wish to expose a
fast low-level API with integer offsets.  The (string-ref) and
(string-set!)  APIs, as currently specified, hit a sweet spot of API
badness: they're so low-level and essential that it's almost
unthinkable that they be anything but O(1); yet they're sufficiently
high-level that every actual O(1) implementation sacrifices efficiency
somewhere else.



(With apologies, but....) this is an example of a category of error
that seems to show up whenever people start talking Unicode
implementation, whether in the context of Scheme or not.   There
is some significance that is R6 specific, though:

The generic error is wishing for some "easy way out" that
makes Unicode as easy to hack as ASCII.   Won't happen.
Text is just not that simple.   Unicode does a fantastic job of
making it "... but no simpler".

The R6 specific thing is the draft's gist of designing CHAR and
STRING around text, rather than around a more abstract conception
of PORTs.  That leads, among other problems, people to go on
a quest for that "easy way out" (and, anyway, there are principled
reasons to reject it a priori).   This ain't Python we're hacking here.


-t


_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Re: [r6rs-discuss] Strings as codepoint-vectors: bad

Reply via email to