Jason Orendorff wrote:

> And most (but not all) Unicode string implementations use UTF-16.
> Among languages and libraries that are very widely used, the majority
> is overwhelming: Java, Microsoft's CLR, Python, JavaScript, Qt,
> Xerces-C, and on and on. 

(...and Windows and Mac and IBM's ICU and PHP 6 and...)

> Higher-level APIs are a fine approach.
> 
> The other solution is to standardize the implementation, so that the
> efficient algorithms don't differ.  I want to push this seriously one
> last time:  Unicode strings have been kicked around for a while now,
> and despite Will's link, real-world implementations do not vary much.
> I don't think it's premature to standardize.

I started looking into these issues a while ago when we were faced with 
internationalizing an app. (The app runs on several platforms and under 
several web servers.) Before learning about what's out there I would have 
wanted to keep my options open; knowing what I know now I'd agree with 
Jason. It would make sense to standardize on UTF-16 strings and UTF-32 
characters. (Note, btw, that that doesn't preclude UTF-8 strings. It just 
means that the built-in string type would be UTF-16.)

On a different note, I find this desire to shield programmers from code 
units odd and senseless. If R6RS intends Scheme to be a higher-level 
language that abstracts away representation issues why is it adding 
fixnums and flonums? Why do bytevectors have operations that get and set 
singles and doubles?

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to