Quoting Marco Antoniotti ([EMAIL PROTECTED]):
> following this thread, it seems to me that what would be extremely
> valuable would be a reasoned comparison of how different CL
> implementations (including the commercial ones) support Unicode w.r.t.
> the ANSI standard.
Yes.
Some possible questions (and partial answers for cmucl, acl, and sbcl):
* Which range of characters does an implementation support?
CMUCL: 8 bit
Allegro: 8 bit / 16 bit (different images shipped)
SBCL: 8 bit / 24 bit (compilation option)
* Unicode support:
CMUCL: no
Allegro: Certainly support for 16-bit unicode (I think that's
Unicode 1.0). Not sure about Unicode 2.0. To represent more
characters, some implementions use UTF-16 internally (Java does
that these days). I think Allegro allows use of surrogates in its
strings, but I'm not sure how much its string-related functions
actually understand about them.
SBCL: yes
* External formats:
CMUCL: n/a, the character code is written as an octet
Allegro: ISO-8859-1 to -9, ISO-8859-14 and -15, koi8-r, CP 1250-58, UTF-8
big5, gb2312, euc, CP 874, 932, 936, 949, 950, jis, shiftjis
Support for composed external formats: CRLF line endings
SBCL: fill me in
* Routines to map octet sequences to strings and back.
CMUCL: no
Allegro: string-to-octets, octets-to-string, string-to-native,
SBCL: string-to-octets, octets-to-string
(albeit with fewer helpful keyword arguments)
flexi-streams provide an emulation for this
Compare and discuss different APIs here.
* Support for bivalent streams (particularly helpful on unicode-aware
Lisps, because characters don't trivially map to octets any more):
CMUCL: don't know
Allegro: yes (simple streams)
SBCL: yes (fd streams; simple streams bivalent but not quite finished)
flexi-streams provide an emulation for this
* Support for (setf external-format) (like bivalent streams, this is
something that could, e.g. help XML parsers that don't know the
external format before having read the XML declaration)
CMUCL: no (?)
Allegro: yes
SBCL: yes (?)
flexi-streams provide this
I'm sure there are other interesting features and issues. Perhaps
sb-unicode people have something to say to this?
> Now: don't look at me for actually doing this. I have no time. I just
> think it is a good idea.
Me too, but perhaps the above helps.
d.
_______________________________________________
Gardeners mailing list
[email protected]
http://www.lispniks.com/mailman/listinfo/gardeners