Leopold Toetsch wrote:

1) The Parrot internal character type

«Strings in Parrot's native string format will probably be an array of "Parrot_Rune"s.»

or iso-8859-1 or UCS-2.

To be more accurate: Parrot has *no* native string format. It stores strings in whatever format you give it (including iso-8859-1, UCS-2, ASCII, etc). And, it stores them as a string buffer, not an array of any type of character.

2) the concept of Parrot_Rune or

<cite>
Unicode codepoint where values >= 0x80000000 are
       understood to be entries into the global "Parrot_grapheme_table" array.
</cite>

seems to be implying that we are gonna starting to:

a) rewrite / improve the now used ICU library
b) inventing a new "standard"
c) will do a lot of future hiring work to keep in sync with unicode folks ;-)

Basically I have some concerns "who will implement and maintain it".

Agreed that would be bad, but I don't think Simon intended that. Regardless, the current spec is only for an additional normalization form added on top of the existing Unicode Standard. No changes to ICU, just another way of interacting with strings, whatever format they happen to be in.

Allison

Reply via email to