On Fri, Mar 25, 2005 at 07:38:10PM -0000, Chip Salzenberg wrote: : Would this be a good time to ask for explanation for C<str> being : never Unicode, while C<Str> is always Unicode, thus leading to an : inability to box a non-Unicode string?
As Rod said, "str" is just a way of declaring a byte buffer, for which "characters", "graphemes", "codepoints", and "bytes" all mean the same thing. Conversion or coercion to more abstract types must be specified explicitly. : And might I also ask why in Perl 6 (if not Parrot) there seems to be : no type support for strings with known encodings which are not subsets : of Unicode? Well, because the main point of Unicode is that there *are* no encodings that cannot be considered subsets of Unicode. Perl 6 considers itself to have abstract Unicode semantics regardless of the underlying representation of the data, which could be Latin-1 or Big5 or UTF-76. That being said, abstract Unicode itself has varying levels of abstraction, which is how we end up with .codes, .graphs, and .chars in addition to .bytes. Larry