NotFound wrote: > To open another can of worms, I think that we can live without > character set specification. We can stablish that the character set is > always unicode, and to deal only with encodings.
We had that discussion already, and the answer was "no" for several reasons: * Strings might contain binary data, it doesn't make sense to view them as Unicode * Unicode isn't necessarily universal, or might stop to be so in future. If a character is not representable in Unicode, and you chose to use Unicode for everything, you're screwed * related to the previous point, some other character encodings might not have a lossless round-trip conversion. > Ascii is an encoding > that maps directly to codepoints and only allows 0-127 values. > iso-8859-1 is the same with 0-255 range. Any other 8 bit encoding just > need a translation table. The only point to solve is we need some > special way to work with fixed-8 with no intended character > representation. Introducing the "no character set" character set is just a special case of arbitrary character sets. I see no point in using the special case over the generic one. Here's the discussion we had on this subject: http://irclog.perlgeek.de/parrot/2008-06-23#i_362697 Cheers, Moritz -- Moritz Lenz http://moritz.faui2k3.org/ | http://perl-6.de/