W liście z wto, 24-08-2004, godz. 20:20 +0300, Jarkko Hietaniemi napisał:
> > that with double recoding, or with $ARGV[0] not being equivalent to > > substr($ARGV[0], 0). > > What substr() example you are referring to here? I cannot find this > in your recent messages. $ perl -Mencoding=ISO-8859-2 -Mopen=:encoding\(ISO-8859-2\) -e ' eval {open F, "/etc/shadow"}; print "$ARGV[0]\n", substr($ARGV[0], 0), "\n"' Ą Ą "\x{00a1}" does not map to iso-8859-2 at -e line 1. \x{00a1} $ perl -Mencoding=ISO-8859-2 -Mopen=:encoding\(ISO-8859-2\) -e ' eval {open F, "/etc/shadow"}; print "$!\n", substr($!, 0), "\n"' Brak dostępu "\x{00ea}" does not map to iso-8859-2 at -e line 1. Brak dost\x{00ea}pu > > I hope the -C flag is considered a temporary hack, to be eventually > > replaced with somethings which supports other encodings and not only > > UTF-8. > > Possibly. It was an explicit solution for much greater brokenness > that resulting from assuming implicit UTF-8 from locales. What breaks? Maybe the problem is that Perl doesn't distinguish strings of text from arrays of bytes. Do people expect that print chr(255) outputs a single byte? It will not work when the stdout encoding is UTF-8 no matter what. If someone works in a mostly-UTF-8 environment, he probably expects stdout to be treated as UTF-8 text by default, which implies that he must use some other means for outputting raw bytes. Maybe syswrite. Similarly for input and file contents in general. > > use encoding files => "ISO-8859-2"; > > use encoding terminal => "UTF-8"; > > What do you mean by "terminal"? The STD* streams or /dev/tty? Nothing precise; it is yet to be decided what classes of "places" which need recoding should be distinguished. Perhaps one switch for whole IO is enough. Or maybe only STD* streams should be separated. -- __("< Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/