Thank you, that was very helpful :) -----Original Message----- From: Bjoern Hoehrmann [mailto:[EMAIL PROTECTED] Sent: Sunday, September 12, 2004 10:09 AM To: Burak Gürsoy Cc: perl-unicode Subject: Re: Encode vs encoding
* Burak Gürsoy wrote: >Can someone *please* explain me the >difference between (except the scope) encoding and Encode::encode()? encoding.pm is about how Perl should interprete your source file, Encode.pm is about character encoding operations you may wish to perform. >#!/usr/bin/perl -w >use strict; >my $char = "\xFE"; >print ord $char; # prints 254 Perl assumes that $char is ISO-8859-1 encoded. >#!/usr/bin/perl -w >use strict; >use Encode; >my $char = "\xFE"; > $char = encode 'ISO-8859-9', $char; >print ord $char; # prints 63 As above, U+00FE is not available in ISO-8859-9 and thus replaced by a question mark. >#!/usr/bin/perl -w >use strict; >use Encode; >my $char = "\xFE"; > $char = encode 'ISO-8859-9', $char, Encode::FB_CROAK(); ># dies with: "\x{00fe}" does not map to iso-8859-9 >print ord $char; As above, just that it does not replace the offending character but croaks instead. >#!/usr/bin/perl -w >use strict; >use encoding 'ISO-8859-9'; >my $char = "\xFE"; >print ord $char; # prints 351 You've told Perl to consider the source ISO-8859-9 encoded which includes some interpretation of strings such as your $char. >How can I get that "351" with Encode.pm? You need to decode the binary string into a character string using e.g. the Encode::decode routine, e.g. perl -MEncode -e "print ord decode 'iso-8859-9'=>qq(\xFE)" >Note: If I use the letter version (small s with a dot under it "?") >instead of the "\x" escape, I get the same results... Due to the same reasons.