Dana Sharvit - M <[EMAIL PROTECTED]> writes:
>Hi ,
>I am using the Encode module (perl 5.8)to convert a string from utf8 to big
>5.
>There is something that I do not understand that I thought you may help
>with:
>The input to the program is a file that contains a utf8 string,
>The encoding works properly only when I use the following code:
> use Encode qw(encode decode find_encoding from_to);
>
>$file = shift;
>open my $in,  $file;
>
>while ($str = <$in>) {
>   chomp;
>      open my $in1,  "<:encoding(utf8)", \$str;
>          while (<$in1>) {
>          $octet = encode("utf8", $_);
>          from_to($octet, "utf8","big5");
>          print "$octet\n";
>          }
>
>
>}
>close $in;

That code reads the file, and decodes the UTF-8 to get characters
(due to :encoding). As they are now characters you re-encode back to 
UTF-8 octets, then use from_to to take those re-encoded octets, 
decode them again (internal to from_to) and then re-encode as big5.

This is a lot of pointless re-encoding!

You should either read the file as octets, or keep the :encoding 
(which is safer with respect to locale effects) and just encode:

open my $in1,  "<:encoding(utf8)", $file;
while (<$in1>) {
  chomp;
  $octet = encode("big5", $_);
  print "$octet\n";
}
close $in;



>
>what I dont understand is two things:
>1.why do I need to read the string using IO ( open my $in1,
>"<:encoding(utf8)", \$str;)

If your environment expects big5 (as the fact you do raw print suggests)
then something is probably assuming data read from files is big5 and 
not utf8 - so you have to tell it.

>2.why do I need to use the encode function before the from_to
>function($octet = encode("utf8", $_);)

See above.

>
>I thought that the bellow code will convert correctly but it does not:
>use Encode qw(encode decode find_encoding from_to);
>
>$file = shift;
>open my $in,  $file;
>
>while ($str = <$in>) {
>  chomp($str);
>  from_to($octet, "utf8","big5");
>  print "$octet\n";

Presumably as from_to modifies string "in place" (ugh) you meant:

$file = shift;
open my $in,  $file;
while (defined($str = <$in>)) {
   from_to($str,"utf8,"big5");
   print $str;
}

That should work unless you have something which causes open to assume 
some encoding.
 
>
>}
>close $in;
>
>Thank you
>Dana

Reply via email to