Rami Friedman <[EMAIL PROTECTED]> writes:
>I need to read files written in a variety of charsets (Big5, Arabic,
>Hebrew, etc) and write their contents to an oracle database. This
>problem is easy to solve in Java where each feed gets converted to a
>ucs-2 string, but, if possible, I need to write the code in perl. Can
>this be done?
I am actively working on this at present.
The development track perl is very close to being able to do it.
>The third edition of Programming Perl says locales and
>unicode don't mix well yet.
locales are unfortunately rather underspecified.
Knowing the locale name does not tell you what the encoding is.
(If you know of a way please let me know!)
>I guess that means I cannot convert from
>Big5 to utf-8, for instance.
Work is in process so you can say:
open(my $fh,"<:encoding(big5)",$name)
then you can read Unicode characters out of the stream.
If you write them to a stream opened thus:
open(my $oh,">:utf8",$outname);
Then you will have converted the file.
Alternatively there will be mechanisms to get utf8 encoded data
for storing into a database.
>Is that correct? Could I instead rely on
>the database driver to convert from the foreign charset to unicode?
>
>Thanks in advance to anyone who might be able to help.
--
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.