Christophe,
Thanks for your mail. So far Encode::Guess does not guess the
encoding for filehandle not because it is impossible but because
Encode::Guess takes a very conservative -- even paranoia --
approach. For example, Ambiguity raises exception, not preferred
encoding.
One way to enable guessing filehandle goes like this.
sub open_and_guess{
my $filename = shift;
open my $fh, "<:raw", $filename or return; # or die if you like
my $head = <$fh>; # may not work for UTF-(16|32);
my $enc = guess_encoding($head);
ref $enc or die $enc; # or return $enc
my $encname = $enc->name;
seek $fh, 0, 0;
binmode $fh, ":encoding($encname)";
return $fh;
}
Here we open, guess, and reopen but this does not work for general
case; Not all files are seekable and reopenable (i.e. pipes and
sockets).
To know more about Encode::Guess, try
http://search.cpan.org/~dankogai/Encode-2.12/lib/Encode/Guess.pm
Yours,
Dan the Encode Maintainer
On Oct 24, 2005, at 19:12 , HERMIER Christophe wrote:
Hello,
I am using the Encode::Guess module to detect the encoding of a
file before opening it.
Basically I believe that I can have various sorts of unicode
encodings or latin-1.
What I want to get is an encoding string to give back to "open".
my code goes like this :
my $codage = guess_encoding ( $debut );
if ( UNIVERSAL::isa ( $codage, "Encode::utf8" ) )
{
$codage = ":utf8";
}
elsif ( UNIVERSAL::isa ( $codage, "Encode::Unicode" ) )
{
$codage = ":encoding(utf-16)";
}
else
{
$codage = ":encoding(iso-8859-1)";
}
The problem is with the "Encode::Unicode" case : I don't know if it
is UTF16-LE ou UTF16-BE
and it could even be UTF32....
Is there a way to know that ???
BTW, I checked your homepage (<http://www.dan.co.jp/>http://
www.dan.co.jp/) first but it does not seem to work ?
Regards,
Christophe.