On Fri, Oct 7, 2011 at 4:54 AM, marcos rebelo <ole...@gmail.com> wrote:
> Hi all > > I'm trying to get the info from a PDF with a code like: > > ####################################################### > > ... > use Data::Dumper; > use PDF::API2; > ... > my $pdf = PDF::API2->open('/home/.../PDF.pdf'); > print Dumper +{ $pdf->info() }; > > ####################################################### > > This code gets me something like: > > ####################################################### > > $VAR1 = { > 'Subject' => 'my subject', > 'CreationDate' => 'D:20111006161347+02\'00\'', > 'Producer' => 'LibreOffice 3.3', > 'Creator' => 'Writer', > 'Author' => 'Marcos Rebelo', > 'Title' => 'my title', > 'Keywords' => 'my keywords' > }; > > ####################################################### > > Unfortunatly someone has the code: < use encoding 'utf8'; > > > and now I get: > > ####################################################### > > $VAR1 = { > 'Subject' => "\x{fffd}\x{fffd}my subject", > 'CreationDate' => 'D:20111006161347+02\'00\'', > 'Producer' => "\x{fffd}\x{fffd}LibreOffice 3.3", > 'Creator' => "\x{fffd}\x{fffd}Writer", > 'Author' => "\x{fffd}\x{fffd}Marcos Rebelo", > 'Title' => "\x{fffd}\x{fffd}my title", > 'Keywords' => "\x{fffd}\x{fffd}my keywords" > }; > > ####################################################### > > I can't remove the < use encoding 'utf8'; >, but I need to clean the hash. > > Find a way. Seriously! use encoding ...; is broken. A drop-in replacement (mostly) would be 'use utf8; use open qw( :encoding(UTF-8) );' If that's not feasible, well.. I haven't tried this, but localizing the ${^ENCODING} variable might make do: { local ${^ENCODING}; my $pdf = PDF::API2->open('/home/.../PDF.pdf'); } How can I clean the hash? > use charnames qw( :full ); use PDF::API2; ... tr/\N{REPLACEMENT CHARACTER}//d for values %{$pdf->info()};