On Fri, Oct 7, 2011 at 4:54 AM, marcos rebelo <ole...@gmail.com> wrote:

> Hi all
>
> I'm trying to get the info from a PDF with a code like:
>
> #######################################################
>
> ...
> use Data::Dumper;
> use PDF::API2;
> ...
> my $pdf = PDF::API2->open('/home/.../PDF.pdf');
> print Dumper +{ $pdf->info() };
>
> #######################################################
>
> This code gets me something like:
>
> #######################################################
>
> $VAR1 = {
>          'Subject' => 'my subject',
>          'CreationDate' => 'D:20111006161347+02\'00\'',
>          'Producer' => 'LibreOffice 3.3',
>          'Creator' => 'Writer',
>          'Author' => 'Marcos Rebelo',
>          'Title' => 'my title',
>          'Keywords' => 'my keywords'
>        };
>
> #######################################################
>
> Unfortunatly someone has the code: < use encoding 'utf8'; >
>
> and now I get:
>
> #######################################################
>
> $VAR1 = {
>          'Subject' => "\x{fffd}\x{fffd}my subject",
>          'CreationDate' => 'D:20111006161347+02\'00\'',
>          'Producer' => "\x{fffd}\x{fffd}LibreOffice 3.3",
>          'Creator' => "\x{fffd}\x{fffd}Writer",
>          'Author' => "\x{fffd}\x{fffd}Marcos Rebelo",
>          'Title' => "\x{fffd}\x{fffd}my title",
>          'Keywords' => "\x{fffd}\x{fffd}my keywords"
>        };
>
> #######################################################
>
> I can't remove the < use encoding 'utf8'; >, but I need to clean the hash.
>
>
Find a way. Seriously! use encoding ...; is broken. A drop-in replacement
(mostly) would be 'use utf8; use open qw( :encoding(UTF-8) );'

If that's not feasible, well.. I haven't tried this, but localizing the
${^ENCODING} variable might make do:

{
    local ${^ENCODING};
    my $pdf = PDF::API2->open('/home/.../PDF.pdf');
}


How can I clean the hash?
>

use charnames qw( :full );
use PDF::API2;
...

tr/\N{REPLACEMENT CHARACTER}//d for values %{$pdf->info()};

Reply via email to