Chas. Owens schrieb:
Data::Dumper is dumping the internal format.  To ensure compatibility, it is 
using the \x{df} escape to represent LATIN SMALL LETTER SHARP S. To see it 
rendered as a character, just print it:

Thanks!  That kinda works:


#!/usr/bin/perl

use strict;
use warnings;

use feature 'say';
use utf8;

use XML::Simple;
use Data::Dumper;


my $xml = new XML::Simple;
my $data = $xml->XMLin("test.xml");

open my $fh, ">", 'pout';
binmode $fh, ":encoding(utf-8)";

print $fh Dumper($data);

print Dumper($data);
print $fh $data->{'Regaletikett_ausgeben'};
close $fh;


if($data->{'Regaletikett_ausgeben'} eq 'groß') {
  say 'ist groß';
} else {
  say 'nicht groß';
}

say 'ok';

say 'test-1: äöüÄÖÜß';
say "test-2: äöüÄÖÜß";
print "test-3: äöüÄÖÜß\n";


exit 0;


Output is:


$VAR1 = {
          'Regaletikett_ausgeben' => "gro\x{df}",
          'Mitarbeiter_inv' => '5449000134264',
          'Bezeichnung1' => {},
          'Stationsnummer' => 'Infostand',
          'Erfassung' => {
                         'Lagerstaette' => '5449000134264',
                         'Artikel_erfassen' => {},
                         'Artikelstapel' => {
                                            'Etikettentyp' => {},
                                            'EAN_Artikel' => '5449000134264',
                                            'Menge' => '20',
                                            'Preis' => '10.0'
                                          }
                       },
          'meta' => {
                    'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7',
                    'http-equiv' => 'content-type',
                    'content' => 'text/html; charset=UTF-8'
                  },
          'id' => 'build_Inventur_1469705446'
        };
ist gro
       ok
test-1: �
test-2: �
test-3: �


In case you can´t see it:  The test-printing shows a single unknown
character instead of äöüÄÖÜß.  Now 'env' says:


[...]
LANG=de_DE.utf8
[...]


I´m looking at an xterm window which is connected via ssh to a remote host
on which an instance of tmux is running to wich I´m attached.  I can type
all the above letters on the command line just fine.  'File' says:


xmlread-4.pl: Perl script, UTF-8 Unicode text executable
pout: UTF-8 Unicode text


When I load pout into emacs, the ß shows up correctly.  When I 'cat pout',
the ß is displayed correctly in the terminal.

So which character encoding on STDOUT does perl use by default?  That should
be utf-8 without any further ado, shouldn´t it?  When I add


binmode STDOUT, ":encoding(utf-8)";


the characters are displayed correctly in the terminal.  Why would perl use
something else than utf-8 by default?



#!/usr/bin/perl

use strict;
use feature 'say';

use XML::Simple;

#warnings should come last to handle any registered warnings in previous modules
use warnings;

binmode STDOUT, ":encoding(UTF-8)";

my $xml = XML::Simple->new;
my $data = $xml->XMLin("test.xml");

say $data->{Regaletikett_ausgeben};


On Thu, Jul 28, 2016 at 9:05 AM hw <h...@gc-24.de <mailto:h...@gc-24.de>> wrote:


    Hi,

    I would like to read XML files which look like this:


    <?xml version='1.0' ?>
    <data id="build_Inventur_1469705446">
        <meta
            http-equiv="content-type" content="text/html; charset=UTF-8">
          <instanceID>uuid:ee1bd852-37ee-4965-a097-50130cf6dac7</instanceID>
        </meta>
        <Stationsnummer>Infostand</Stationsnummer>
        <Mitarbeiter_inv>5449000134264</Mitarbeiter_inv>
        <Bezeichnung1/>
        <Regaletikett_ausgeben>gro&#223;</Regaletikett_ausgeben>
        <Erfassung>
          <Artikel_erfassen/>
          <Lagerstaette>5449000134264</Lagerstaette>
          <Artikelstapel>
            <EAN_Artikel>5449000134264</EAN_Artikel>
            <Preis>10.0</Preis>
            <Menge>20</Menge>
            <Etikettentyp/>
          </Artikelstapel>
        </Erfassung>
    </data>


    There is an Umlaut, ß, supposed to be at


    <Regaletikett_ausgeben>gro&#223;</Regaletikett_ausgeben>



    which is apparently impossible to read.  The following program ...


    #!/usr/bin/perl

    use strict;
    use warnings;

    use feature 'say';

    use XML::Simple;
    use Data::Dumper;


    my $xml = new XML::Simple;
    my $data = $xml->XMLin("test.xml");

    open my $fh, ">", 'pout';
    print $fh Dumper($data);
    close $fh;

    print Dumper($data);


    exit 0;


    ... gives me this output:


    $VAR1 = {
                'Bezeichnung1' => {},
                'id' => 'build_Inventur_1469705446',
                'Stationsnummer' => 'Infostand',
                'meta' => {
                          'content' => 'text/html; charset=UTF-8',
                          'http-equiv' => 'content-type',
                          'instanceID' => 
'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7'
                        },
                'Mitarbeiter_inv' => '5449000134264',
                'Regaletikett_ausgeben' => "gro\x{df}",
                'Erfassung' => {
                               'Artikelstapel' => {
                                                  'Menge' => '20',
                                                  'Preis' => '10.0',
                                                  'EAN_Artikel' => 
'5449000134264',
                                                  'Etikettentyp' => {}
                                                },
                               'Artikel_erfassen' => {},
                               'Lagerstaette' => '5449000134264'
                             }
              };


    I´m not getting any better results when adding an encoding tag to the
    XML file and when writing the Dumper output to a file.

    Is it impossible to use Umlaute in XML Files?

    --
    To unsubscribe, e-mail: beginners-unsubscr...@perl.org 
<mailto:beginners-unsubscr...@perl.org>
    For additional commands, e-mail: beginners-h...@perl.org 
<mailto:beginners-h...@perl.org>
    http://learn.perl.org/




--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to