Chas. Owens schrieb:
Data::Dumper is dumping the internal format. To ensure compatibility, it is
using the \x{df} escape to represent LATIN SMALL LETTER SHARP S. To see it
rendered as a character, just print it:
Thanks! That kinda works:
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use utf8;
use XML::Simple;
use Data::Dumper;
my $xml = new XML::Simple;
my $data = $xml->XMLin("test.xml");
open my $fh, ">", 'pout';
binmode $fh, ":encoding(utf-8)";
print $fh Dumper($data);
print Dumper($data);
print $fh $data->{'Regaletikett_ausgeben'};
close $fh;
if($data->{'Regaletikett_ausgeben'} eq 'groß') {
say 'ist groß';
} else {
say 'nicht groß';
}
say 'ok';
say 'test-1: äöüÄÖÜß';
say "test-2: äöüÄÖÜß";
print "test-3: äöüÄÖÜß\n";
exit 0;
Output is:
$VAR1 = {
'Regaletikett_ausgeben' => "gro\x{df}",
'Mitarbeiter_inv' => '5449000134264',
'Bezeichnung1' => {},
'Stationsnummer' => 'Infostand',
'Erfassung' => {
'Lagerstaette' => '5449000134264',
'Artikel_erfassen' => {},
'Artikelstapel' => {
'Etikettentyp' => {},
'EAN_Artikel' => '5449000134264',
'Menge' => '20',
'Preis' => '10.0'
}
},
'meta' => {
'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7',
'http-equiv' => 'content-type',
'content' => 'text/html; charset=UTF-8'
},
'id' => 'build_Inventur_1469705446'
};
ist gro
ok
test-1: �
test-2: �
test-3: �
In case you can´t see it: The test-printing shows a single unknown
character instead of äöüÄÖÜß. Now 'env' says:
[...]
LANG=de_DE.utf8
[...]
I´m looking at an xterm window which is connected via ssh to a remote host
on which an instance of tmux is running to wich I´m attached. I can type
all the above letters on the command line just fine. 'File' says:
xmlread-4.pl: Perl script, UTF-8 Unicode text executable
pout: UTF-8 Unicode text
When I load pout into emacs, the ß shows up correctly. When I 'cat pout',
the ß is displayed correctly in the terminal.
So which character encoding on STDOUT does perl use by default? That should
be utf-8 without any further ado, shouldn´t it? When I add
binmode STDOUT, ":encoding(utf-8)";
the characters are displayed correctly in the terminal. Why would perl use
something else than utf-8 by default?
#!/usr/bin/perl
use strict;
use feature 'say';
use XML::Simple;
#warnings should come last to handle any registered warnings in previous modules
use warnings;
binmode STDOUT, ":encoding(UTF-8)";
my $xml = XML::Simple->new;
my $data = $xml->XMLin("test.xml");
say $data->{Regaletikett_ausgeben};
On Thu, Jul 28, 2016 at 9:05 AM hw <[email protected] <mailto:[email protected]>> wrote:
Hi,
I would like to read XML files which look like this:
<?xml version='1.0' ?>
<data id="build_Inventur_1469705446">
<meta
http-equiv="content-type" content="text/html; charset=UTF-8">
<instanceID>uuid:ee1bd852-37ee-4965-a097-50130cf6dac7</instanceID>
</meta>
<Stationsnummer>Infostand</Stationsnummer>
<Mitarbeiter_inv>5449000134264</Mitarbeiter_inv>
<Bezeichnung1/>
<Regaletikett_ausgeben>groß</Regaletikett_ausgeben>
<Erfassung>
<Artikel_erfassen/>
<Lagerstaette>5449000134264</Lagerstaette>
<Artikelstapel>
<EAN_Artikel>5449000134264</EAN_Artikel>
<Preis>10.0</Preis>
<Menge>20</Menge>
<Etikettentyp/>
</Artikelstapel>
</Erfassung>
</data>
There is an Umlaut, ß, supposed to be at
<Regaletikett_ausgeben>groß</Regaletikett_ausgeben>
which is apparently impossible to read. The following program ...
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use XML::Simple;
use Data::Dumper;
my $xml = new XML::Simple;
my $data = $xml->XMLin("test.xml");
open my $fh, ">", 'pout';
print $fh Dumper($data);
close $fh;
print Dumper($data);
exit 0;
... gives me this output:
$VAR1 = {
'Bezeichnung1' => {},
'id' => 'build_Inventur_1469705446',
'Stationsnummer' => 'Infostand',
'meta' => {
'content' => 'text/html; charset=UTF-8',
'http-equiv' => 'content-type',
'instanceID' =>
'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7'
},
'Mitarbeiter_inv' => '5449000134264',
'Regaletikett_ausgeben' => "gro\x{df}",
'Erfassung' => {
'Artikelstapel' => {
'Menge' => '20',
'Preis' => '10.0',
'EAN_Artikel' =>
'5449000134264',
'Etikettentyp' => {}
},
'Artikel_erfassen' => {},
'Lagerstaette' => '5449000134264'
}
};
I´m not getting any better results when adding an encoding tag to the
XML file and when writing the Dumper output to a file.
Is it impossible to use Umlaute in XML Files?
--
To unsubscribe, e-mail: [email protected]
<mailto:[email protected]>
For additional commands, e-mail: [email protected]
<mailto:[email protected]>
http://learn.perl.org/
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/