Chas. Owens schrieb:
Data::Dumper is dumping the internal format. To ensure compatibility, it is using the \x{df} escape to represent LATIN SMALL LETTER SHARP S. To see it rendered as a character, just print it:
Thanks! That kinda works: #!/usr/bin/perl use strict; use warnings; use feature 'say'; use utf8; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; binmode $fh, ":encoding(utf-8)"; print $fh Dumper($data); print Dumper($data); print $fh $data->{'Regaletikett_ausgeben'}; close $fh; if($data->{'Regaletikett_ausgeben'} eq 'groß') { say 'ist groß'; } else { say 'nicht groß'; } say 'ok'; say 'test-1: äöüÄÖÜß'; say "test-2: äöüÄÖÜß"; print "test-3: äöüÄÖÜß\n"; exit 0; Output is: $VAR1 = { 'Regaletikett_ausgeben' => "gro\x{df}", 'Mitarbeiter_inv' => '5449000134264', 'Bezeichnung1' => {}, 'Stationsnummer' => 'Infostand', 'Erfassung' => { 'Lagerstaette' => '5449000134264', 'Artikel_erfassen' => {}, 'Artikelstapel' => { 'Etikettentyp' => {}, 'EAN_Artikel' => '5449000134264', 'Menge' => '20', 'Preis' => '10.0' } }, 'meta' => { 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7', 'http-equiv' => 'content-type', 'content' => 'text/html; charset=UTF-8' }, 'id' => 'build_Inventur_1469705446' }; ist gro ok test-1: � test-2: � test-3: � In case you can´t see it: The test-printing shows a single unknown character instead of äöüÄÖÜß. Now 'env' says: [...] LANG=de_DE.utf8 [...] I´m looking at an xterm window which is connected via ssh to a remote host on which an instance of tmux is running to wich I´m attached. I can type all the above letters on the command line just fine. 'File' says: xmlread-4.pl: Perl script, UTF-8 Unicode text executable pout: UTF-8 Unicode text When I load pout into emacs, the ß shows up correctly. When I 'cat pout', the ß is displayed correctly in the terminal. So which character encoding on STDOUT does perl use by default? That should be utf-8 without any further ado, shouldn´t it? When I add binmode STDOUT, ":encoding(utf-8)"; the characters are displayed correctly in the terminal. Why would perl use something else than utf-8 by default?
#!/usr/bin/perl use strict; use feature 'say'; use XML::Simple; #warnings should come last to handle any registered warnings in previous modules use warnings; binmode STDOUT, ":encoding(UTF-8)"; my $xml = XML::Simple->new; my $data = $xml->XMLin("test.xml"); say $data->{Regaletikett_ausgeben}; On Thu, Jul 28, 2016 at 9:05 AM hw <h...@gc-24.de <mailto:h...@gc-24.de>> wrote: Hi, I would like to read XML files which look like this: <?xml version='1.0' ?> <data id="build_Inventur_1469705446"> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <instanceID>uuid:ee1bd852-37ee-4965-a097-50130cf6dac7</instanceID> </meta> <Stationsnummer>Infostand</Stationsnummer> <Mitarbeiter_inv>5449000134264</Mitarbeiter_inv> <Bezeichnung1/> <Regaletikett_ausgeben>groß</Regaletikett_ausgeben> <Erfassung> <Artikel_erfassen/> <Lagerstaette>5449000134264</Lagerstaette> <Artikelstapel> <EAN_Artikel>5449000134264</EAN_Artikel> <Preis>10.0</Preis> <Menge>20</Menge> <Etikettentyp/> </Artikelstapel> </Erfassung> </data> There is an Umlaut, ß, supposed to be at <Regaletikett_ausgeben>groß</Regaletikett_ausgeben> which is apparently impossible to read. The following program ... #!/usr/bin/perl use strict; use warnings; use feature 'say'; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; print $fh Dumper($data); close $fh; print Dumper($data); exit 0; ... gives me this output: $VAR1 = { 'Bezeichnung1' => {}, 'id' => 'build_Inventur_1469705446', 'Stationsnummer' => 'Infostand', 'meta' => { 'content' => 'text/html; charset=UTF-8', 'http-equiv' => 'content-type', 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7' }, 'Mitarbeiter_inv' => '5449000134264', 'Regaletikett_ausgeben' => "gro\x{df}", 'Erfassung' => { 'Artikelstapel' => { 'Menge' => '20', 'Preis' => '10.0', 'EAN_Artikel' => '5449000134264', 'Etikettentyp' => {} }, 'Artikel_erfassen' => {}, 'Lagerstaette' => '5449000134264' } }; I´m not getting any better results when adding an encoding tag to the XML file and when writing the Dumper output to a file. Is it impossible to use Umlaute in XML Files? -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org <mailto:beginners-unsubscr...@perl.org> For additional commands, e-mail: beginners-h...@perl.org <mailto:beginners-h...@perl.org> http://learn.perl.org/
-- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/