Re: XML::Feed.pm perl on CentOS 7
Hi Lars, On Tue, 3 Sep 2019 08:06:36 +0300 Lars Noodén wrote: > I'm not finding CPAN's XML::Feed.pm for perl 5 for centos 7 via yum. > > $ yum -q search all XML-Feed > Warning: No matches found for: XML-Feed > > $ grep PRETTY /etc/os-release > PRETTY_NAME="CentOS Linux 7 (Core)" > > Is there an additional package repository containing CPAN material for > CentOS 7 which I can add? > See https://pkgs.org/download/perl-XML-Feed . Perhaps these pages will also help: * https://perl-begin.org/topics/cpan/ * https://perl-begin.org/topics/cpan/wrappers-for-distributions/ > /Lars > -- - Shlomi Fish http://www.shlomifish.org/ -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML::Feed.pm perl on CentOS 7
I'm not finding CPAN's XML::Feed.pm for perl 5 for centos 7 via yum. $ yum -q search all XML-Feed Warning: No matches found for: XML-Feed $ grep PRETTY /etc/os-release PRETTY_NAME="CentOS Linux 7 (Core)" Is there an additional package repository containing CPAN material for CentOS 7 which I can add? /Lars -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Making use of XML::Feed->parse() more robust
> Is this something which should be handled differently in the module itself? Possibly. But what the author considers to be fatal is up to them so I wouldn’t say “should” but “could” ... that’s why Perl has eval ;-). You can go into the module code, find the “die” and change it “warn” maybe. It’d just be a mod you’d have to maintain. You could also go to cpan and offer a suggestion to the author(s) or even a patch. TIMTOWTDI On Thu, Aug 29, 2019 at 11:20 AM Lars Noodén wrote: > On 8/28/19 7:33 PM, Andy Bach wrote: > > Look at eval blocks - lets you trap fatal errors from other code and not > > die/abort yourself. > > https://perldoc.perl.org/functions/eval.html > > Thanks. I went with an eval block since it was very quick to set up. > > > You can also wrie your own signal handling code > > > https://www.perl.com/article/37/2013/8/18/Catch-and-Handle-Signals-in-Perl/ > > Is this something which should be handled differently in the module itself? > > /Lars > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > -- Andy Bach afb...@gmail.com Not at my desk
Re: Making use of XML::Feed->parse() more robust
On 8/28/19 7:33 PM, Andy Bach wrote: > Look at eval blocks - lets you trap fatal errors from other code and not > die/abort yourself. > https://perldoc.perl.org/functions/eval.html Thanks. I went with an eval block since it was very quick to set up. > You can also wrie your own signal handling code > https://www.perl.com/article/37/2013/8/18/Catch-and-Handle-Signals-in-Perl/ Is this something which should be handled differently in the module itself? /Lars -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Making use of XML::Feed->parse() more robust
Look at eval blocks - lets you trap fatal errors from other code and not die/abort yourself. https://perldoc.perl.org/functions/eval.html You can also wrie your own signal handling code https://www.perl.com/article/37/2013/8/18/Catch-and-Handle-Signals-in-Perl/ On Wed, Aug 28, 2019 at 8:42 AM Lars Noodén wrote: > I've been using the CPAN module XML::Feed to parse Atom and RSS feeds. > Some of the feeds it fetches are a little broken from time to time and > when that happens the parser produces and error and stops the program. > I'd like it to just keep going. > > I am invoking the parser inside a subroutine like this: > > my $feed = XML::Feed->parse(URI->new($uri)) or return(0); > > which I thought that would allow the subroutine to simply return failure > and let the program keep going. But it does not. Instead it shows an > error and quits. Here is an error from feed which is broken today but > not yesterday and probably will be ok again tomorrow: > > not well-formed (invalid token) at line 142, column 76, > byte 30070 at /usr/lib/x86_64-linux-gnu/perl5/5.24/XML/Parser.pm > line 187. > > ... foo.pl: exited with status 255; aborting > > I have no control over the feeds and their formats or contents. So, are > there instead any recommendations on how I can have perl trap the error > or otherwise prevent malformed XML from bringing the whole program to a > halt? > > Thanks, > Lars > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > -- Andy Bach afb...@gmail.com Not at my desk
Making use of XML::Feed->parse() more robust
I've been using the CPAN module XML::Feed to parse Atom and RSS feeds. Some of the feeds it fetches are a little broken from time to time and when that happens the parser produces and error and stops the program. I'd like it to just keep going. I am invoking the parser inside a subroutine like this: my $feed = XML::Feed->parse(URI->new($uri)) or return(0); which I thought that would allow the subroutine to simply return failure and let the program keep going. But it does not. Instead it shows an error and quits. Here is an error from feed which is broken today but not yesterday and probably will be ok again tomorrow: not well-formed (invalid token) at line 142, column 76, byte 30070 at /usr/lib/x86_64-linux-gnu/perl5/5.24/XML/Parser.pm line 187. ... foo.pl: exited with status 255; aborting I have no control over the feeds and their formats or contents. So, are there instead any recommendations on how I can have perl trap the error or otherwise prevent malformed XML from bringing the whole program to a halt? Thanks, Lars -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
RE: XML::LibXML and comments
Hi Lawrence, Works great, thanks! Never thought of appending the comment directly to the $doc, just assumed I Needed to set the root first. Regards, John From: Lawrence Statton Sent: Monday, September 10, 2018 9:44 PM To: beginners@perl.org Subject: Re: XML::LibXML and comments On Sep 10, 2018, at 6:33 AM, John Cortland Morgan mailto:johncortland.mor...@ericsson.com>> wrote: Hi, I'm trying to place a comment directly after the XML declaration using XML::LibXML, But cannot seem to manage, always receiving error: setDocumentElement: ELEMENT node required at .../LibXML.pm line 1393 What I would like: testing My relevant code thus far: My $dom = XML::LibXML::Document->new( "1.0", "UTF-8"); My $root = XML::LibXML::Comment->new( "test comment" ); $dom->setDocumentElement($root); Any help would be greatly appreciated. I'm kinda stuck with using XML::LibXML though. John -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org<mailto:beginners-unsubscr...@perl.org> For additional commands, e-mail: beginners-h...@perl.org<mailto:beginners-h...@perl.org> http://learn.perl.org/ The root of a document cannot be a comment, however you can add a comment with $dom->addChild($dom->createComment(‘test comment’)) #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $doc = XML::LibXML::Document->new(qw/1.0 utf-8/); $doc->appendChild($doc->createComment('test comment')); $doc->setDocumentElement(my $e_products = $doc->createElement('products')); $e_products->appendTextChild(field => 'testing'); print $doc->toString(1);
Re: XML::LibXML and comments
> On Sep 10, 2018, at 6:33 AM, John Cortland Morgan > wrote: > > Hi, > > I'm trying to place a comment directly after the XML declaration using > XML::LibXML, > But cannot seem to manage, always receiving error: > > setDocumentElement: ELEMENT node required at .../LibXML.pm line 1393 > > What I would like: > > > > > testing > > > My relevant code thus far: > > My $dom = XML::LibXML::Document->new( "1.0", "UTF-8"); > My $root = XML::LibXML::Comment->new( "test comment" ); > $dom->setDocumentElement($root); > > > Any help would be greatly appreciated. I'm kinda stuck with using XML::LibXML > though. > > John > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > The root of a document cannot be a comment, however you can add a comment with $dom->addChild($dom->createComment(‘test comment’)) #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $doc = XML::LibXML::Document->new(qw/1.0 utf-8/); $doc->appendChild($doc->createComment('test comment')); $doc->setDocumentElement(my $e_products = $doc->createElement('products')); $e_products->appendTextChild(field => 'testing'); print $doc->toString(1);
Re: XML::LibXML and comments
Take a look at: https://stackoverflow.com/questions/19411152/libxml-inserting-a-comment use XML::LibXML; my $doc = XML::LibXML::Document->new;my $root = $doc->createElement("doc"); $doc->setDocumentElement($root); $root->appendChild($doc->createElement("JJ")); $root->appendChild($doc->createComment("comment"));print $doc->toString(1); So, maybe your comment needs an element to be a child of, not the root node itself. On Mon, Sep 10, 2018 at 6:50 AM John Cortland Morgan < johncortland.mor...@ericsson.com> wrote: > Hi, > > I'm trying to place a comment directly after the XML declaration using > XML::LibXML, > But cannot seem to manage, always receiving error: > > setDocumentElement: ELEMENT node required at .../LibXML.pm line 1393 > > What I would like: > > > > > testing > > > My relevant code thus far: > > My $dom = XML::LibXML::Document->new( "1.0", "UTF-8"); > My $root = XML::LibXML::Comment->new( "test comment" ); > $dom->setDocumentElement($root); > > > Any help would be greatly appreciated. I'm kinda stuck with using > XML::LibXML though. > > John > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > > -- a Andy Bach, afb...@gmail.com 608 658-1890 cell 608 261-5738 wk
XML::LibXML and comments
Hi, I'm trying to place a comment directly after the XML declaration using XML::LibXML, But cannot seem to manage, always receiving error: setDocumentElement: ELEMENT node required at .../LibXML.pm line 1393 What I would like: testing My relevant code thus far: My $dom = XML::LibXML::Document->new( "1.0", "UTF-8"); My $root = XML::LibXML::Comment->new( "test comment" ); $dom->setDocumentElement($root); Any help would be greatly appreciated. I'm kinda stuck with using XML::LibXML though. John -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Using __DATA__ with XML::LibXML
Hi, I didn't understand what you did with the package that you defined 'XML_Check' the use of the "SPECIAL LITERAL" _DATA_ is not clear since i didn't see any usage for it in the 'XML_Check' package. I mean you didn't use in it. I didn't try running the script but i can give you some hints to work with: First: Did you try to print the lines inside the filehandler. while () { print; } to see if you are accessing it. Second: if you are importing the package in a script for example, the TOKENS after the data in the package 'XML_Check' are accessible via the XML_Check::DATA filehandler according to the documentation. " Text after __DATA__ may be read via the filehandle "PACKNAME::DATA", where "PACKNAME" is the package that was current when the __DATA__token was encountered. " maybe this should work. Third: if you use _DATA_ section in your main script, you can use it as you did. you can also reference it as hope that helps Good luck *Khalil Zakaria Zemmoura* *Visiteur Médical EST* *Laboratoire NOVOMEDIS* On Thu, Jun 22, 2017 at 10:27 PM, SSC_perl <p...@surfshopcart.com> wrote: > I think I'm losing it. I'm trying to do something simple here but > I can't get it to work. > > I'm using XML::LibXML to verify XML sitemaps and it's working fine > with an external XSD file. However, I'd like to add the schema to a > __DATA__ section so that I only need a single file. However, no matter > what I do, I get error messages***. I've tried the following: > > my @data = ; > my $data = join ('', @data); > my $schema = XML::LibXML::Schema->new(string => $data); > - > my $schema = XML::LibXML::Schema->new(string => *DATA); > - > my $schema = XML::LibXML::Schema->new(string => \*DATA); > > > but the only thing that works is: > > my $schema_file = '/home/user/cron/sitemaps/xsd-schema.xsd'; > my $schema = XML::LibXML::Schema->new(location => $schema_file); > > > Why can't I get the DATA solution to work? > > Thanks, > Frank > > > *** Some of the error messages I've seen: > > Schemas parser error : Failed to parse the XML resource 'in_memory_buffer'. > > Entity: line 1: parser error : Start tag expected, '<' not found > > --- > > The full script is as follows: > > package XML_Check; > > use XML::LibXML; > use FindBin qw($Bin); > use lib "$Bin/../../perl/Modules"; > require EmailSender; > > # > ## Load YAML Config File > # > use YAML qw(LoadFile); > my $config = LoadFile('/home/user/conf/config.yaml'); > my $to_address = $config->{'email'}{'report_to'}; > my $from_address = $config->{'email'}{'report_from'}; > ###### > > sub verify_xml { > my $document = shift; > my $schema_file = '/home/user/cron/sitemaps/xsd-schema.xsd'; > my $schema = XML::LibXML::Schema->new(location => $schema_file); > > my $parser = XML::LibXML->new; > my $doc= $parser->parse_file($document); > > eval { $schema->validate($doc) }; > if ($@) { > my $subject = 'XML Sitemap Error'; > my $message = "There was an error in the $document sitemap."; > EmailSender::send_mail($subject, $message, $to_address, > $from_address, 'text/plain'); > } > > return $@; > } > > 1; > > __DATA__ > http://www.w3.org/2001/XMLSchema; xmlns=" > http://www.sitemaps.org/schemas/sitemap/0.9; targetNamespace="http://www. > sitemaps.org/schemas/sitemap/0.9" elementFormDefault="qualified"> > > > XML Schema for Sitemap files. Last Modifed 2008-03-26 > > > > > > Container for a set of up to 50,000 document elements. This is the root > element of the XML file. > > > > > processContents="strict"/> > > > > > > > > Container for the data needed to describe a document to crawl. > > > > > > > > processContents="strict"/> > > > > > > REQUIRED: The location URI of a document. The URI must conform to RFC 2396 > (http://www.ietf.org/rfc/rfc2396.txt). > > > > > > > > > > > OPTIONAL: The date the document was last modified. The date must conform > to the W3C DATETIME format (http://www.w3.org/TR/NOTE-datetime). Example: > 2005-05-10 Lastmod may also contain a timestamp. Example: > 2005-05-10T17:33:30+08:00 > > > > > > > > > > > > > > > OPTIONAL: Indicat
Using __DATA__ with XML::LibXML
I think I'm losing it. I'm trying to do something simple here but I can't get it to work. I'm using XML::LibXML to verify XML sitemaps and it's working fine with an external XSD file. However, I'd like to add the schema to a __DATA__ section so that I only need a single file. However, no matter what I do, I get error messages***. I've tried the following: my @data = ; my $data = join ('', @data); my $schema = XML::LibXML::Schema->new(string => $data); - my $schema = XML::LibXML::Schema->new(string => *DATA); - my $schema = XML::LibXML::Schema->new(string => \*DATA); but the only thing that works is: my $schema_file = '/home/user/cron/sitemaps/xsd-schema.xsd'; my $schema = XML::LibXML::Schema->new(location => $schema_file); Why can't I get the DATA solution to work? Thanks, Frank *** Some of the error messages I've seen: Schemas parser error : Failed to parse the XML resource 'in_memory_buffer'. Entity: line 1: parser error : Start tag expected, '<' not found --- The full script is as follows: package XML_Check; use XML::LibXML; use FindBin qw($Bin); use lib "$Bin/../../perl/Modules"; require EmailSender; # ## Load YAML Config File # use YAML qw(LoadFile); my $config = LoadFile('/home/user/conf/config.yaml'); my $to_address = $config->{'email'}{'report_to'}; my $from_address = $config->{'email'}{'report_from'}; ## sub verify_xml { my $document = shift; my $schema_file = '/home/user/cron/sitemaps/xsd-schema.xsd'; my $schema = XML::LibXML::Schema->new(location => $schema_file); my $parser = XML::LibXML->new; my $doc= $parser->parse_file($document); eval { $schema->validate($doc) }; if ($@) { my $subject = 'XML Sitemap Error'; my $message = "There was an error in the $document sitemap."; EmailSender::send_mail($subject, $message, $to_address, $from_address, 'text/plain'); } return $@; } 1; __DATA__ http://www.w3.org/2001/XMLSchema; xmlns="http://www.sitemaps.org/schemas/sitemap/0.9; targetNamespace="http://www.sitemaps.org/schemas/sitemap/0.9; elementFormDefault="qualified"> XML Schema for Sitemap files. Last Modifed 2008-03-26 Container for a set of up to 50,000 document elements. This is the root element of the XML file. Container for the data needed to describe a document to crawl. REQUIRED: The location URI of a document. The URI must conform to RFC 2396 (http://www.ietf.org/rfc/rfc2396.txt). OPTIONAL: The date the document was last modified. The date must conform to the W3C DATETIME format (http://www.w3.org/TR/NOTE-datetime). Example: 2005-05-10 Lastmod may also contain a timestamp. Example: 2005-05-10T17:33:30+08:00 OPTIONAL: Indicates how frequently the content at a particular URL is likely to change. The value "always" should be used to describe documents that change each time they are accessed. The value "never" should be used to describe archived URLs. Please note that web crawlers may not necessarily crawl pages marked "always" more often. Consider this element as a friendly suggestion and not a command. OPTIONAL: The priority of a particular URL relative to other pages on the same site. The value for this element is a number between 0.0 and 1.0 where 0.0 identifies the lowest priority page(s). The default priority of a page is 0.5. Priority is used to select between pages on your site. Setting a priority of 1.0 for all URLs will not help you, as the relative priority of pages on your site is what will be considered. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
Take a look at the -C argument for perl and the PERL_UNICODE environment variable in http://perldoc.perl.org/perlrun.html Examine the difference between perl -E 'say "\x{df}"' and PERL_UNICODE=O perl -E 'say "\x{df}"' That said, if you are working with the web, why in the world are you sending UTF-8? HTML has entities for a reason. I would suggest using HTML::Entities instead of trying to send non-ASCII characters through who knows how many layers of things that can screw up UTF-8: perl -MHTML::Entities -E 'say encode_entities "\x{df}"' On Tue, Aug 9, 2016 at 7:34 AM hwwrote: > Chas. Owens schrieb: > > > > On Thu, Jul 28, 2016 at 10:55 AM Paul Johnson > wrote: > > > > On Thu, Jul 28, 2016 at 10:23:19AM -0400, Chas. Owens wrote: > > > > snip > > > > > Also, this answer on StackOverflow by tchrist (Tom Christiansen, > who I > > > would say knows the most about the intersection of Perl and > Unicode) > > > is a good resource: http://stackoverflow.com/a/6163129/78259 > > > > Quite. And utf8::all tries to encapsulate as much of that > boilerplate > > as it can. > > > > > > I have always read that answer as a bit of an indictment of the idea of > "you should be able to load this module and everything will be fine". > Unicode is complex and trying to treat it like just another list of > characters is doomed to teeth gnashing and crying. Of course, even > treating it the way it should be leads to teeth gnashing and crying, but at > least that will be over the fact the humans suck (we can't even agree on > where þ should be sorted) as opposed to Perl sucking. > > When I have something like > > > print $cgi->p('Gebäudefläche:'); > > > in my source, which is correctly displayed everywhere else, I also > need it correctly displayed in the web browser --- even particularly > there because that is what the users are looking at. > > And that´s all there is to it. It´s really that simple. > >
Re: XML::Simple Umlaute
Chas. Owens schrieb: On Thu, Jul 28, 2016 at 10:05 AM, hwwrote: snip So which character encoding on STDOUT does perl use by default? That should be utf-8 without any further ado, shouldn´t it? When I add binmode STDOUT, ":encoding(utf-8)"; the characters are displayed correctly in the terminal. Why would perl use something else than utf-8 by default? Take the following with a grain of salt. My knowledge is mostly hearsay and supposition with a dash of cargo cultism on this matter. Perl predates even Unicode (they both came out in '87). Unicode did not get much traction until the mid-nineties when people started realizing that UTF-8 (created in '92) was a good thing. So, for most of its early history, Perl used Latin1. It still does to a large extent for backwards compatibility reasons. To make Perl 5 a proper UTF-8 environment there are a number of knobs to pull and buttons to poke. You may find this video from YAPC NA 2016 enlightening: https://www.youtube.com/watch?v=TmTeXcEixEg Others that may be helpful (I haven't watched them, but I trust the speaker): https://www.youtube.com/watch?v=iZgqhVu72zc https://www.youtube.com/watch?v=X2FQHUHjo8M Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I would say knows the most about the intersection of Perl and Unicode) is a good resource: http://stackoverflow.com/a/6163129/78259 Hope this helps. Thanks! That makes it really complicated to write applications which display data from a database via a web browser --- yet ppl are doing this since a pretty long time now. But no matter what I do, Umlaute are not displayed correctly throughout the whole web page: they are either wrong in the data from the database or in print statements or in the output of the CGI::FormBuilder. There´s probably no way to get it right :( -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
Chas. Owens schrieb: On Thu, Jul 28, 2016 at 10:55 AM Paul Johnson> wrote: On Thu, Jul 28, 2016 at 10:23:19AM -0400, Chas. Owens wrote: snip > Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I > would say knows the most about the intersection of Perl and Unicode) > is a good resource: http://stackoverflow.com/a/6163129/78259 Quite. And utf8::all tries to encapsulate as much of that boilerplate as it can. I have always read that answer as a bit of an indictment of the idea of "you should be able to load this module and everything will be fine". Unicode is complex and trying to treat it like just another list of characters is doomed to teeth gnashing and crying. Of course, even treating it the way it should be leads to teeth gnashing and crying, but at least that will be over the fact the humans suck (we can't even agree on where þ should be sorted) as opposed to Perl sucking. When I have something like print $cgi->p('Gebäudefläche:'); in my source, which is correctly displayed everywhere else, I also need it correctly displayed in the web browser --- even particularly there because that is what the users are looking at. And that´s all there is to it. It´s really that simple. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
Paul Johnson schrieb: On Thu, Jul 28, 2016 at 10:23:19AM -0400, Chas. Owens wrote: On Thu, Jul 28, 2016 at 10:05 AM, hwwrote: snip So which character encoding on STDOUT does perl use by default? That should be utf-8 without any further ado, shouldn´t it? When I add binmode STDOUT, ":encoding(utf-8)"; the characters are displayed correctly in the terminal. Why would perl use something else than utf-8 by default? As a general rule, use "utf8::all" instead of just "utf8" and a lot of the problems go away. Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I would say knows the most about the intersection of Perl and Unicode) is a good resource: http://stackoverflow.com/a/6163129/78259 Quite. And utf8::all tries to encapsulate as much of that boilerplate as it can. Maybe that would work, but I can´t very well go through all the programs and adjust them and experiment every time there is a problem like this. I need some sort of general switch to make perl use utf8 by default, as it should to begin with ... -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
I'm not sure if it is possible to use Umlaute in XML Files or not. Maybe this post with help you: http://stackoverflow.com/questions/11772468/reading-xml-files-with-umlaut-chars Is there a way to change encoding to "iso-8859-1"? Mike On 7/28/2016 8:03 AM, beginners-digest-h...@perl.org wrote: Hi, I would like to read XML files which look like this: uuid:ee1bd852-37ee-4965-a097-50130cf6dac7 Infostand 5449000134264 gro 5449000134264 5449000134264 10.0 20 There is an Umlaut, ß, supposed to be at gro which is apparently impossible to read. The following program ... #!/usr/bin/perl use strict; use warnings; use feature 'say'; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; print $fh Dumper($data); close $fh; print Dumper($data); exit 0; ... gives me this output: $VAR1 = { 'Bezeichnung1' => {}, 'id' => 'build_Inventur_1469705446', 'Stationsnummer' => 'Infostand', 'meta' => { 'content' => 'text/html; charset=UTF-8', 'http-equiv' => 'content-type', 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7' }, 'Mitarbeiter_inv' => '5449000134264', 'Regaletikett_ausgeben' => "gro\x{df}", 'Erfassung' => { 'Artikelstapel' => { 'Menge' => '20', 'Preis' => '10.0', 'EAN_Artikel' => '5449000134264', 'Etikettentyp' => {} }, 'Artikel_erfassen' => {}, 'Lagerstaette' => '5449000134264' } }; I´m not getting any better results when adding an encoding tag to the XML file and when writing the Dumper output to a file. Is it impossible to use Umlaute in XML Files?
Re: XML::Simple Umlaute
On Thu, Jul 28, 2016 at 10:55 AM Paul Johnsonwrote: > On Thu, Jul 28, 2016 at 10:23:19AM -0400, Chas. Owens wrote: snip > > Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I > > would say knows the most about the intersection of Perl and Unicode) > > is a good resource: http://stackoverflow.com/a/6163129/78259 > > Quite. And utf8::all tries to encapsulate as much of that boilerplate > as it can. > I have always read that answer as a bit of an indictment of the idea of "you should be able to load this module and everything will be fine". Unicode is complex and trying to treat it like just another list of characters is doomed to teeth gnashing and crying. Of course, even treating it the way it should be leads to teeth gnashing and crying, but at least that will be over the fact the humans suck (we can't even agree on where þ should be sorted) as opposed to Perl sucking.
Re: XML::Simple Umlaute
On Thu, Jul 28, 2016 at 10:23:19AM -0400, Chas. Owens wrote: > On Thu, Jul 28, 2016 at 10:05 AM, hwwrote: > snip > > So which character encoding on STDOUT does perl use by default? That should > > be utf-8 without any further ado, shouldn´t it? When I add > > > > > > binmode STDOUT, ":encoding(utf-8)"; > > > > > > the characters are displayed correctly in the terminal. Why would perl use > > something else than utf-8 by default? As a general rule, use "utf8::all" instead of just "utf8" and a lot of the problems go away. > Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I > would say knows the most about the intersection of Perl and Unicode) > is a good resource: http://stackoverflow.com/a/6163129/78259 Quite. And utf8::all tries to encapsulate as much of that boilerplate as it can. -- Paul Johnson - p...@pjcj.net http://www.pjcj.net -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
On Thu, Jul 28, 2016 at 10:05 AM, hwwrote: snip > So which character encoding on STDOUT does perl use by default? That should > be utf-8 without any further ado, shouldn´t it? When I add > > > binmode STDOUT, ":encoding(utf-8)"; > > > the characters are displayed correctly in the terminal. Why would perl use > something else than utf-8 by default? Take the following with a grain of salt. My knowledge is mostly hearsay and supposition with a dash of cargo cultism on this matter. Perl predates even Unicode (they both came out in '87). Unicode did not get much traction until the mid-nineties when people started realizing that UTF-8 (created in '92) was a good thing. So, for most of its early history, Perl used Latin1. It still does to a large extent for backwards compatibility reasons. To make Perl 5 a proper UTF-8 environment there are a number of knobs to pull and buttons to poke. You may find this video from YAPC NA 2016 enlightening: https://www.youtube.com/watch?v=TmTeXcEixEg Others that may be helpful (I haven't watched them, but I trust the speaker): https://www.youtube.com/watch?v=iZgqhVu72zc https://www.youtube.com/watch?v=X2FQHUHjo8M Also, this answer on StackOverflow by tchrist (Tom Christiansen, who I would say knows the most about the intersection of Perl and Unicode) is a good resource: http://stackoverflow.com/a/6163129/78259 Hope this helps. -- Chas. Owens http://github.com/cowens The most important skill a programmer can have is the ability to read. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Simple Umlaute
Chas. Owens schrieb: Data::Dumper is dumping the internal format. To ensure compatibility, it is using the \x{df} escape to represent LATIN SMALL LETTER SHARP S. To see it rendered as a character, just print it: Thanks! That kinda works: #!/usr/bin/perl use strict; use warnings; use feature 'say'; use utf8; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; binmode $fh, ":encoding(utf-8)"; print $fh Dumper($data); print Dumper($data); print $fh $data->{'Regaletikett_ausgeben'}; close $fh; if($data->{'Regaletikett_ausgeben'} eq 'groß') { say 'ist groß'; } else { say 'nicht groß'; } say 'ok'; say 'test-1: äöüÄÖÜß'; say "test-2: äöüÄÖÜß"; print "test-3: äöüÄÖÜß\n"; exit 0; Output is: $VAR1 = { 'Regaletikett_ausgeben' => "gro\x{df}", 'Mitarbeiter_inv' => '5449000134264', 'Bezeichnung1' => {}, 'Stationsnummer' => 'Infostand', 'Erfassung' => { 'Lagerstaette' => '5449000134264', 'Artikel_erfassen' => {}, 'Artikelstapel' => { 'Etikettentyp' => {}, 'EAN_Artikel' => '5449000134264', 'Menge' => '20', 'Preis' => '10.0' } }, 'meta' => { 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7', 'http-equiv' => 'content-type', 'content' => 'text/html; charset=UTF-8' }, 'id' => 'build_Inventur_1469705446' }; ist gro ok test-1: � test-2: � test-3: � In case you can´t see it: The test-printing shows a single unknown character instead of äöüÄÖÜß. Now 'env' says: [...] LANG=de_DE.utf8 [...] I´m looking at an xterm window which is connected via ssh to a remote host on which an instance of tmux is running to wich I´m attached. I can type all the above letters on the command line just fine. 'File' says: xmlread-4.pl: Perl script, UTF-8 Unicode text executable pout: UTF-8 Unicode text When I load pout into emacs, the ß shows up correctly. When I 'cat pout', the ß is displayed correctly in the terminal. So which character encoding on STDOUT does perl use by default? That should be utf-8 without any further ado, shouldn´t it? When I add binmode STDOUT, ":encoding(utf-8)"; the characters are displayed correctly in the terminal. Why would perl use something else than utf-8 by default? #!/usr/bin/perl use strict; use feature 'say'; use XML::Simple; #warnings should come last to handle any registered warnings in previous modules use warnings; binmode STDOUT, ":encoding(UTF-8)"; my $xml = XML::Simple->new; my $data = $xml->XMLin("test.xml"); say $data->{Regaletikett_ausgeben}; On Thu, Jul 28, 2016 at 9:05 AM hw <h...@gc-24.de <mailto:h...@gc-24.de>> wrote: Hi, I would like to read XML files which look like this: uuid:ee1bd852-37ee-4965-a097-50130cf6dac7 Infostand 5449000134264 gro 5449000134264 5449000134264 10.0 20 There is an Umlaut, ß, supposed to be at gro which is apparently impossible to read. The following program ... #!/usr/bin/perl use strict; use warnings; use feature 'say'; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; print $fh Dumper($data); close $fh; print Dumper($data); exit 0; ... gives me this output: $VAR1 = { 'Bezeichnung1' => {}, 'id' => 'build_Inventur_1469705446', 'Stationsnummer' => 'Infostand', 'meta' => { 'content' => 'text/html; charset=UTF-8', 'http-equiv' => 'content-type', 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7' }, 'Mitarbeiter_inv' => '5449000134264', 'Regaletikett_ausgeben' => "gro\x{df}", 'Erfassung' => { 'Artikelstapel' => { 'Menge' => '20', 'Preis' => '10.0', 'EAN_Arti
Re: XML::Simple Umlaute
Data::Dumper is dumping the internal format. To ensure compatibility, it is using the \x{df} escape to represent LATIN SMALL LETTER SHARP S. To see it rendered as a character, just print it: #!/usr/bin/perl use strict; use feature 'say'; use XML::Simple; #warnings should come last to handle any registered warnings in previous modules use warnings; binmode STDOUT, ":encoding(UTF-8)"; my $xml = XML::Simple->new; my $data = $xml->XMLin("test.xml"); say $data->{Regaletikett_ausgeben}; On Thu, Jul 28, 2016 at 9:05 AM hw <h...@gc-24.de> wrote: > > Hi, > > I would like to read XML files which look like this: > > > > >http-equiv="content-type" content="text/html; charset=UTF-8"> > uuid:ee1bd852-37ee-4965-a097-50130cf6dac7 > >Infostand >5449000134264 > >gro > > > 5449000134264 > >5449000134264 >10.0 >20 > > > > > > > There is an Umlaut, ß, supposed to be at > > > gro > > > > which is apparently impossible to read. The following program ... > > > #!/usr/bin/perl > > use strict; > use warnings; > > use feature 'say'; > > use XML::Simple; > use Data::Dumper; > > > my $xml = new XML::Simple; > my $data = $xml->XMLin("test.xml"); > > open my $fh, ">", 'pout'; > print $fh Dumper($data); > close $fh; > > print Dumper($data); > > > exit 0; > > > ... gives me this output: > > > $VAR1 = { >'Bezeichnung1' => {}, >'id' => 'build_Inventur_1469705446', >'Stationsnummer' => 'Infostand', >'meta' => { > 'content' => 'text/html; charset=UTF-8', > 'http-equiv' => 'content-type', > 'instanceID' => > 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7' >}, >'Mitarbeiter_inv' => '5449000134264', >'Regaletikett_ausgeben' => "gro\x{df}", >'Erfassung' => { > 'Artikelstapel' => { > 'Menge' => '20', > 'Preis' => '10.0', > 'EAN_Artikel' => > '5449000134264', > 'Etikettentyp' => {} >}, > 'Artikel_erfassen' => {}, > 'Lagerstaette' => '5449000134264' > } > }; > > > I´m not getting any better results when adding an encoding tag to the > XML file and when writing the Dumper output to a file. > > Is it impossible to use Umlaute in XML Files? > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > >
XML::Simple Umlaute
Hi, I would like to read XML files which look like this: uuid:ee1bd852-37ee-4965-a097-50130cf6dac7 Infostand 5449000134264 gro 5449000134264 5449000134264 10.0 20 There is an Umlaut, ß, supposed to be at gro which is apparently impossible to read. The following program ... #!/usr/bin/perl use strict; use warnings; use feature 'say'; use XML::Simple; use Data::Dumper; my $xml = new XML::Simple; my $data = $xml->XMLin("test.xml"); open my $fh, ">", 'pout'; print $fh Dumper($data); close $fh; print Dumper($data); exit 0; ... gives me this output: $VAR1 = { 'Bezeichnung1' => {}, 'id' => 'build_Inventur_1469705446', 'Stationsnummer' => 'Infostand', 'meta' => { 'content' => 'text/html; charset=UTF-8', 'http-equiv' => 'content-type', 'instanceID' => 'uuid:ee1bd852-37ee-4965-a097-50130cf6dac7' }, 'Mitarbeiter_inv' => '5449000134264', 'Regaletikett_ausgeben' => "gro\x{df}", 'Erfassung' => { 'Artikelstapel' => { 'Menge' => '20', 'Preis' => '10.0', 'EAN_Artikel' => '5449000134264', 'Etikettentyp' => {} }, 'Artikel_erfassen' => {}, 'Lagerstaette' => '5449000134264' } }; I´m not getting any better results when adding an encoding tag to the XML file and when writing the Dumper output to a file. Is it impossible to use Umlaute in XML Files? -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML Simple + parsing inner loop elements + help
tl;dr I'm not answering your specific question here. On Dec 8, 2015 1:26 AM, "perl kamal" <kamal.p...@gmail.com> wrote: > > Hi, > > I am trying to parse the inner loop elements of the attached input xml elements. Just fyi, I've found it easier to use xslt as an etl preprocessor to perl. I'm not sure how you intend to use the output but you might look into basex if you intend to store and parse lots of xml. > use strict; > use XML::Simple; > use Data::Dumper; > You'll find that for lots of uses, ::Simple is too simple. Also note that no perl modules support newer features (specifically, I've found xpath to be lacking) - another reason to prefer using xslt (though python's has richer modules for xml - but I can't stand python so...).
Re: XML Simple + parsing inner loop elements + help
On 8 December 2015 at 19:25, perl kamal <kamal.p...@gmail.com> wrote: > I am trying to parse the inner loop elements of the attached input xml > elements. > The below code doesn't retrieve the inner loop() elements if > the properties tag contains more than one item. Will you please point > the error and correct me. > Please find the attached input xml file. Thanks. A quick glance suggests you're getting bitten by one of the known problems of XML::Simple: That its completely inconsistent. 2 seemingly identally strucutred XML files can be decoded completely different to each other, so you need to have special cases everywhere in your code *just in case* that happens. Take for instance this simple code and its simple XML use strict; use warnings; use utf8; my $sample_a = <<"EOF"; EOF my $sample_b = <<"EOF"; EOF use XML::Simple; use Data::Dump qw(pp); my $sample_a_dec = XMLin($sample_a); my $sample_b_dec = XMLin($sample_b); pp { a => $sample_a_dec, b => $sample_b_dec, }; It looks simple, it looks like a and be have similar enough data structures, and you expect the pretty printed output to also be similar, right? Right? Nope! Here, XML::Simple went a bit special snowflake. { a => { subgroup => { item => { name => "bruce" } } }, b => { subgroup => { item => { mary => {}, sue => {} } } }, } At first glance you might overlook how these 2 entries are completely different. One is a hash mapping: "somevalue" => hash The other is a has mapping: "name" => some value either it should be: a => { subgroup => { item => { bruce => {} } } }, b => { subgroup => { item => { mary => {}, sue => {} } } }, or it should be a => { subgroup => { item => [{ name => "bruce" }] }}, b => { subgroup => { item => [{ name => "mary" }, { name => "sue" }] } } But XML::Simple gave you the worst of both worlds. For this reason, XML::Simple is not recommended for real world work. XML::Twig may be more what you're looking for. Even its maintainer and author for 15 says "Hey, please don't use this" :) -- Kent KENTNL - https://metacpan.org/author/KENTNL -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML Simple + parsing inner loop elements + help
Hi, Thanks you for your valuable comments,let me try the Twig module. On 12/8/15, Kent Fredric <kentfred...@gmail.com> wrote: > On 8 December 2015 at 19:25, perl kamal <kamal.p...@gmail.com> wrote: >> I am trying to parse the inner loop elements of the attached input xml >> elements. >> The below code doesn't retrieve the inner loop() elements if >> the properties tag contains more than one item. Will you please point >> the error and correct me. >> Please find the attached input xml file. Thanks. > > A quick glance suggests you're getting bitten by one of the known > problems of XML::Simple: That its completely inconsistent. > > 2 seemingly identally strucutred XML files can be decoded completely > different to each other, so you need to have special cases everywhere > in your code *just in case* that happens. > > Take for instance this simple code and its simple XML > > use strict; > use warnings; > use utf8; > > my $sample_a = <<"EOF"; > > > > > > EOF > > my $sample_b = <<"EOF"; > > > > > > > EOF > > use XML::Simple; > use Data::Dump qw(pp); > my $sample_a_dec = XMLin($sample_a); > my $sample_b_dec = XMLin($sample_b); > > pp { > a => $sample_a_dec, > b => $sample_b_dec, > }; > > It looks simple, it looks like a and be have similar enough data > structures, and you expect the pretty printed output to also be > similar, right? Right? > > Nope! > > Here, XML::Simple went a bit special snowflake. > > { > a => { subgroup => { item => { name => "bruce" } } }, > b => { subgroup => { item => { mary => {}, sue => {} } } }, > } > > At first glance you might overlook how these 2 entries are completely > different. > > One is a hash mapping: > "somevalue" => hash > The other is a has mapping: >"name" => some value > > either it should be: > > a => { subgroup => { item => { bruce => {} } } }, > b => { subgroup => { item => { mary => {}, sue => {} } } }, > > or it should be > > a => { subgroup => { item => [{ name => "bruce" }] }}, > b => { subgroup => { item => [{ name => "mary" }, { name => "sue" }] } } > > > But XML::Simple gave you the worst of both worlds. > > For this reason, XML::Simple is not recommended for real world work. > XML::Twig may be more what you're looking for. > > Even its maintainer and author for 15 says "Hey, please don't use this" :) > > -- > Kent > > KENTNL - https://metacpan.org/author/KENTNL > -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML Simple + parsing inner loop elements + help
Hi, I am trying to parse the inner loop elements of the attached input xml elements. The below code doesn't retrieve the inner loop() elements if the properties tag contains more than one item. Will you please point the error and correct me. Please find the attached input xml file. Thanks. use strict; use XML::Simple; use Data::Dumper; die "Usage: perl $0 path output_file\n" unless @ARGV == 2; my $path = shift; my $output_file = shift; my ($name,$url,$value,$password,$user,$prop_name,$string); opendir (DIR, "$path") or die "Can't open the dir: $!\n"; my @files = grep (/\.xml$/, readdir(DIR)); open(CSV, '>', "$output_file") or die "Can't open the file:$!\n"; my $header = "Name,Data Source,URL,Password\n"; print CSV $header; foreach my $file(@files) { print "Processing the file:$file\n"; ($name,$url,$value,$password) = undef; my $file_path = $path .'/'. $file; my $jdbc_data_source = XMLin($file_path); $name = "$jdbc_data_source->{'name'}"; $url = "$jdbc_data_source->{'jdbc-driver-params'}->{'url'}"; $password = "$jdbc_data_source->{'jdbc-driver-params'}->{'password-encrypted'}\n"; if (ref($jdbc_data_source->{'jdbc-driver-params'}->{'properties'}) =~ /ARRAY/) { #print Dumper($jdbc_data_source); foreach my $property (@{$jdbc_data_source->{'jdbc-driver-params'}->{'properties'}}) { #print Dumper($property); # $prop_name = {$property}->[0]; #$prop_name = {$property}->[1]; #No need to capture the value if pop_name eq user; # next if $prop_name eq 'user'; $string .= "$name,$value,$url,$password\n"; print "$string"; } } else { $prop_name = $jdbc_data_source->{'jdbc-driver-params'}{'properties'}{'property'}{'name'} ; #unless ($prop_name eq 'user'){ $value = $jdbc_data_source->{'jdbc-driver-params'}{'properties'}{'property'}{'value'} ; #} $string .= "$name,$value,$url,$password\n"; } #print "$string"; print CSV $string; } http://xmlns.oracle.com/weblogic/jdbc-data-source; xmlns:sec="http://xmlns.oracle.com/weblogic/security; xmlns:wls="http://xmlns.oracle.com/weblogic/security/wls; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation="http://xmlns.oracle.com/weblogic/jdbc-data-source http://xmlns.oracle.com/weblogic/jdbc-data-source/1.2/jdbc-data-source.xsd;> AllocationImport Data Source test url 1 name 1 den_alien ben1 v$session.program ALIMP-OSB-11G user ben_alimp_sb v$session.program ALIMP-OSB-11G {encrypted password 1}= 600 true SQL SELECT 1 FROM DUAL jdbc/AllocationImportServiceDS TwoPhaseCommit -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML Sructure creation not working
Hi, I am facing an issue with my script while creating a XML . Below is required structure :- xyz true Distri 123 However I am getting below o/p xyz true Distri 123 My current code is :- #!/usr/bin/perl use strict; use warnings; use FindBin qw(); use lib "$FindBin::Bin/../lib"; use IO::Handle; use XML::Simple; use Data::Dumper ; my $xmls = XML::Simple->new(ForceArray => 1); my $contents = { rule =>{appliedToList=> { appliedTo => [ ] } } }; push @{ $contents->{rule}->{appliedToList}->{appliedTo} }, { name => ['test'], value => ['123'], name => [ 'xyz' ], type => ['Distri'], isValid => [ 'true' ], }; open my $xml, '>', "output.xml" or die $!; $xml->print($xmls->XMLout($contents)); $xml->close(); Any insights on what I am doing wrong ? Regards, Punit
Re: XML::Rabbit and utf8
Hi, Martin! First, specify UTF-8 binmode for STDOUT, it's good practice if you printing unicode characters. Second and main, problem here is that your umlaut character has not ord 195. More over, the way you construct umlaut character give you not a single character but unicode grapheme. You can test it with this simple program https://gist.github.com/elcamlost/e44616785cf475bea10d This problem accurately described in Effective Perl Programming book (see http://www.effectiveperlprogramming.com/2011/06/treat-unicode-strings-as-grapheme-clusters/ ). So, your tests are correct and they fail by the reason. If you will construct your umlaut symbol like suggested in gist (my $CHAR_UMLAUT => "\N{LATIN SMALL LETTER U WITH DIAERESIS}";) your tests will work as expected. Completed example you can find in that gist https://gist.github.com/elcamlost/007c398c901881763c0b ср, 23 сент. 2015 г. в 12:26, Martin Barth <mar...@senfdax.de>: > Hello, > > i'm struggling around with umlauts in my xml files, which i want to > parse with XML::Rabbit. > I've got the same behaviour with __DATA__ or when i'm reading a xml file > via MyNode->new(file => ); > > And i've got non idea what i am doing wrong :( > (ps: yes, the testcase is utf8 encoded acording to the file command) > > % perl xml_rabbit.t > # > # 195 > not ok 1 - umlaut in xml > # Failed test 'umlaut in xml' > # at xml_rabbit.t line 18. > # got: '�' > # expected: 'ü' > not ok 2 - ord of umlaut > # Failed test 'ord of umlaut' > # at xml_rabbit.t line 19. > # got: '195' > # expected: '252' > 1..2 > # Looks like you failed 2 tests of 2. > > > % cat xml_rabbit.t > #!/usr/bin/env perl > > package MyNode; > use XML::Rabbit::Root; > has_xpath_value umlaut => '/x/umlaut'; > > package main; > use Test::More; > > my $xml = do{local $/; }; > my $node = MyNode->new(xml => $xml); > > diag $node->umlaut; > diag ord "ü"; > is($node->umlaut, "ü", "umlaut in xml"); > is(ord("ü"), ord($node->umlaut), "ord of umlaut"); > > done_testing(2); > > __DATA__ > > > ü > > % perl -v > > This is perl 5, version 20, subversion 1 (v5.20.1) built for x86_64-linux > > > > -- > To unsubscribe, e-mail: beginners-unsubscr...@perl.org > For additional commands, e-mail: beginners-h...@perl.org > http://learn.perl.org/ > > >
XML::Rabbit and utf8
Hello, i'm struggling around with umlauts in my xml files, which i want to parse with XML::Rabbit. I've got the same behaviour with __DATA__ or when i'm reading a xml file via MyNode->new(file => ); And i've got non idea what i am doing wrong :( (ps: yes, the testcase is utf8 encoded acording to the file command) % perl xml_rabbit.t # # 195 not ok 1 - umlaut in xml # Failed test 'umlaut in xml' # at xml_rabbit.t line 18. # got: '�' # expected: 'ü' not ok 2 - ord of umlaut # Failed test 'ord of umlaut' # at xml_rabbit.t line 19. # got: '195' # expected: '252' 1..2 # Looks like you failed 2 tests of 2. % cat xml_rabbit.t #!/usr/bin/env perl package MyNode; use XML::Rabbit::Root; has_xpath_value umlaut => '/x/umlaut'; package main; use Test::More; my $xml = do{local $/; }; my $node = MyNode->new(xml => $xml); diag $node->umlaut; diag ord "ü"; is($node->umlaut, "ü", "umlaut in xml"); is(ord("ü"), ord($node->umlaut), "ord of umlaut"); done_testing(2); __DATA__ ü % perl -v This is perl 5, version 20, subversion 1 (v5.20.1) built for x86_64-linux -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Printing dir into XML
On Thu, Jul 9, 2015 at 6:01 AM, Nagy Tamas (TVI-GmbH) tamas.n...@tvi-gmbh.de wrote: Hi, The following code doesn’t recognize dirs. As I list the dir into the XML, it shows dirs as ordinary files. Like the –d would not work. If I add an extra branch to recognize files with –f, it doesn’t print either files at all nor dirs. sub Traverse { opendir(DIR, $dir) or die Cannot open directory $dir: $!\n; my @files = readdir(DIR); closedir(DIR); foreach my $file (@files) { # generate XML here next if (($file eq '.') || ($file eq '..')); print $file; if((-d $file) and ($file !~ /^\.\.?$/) and ($file ne .) and ($file ne ..)) { # make dir branch $writer-startTag(Folder, Name = $file); Traverse($file); $writer-endTag(Folder); } else { $writer-emptyTag(Object, Name = $file); # make file branch } } } Tamas From the readdir documentation: If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file. opendir(my $dh, $some_dir) || die can't opendir $some_dir: $!; @dots = grep { /^\./ -f $some_dir/$_ } readdir($dh); closedir $dh; HTH, Ken
Printing dir into XML
Hi, The following code doesn't recognize dirs. As I list the dir into the XML, it shows dirs as ordinary files. Like the -d would not work. If I add an extra branch to recognize files with -f, it doesn't print either files at all nor dirs. sub Traverse { opendir(DIR, $dir) or die Cannot open directory $dir: $!\n; my @files = readdir(DIR); closedir(DIR); foreach my $file (@files) { # generate XML here next if (($file eq '.') || ($file eq '..')); print $file; if((-d $file) and ($file !~ /^\.\.?$/) and ($file ne .) and ($file ne ..)) { # make dir branch $writer-startTag(Folder, Name = $file); Traverse($file); $writer-endTag(Folder); } else { $writer-emptyTag(Object, Name = $file); # make file branch } } } Tamas
generate XML from recursion
Hi, Thanks for the solutions of the recent problems. I would like to traverse a dir on the HDD at a part of the XML generation. So I generated about 75% of an xml file with xml::writer, and it comes a Projectstructure tag. Between these tag I simply list a dir into empty tags Object attr=./. If recursion is ready, the closing Projectstructure tag closes it. And non recursive XML generation continues. $output = IO::File-new(default.xml); $writer = XML::Writer-new(OUTPUT = $output, DATA_MODE = 1, DATA_INDENT = , ENCODING = utf-8, NEWLINES = 0 ); Traverse($dir, $writer); sub Traverse { opendir(DIR, $dir) or die Cannot open directory $dir: $!\n; my @files = readdir(DIR); closedir(DIR); foreach my $file (@files) { # generate XML here print $file; if(-d $file and ($file !~ /^\.\.?$/) ) { # make dir branch Traverse($file); } $writer-emptyTag(Object, Name = $file); # make file branch } } ... $writer-startTag(ProjectStructure); Traverse(C:\\ph); $writer-endTag(ProjectStructure); But it gives a hideous error: Attempt to insert empty tag after close of document element at t2tot3Project line 72. How can it be? I call the subrutine in the middle of a tag. How can be the document closed? Tamas
Re: generate XML from recursion
On Wed, Jul 8, 2015 at 10:51 AM, Nagy Tamas (TVI-GmbH) tamas.n...@tvi-gmbh.de wrote: How can it be? I call the subroutine in the middle of a tag. How can be the document closed? Well, I don't get that (could it be that call to Travers($dir,$writer); before the sub def?) but to get your code to work I had to keep the dir involved sub Traverse { my $dir = shift; opendir(DIR, $dir) or die Cannot open directory $dir: $!\n; my @files = readdir(DIR); closedir(DIR); foreach my $file (@files) { # generate XML here if(-d $dir/$file and ($file !~ /^\.\.?$/) ) { # make dir branch Traverse($dir/$file); } -- a Andy Bach, afb...@gmail.com 608 658-1890 cell 608 261-5738 wk
Re: How to parse XML content using lwp agent or XML Smart
Hi Rob, I am sorry, I somehow missed your response until this evening and in the mean time have figured out lots of new ways to have this fail shrug. Anyway, I looked at your sample and came up with the following updates and it works but I am interested to hear your input on how I can improve it. #!/opt/csw/bin/perl use strict; use warnings; use LWP; use XML::Smart; use CGI; my $ua = LWP::UserAgent-new; my $q = new CGI; my $doc = XML::Smart-new(); $doc-{PHC_LOGIN}{USERID} = ‘john.doe'; $doc-{PHC_LOGIN}{USERID}-set_node(1); $doc-{PHC_LOGIN}{USERPASSWORD} = ‘FAKEPASSWORD; $doc-{PHC_LOGIN}{USERPASSWORD}-set_node(1); $doc-{PHC_LOGIN}{PARTNERID} = ‘111'; $doc-{PHC_LOGIN}{PARTNERID}-set_node(1); $doc-save('passport.xml'); my $sendXML = passport.xml; my $message=; open (XML,$sendXML); while (XML) { $message .=$_; } close XML; my $webpage =https://example.com/members/LoginNet/AutoLogin.aspx;; $ua-proxy('https', 'http://proxy.exampe.net:8080/'); my $resp = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); my $xml = XML::Smart-new($resp-decoded_content); my $dest = $xml-{PHC_LOGIN}{REDIRECTURL}; print Location: $dest\n\n; On Aug 22, 2014, at 10:38 AM, Rob Dixon rob.di...@gmx.com wrote: On 22/08/2014 11:27, angus wrote: Hi, Hi, I have some sample code that posts some XML content to an URL using LWP agent and XML Smart. The response to this comes back as some custom XML. I now need to try and parse the response and find the REDIRECTURL value and redirect the client to that destination. I am having issues parsing the output here is what I have tried. my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $xml = new XML::Smart( $resonse-content , 'XML::Smart::HTMLParser' ); my $xml = new XML::Smart( $resonse-content , 'html' ); print \n\n FOUND:$xml\n; # comes back with a null value ## # another attempt below ### my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $html = $response-content; while ( $html =~/REDIRECT/g ) { print I found: $1\n; } else { print “nothing found\n”; }; The find the value REDIRECT as the script prints out “I found: “ and not “nothing found” but I need the url value…. Here is a sample of what the xml looks like that I am trying to parse. PHC_LOGIN\r VERSION3.0/VERSION\r PARTNERID11/PARTNERID\r USERLOGINJohn.doe/USERLOGIN\r ERROR_CODE0/ERROR_CODE\r ERROR_DESCRIPTIONLogin Successful/ERROR_DESCRIPTION\r REDIRECTURLhttps://example.com/cust/login.asp/REDIRECTURL\r /PHC_LOGIN There are a few problems here. I am concentrating on your first attempt, as regular expressions are very rarely an appropriate way of processing XML. - You need to save the content of the incoming message so that you can pass it to the parser, but you have written $response-decoded_content in void context, so the result will be just thrown away - In the line my $xml = new XML::Smart( $resonse-content , 'XML::Smart::HTMLParser' ); you mustn't put the method call in quotation marks.It will try to stringify `$response` (which, incidentally, you have misspelled) and result in something like HTTP::Response=HASH(0x2c68544)-content Plus, to be safe, you should be calling `decoded_content` instead of just `content`. - There is no need to specify a parser in the second parameter unless you need specific behaviour, so that line should be my $xml = XML::Smart-new( $resp-decoded_content ) after which you can simply access the elemetn you need using $xml-{PHC_LOGIN}{REDIRECTURL} I have put a semi-complete program below. It needs values for $webpage and $message, but the rest of the code is in place, and it runs fine. HTH, Rob use strict; use warnings; use 5.010; use LWP; use XML::Smart; my $ua = LWP::UserAgent-new; my $webpage = ''; my $message = ''; my $resp = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); my $xml = XML::Smart-new( $resp-decoded_content ); say $xml-{PHC_LOGIN}{REDIRECTURL}; **output** https://example.com/cust/login.asp
How to parse XML content using lwp agent or XML Smart
Hi, Hi, I have some sample code that posts some XML content to an URL using LWP agent and XML Smart. The response to this comes back as some custom XML. I now need to try and parse the response and find the REDIRECTURL value and redirect the client to that destination. I am having issues parsing the output here is what I have tried. my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $xml = new XML::Smart( $resonse-content , 'XML::Smart::HTMLParser' ); my $xml = new XML::Smart( $resonse-content , 'html' ); print \n\n FOUND:$xml\n; # comes back with a null value ## # another attempt below ### my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $html = $response-content; while ( $html =~/REDIRECT/g ) { print I found: $1\n; } else { print “nothing found\n”; }; The find the value REDIRECT as the script prints out “I found: “ and not “nothing found” but I need the url value…. Here is a sample of what the xml looks like that I am trying to parse. PHC_LOGIN\r VERSION3.0/VERSION\r PARTNERID11/PARTNERID\r USERLOGINJohn.doe/USERLOGIN\r ERROR_CODE0/ERROR_CODE\r ERROR_DESCRIPTIONLogin Successful/ERROR_DESCRIPTION\r REDIRECTURLhttps://example.com/cust/login.asp/REDIRECTURL\r /PHC_LOGIN Thanks in advance for any tips you can provide. -angus
Re: How to parse XML content using lwp agent or XML Smart
On 22/08/2014 11:27, angus wrote: Hi, Hi, I have some sample code that posts some XML content to an URL using LWP agent and XML Smart. The response to this comes back as some custom XML. I now need to try and parse the response and find the REDIRECTURL value and redirect the client to that destination. I am having issues parsing the output here is what I have tried. my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $xml = new XML::Smart( $resonse-content , 'XML::Smart::HTMLParser' ); my $xml = new XML::Smart( $resonse-content , 'html' ); print \n\n FOUND:$xml\n; # comes back with a null value ## # another attempt below ### my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); $response-decoded_content; if ($response-is_success) { my $html = $response-content; while ( $html =~/REDIRECT/g ) { print I found: $1\n; } else { print “nothing found\n”; }; The find the value REDIRECT as the script prints out “I found: “ and not “nothing found” but I need the url value…. Here is a sample of what the xml looks like that I am trying to parse. PHC_LOGIN\r VERSION3.0/VERSION\r PARTNERID11/PARTNERID\r USERLOGINJohn.doe/USERLOGIN\r ERROR_CODE0/ERROR_CODE\r ERROR_DESCRIPTIONLogin Successful/ERROR_DESCRIPTION\r REDIRECTURLhttps://example.com/cust/login.asp/REDIRECTURL\r /PHC_LOGIN There are a few problems here. I am concentrating on your first attempt, as regular expressions are very rarely an appropriate way of processing XML. - You need to save the content of the incoming message so that you can pass it to the parser, but you have written $response-decoded_content in void context, so the result will be just thrown away - In the line my $xml = new XML::Smart( $resonse-content , 'XML::Smart::HTMLParser' ); you mustn't put the method call in quotation marks.It will try to stringify `$response` (which, incidentally, you have misspelled) and result in something like HTTP::Response=HASH(0x2c68544)-content Plus, to be safe, you should be calling `decoded_content` instead of just `content`. - There is no need to specify a parser in the second parameter unless you need specific behaviour, so that line should be my $xml = XML::Smart-new( $resp-decoded_content ) after which you can simply access the elemetn you need using $xml-{PHC_LOGIN}{REDIRECTURL} I have put a semi-complete program below. It needs values for $webpage and $message, but the rest of the code is in place, and it runs fine. HTH, Rob use strict; use warnings; use 5.010; use LWP; use XML::Smart; my $ua = LWP::UserAgent-new; my $webpage = ''; my $message = ''; my $resp = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); my $xml = XML::Smart-new( $resp-decoded_content ); say $xml-{PHC_LOGIN}{REDIRECTURL}; **output** https://example.com/cust/login.asp -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
posting XML to a url and encoding issues.
Hi, I think I may have issues with the encoding in the perl below. Trying to monitor the http request/response using Fiddler I see some content length warnings at times. I also think there may be an encoding problems because the web server I am posting to seems to return an error. I believe XML::Smart defaults to ASCII encoding or iso-8859-1 which I think are the same thing. I wonder though if LWP is using UTF-8 by default, and maybe that is causing me issues? Thanks in advance for any input you can provide. -angus #!/usr/bin/perl use LWP::UserAgent; use HTTP::Request; use XML::Smart; my $doc = XML::Smart-new(); $doc-{header} {data} {struct} {var}[0] = {name = 'USERID'} ; $doc-{header} {data} {struct} {var}[0] {string} = ‘john.doe'; $doc-{header} {data} {struct} {var}[0] {string}-set_node(1); $doc-{header} {data} {struct} {var}[1] = {name = 'USERPASSWORD'}; $doc-{header} {data} {struct} {var}[1] {string} = ‘FAKEPASSWORD'; $doc-{header} {data} {struct} {var}[1] {string}-set_node(1); $doc-{header} {data} {struct} {var}[2] = {name = 'PARTNERID'}; $doc-{header} {data} {struct} {var}[2] {string} = ‘1'; $doc-{header} {data} {struct} {var}[2] {string}-set_node(1); $doc-save('passport.xml'); my $sendXML=passport.xml; my $resultXML=passport_result.xml; my $webpage=https://example.com/login/autologin_xml.asp;; my $message=; open (XML,$sendXML); while (XML) { $message .=$_; } close XML; my $ua = LWP::UserAgent-new; print what we sent\n; $ua-add_handler(request_send, sub { shift-dump; return }); print what we got back\n; $ua-add_handler(response_done, sub { shift-dump; return }); $ua-proxy('https', 'http://proxy.example.net:8080/'); my $response = $ua-post($webpage, Content_Type = 'text/xml', Content = $message); #my $response = $ua-request(POST $webpage, Content_Type = 'text/xml', Content = $message); if ($response-is_success) { #print $response-decoded_content; # or whatever } else { die print $response-status_line;
creating an XML document using xml::dom
I am trying to recreate some code written in C# using perl. The goal is to create an XML formatted login packet and post it to http webserver and then parse the response. My first hurdle I think is to create the XML login packet. Here is the sample of what the login packet is supposed to look like header /header data struct var name='USERID'stringjohn.doe/string/var var name='USERPASSWORD'stringFAKEPASSWD/string/var var name='PARTNERID'string1/string/var var name='HOMEURL'string/string/var var name='ERRORURL'string/string/var /struct /data I have been looking for a perl module to help me create this and I think using XML::DOM may solve this need. I have the following code so far which almost provides the first attribute and string value. #!/usr/bin/perl use strict; use warnings; use XML::DOM; my $doc = XML::DOM::Document-new; my $xml_pi = $doc-createXMLDecl ('1.0'); my $root = $doc-createElement('data'); my $body = $doc-createElement('struct'); $root-appendChild($body); my $var = $doc-createElement('var'); $var-setAttribute('name', 'USERID'); $body-appendChild($var); my $text = $doc-createTextNode('ghct.test'); $body-appendChild($text); print $xml_pi-toString; print $root-toString; print \n; When I run this code I get back the following output. ?xml version=1.0?datastructvar name=USERID/john.doe/struct/data Note I am missing the string element. Here is C# code used for this purpose. using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Net; using System.IO; using System.Diagnostics; using System.Xml; namespace LoginSSOSample { class Program { static void Main(string[] args) { string url = https://example.com/external/login/autologin_xml.asp;; HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create(url); ASCIIEncoding encoding = new ASCIIEncoding(); XmlDocument xmlDoc = new XmlDocument(); xmlDoc.LoadXml(TST_LOGIN/TST_LOGIN); var root = xmlDoc.DocumentElement; root.SetAttribute(USERID, john.doe); root.SetAttribute(USERPASSWORD, FAKEPASSWD); root.SetAttribute(USERAUTH, Client); root.SetAttribute(PARTNERID, 1); root.SetAttribute(ERRORURL, ); root.SetAttribute(HOMEURL, ); string postData = TST_LOGIN= + Uri.EscapeUriString(xmlDoc.OuterXml); byte[] data = encoding.GetBytes(postData); httpWReq.Method = POST; httpWReq.ContentType = application/x-www-form-urlencoded; httpWReq.ContentLength = data.Length; using (Stream stream = httpWReq.GetRequestStream()) { stream.Write(data, 0, data.Length); } HttpWebResponse response = (HttpWebResponse)httpWReq.GetResponse(); string responseString = new StreamReader(response.GetResponseStream()).ReadToEnd(); Debug.WriteLine(responseString); xmlDoc.LoadXml(responseString); root = xmlDoc.DocumentElement; string rurl = root.GetAttribute(REDIRECTURL); Process.Start(rurl); } static byte[] GetBytes(string str) { byte[] bytes = new byte[str.Length * sizeof(char)]; System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length); return bytes; } } } thanks in advance for any assistance you may be able to provide. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
creating an XML document using xml::dom
I am trying to recreate some code written in C# using perl. The goal is to create an XML formatted login packet and post it to http webserver and then parse the response. My first hurdle I think is to create the XML login packet. Here is the sample of what the login packet is supposed to look like header /header data struct var name='USERID'stringjohn.doe/string/var var name='USERPASSWORD'stringFAKEPASSWD/string/var var name='PARTNERID'string1/string/var var name='HOMEURL'string/string/var var name='ERRORURL'string/string/var /struct /data I have been looking for a perl module to help me create this and I think using XML::DOM may solve this need. I have the following code so far which almost provides the first attribute and string value. #!/usr/bin/perl use strict; use warnings; use XML::DOM; my $doc = XML::DOM::Document-new; my $xml_pi = $doc-createXMLDecl ('1.0'); my $root = $doc-createElement('data'); my $body = $doc-createElement('struct'); $root-appendChild($body); my $var = $doc-createElement('var'); $var-setAttribute('name', 'USERID'); $body-appendChild($var); my $text = $doc-createTextNode('ghct.test'); $body-appendChild($text); print $xml_pi-toString; print $root-toString; print \n; When I run this code I get back the following output. ?xml version=1.0?datastructvar name=USERID/john.doe/struct/data Note I am missing the string element. Here is C# code used for this purpose. using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Net; using System.IO; using System.Diagnostics; using System.Xml; namespace LoginSSOSample { class Program { static void Main(string[] args) { string url = https://example.com/external/login/autologin_xml.asp;; HttpWebRequest httpWReq = (HttpWebRequest)WebRequest.Create(url); ASCIIEncoding encoding = new ASCIIEncoding(); XmlDocument xmlDoc = new XmlDocument(); xmlDoc.LoadXml(TST_LOGIN/TST_LOGIN); var root = xmlDoc.DocumentElement; root.SetAttribute(USERID, john.doe); root.SetAttribute(USERPASSWORD, FAKEPASSWD); root.SetAttribute(USERAUTH, Client); root.SetAttribute(PARTNERID, 1); root.SetAttribute(ERRORURL, ); root.SetAttribute(HOMEURL, ); string postData = TST_LOGIN= + Uri.EscapeUriString(xmlDoc.OuterXml); byte[] data = encoding.GetBytes(postData); httpWReq.Method = POST; httpWReq.ContentType = application/x-www-form-urlencoded; httpWReq.ContentLength = data.Length; using (Stream stream = httpWReq.GetRequestStream()) { stream.Write(data, 0, data.Length); } HttpWebResponse response = (HttpWebResponse)httpWReq.GetResponse(); string responseString = new StreamReader(response.GetResponseStream()).ReadToEnd(); Debug.WriteLine(responseString); xmlDoc.LoadXml(responseString); root = xmlDoc.DocumentElement; string rurl = root.GetAttribute(REDIRECTURL); Process.Start(rurl); } static byte[] GetBytes(string str) { byte[] bytes = new byte[str.Length * sizeof(char)]; System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length); return bytes; } } } thanks in advance for any assistance you may be able to provide. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Problem printing double quotes in XML using XML::DOM
Dear Shlomi, I tried to install https://metacpan.org/release/XML-LibXML in my windows setup (using Strawberry Perl 5.18.1 on Windows 7) but it failed. Please suggest where am I going wrong? Here is the output from console (command prompt) C:\democpanm XML::LibXML -- Working on XML::LibXML Fetching http://www.cpan.org/authors/id/S/SH/SHLOMIF/XML-LibXML-2.0107.tar.gz .. . OK Configuring XML-LibXML-2.0107 ... N/A ! Configure failed for XML-LibXML-2.0107. See C:\Users\SHAJIK~1\.cpanm\work\1385 878181.5848\build.log for details. Here is the content of the build.log file [build.log] cpanm (App::cpanminus) 1.7001 on perl 5.018001 built for MSWin32-x64-multi-thread Work directory is C:\Users\SHAJIK~1/.cpanm/work/1385878277.6120 You have make G:\strawberry\c\bin\dmake.exe You have LWP 6.05 Falling back to Archive::Tar 1.92 Searching XML::LibXML on cpanmetadb ... -- Working on XML::LibXML Fetching http://www.cpan.org/authors/id/S/SH/SHLOMIF/XML-LibXML-2.0107.tar.gz - OK Unpacking XML-LibXML-2.0107.tar.gz Entering XML-LibXML-2.0107 Checking configure dependencies from META.json Checking if you have ExtUtils::MakeMaker 0 ... Yes (6.72) Configuring XML-LibXML-2.0107 Running Makefile.PL enable native perl UTF8 Checking for ability to link against xml2...no Checking for ability to link against libxml2...libxml2, zlib, and/or the Math library (-lm) have not been found. Try setting LIBS and INC values on the command line Or get libxml2 from http://xmlsoft.org/ If you install via RPMs, make sure you also install the -devel RPMs, as this is where the headers (.h files) are. Also, you may try to run perl Makefile.PL with the DEBUG=1 parameter to see the exact reason why the detection of libxml2 installation failed or why Makefile.PL was not able to compile a test program. - N/A - FAIL Configure failed for XML-LibXML-2.0107. See C:\Users\SHAJIK~1\.cpanm\work\1385878277.6120\build.log for details. [/build.log] Please help. Sincerely, Shaji --- Your talent is God's gift to you. What you do with it is your gift back to God. --- On Saturday, 30 November 2013 12:07 PM, Shaji Kalidasan shajiin...@yahoo.com wrote: Dear Shlomi, I want the XML output to include double quotes instead of quot;. Example:- In the following code snippet $test_method-setAttribute(name,\$count.$attribute_act\ duration-ms=\0\ started-at=\0\); I want the output (XML) to be duration-ms=0 started-at=0 One more issue. How can I include a newline after each end element so that the XML output is formatted? Please explain how can I do it? Thank you. best, Shaji --- Your talent is God's gift to you. What you do with it is your gift back to God. --- On Friday, 29 November 2013 1:53 PM, Shlomi Fish shlo...@shlomifish.org wrote: Hello Shaji, On Fri, 29 Nov 2013 13:32:49 +0800 (SGT) Shaji Kalidasan shajiin...@yahoo.com wrote: Dear Perlers, I am trying to print double quotes in the output (output.xml) but it is printing quot; instead of . How can I include double quotes in the output. Please help. In XML, «quot;» is an XML entity (see https://en.wikipedia.org/wiki/Character_entity_reference ) which is the same as giving double quotes - «» and is often required if specified as the value of a double quotes-enclosed attribute (e.g: «mytag myattr=Hello quot;Foo /»), so you should not worry about it being output instead. Furthermore, I should note that according to http://cpanratings.perl.org/dist/XML-DOM , XML::DOM has been under-maintained and largely superseded by https://metacpan.org/release/XML-LibXML (which I should note that I currently maintain). Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Beginners Site for the Vim text editor - http://vim.begin-site.org/ Only two things are infinite: the universe, and Chuck Norris’s destruction ability. And we cannot be sure about the former thanks to the latter. — http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/ Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Problem printing double quotes in XML using XML::DOM
On Sun, Dec 1, 2013 at 7:28 AM, Shaji Kalidasan shajiin...@yahoo.comwrote: Dear Shlomi, I tried to install https://metacpan.org/release/XML-LibXML in my windows setup (using Strawberry Perl 5.18.1 on Windows 7) but it failed. Please suggest where am I going wrong? Here is the output from console (command prompt) C:\democpanm XML::LibXML -- Working on XML::LibXML Fetching http://www.cpan.org/authors/id/S/SH/SHLOMIF/XML-LibXML-2.0107.tar.gz .. . OK Configuring XML-LibXML-2.0107 ... N/A ! Configure failed for XML-LibXML-2.0107. See C:\Users\SHAJIK~1\.cpanm\work\1385 878181.5848\build.log for details. Here is the content of the build.log file [build.log] cpanm (App::cpanminus) 1.7001 on perl 5.018001 built for MSWin32-x64-multi-thread Work directory is C:\Users\SHAJIK~1/.cpanm/work/1385878277.6120 You have make G:\strawberry\c\bin\dmake.exe You have LWP 6.05 Falling back to Archive::Tar 1.92 Searching XML::LibXML on cpanmetadb ... -- Working on XML::LibXML Fetching http://www.cpan.org/authors/id/S/SH/SHLOMIF/XML-LibXML-2.0107.tar.gz - OK Unpacking XML-LibXML-2.0107.tar.gz Entering XML-LibXML-2.0107 Checking configure dependencies from META.json Checking if you have ExtUtils::MakeMaker 0 ... Yes (6.72) Configuring XML-LibXML-2.0107 Running Makefile.PL enable native perl UTF8 Checking for ability to link against xml2...no Checking for ability to link against libxml2...libxml2, zlib, and/or the Math library (-lm) have not been found. Try setting LIBS and INC values on the command line Or get libxml2 from http://xmlsoft.org/ If you install via RPMs, make sure you also install the -devel RPMs, as this is where the headers (.h files) are. Also, you may try to run perl Makefile.PL with the DEBUG=1 parameter to see the exact reason why the detection of libxml2 installation failed or why Makefile.PL was not able to compile a test program. - N/A - FAIL Configure failed for XML-LibXML-2.0107. See C:\Users\SHAJIK~1\.cpanm\work\1385878277.6120\build.log for details. [/build.log] Please help. That buidlog has some suggestions. Have you tried them?
Re: Problem printing double quotes in XML using XML::DOM
Hello Shaji, On Fri, 29 Nov 2013 13:32:49 +0800 (SGT) Shaji Kalidasan shajiin...@yahoo.com wrote: Dear Perlers, I am trying to print double quotes in the output (output.xml) but it is printing quot; instead of . How can I include double quotes in the output. Please help. In XML, «quot;» is an XML entity (see https://en.wikipedia.org/wiki/Character_entity_reference ) which is the same as giving double quotes - «» and is often required if specified as the value of a double quotes-enclosed attribute (e.g: «mytag myattr=Hello quot;Foo /»), so you should not worry about it being output instead. Furthermore, I should note that according to http://cpanratings.perl.org/dist/XML-DOM , XML::DOM has been under-maintained and largely superseded by https://metacpan.org/release/XML-LibXML (which I should note that I currently maintain). Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Beginners Site for the Vim text editor - http://vim.begin-site.org/ Only two things are infinite: the universe, and Chuck Norris’s destruction ability. And we cannot be sure about the former thanks to the latter. — http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/ Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Problem printing double quotes in XML using XML::DOM
Dear Shlomi, I want the XML output to include double quotes instead of quot;. Example:- In the following code snippet $test_method-setAttribute(name,\$count.$attribute_act\ duration-ms=\0\ started-at=\0\); I want the output (XML) to be duration-ms=0 started-at=0 One more issue. How can I include a newline after each end element so that the XML output is formatted? Please explain how can I do it? Thank you. best, Shaji --- Your talent is God's gift to you. What you do with it is your gift back to God. --- On Friday, 29 November 2013 1:53 PM, Shlomi Fish shlo...@shlomifish.org wrote: Hello Shaji, On Fri, 29 Nov 2013 13:32:49 +0800 (SGT) Shaji Kalidasan shajiin...@yahoo.com wrote: Dear Perlers, I am trying to print double quotes in the output (output.xml) but it is printing quot; instead of . How can I include double quotes in the output. Please help. In XML, «quot;» is an XML entity (see https://en.wikipedia.org/wiki/Character_entity_reference ) which is the same as giving double quotes - «» and is often required if specified as the value of a double quotes-enclosed attribute (e.g: «mytag myattr=Hello quot;Foo /»), so you should not worry about it being output instead. Furthermore, I should note that according to http://cpanratings.perl.org/dist/XML-DOM , XML::DOM has been under-maintained and largely superseded by https://metacpan.org/release/XML-LibXML (which I should note that I currently maintain). Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Beginners Site for the Vim text editor - http://vim.begin-site.org/ Only two things are infinite: the universe, and Chuck Norris’s destruction ability. And we cannot be sure about the former thanks to the latter. — http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/ Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Problem printing double quotes in XML using XML::DOM
Dear Perlers, I am trying to print double quotes in the output (output.xml) but it is printing quot; instead of . How can I include double quotes in the output. Please help. Here is the code [code] #!/usr/bin/perl use strict; use warnings; use XML::DOM; my $parser = XML::DOM::Parser-new(); my $doc = $parser-parsefile(input.xml); # create an XML::DOM::Document object my $new_doc = XML::DOM::Document-new(); # create the root element of the document my $root = $new_doc-createElement('testng-results'); my $suite = $new_doc-createElement('suite'); $suite-setAttribute(name,Suite1); my $test = $new_doc-createElement('test'); $test-setAttribute(name, EML_CLOG_TSS320_3_6); my $class = $new_doc-createElement('class'); $class-setAttribute(name,1350OMS_9.6.5.0AD1_UX+IRP619UX.EML_CLOG_TSS320_3_6); $root-appendChild($suite); $suite-appendChild($test); $test-appendChild($class); my $count = 0; foreach my $node ($doc-getElementsByTagName('log')) { $count++; #Counter which indicates the record number #Getting the value of attribute nodes my $attribute_act = $node-getAttributeNode(act)-getValue(); #Filling the empty space with underscore $attribute_act =~ tr/ /_/; #Getting the value of elements my $app = $node-getElementsByTagName('app')-item(0)-getFirstChild-getNodeValue; my $subsys = $node-getElementsByTagName('subsys')-item(0)-getFirstChild-getNodeValue; my $host = $node-getElementsByTagName('host')-item(0)-getFirstChild-getNodeValue; my $operator = $node-getElementsByTagName('operator')-item(0)-getFirstChild-getNodeValue; my $time = $node-getElementsByTagName('time')-item(0)-getFirstChild-getNodeValue; $time =~ tr/ /_/; my $status = $node-getElementsByTagName('status')-item(0)-getFirstChild-getNodeValue; #my $userLabel = $node-getElementsByTagName('userLabel')-item(0)-getFirstChild-getNodeValue; my $involved_object = $node-getElementsByTagName('involved_object')-item(0)-getFirstChild-getNodeValue; #my $command = $node-getElementsByTagName('command')-item(0)-getFirstChild-getNodeValue; my $client_host = $node-getElementsByTagName('client_host')-item(0)-getFirstChild-getNodeValue; #Printing the information to screen print '*' x 40, \n; print app : $app\n; print act : $attribute_act\n; print subsys : $subsys\n; print host : $host\n; print operator : $operator\n; print time : $time\n; print status : $status\n; #print userLabel : $userLabel\n; print involved_object : $involved_object\n; #print command : $command\n; print client host : $client_host\n; print '*' x 40, \n; #create a new test-method element my $test_method = $new_doc-createElement('test-method'); if($status eq 'SUCCESS') { $test_method-setAttribute(status,PASS); } else { $test_method-setAttribute(status,FAIL); } $test_method-setAttribute(signature,=$count)$involved_object . _ . $attribute_act . (app=$app,subsys=$subsys,host=$host,operator=$operator,time=$time,status=$status,involved_object=$involved_object,client_host=$client_host);\n); $test_method-setAttribute(name,\$count.$attribute_act\ duration-ms=\0\ started-at=\0\); $test_method-setAttribute(description,$attribute_act(log_act=$attribute_act,app=$app,subsys=$subsys,host=$host,operator=$operator,time=$time,status=$status,involved_object=$involved_object,client_host=$client_host);\ finished-at=\0\); $test-appendChild($test_method); } $root-printToFile(output.xml); [/code] [input.xml] logger log act=NE Login appZIC/app seq0/seq subsys1850TSS320ZIC/subsys hostsindhu/host operatorCHANDRA/operator time2013-11-19 16:17:44/time utc_time2013-11-19 10:47:44/utc_time statusSUCCESS/status userLabelTSS320-BA-04/userLabel involved_objectTSS320-BA-04/involved_object request_idKWeCoZClHWuE7iJAnYGpEQ/request_id commandACT-USER/command clog_private_db_insertion_flag1/clog_private_db_insertion_flag client_host135.244.1.157/client_host usernameCHANDRA/username params::CHANDRA:V25537::*:HOSTIP=135-244-1-157/params /log log act=AddCard appZIC/app seq1/seq subsys1850TSS320ZIC/subsys hostsindhu/host operatorCHANDRA/operator time2013-11-19 16:17:44/time utc_time2013-11-19 10:47:44/utc_time statusFAILED/status involved_objectTSS320-BA-04/involved_object request_idWLAlMy4XMt9dysWL0xFM9g/request_id slotList slot slotLabelXFP-1-1-8-1/slotLabel /slot /slotList clog_private_db_insertion_flag1/clog_private_db_insertion_flag client_host135.244.1.157/client_host usernameCHANDRA/username neList ne neLabelTSS320-BA-04/neLabel /ne /neList typeXI-641/type /log log act=CreateXC appZIC/app seq3/seq subsys1850TSS320ZIC/subsys hostsindhu/host operatorCHANDRA/operator time2013-11-19 16:17:44/time utc_time2013-11-19 10:47:44/utc_time statusSUCCESS/status involved_objectTSS320-BA
Re: Handling special characters in peoples names in XML
From: Gregory Machin g...@linuxpro.co.za Thanks Terry for responding. The files are very big and contain data I'd prefer not to be out in the wild. what parts of the file would be helpful , I can provide the lines with the text and say heard part of the xml ?? Thanks G Yep, that should be enough. The first line or two and a bit with the text as attached files. Just make sure you do not change the encoding in the process. You may check that with a heda editor Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Handling special characters in peoples names in XML
Thanks Terry for responding. The files are very big and contain data I'd prefer not to be out in the wild. what parts of the file would be helpful , I can provide the lines with the text and say heard part of the xml ?? Thanks G On Thu, Jun 20, 2013 at 7:42 PM, Jenda Krynicky je...@krynicky.cz wrote: From: Gregory Machin g...@linuxpro.co.za I'm debugging an application written in Perl that converse data exported from the Nessus security scanner in xml format. I have narrowed down the bug to an issue with special characters in names that are in the file such as Fr~A©d~A©ric and Gr~A©goire , thus ~A© are most likely the guilty parties. What is the best and most simple way to handle this ? From a quick google it looks like I should convert the file to UTF8 format , would this be correct ? Thanks Greg Looks like the data already is utf8, but the header of the XML specifies otherwise. How do you parse the data? Can you give us a short example file? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Handling special characters in peoples names in XML
On Wed, 26 Jun 2013 12:36:01 +1200, Gregory Machin wrote: Looks like the data already is utf8, but the header of the XML specifies otherwise. How do you parse the data? Can you give us a short example file? Jenda This is a bit of code I adapt to whichever encoding I require. use open :encoding(UTF-16le); while( ) { s/\x{FF}\x{FE}|\x{}//; # Remove BOM. s/[\x0A\x0D]+$//; # Remove CR LF If you can get the data into a text editor which has a convert option, you can use it to either find out the encoding /or change it to utf8. If you have a file with mixed encodings, you have my sympathies. -- Peter Gordon, pete...@netspace.net.au on 06/26/2013 -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Handling special characters in peoples names in XML
On Wed, Jun 26, 2013 at 7:45 AM, Peter Gordon pete...@netspace.net.auwrote: On Wed, 26 Jun 2013 12:36:01 +1200, Gregory Machin wrote: Looks like the data already is utf8, but the header of the XML specifies otherwise. How do you parse the data? Can you give us a short example file? Jenda This is a bit of code I adapt to whichever encoding I require. use open :encoding(UTF-16le); while( ) { s/\x{FF}\x{FE}|\x{}//; # Remove BOM. s/[\x0A\x0D]+$//; # Remove CR LF If you can get the data into a text editor which has a convert option, you can use it to either find out the encoding /or change it to utf8. If you have a file with mixed encodings, you have my sympathies. Encode::Guess may occasionally be useful: use Encode::Guess; my $decoder=Encode::Guess-guess(Grégoire); die $decoder unless $decoder; print $decoder-name;#--- utf8 -- Charles DeRykus
Re: Handling special characters in peoples names in XML
From: Gregory Machin g...@linuxpro.co.za I'm debugging an application written in Perl that converse data exported from the Nessus security scanner in xml format. I have narrowed down the bug to an issue with special characters in names that are in the file such as Fr~A©d~A©ric and Gr~A©goire , thus ~A© are most likely the guilty parties. What is the best and most simple way to handle this ? From a quick google it looks like I should convert the file to UTF8 format , would this be correct ? Thanks Greg Looks like the data already is utf8, but the header of the XML specifies otherwise. How do you parse the data? Can you give us a short example file? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Handling special characters in peoples names in XML
Hi. I'm debugging an application written in Perl that converse data exported from the Nessus security scanner in xml format. I have narrowed down the bug to an issue with special characters in names that are in the file such as Frédéric and Grégoire , thus é are most likely the guilty parties. What is the best and most simple way to handle this ? From a quick google it looks like I should convert the file to UTF8 format , would this be correct ? Thanks Greg
Re: xml find text with wildcard
On Sat, Nov 10, 2012 at 05:33:31PM +, shawn wilson wrote: [ ... ] my $xml_data = XML; ?xml version=1.0 encoding=UTF-8? TEST xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:noNamespaceSchemaLocation=null.xsd A BFind Me/B CSome Data/C /A A BLeave Me Alone/B CUnimportant Data/C /A A BFind Me!/B CSome More Data/C /A /TEST XML [ ... ] my $nodes = $doc-findnodes(//*[text()]); my $i = 1; foreach my $node (@$nodes) { print $i++ . [ . $node-textContent . ]\n if $node-textContent =~ 'Find Me'; } From the docs: textContent $content = $node-textContent; this function returns the content of all text nodes in the descendants of the given node as specified in DOM. I guess you XPath is wrong, I've changed it to '//*/text()' and the result is the following: 1 [Find Me] 2 [Find Me!] Cheers, Gerhard -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
xml find text with wildcard
what is the best way to find nodes a certain string (in this example: /Find Me/) ? in the example below, i'm getting too much output - i'd only expect 2 and 4. i had read there's a way of doing this with a newer LibXML, but i was unable to get it to work nor find documentation to suggest the perl module supports the newer xpath features. also, if there's a way to give LibXML a subref to parse with and i could do something like: my $xparse = sub { return if $_-textContent;=~ /Find Me/; }; maybe i could do it that way? # CODE #!/usr/bin/perl use strict; use warnings; #use Carp::Always; use Data::Dumper; use XML::LibXML; my $xml_data = XML; ?xml version=1.0 encoding=UTF-8? TEST xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:noNamespaceSchemaLocation=null.xsd A BFind Me/B CSome Data/C /A A BLeave Me Alone/B CUnimportant Data/C /A A BFind Me!/B CSome More Data/C /A /TEST XML my $parser = XML::LibXML-new(); my $doc = $parser-parse_string($xml_data); my $nodes = $doc-findnodes(//*[text()]); my $i = 1; foreach my $node (@$nodes) { print $i++ . [ . $node-textContent . ]\n if $node-textContent =~ 'Find Me'; } # OUTPUT 1 [ Find Me Some Data Leave Me Alone Unimportant Data Find Me! Some More Data ] 2 [ Find Me Some Data ] 3 [Find Me] 4 [ Find Me! Some More Data ] 5 [Find Me!] -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com To: beginners@perl.org Subject:Fast XML parser? Date sent: Thu, 25 Oct 2012 14:33:15 +0300 Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. You did not specify what do you want to do with the lexemes anyway you might try something like this: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { print $_-{InflectionId}: $_-{Form}\n; } }, } ); $parser-parse(\*DATA); __DATA__ ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 ... XML::Rules sits on top of XML::Parser::Expat so I would not expect this to be 25 times faster than XML::Twig, but it might be a bit quicker. Or not. Jenda Hi Jenda, I tried your program above, modified as below, but it gives the error: Free to wrong pool 3967d8 not 20202020 at e:/usr/lib/XML/Parser/Expat.pm line 470. I was able to install XML::Rules under Windows using cpanm with no problems, so it should be working... The program: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); #print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { #print $_-{InflectionId}: $_-{Form}\n; } }, } ); my $file = '/path/to/file.xml'; open my $xml, ':utf8', $file or die Cannot open $file: $!; $parser-parse( $xml ); Thanks. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com To: beginners@perl.org Subject:Fast XML parser? Date sent: Thu, 25 Oct 2012 14:33:15 +0300 Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. You did not specify what do you want to do with the lexemes anyway you might try something like this: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { print $_-{InflectionId}: $_-{Form}\n; } }, } ); $parser-parse(\*DATA); __DATA__ ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 ... XML::Rules sits on top of XML::Parser::Expat so I would not expect this to be 25 times faster than XML::Twig, but it might be a bit quicker. Or not. Jenda I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
On Wed, Oct 31, 2012 at 5:39 PM, Jenda Krynicky je...@krynicky.cz wrote: From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ The memory issue is really an issue of the module it self I have had those problems as well, the more complex the xml structure the more memory it takes up and the faster you will run out. I simply moved on to other modules as I could not afford to spend my time on trying to figure out a workaround. Regards, Rob Coops
Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda perl -v This is perl 5, version 14, subversion 2 (v5.14.2) built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2011, Larry Wall Binary build 1402 [295342] provided by ActiveState http://www.ActiveState.com Built Oct 7 2011 15:49:44 ... cpanm XML::Parser::Expat Set up gcc environment - 3.4.5 (mingw-vista special r3) XML::Parser::Expat is up to date. (2.41) I think Perl is also new enough... Anyway, I solved the problem by parsing the XML content using regular expressions and it works very fast this way. And the regexp solution is not uglier and harder to maintain than using an XML parser... Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Octavian Rasnita orasn...@gmail.com To: beginners@perl.org Subject:Fast XML parser? Date sent: Thu, 25 Oct 2012 14:33:15 +0300 Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. You did not specify what do you want to do with the lexemes anyway you might try something like this: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { print $_-{InflectionId}: $_-{Form}\n; } }, } ); $parser-parse(\*DATA); __DATA__ ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 ... XML::Rules sits on top of XML::Parser::Expat so I would not expect this to be 25 times faster than XML::Twig, but it might be a bit quicker. Or not. Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
Hi Octavian, On Sun, 28 Oct 2012 17:45:15 +0200 Octavian Rasnita orasn...@gmail.com wrote: From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, Hi Shlomi, I tried to use XML::LibXML::Reader which uses the pool parser, and I read that: However, it is also possible to mix Reader with DOM. At every point the user may copy the current node (optionally expanded into a complete sub-tree) from the processed document to another DOM tree, or to instruct the Reader to collect sub-document in form of a DOM tree So I tried: use XML::LibXML::Reader; my $xml = 'path/to/xml/file.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-document; my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); #Doesn't work well my @lexem_text = $doc-getElementsByTagName( 'Form' ); #Doesn't work fine } I'm not sure you should do -document. I cannot tell you off-hand how to do it right, but I can try to investigate when I have some spare cycles. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Funny Anti-Terrorism Story - http://shlom.in/enemy What does “IDK” stand for? I don’t know. Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
On Mon, 29 Oct 2012 10:09:53 +0200 Shlomi Fish shlo...@shlomifish.org wrote: Hi Octavian, On Sun, 28 Oct 2012 17:45:15 +0200 Octavian Rasnita orasn...@gmail.com wrote: From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, Hi Shlomi, I tried to use XML::LibXML::Reader which uses the pool parser, and I read that: However, it is also possible to mix Reader with DOM. At every point the user may copy the current node (optionally expanded into a complete sub-tree) from the processed document to another DOM tree, or to instruct the Reader to collect sub-document in form of a DOM tree So I tried: use XML::LibXML::Reader; my $xml = 'path/to/xml/file.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-document; my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); #Doesn't work well my @lexem_text = $doc-getElementsByTagName( 'Form' ); #Doesn't work fine } I'm not sure you should do -document. I cannot tell you off-hand how to do it right, but I can try to investigate when I have some spare cycles. OK, after a short amount of investigation, I found that this program works: [CODE] use strict; use warnings; use XML::LibXML::Reader; my $xml = 'Lexems.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-copyCurrentNode(1); my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); my @lexem_text = $doc-getElementsByTagName( 'Form' ); } [/CODE] Note that you can also use XPath for looking up XML information. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ List of Text Processing Tools - http://shlom.in/text-proc Sophie: Let’s suppose you have a table with 2^n cups… Jack: Wait a second! Is ‘n’ a natural number? Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
On Mon, Oct 29, 2012 at 9:18 AM, Shlomi Fish shlo...@shlomifish.org wrote: On Mon, 29 Oct 2012 10:09:53 +0200 Shlomi Fish shlo...@shlomifish.org wrote: Hi Octavian, On Sun, 28 Oct 2012 17:45:15 +0200 Octavian Rasnita orasn...@gmail.com wrote: From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, Hi Shlomi, I tried to use XML::LibXML::Reader which uses the pool parser, and I read that: However, it is also possible to mix Reader with DOM. At every point the user may copy the current node (optionally expanded into a complete sub-tree) from the processed document to another DOM tree, or to instruct the Reader to collect sub-document in form of a DOM tree So I tried: use XML::LibXML::Reader; my $xml = 'path/to/xml/file.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-document; my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); #Doesn't work well my @lexem_text = $doc-getElementsByTagName( 'Form' ); #Doesn't work fine } I'm not sure you should do -document. I cannot tell you off-hand how to do it right, but I can try to investigate when I have some spare cycles. OK, after a short amount of investigation, I found that this program works: [CODE] use strict; use warnings; use XML::LibXML::Reader; my $xml = 'Lexems.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-copyCurrentNode(1); my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); my @lexem_text = $doc-getElementsByTagName( 'Form' ); } [/CODE] Note that you can also use XPath for looking up XML information. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ List of Text Processing Tools - http://shlom.in/text-proc Sophie: Let’s suppose you have a table with 2^n cups… Jack: Wait a second! Is ‘n’ a natural number? Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ A little late I know but still... Last year I was asked to process a large amount of XML files 2x 1.6M files that needed to be compared on a element by element level and with some fuzzy logic needed to be the same. Things like floating point precision could change (1.00 = 1) and in some cases data could show up in a different order (repeating elements for multiple items on an order). The whole idea was system A that took flat text output from a mainframe and translated this to XML for consumption by a web service was being replaced by system B that did the same thing but on a entirely different software stack. Of course this needed to go as fast as possible as we simply could not sit around for a few days while the computer did it's thing. LibXML was my saviour and using XPath was the fastest solution. Though it is possible to do the DOM thing you end up with the DOM being translated to XPath under the hood (at least the performance seemed to indicate that). After a lot of testing and using pretty much any XML parser I could find using LibXML and XPath was really the fastest. If you are going for speed then you will want to avoid any copy operations you can and you will want to as much as possible use references. Because even though a memory copy of some 100 bytes is a very fast operation on a few million files the the little time it takes kind of adds up to a lot longer then you would like it to. When you are looking at speed first and foremost try and avoid anything that would slow you down. A copy of information is slow so don't do it if you can avoid it. A reference to a memory location is slightly harder to work with in programming but a lot faster. A translation from DOM to XPath would take you time to do, the computer needs the same time. If it is pure speed you are after avoid this as well. If you are sure you are as fast as you can be add a benchmark to the code and try individual optimisations that might or might not be faster... you would be surprised how the perl internals sometimes are a lot faster with some operations then with others even though feeling wise you would not have expected this to be the case. For my case as it was a once in every 25 years kind of major change I didn't do to much benchmarking as the code would be discarded at the end of the project. (well stored in a dusty old SVN repository for others to reuse and never to be looked at again realistically) I got it to go fast enough
Re: Fast XML parser?
From: Rob Coops rco...@gmail.com On Mon, Oct 29, 2012 at 9:18 AM, Shlomi Fish shlo...@shlomifish.org wrote: On Mon, 29 Oct 2012 10:09:53 +0200 Shlomi Fish shlo...@shlomifish.org wrote: Hi Octavian, On Sun, 28 Oct 2012 17:45:15 +0200 Octavian Rasnita orasn...@gmail.com wrote: From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, ... OK, after a short amount of investigation, I found that this program works: [CODE] use strict; use warnings; use XML::LibXML::Reader; my $xml = 'Lexems.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-copyCurrentNode(1); my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); my @lexem_text = $doc-getElementsByTagName( 'Form' ); } [/CODE] Note that you can also use XPath for looking up XML information. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ A little late I know but still... Unfortunately it is not so late. :-) LibXML was my saviour and using XPath was the fastest solution. Though it is possible to do the DOM thing you end up with the DOM being translated to XPath under the hood (at least the performance seemed to indicate that). After a lot of testing and using pretty much any XML parser I could find using LibXML and XPath was really the fastest. If you are going for speed then you will want to avoid any copy operations you can and you will want to as much as possible use references. Because even though a memory copy of some 100 bytes is a very fast operation on a few million files the the little time it takes kind of adds up to a lot longer then you would like it to. ** Can you gave or point me to some examples of using XPath with XML::LibXML? I tried to use XML::XPath but it tries to load the entire document in memory so it is not a good way. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Shlomi Fish shlo...@shlomifish.org On Mon, 29 Oct 2012 10:09:53 +0200 Shlomi Fish shlo...@shlomifish.org wrote: Hi Octavian, On Sun, 28 Oct 2012 17:45:15 +0200 Octavian Rasnita orasn...@gmail.com wrote: From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, Hi Shlomi, I tried to use XML::LibXML::Reader which uses the pool parser, and I read that: However, it is also possible to mix Reader with DOM. At every point the user may copy the current node (optionally expanded into a complete sub-tree) from the processed document to another DOM tree, or to instruct the Reader to collect sub-document in form of a DOM tree So I tried: use XML::LibXML::Reader; my $xml = 'path/to/xml/file.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-document; my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); #Doesn't work well my @lexem_text = $doc-getElementsByTagName( 'Form' ); #Doesn't work fine } I'm not sure you should do -document. I cannot tell you off-hand how to do it right, but I can try to investigate when I have some spare cycles. OK, after a short amount of investigation, I found that this program works: [CODE] use strict; use warnings; use XML::LibXML::Reader; my $xml = 'Lexems.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-copyCurrentNode(1); my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); my @lexem_text = $doc-getElementsByTagName( 'Form' ); } [/CODE] Note that you can also use XPath for looking up XML information. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ I followed the way you suggested, and it works fine, however it is very slow. I've done: while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); my $doc = $reader-copyCurrentNode(1); my $timestamp = $doc-findnodes( 'Timestamp' ); my $lexem_text = $doc-findnodes( 'Form' ); my $inflected_forms = $doc-findnodes( 'InflectedForm' ); for my $inflected_form ( $inflected_forms-get_nodelist ) { my $inflection_id = $inflected_form-findnodes( './InflectionId' ); my $inflection_dia = $inflected_form-findnodes( './Form' ); } } I tried to find a way of using XPath but I couldn't find a good one, and it seems that copy of that node takes a pretty long time. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Shlomi Fish shlo...@shlomifish.org Hi Octavian, On Thu, 25 Oct 2012 14:33:15 +0300 Octavian Rasnita orasn...@gmail.com wrote: Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. XML::LibXML contains several event-based parsers including the SAX parser and the pull-parser. Can you try using them? Regards, Shlomi Fish Hi Shlomi, I tried to use XML::LibXML::Reader which uses the pool parser, and I read that: However, it is also possible to mix Reader with DOM. At every point the user may copy the current node (optionally expanded into a complete sub-tree) from the processed document to another DOM tree, or to instruct the Reader to collect sub-document in form of a DOM tree So I tried: use XML::LibXML::Reader; my $xml = 'path/to/xml/file.xml'; my $reader = XML::LibXML::Reader-new( location = $xml ) or die cannot read $xml; while ( $reader-nextElement( 'Lexem' ) ) { my $id = $reader-getAttribute( 'id' ); #works fine my $doc = $reader-document; my $timestamp = $doc-getElementsByTagName( 'Timestamp' ); #Doesn't work well my @lexem_text = $doc-getElementsByTagName( 'Form' ); #Doesn't work fine } So I could get that attribute well, but I couldn't get the rest of the sub-elements because for example when I printed the var $timestamp, sometimes it printed its value twice or 3 times together. I couldn't find an example of using XML::LibXML for reading the xml file element by element, than read each element's elements directly. The XML I want to parse looks like the one below. It is just much bigger. I want to read one by one each Lexem element (and I've done this successfully), then read its id attribute (also done this well), but I also want to read its sub elements, using something like: $reader-read_some_element('Form') or $reader-{Form} which should read just the element Form right below the Lexem element, but don't read the Form elements below the InflectedForm. and then read the elements under the InflectedForm element using something like: $reader-read_another_element( '/InflectedForm/Form' ) or like $reader-{InflectedForm}{Form} or using the $doc object... I tried to use a lot of methods for reading the elements of the current Lexem element, but with no good results. ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 Timestamp1346826989/Timestamp Formaa/Form InflectedForm InflectionId84/InflectionId Formaa/Form /InflectedForm /Lexem Lexem id=2 Timestamp1346826989/Timestamp Formaaa/Form InflectedForm InflectionId84/InflectionId Formaaa/Form /InflectedForm /Lexem Lexem id=3 Timestamp1346826989/Timestamp Formaaleni#039;an/Form InflectedForm InflectionId25/InflectionId Formaaleni#039;an/Form /InflectedForm InflectedForm InflectionId26/InflectionId Formaaleni#039;an/Form /InflectedForm /Lexem /Lexems Thanks. Octavian. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Twig installation fails
Hi Bob, On Thu, 25 Oct 2012 16:28:02 + Bob McConnell r...@cbord.com wrote: Can anyone tell me how to get this module installed on Win7? It is a requirement for ODF::lpOD. I get the following error message: -- Checking if your kit is complete... Looks good Writing Makefile for XML::Twig malformed JSON string, neither array, object, number, string or atom, at charact er offset 0 (before (end of string)) at Makefile.PL line 147. Warning: No success on command[C:\strawberry\perl\bin\perl.exe Makefile.PL] MIROD/XML-Twig-3.41.tar.gz C:\strawberry\perl\bin\perl.exe Makefile.PL -- NOT OK Running make test Make had some problems, won't test Running make install Make had some problems, won't install -- Seems like your ExtUtils::MakeMaker version is too low. In addition, perl-5.10.1 is very old, and should no longer be used. Please upgrade to perl-5.14.x or perl-5.16.x. Regards, Shlomi Fish This is Strawberry Perl, version information: -- perl --version This is perl, v5.10.1 (*) built for MSWin32-x86-multi-thread Copyright 1987-2009, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using man perl or perldoc perl. If you have access to the Internet, point your browser at http://www.perl.org/, the Perl Home Page. ver Microsoft Windows [Version 6.1.7601] -- I found descriptions of this problem by searching on Google, but nothing to help me resolve it. Bob McConnell This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of this message is not the intended recipient or an authorized representative of the intended recipient, any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this e-mail message, then delete this message and any attachments from your system. -- - Shlomi Fish http://www.shlomifish.org/ List of Networking Clients - http://shlom.in/net-clients Java is a DSL (= Domain Specific Language) to transform big XML documents into long exception stack traces. — Scott Bellware Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Fast XML parser?
Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. Thanks. use XML::Twig; my $xml = 'path/to/xml/file.xml'; my $t= XML::Twig-new( twig_handlers = { Lexem = sub { my( $t, $lexem )= @_; my $id = $lexem-att( 'id' ); my $timestamp = $lexem-first_child( 'Timestamp')-text; my $lexem_text = $lexem-first_child( 'Form' )-text; my @inflected_form = $lexem-children( 'InflectedForm' ); for my $inflected_form ( @inflected_form ) { my $inflection_id = $inflected_form-first_child( 'InflectionId' )-text; my $inflection_text = $inflected_form-first_child( 'Form' )-text; } $t-purge; return 1; }, } ); $t-safe_parsefile( $xml ); $t-purge; --Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
Hi Octavian, On Thu, 25 Oct 2012 14:33:15 +0300 Octavian Rasnita orasn...@gmail.com wrote: Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. XML::LibXML contains several event-based parsers including the SAX parser and the pull-parser. Can you try using them? Regards, Shlomi Fish I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. Thanks. use XML::Twig; my $xml = 'path/to/xml/file.xml'; my $t= XML::Twig-new( twig_handlers = { Lexem = sub { my( $t, $lexem )= @_; my $id = $lexem-att( 'id' ); my $timestamp = $lexem-first_child( 'Timestamp')-text; my $lexem_text = $lexem-first_child( 'Form' )-text; my @inflected_form = $lexem-children( 'InflectedForm' ); for my $inflected_form ( @inflected_form ) { my $inflection_id = $inflected_form-first_child( 'InflectionId' )-text; my $inflection_text = $inflected_form-first_child( 'Form' )-text; } $t-purge; return 1; }, } ); $t-safe_parsefile( $xml ); $t-purge; --Octavian -- - Shlomi Fish http://www.shlomifish.org/ Interview with Ben Collins-Sussman - http://shlom.in/sussman Modern Perl — the 3‐D Movie. In theatres near you. Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
Hi Octavian, On Thu, Oct 25, 2012 at 1:33 PM, Octavian Rasnita orasn...@gmail.com wrote: Can you recommend an XML parser which is faster than XML::Twig? Did you try XML::LibXML ? https://www.metacpan.org/module/XML::LibXML -- Michiel -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
I'm sorry, I did not see Shlomi's reply, it was in my spam folder for some reason. On Thu, Oct 25, 2012 at 5:30 PM, Michiel Beijen michiel.bei...@gmail.com wrote: Hi Octavian, On Thu, Oct 25, 2012 at 1:33 PM, Octavian Rasnita orasn...@gmail.com wrote: Can you recommend an XML parser which is faster than XML::Twig? Did you try XML::LibXML ? https://www.metacpan.org/module/XML::LibXML -- Michiel -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML::Twig installation fails
Can anyone tell me how to get this module installed on Win7? It is a requirement for ODF::lpOD. I get the following error message: -- Checking if your kit is complete... Looks good Writing Makefile for XML::Twig malformed JSON string, neither array, object, number, string or atom, at charact er offset 0 (before (end of string)) at Makefile.PL line 147. Warning: No success on command[C:\strawberry\perl\bin\perl.exe Makefile.PL] MIROD/XML-Twig-3.41.tar.gz C:\strawberry\perl\bin\perl.exe Makefile.PL -- NOT OK Running make test Make had some problems, won't test Running make install Make had some problems, won't install -- This is Strawberry Perl, version information: -- perl --version This is perl, v5.10.1 (*) built for MSWin32-x86-multi-thread Copyright 1987-2009, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using man perl or perldoc perl. If you have access to the Internet, point your browser at http://www.perl.org/, the Perl Home Page. ver Microsoft Windows [Version 6.1.7601] -- I found descriptions of this problem by searching on Google, but nothing to help me resolve it. Bob McConnell This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of this message is not the intended recipient or an authorized representative of the intended recipient, any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this e-mail message, then delete this message and any attachments from your system. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML file error
Hi, There is one XML file, while opening at any browser it will error. And while seeing the source of XML, all are correct. There is some truncated character in XML file who does not allow to display properly on browser. Please tell me how I can find out that XML is proper or not. Regards, Ajay
Re: XML file error
On 10/07/2012 10:37 AM, Ajaykumar Upadhyay wrote: Hi, There is one XML file, while opening at any browser it will error. And while seeing the source of XML, all are correct. There is some truncated character in XML file who does not allow to display properly on browser. What do you mean there are some truncated characters? Please tell me how I can find out that XML is proper or not. I have been known to use xmllint (part of the libxml project http://xmlsoft.org/) for that Regards, Ajay -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Twig Question
On Wed, Sep 05, 2012 at 08:26:30PM +0530, Anirban Adhikary wrote: Hi List, Hello, (Note: the first (unquoted) snippet is missing a single-quote :)) Now in the case of following XML file xn:VsDataContainer id=20408112016662 xn:attributes xn:vsDataTypevsMscServerCell/xn:vsDataType xn:vsDataFormatVersionvsData1.0/xn:vsDataFormatVersion xn:vsMscServerCell xn:callSourceNameRADIO-IU/xn:callSourceName xn:cellGrpNameINVALID/xn:cellGrpName xn:cellType3GCell/xn:cellType xn:gci_sai20408112016662/xn:gci_sai xn:iDPLNAAIDN/xn:iDPLNAA xn:ifCallInNO/xn:ifCallIn xn:ifCallOutNO/xn:ifCallOut xn:ifRoamAnalysisNO/xn:ifRoamAnalysis xn:isEarlyAssignEARLYASN/xn:isEarlyAssign xn:laDegree0/xn:laDegree xn:laMinute0/xn:laMinute xn:laSecond0/xn:laSecond xn:laiCategorySAI/xn:laiCategory xn:laiTypeHVLR/xn:laiType xn:latitudeTypeNOR/xn:latitudeType xn:lgDegree0/xn:lgDegree xn:lgMinute0/xn:lgMinute xn:lgSecond0/xn:lgSecond xn:locationIDNameINVALID/xn:locationIDName xn:locationNumber11704700700/xn:locationNumber xn:locationNumberNameINVALID/xn:locationNumberName xn:longitudeTypeEAST/xn:longitudeType xn:mncFFF/xn:mnc xn:mscNumber31653032/xn:mscNumber xn:multiAreaStatNameAHPTMS1/xn:multiAreaStatName xn:radius0/xn:radius xn:rncId1112/xn:rncId1 xn:svrNameAHPTMS1/xn:svrName xn:tZDSTNameINVALID/xn:tZDSTName xn:toneNameINVALID/xn:toneName xn:vlrNumber31653032/xn:vlrNumber /xn:vsMscServerCell /xn:attributes /xn:VsDataContainer (Note: indentation mine) I am using the same code to print the values it shows nothing in the screen use strict; use warnings; use XML::Twig; my $twig = XML::Twig-new(TwigHandlers = {VsDataContainer = \on_VsDataContainer}); sub on_VsDataContainer { my($twig, $dc)= @_; print $dc-id, \n; my $gci_sai = $dc-field('gci_sai'); print $gci_sai,\n; my $locationNumber = $dc-field('locationNumber'); print $locationNumber,\n; $twig-purge; } $twig-parsefile(C:/Users/eamasar/Desktop/xnm/data/WA07B/input/MSCServerCell_201209050400.E2G.xml); And when I am removing the xn: from the begining of the line from XML file it prints the values on screen. Disclaimer: I am not familiar with XML::Twig. The xn: prefix is called a namespace prefix. Many XML parsers handle these in a special way, requiring you to register the namespaces with the parser. Your XML snippet doesn't even contain an xmlns attribute to define the namespace. I don't even know if that is still considered well-formed XML. I played around with your code a little bit and managed to get some output. :) First, I added the following attribute to the root element of the XML data: xmlns:xn=bar Normally the xmlns is a URI so that it's easy to be globally unique. For my purposes anything will do. In your case, you should determine what the xn: prefix represents and use the appropriate URI. If it's your own namespace then make something up. :) If the XML data is coming from a third party then you may need to tell them to add the xmlns attribute. :) I don't know if XML::Twig can work around it or not (again, I'm not familiar with it). I think you'll have to do a bit more work to get at the descendants xn:gci_sai and xn:locationNumber. For now I wrote a separate handler for each, which suffices to get the desired output. :) You can continue on from there.. use strict; use warnings; use XML::Twig; # Note the $config{map_xmlns} element. my %config = ( map_xmlns = { foo = 'bar' # ^ ^ This needs to match the URI part in the XML # | document. #\ This can be anything we want. We use it in our code. }, TwigHandlers = { 'foo:VsDataContainer' = \on_VsDataContainer, #^ This matches the registered namespace prefix above. 'foo:gci_sai' = \print_text, 'foo:locationNumber' = \print_text, } ); my $twig = XML::Twig-new(%config); sub on_VsDataContainer { my ($twig, $dc) = @_; print $dc-id, \n; } sub print_text { my ($twig, $element) = @_; print trim($element-text()), \n; } $twig-parsefile(bar.xml); Regards, -- Brandon McCaig bamcc...@gmail.com bamcc...@castopulence.org Castopulence Software https://www.castopulence.org/ Blog http://www.bamccaig.com/ perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say' signature.asc Description: Digital signature
Re: XML::Twig Question
On Thu, Sep 06, 2012 at 03:09:28PM -0400, Brandon McCaig wrote: print trim($element-text()), \n; Sorry, I seem to have left out the definition of trim from my example.. For completeness, it would be something along these lines: sub trim { my ($string) = @_; $string =~ s/\A\s+//; $string =~ s/\s+\z//; return $string; } I only did that because the output had a bunch of extra whitespace lines, which I didn't bother to investigate beyond trimming the elements' text... Regards, -- Brandon McCaig bamcc...@gmail.com bamcc...@castopulence.org Castopulence Software https://www.castopulence.org/ Blog http://www.bamccaig.com/ perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say' signature.asc Description: Digital signature
XML::Twig Question
Hi List, I have a XML file which looks like as follows ISProducts StoreInfo BSC id=AMIBRB1 ALPHA10/ALPHA AMRCSFR3MODE1,3,4,7/AMRCSFR3MODE AMRCSFR3THR12,16,21/AMRCSFR3THR AMRCSFR3HYST2,3,3/AMRCSFR3HYST AMRCSFR4MODE1,3,6,8/AMRCSFR4MODE AMRCSFR4THR12,17,25/AMRCSFR4THR PAGBUNDLE50/PAGBUNDLE USERDATAAMI_BRANLY_B_1/USERDATA /BSC . . And I am ble to parse it using XML::TWIG use strict; use warnings; use XML::Twig; my $twig = XML::Twig-new(TwigHandlers = { BSC = \on_BSC }); sub on_BSC { my($twig, $bsc)= @_; print $bsc-id, \n; my $alpha = $bsc-field('ALPHA'); print $alpha, \n; $twig-purge; } $twig-parsefile(ISProducts.xml'); I am able to print the value against the ALPHA tag. Now in the case of following XML file xn:VsDataContainer id=20408112016662 xn:attributes xn:vsDataTypevsMscServerCell/xn:vsDataType xn:vsDataFormatVersionvsData1.0/xn:vsDataFormatVersion xn:vsMscServerCell xn:callSourceNameRADIO-IU/xn:callSourceName xn:cellGrpNameINVALID/xn:cellGrpName xn:cellType3GCell/xn:cellType xn:gci_sai20408112016662/xn:gci_sai xn:iDPLNAAIDN/xn:iDPLNAA xn:ifCallInNO/xn:ifCallIn xn:ifCallOutNO/xn:ifCallOut xn:ifRoamAnalysisNO/xn:ifRoamAnalysis xn:isEarlyAssignEARLYASN/xn:isEarlyAssign xn:laDegree0/xn:laDegree xn:laMinute0/xn:laMinute xn:laSecond0/xn:laSecond xn:laiCategorySAI/xn:laiCategory xn:laiTypeHVLR/xn:laiType xn:latitudeTypeNOR/xn:latitudeType xn:lgDegree0/xn:lgDegree xn:lgMinute0/xn:lgMinute xn:lgSecond0/xn:lgSecond xn:locationIDNameINVALID/xn:locationIDName xn:locationNumber11704700700/xn:locationNumber xn:locationNumberNameINVALID/xn:locationNumberName xn:longitudeTypeEAST/xn:longitudeType xn:mncFFF/xn:mnc xn:mscNumber31653032/xn:mscNumber xn:multiAreaStatNameAHPTMS1/xn:multiAreaStatName xn:radius0/xn:radius xn:rncId1112/xn:rncId1 xn:svrNameAHPTMS1/xn:svrName xn:tZDSTNameINVALID/xn:tZDSTName xn:toneNameINVALID/xn:toneName xn:vlrNumber31653032/xn:vlrNumber /xn:vsMscServerCell /xn:attributes /xn:VsDataContainer I am using the same code to print the values it shows nothing in the screen use strict; use warnings; use XML::Twig; my $twig = XML::Twig-new(TwigHandlers = {VsDataContainer = \on_VsDataContainer}); sub on_VsDataContainer { my($twig, $dc)= @_; print $dc-id, \n; my $gci_sai = $dc-field('gci_sai'); print $gci_sai,\n; my $locationNumber = $dc-field('locationNumber'); print $locationNumber,\n; $twig-purge; } $twig-parsefile(C:/Users/eamasar/Desktop/xnm/data/WA07B/input/MSCServerCell_201209050400.E2G.xml); ANd when I am removing the xn: from the begining of the line from XML file it prints the values on screen. Thanks Regards in advance Anirban Adhikary.
Re: XML::Twig Question
One thing forget to mention in the 2nd case I only able to print the value of id after removing of *xn: *from the beginning of the line.NO print for gci_sai and locationNumber . On Wed, Sep 5, 2012 at 8:26 PM, Anirban Adhikary anirban.adhik...@gmail.com wrote: Hi List, I have a XML file which looks like as follows ISProducts StoreInfo BSC id=AMIBRB1 ALPHA10/ALPHA AMRCSFR3MODE1,3,4,7/AMRCSFR3MODE AMRCSFR3THR12,16,21/AMRCSFR3THR AMRCSFR3HYST2,3,3/AMRCSFR3HYST AMRCSFR4MODE1,3,6,8/AMRCSFR4MODE AMRCSFR4THR12,17,25/AMRCSFR4THR PAGBUNDLE50/PAGBUNDLE USERDATAAMI_BRANLY_B_1/USERDATA /BSC . . And I am ble to parse it using XML::TWIG use strict; use warnings; use XML::Twig; my $twig = XML::Twig-new(TwigHandlers = { BSC = \on_BSC }); sub on_BSC { my($twig, $bsc)= @_; print $bsc-id, \n; my $alpha = $bsc-field('ALPHA'); print $alpha, \n; $twig-purge; } $twig-parsefile(ISProducts.xml'); I am able to print the value against the ALPHA tag. Now in the case of following XML file xn:VsDataContainer id=20408112016662 xn:attributes xn:vsDataTypevsMscServerCell/xn:vsDataType xn:vsDataFormatVersionvsData1.0/xn:vsDataFormatVersion xn:vsMscServerCell xn:callSourceNameRADIO-IU/xn:callSourceName xn:cellGrpNameINVALID/xn:cellGrpName xn:cellType3GCell/xn:cellType xn:gci_sai20408112016662/xn:gci_sai xn:iDPLNAAIDN/xn:iDPLNAA xn:ifCallInNO/xn:ifCallIn xn:ifCallOutNO/xn:ifCallOut xn:ifRoamAnalysisNO/xn:ifRoamAnalysis xn:isEarlyAssignEARLYASN/xn:isEarlyAssign xn:laDegree0/xn:laDegree xn:laMinute0/xn:laMinute xn:laSecond0/xn:laSecond xn:laiCategorySAI/xn:laiCategory xn:laiTypeHVLR/xn:laiType xn:latitudeTypeNOR/xn:latitudeType xn:lgDegree0/xn:lgDegree xn:lgMinute0/xn:lgMinute xn:lgSecond0/xn:lgSecond xn:locationIDNameINVALID/xn:locationIDName xn:locationNumber11704700700/xn:locationNumber xn:locationNumberNameINVALID/xn:locationNumberName xn:longitudeTypeEAST/xn:longitudeType xn:mncFFF/xn:mnc xn:mscNumber31653032/xn:mscNumber xn:multiAreaStatNameAHPTMS1/xn:multiAreaStatName xn:radius0/xn:radius xn:rncId1112/xn:rncId1 xn:svrNameAHPTMS1/xn:svrName xn:tZDSTNameINVALID/xn:tZDSTName xn:toneNameINVALID/xn:toneName xn:vlrNumber31653032/xn:vlrNumber /xn:vsMscServerCell /xn:attributes /xn:VsDataContainer I am using the same code to print the values it shows nothing in the screen use strict; use warnings; use XML::Twig; my $twig = XML::Twig-new(TwigHandlers = {VsDataContainer = \on_VsDataContainer}); sub on_VsDataContainer { my($twig, $dc)= @_; print $dc-id, \n; my $gci_sai = $dc-field('gci_sai'); print $gci_sai,\n; my $locationNumber = $dc-field('locationNumber'); print $locationNumber,\n; $twig-purge; } $twig-parsefile(C:/Users/eamasar/Desktop/xnm/data/WA07B/input/MSCServerCell_201209050400.E2G.xml); ANd when I am removing the xn: from the begining of the line from XML file it prints the values on screen. Thanks Regards in advance Anirban Adhikary.
xml parsing
hi, i need to parse the xml file and store the data in array : here is the code: use XML::Simple; my $ItemGroup = XMLin('C:\Users\bvcontrolbuild\Desktop\data.xml'); foreach my $BuildProject (@{$ItemGroup-{BuildProject}}) { print $BuildProject-{Include} . \n; } and xml file is : ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup but when i compile the code, it throws the error : Not an ARRAY reference' i need output in array : @a = (AssemblyInfo.csproj,Assembly.csproj); please suggest regards irfan
Re: xml parsing
Hi Irfan, On Thu, 16 Aug 2012 04:55:33 -0700 (PDT) Irfan Sayed irfan_sayed2...@yahoo.com wrote: hi, i need to parse the xml file and store the data in array : here is the code: use XML::Simple; my $ItemGroup = XMLin('C:\Users\bvcontrolbuild\Desktop\data.xml'); foreach my $BuildProject (@{$ItemGroup-{BuildProject}}) { print $BuildProject-{Include} . \n; } Please don't use XML::Simple to parse XML. See: * http://perl-begin.org/uses/xml/ rindolf xml perlbot Don't parse XML with regex! Use a real parser. Avoid XML::Simple (see the xml::simple factoid). Choices are ::Easy, ::Smart, ::TreeBuilder, ::Twig for simple stuff. LibXML is a good general purpose starting point. See also XML::All. http://perl-xml.sf.net/faq/ rindolf xml::simple perlbot XML::Simple commits the fatal flaw of trying to massage complicated and often irregular XML into the simple and highly regular world of perl data structures. Irregularities cause not a hashref sort of errors in your program. Use a real parser. see: xml In a post to the perl-xml mailing list, XML::Simple's originator and maintainer noted he can no longer recommend XML::Simple either. So please use a different and more robust alternative. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ The Human Hacking Field Guide - http://shlom.in/hhfg He says “One and one and one is three”. Got to be good‐looking ’cause he’s so hard to see. — The Beatles, “Come Together” Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
can you please give me sample code to store the xml contents to perl array using LibXML lets say xml files is as : ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup regards irfan From: Shlomi Fish shlo...@shlomifish.org To: Irfan Sayed irfan_sayed2...@yahoo.com Cc: beginners@perl.org beginners@perl.org Sent: Thursday, August 16, 2012 6:07 PM Subject: Re: xml parsing Hi Irfan, On Thu, 16 Aug 2012 04:55:33 -0700 (PDT) Irfan Sayed irfan_sayed2...@yahoo.com wrote: hi, i need to parse the xml file and store the data in array : here is the code: use XML::Simple; my $ItemGroup = XMLin('C:\Users\bvcontrolbuild\Desktop\data.xml'); foreach my $BuildProject (@{$ItemGroup-{BuildProject}}) { print $BuildProject-{Include} . \n; } Please don't use XML::Simple to parse XML. See: * http://perl-begin.org/uses/xml/ rindolf xml perlbot Don't parse XML with regex! Use a real parser. Avoid XML::Simple (see the xml::simple factoid). Choices are ::Easy, ::Smart, ::TreeBuilder, ::Twig for simple stuff. LibXML is a good general purpose starting point. See also XML::All. http://perl-xml.sf.net/faq/ rindolf xml::simple perlbot XML::Simple commits the fatal flaw of trying to massage complicated and often irregular XML into the simple and highly regular world of perl data structures. Irregularities cause not a hashref sort of errors in your program. Use a real parser. see: xml In a post to the perl-xml mailing list, XML::Simple's originator and maintainer noted he can no longer recommend XML::Simple either. So please use a different and more robust alternative. Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ The Human Hacking Field Guide - http://shlom.in/hhfg He says “One and one and one is three”. Got to be good‐looking ’cause he’s so hard to see. — The Beatles, “Come Together” Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
On 08/16/2012 07:46 AM, Irfan Sayed wrote: can you please give me sample code to store the xml contents to perl array using LibXML lets say xml files is as : ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup regards irfan I'm going to assume what you wanbt is the list of Included filenames... #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $XML = ';'; ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup ; my $document = XML::LibXML-load_xml(string = $XML); my @include = map $_-findvalue('@Include'), $document-documentElement-findnodes('/ItemGroup/BuildProject'); use Data::Dumper; print Data::Dumper-Dump([\@include],[qw/*include/]); -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
On Thu, Aug 16, 2012 at 04:55:33AM -0700, Irfan Sayed wrote: hi, i need to parse the xml file and store the data in array : here is the code: use XML::Simple; my $ItemGroup = XMLin('C:\Users\bvcontrolbuild\Desktop\data.xml'); foreach my $BuildProject (@{$ItemGroup-{BuildProject}}) { print $BuildProject-{Include} . \n; } and xml file is : ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup but when i compile the code, it throws the error : Not an ARRAY reference' i need output in array : @a = (AssemblyInfo.csproj,Assembly.csproj); please suggest Hi Irfan 1. I see no use strict; use warnings; in your code. If you are not using them please do so, as they will give syntax errors and warnings about your code. 2. use Data::Dumper; and then print Dumper($ItemGroup); will show the data structure returned by XMLin. FWIW I got $VAR1 = { 'BuildProject' = [ { 'Include' = 'AssemblyInfo.csproj' }, { 'Include' = 'Assembly.csproj' } ] }; AssemblyInfo.csproj Assembly.csproj from #!/usr/bin/env perl use Modern::Perl 2011; use autodie; use strict; use warnings; use XML::Simple; use Data::Dumper; my $ItemGroup = XMLin('./data.xml'); say Dumper($ItemGroup); foreach my $BuildProject (@{$ItemGroup-{BuildProject}}) { say $BuildProject-{Include}; } Kind Regards Lesley -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
thanks. it worked however, i cant give xml file path instead of all the contents in the start tag regards, irfan From: Lawrence Statton lawre...@cluon.com To: beginners@perl.org Sent: Thursday, August 16, 2012 6:24 PM Subject: Re: xml parsing On 08/16/2012 07:46 AM, Irfan Sayed wrote: can you please give me sample code to store the xml contents to perl array using LibXML lets say xml files is as : ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup regards irfan I'm going to assume what you wanbt is the list of Included filenames... #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $XML = ';'; ItemGroup BuildProject Include=AssemblyInfo.csproj / BuildProject Include=Assembly.csproj / /ItemGroup ; my $document = XML::LibXML-load_xml(string = $XML); my @include = map $_-findvalue('@Include'), $document-documentElement-findnodes('/ItemGroup/BuildProject'); use Data::Dumper; print Data::Dumper-Dump([\@include],[qw/*include/]); -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
On 08/16/2012 08:09 AM, Irfan Sayed wrote: thanks. it worked however, i cant give xml file path instead of all the contents in the start tag regards, irfan (BTW: The custom on this list is NOT to top post -- trim, and put your replies at the BOTTOM of the email you are responding to) Carefully read the perldoc for XML::LibXML ... XML is a huge and complicated thing and there is complete (if a bit overwhelming) documentation for Documents and Nodes and Elements and Attributes - and then multiply it all by oh Glory Be when it starts talking about namespaces (which for trivial XML like you've demonstrated can often be ignored) What you want is in the begging of perldoc XML::LibXML::Parser, where it says ... $dom = XML::LibXML-load_xml( location = $file_or_url # parser options ... ); -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
please find the attached xml file. please suggest. regards irfan The file attached does not match the sample XML file you included in your email. The string passed to findnodes() is called an XPath Selector - you will need to adjust that to match the actual path of the elements you want to retrieve from your XML Document. W3Schools has a tutorial on XPath at http://www.w3schools.com/xpath/ You want to find the path expression that describes Selects nodes in the document from the current node that match the selection no matter where they are --L -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
Okay -- I've looked at the attachment -- remember when I mentioned namespaces a while back? This document uses one, so things get more complicated. The solution I always use for this is to put the root element into an XPath Context object and assign a prefix for the default namespace (I want oh-so-much to believe there's an easier solution than this, but I found this one late at night some years ago and have stuck with it ever since) This code will do what you want: #!/usr/bin/perl use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; my $XML = '/tmp/carrerabuild1.proj'; # this loads the XML document, then passes the root element into the # XPathContext constructor -- perldoc XML::LibXML::XPathContext my $document = XML::LibXML::XPathContext-new(XML::LibXML-load_xml(location = $XML)-documentElement); # this adds a prefix to the default namespace from your document so we # can find its elements $document-registerNs(p = 'http://schemas.microsoft.com/developer/msbuild/2003'); my @include = map $_-value(), $document-findnodes('//p:ItemGroup/p:BuildProject/@Include'); use Data::Dumper; print Data::Dumper-Dump([\@include],[qw/*include/]); -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: xml parsing
From: Lawrence Statton lawre...@cluon.com To: Irfan Sayed irfan_sayed2...@yahoo.com Cc: beginners@perl.org beginners@perl.org Sent: Thursday, August 16, 2012 7:28 PM Subject: Re: xml parsing Okay -- I've looked at the attachment -- remember when I mentioned namespaces a while back? This document uses one, so things get more complicated. The solution I always use for this is to put the root element into an XPath Context object and assign a prefix for the default namespace (I want oh-so-much to believe there's an easier solution than this, but I found this one late at night some years ago and have stuck with it ever since) This code will do what you want: #!/usr/bin/perl use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; my $XML = '/tmp/carrerabuild1.proj'; # this loads the XML document, then passes the root element into the # XPathContext constructor -- perldoc XML::LibXML::XPathContext my $document = XML::LibXML::XPathContext-new(XML::LibXML-load_xml(location = $XML)-documentElement); # this adds a prefix to the default namespace from your document so we # can find its elements $document-registerNs(p = 'http://schemas.microsoft.com/developer/msbuild/2003'); my @include = map $_-value(), $document-findnodes('//p:ItemGroup/p:BuildProject/@Include'); use Data::Dumper; print Data::Dumper-Dump([\@include],[qw/*include/]); truly awesome . it worked. thanks. regards, irfan
problem install XML::LibXSLT on RHEL
Hi Gurus, I have written a program that uses XML::LibXSLT. When i Run the program it complain about not findind the XML/LibXSLT @INC. I have installed perl-libxml and perl-lib-xslt successfully .Still my perl progam complains about XML::LibXSLT not installed. 1) * when I run Perl MakeFile.PL to install XML::LibXSLT ( I get the below error)* ** running xslt-config... failed XML::LibXSLT needs libxslt version 1.1.18 or higher ** ** *2) Now I am trying to install libxslt when I run ./configure* It gives me the following error even though i have 2.6.28 lib avaiable on /usr/local/lib. checking for libxml libraries = 2.6.27... configure: error: Version 2.6.26 found. You need at least libxml2 2.6.27 for this version of libxslt *3) But I have install libxml2 2.6.28 and it resides ins /usr/loca/lib (see below)* -rwxr-xr-x 1 root root 4162762 May 29 10:59 libxml2.so.2.7.8 -rwxr-xr-x 1 root root 4076119 Jun 4 00:06 libxml2.so.2.6.28 lrwxrwxrwx 1 root root 17 Jun 4 00:06 libxml2.so - libxml2.so.2.6.28 -rwxr-xr-x 1 root root 807 Jun 4 00:06 libxml2.la lrwxrwxrwx 1 root root 16 Jun 4 00:06 libxml2.so.2 - libxml2.so.2.7.8 -rw-r--r-- 1 root root 6441062 Jun 4 00:06 libxml2.a -rw-r--r-- 1 root root 211 Jun 4 00:06 xml2Conf.sh drwxr-xr-x 2 root root4096 Jun 4 00:06 pkgconfig can somebody help me in resolving this issue? I did ldconfig too but no use. Thnaks, c
Re: problem install XML::LibXSLT on RHEL
*2) Now I am trying to install libxslt when I run ./configure* It gives me the following error even though i have 2.6.28 lib avaiable on /usr/local/lib. checking for libxml libraries= 2.6.27... configure: error: Version 2.6.26 found. You need at least libxml2 2.6.27 for this version of libxslt *3) But I have install libxml2 2.6.28 and it resides ins /usr/loca/lib (see below)* Hi, Run echo $LD_LIBRARY_PATH in your host, you may see that /usr/lib is before the path of /usr/local/lib, so the system find the version 2.6.26 in /usr/lib though you have installed the other version in /usr/local/lib. To resolve this problem you may need to reinstall the newest libxml2 in /usr/lib. HTH. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
Just an update I also found this book. http://www.amazon.com/Beginning-Perl-Web-Development-Professional/dp/1590595319 Beginning Perl Web Development: From Novice to Professional (Beginning: From Novice to Professional) [Paperback] Steve Suehring it's a good overview book but does cover a lot of ground. haven't finished it yet but liking it so fa. Sayth -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
From: David Christensen dpchr...@holgerdanske.com On 04/17/2012 11:04 PM, flebber wrote: I also saw Perl XML - http://shop.oreilly.com/product/9780596002053.do These will probably both be good references I haven't read Perl and XML (nor XML and Perl). However, I am working on a project that could benefit from XML. So, I'll need to explore that path soon enough. but was a little concern that the publication dates were 2000 and 2002. Perl and the core technologies involved have had significant advancement in that time. Don't let the publication dates put you off. Good ideas and good books are enduring. Yes, there are updates; but you won't understand the updates if you don't understand what is being updated in the first place. While generally this is a good idea, in this particular case the target was moving so fast that I would not recommend that book. Quite a few modules described in the book are long gone, others appeared, the Unicode handling in Perl has undergone big changes in the meantime, ... Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
On Apr 18, 2:50 pm, dpchr...@holgerdanske.com (David Christensen) wrote: On 04/17/2012 08:28 PM, flebber wrote: Can anyone recommend a good book/s for Perl and learning how to handle XPath,XML,SQL, XQuery. I would like to know better how to handle and retrieve text formats and utilise database storage of the data. Sorry I should have pointed out in ref to above I meant to add using perl tools DBI etc. Which perl tools and how to use them. All help appreciated Programming the Perl DBI; Database programming with Perl is the canonical book on Perl/ DBI: http://shop.oreilly.com/product/9781565926998.do But, the MySQL and PostgreSQL books I've read also included chapters or sections for various programming languages, including Perl. If you're doing simple database stuff and/or want to interface with various data stores (plain text/ CSV, SQLite, MS Excel/ Access/ SQL Server, MySQL, PostgreSQL, Oracle, etc.), the first book would be better. If you're going to do hard-core database stuff (large amounts of data, vendor-specific optimizations/ features, internal programming languages/ stored procedures, etc.), choose your engine and find a matching book with a Perl chapter. HTH, David Thanks David. I need to create an XML feed from Website data and store it in a MSSQL database. Then create queries stored procedures on that data and present the data in a web format. I will also be using MySQL but as part of my main project. I also saw Perl XML - http://shop.oreilly.com/product/9780596002053.do These will probably both be good references but was a little concern that the publication dates were 2000 and 2002. Perl and the core technologies involved have had significant advancement in that time. Sayth -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
David Christensen wrote: On 04/17/2012 08:28 PM, flebber wrote: Can anyone recommend a good book/s for Perl and learning how to handle XPath,XML,SQL, XQuery. I would like to know better how to handle and retrieve text formats and utilise database storage of the data. Sorry I should have pointed out in ref to above I meant to add using perl tools DBI etc. Which perl tools and how to use them. All help appreciated Programming the Perl DBI; Database programming with Perl is the canonical book on Perl/ DBI: http://shop.oreilly.com/product/9781565926998.do But, the MySQL and PostgreSQL books I've read also included chapters or sections for various programming languages, including Perl. If you're doing simple database stuff and/or want to interface with various data stores (plain text/ CSV, SQLite, MS Excel/ Access/ SQL Server, MySQL, PostgreSQL, Oracle, etc.), the first book would be better. If you're going to do hard-core database stuff (large amounts of data, vendor-specific optimizations/ features, internal programming languages/ stored procedures, etc.), choose your engine and find a matching book with a Perl chapter. HTH, David There also is, slightly newer (2003), Perl Database Programming (Wiley) ISBN 978-0-7645-4956-4. Though it is more directed to web programming it covers all the fundamentals. I used it as an introduction to DB programming. K.D.J. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
Hi flebber, thanks for bottom-posting. :-) On Tue, 17 Apr 2012 23:04:27 -0700 (PDT) flebber flebber.c...@gmail.com wrote: On Apr 18, 2:50 pm, dpchr...@holgerdanske.com (David Christensen) wrote: On 04/17/2012 08:28 PM, flebber wrote: Can anyone recommend a good book/s for Perl and learning how to handle XPath,XML,SQL, XQuery. I would like to know better how to handle and retrieve text formats and utilise database storage of the data. Sorry I should have pointed out in ref to above I meant to add using perl tools DBI etc. Which perl tools and how to use them. All help appreciated Programming the Perl DBI; Database programming with Perl is the canonical book on Perl/ DBI: http://shop.oreilly.com/product/9781565926998.do But, the MySQL and PostgreSQL books I've read also included chapters or sections for various programming languages, including Perl. If you're doing simple database stuff and/or want to interface with various data stores (plain text/ CSV, SQLite, MS Excel/ Access/ SQL Server, MySQL, PostgreSQL, Oracle, etc.), the first book would be better. If you're going to do hard-core database stuff (large amounts of data, vendor-specific optimizations/ features, internal programming languages/ stored procedures, etc.), choose your engine and find a matching book with a Perl chapter. HTH, David Thanks David. I need to create an XML feed from Website data and store it in a MSSQL database. Then create queries stored procedures on that data and present the data in a web format. I will also be using MySQL but as part of my main project. I also saw Perl XML - http://shop.oreilly.com/product/9780596002053.do These will probably both be good references but was a little concern that the publication dates were 2000 and 2002. Perl and the core technologies involved have had significant advancement in that time. I've read the Perl XML book and it's a nice book - not long yet informative. Here is my review it: http://perl.org.il/books/059600205X.html Regarding Perl databases, I'm quoting one of my replies to Boston-pm, whose archive appears to be private: [QUOTE] Well, as far as I know the CPAN DBI module did not break backwards compatibility since the year 2000 (though naturally there probably were many bug fixes and possibly some new features). What has changed considerably are the various layers above DBI.pm - from https://metacpan.org/release/DBIx-Simple (which gives you relatively little, but it's still pretty useful) to https://metacpan.org/release/DBIx-Class and its various dependencies and extensions (which is a full-fledged Object-Relational Mapper (ORM)) and maybe also https://metacpan.org/release/KiokuDB , which is an Object Graph storage engine, which can use several backends including DBI. There are many other DBI extensions in the DBIx:: and SQL:: namespaces . So will a book be appropriate? I'm not aware of any printed book, that covers all of that in a comprehensive manner, and even if you learn about SQL databases, SQL and then about DBI (which is time consuming by itself), you should at least play with DBIx-Class, to see if it's good enough for you. I've written a little about Databases in Perl here: http://perl-begin.org/uses/databases/ Now, There's this one : (for PHP) http://www.amazon.com/PHP-MySQL-Web-Development-Edition/dp/0672329166/ref=sr_1_2?s=booksie=UTF8qid=1334202498sr=1-2 Along with the very good reviews it gets, (yes, I know, online reviews), I flipped through it for 20 minutes or so, in a book store, and it looks like solid and well written information. can it be that there's not a current equivalent for Perl ? Well, someone might take the initiative and write or compile a book about the current state of the art with the DBI ecosystem in Perl, and possibly publish it as an E-book and/or on http://lulu.com/ . But it is bound to become out-of-date and I feel that the online documentation may be good enough for people who don't want or need a printed book. Now that I think of it, it is possible that some of the books about Catalyst (see http://www.catalystframework.org/ ) cover DBIx-Class to some extent, but they may assume some prior knowledge. Good luck! [/QUOTE] Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ http://www.shlomifish.org/humour/ways_to_do_it.html Real programmers use a nice editor and a nice programming language and get it done in less than O(N!). — vanguard on Freenode’s ##programming Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
XML::Mini question
Hi there, I've got a question about XML::Mini. When parsing an xml document for some reasons I want to preserve white space. However, it doesn't work really. Minimal example: ! /usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Mini::Document; my $XMLString = book Learning Perl /book; my $xmlDoc = XML::Mini::Document-new(); $XML::Mini::IgnoreWhitespaces = 0; # init the doc from an XML string $xmlDoc-parse($XMLString); my $xmlHash = $xmlDoc-toHash(); print Dumper($xmlHash); I get the following output: VAR1 = { 'book' = 'Learning Perl ' }; I would have expecte to have book' = ' Learning Perl ' instead. Any idea, what's going wrong? -- Manfred -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Mini question
Hi there, I've got a question about XML::Mini. When parsing an xml document for some reasons I want to preserve white space. However, it doesn't work really. Minimal example: ! /usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Mini::Document; my $XMLString = book Learning Perl /book; my $xmlDoc = XML::Mini::Document-new(); $XML::Mini::IgnoreWhitespaces = 0; # init the doc from an XML string $xmlDoc-parse($XMLString); my $xmlHash = $xmlDoc-toHash(); print Dumper($xmlHash); I get the following output: VAR1 = { 'book' = 'Learning Perl ' }; I would have expecte to have book' = ' Learning Perl ' instead. Any idea, what's going wrong? What Happens if you set $XML::Mini::IgnoreWhitespaces = 1 Seems to me that 1 = yes What does the documentation say? -- Owen -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Mini question
On Thu, 19 Apr 2012 06:15:47 +1000 Owen rc...@pcug.org.au wrote: Hi there, I've got a question about XML::Mini. When parsing an xml document for some reasons I want to preserve white space. However, it doesn't work really. Minimal example: ! /usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Mini::Document; my $XMLString = book Learning Perl /book; my $xmlDoc = XML::Mini::Document-new(); $XML::Mini::IgnoreWhitespaces = 0; # init the doc from an XML string $xmlDoc-parse($XMLString); my $xmlHash = $xmlDoc-toHash(); print Dumper($xmlHash); I get the following output: VAR1 = { 'book' = 'Learning Perl ' }; I would have expecte to have book' = ' Learning Perl ' instead. Any idea, what's going wrong? What Happens if you set $XML::Mini::IgnoreWhitespaces = 1 Seems to me that 1 = yes This is true. What does the documentation say? If I set it to 1 then I get book' = 'Learning Perl' which is even worse. Please note that I don't want to have ignored white space. -- Manfred -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Book Recommendation - XPath, XML,SQL, XQuery
On 04/17/2012 11:04 PM, flebber wrote: Thanks David. YW. :-) I need to create an XML feed from Website data and store it in a MSSQL database. Then create queries stored procedures on that data and present the data in a web format. I will also be using MySQL but as part of my main project. SQL queries via Perl DBI are one thing, stored procedures are another. Beware that the later may require you to learn yet another computer language (vendor-specific stored procedure language, such as PL/pgSQL). I also saw Perl XML - http://shop.oreilly.com/product/9780596002053.do These will probably both be good references I haven't read Perl and XML (nor XML and Perl). However, I am working on a project that could benefit from XML. So, I'll need to explore that path soon enough. but was a little concern that the publication dates were 2000 and 2002. Perl and the core technologies involved have had significant advancement in that time. Don't let the publication dates put you off. Good ideas and good books are enduring. Yes, there are updates; but you won't understand the updates if you don't understand what is being updated in the first place. I have been deliberately reading older Perl books that I missed back when. I'm currently reading Network Programming with Perl -- good stuff. :-) David -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: XML::Mini question
On Wed, 18 Apr 2012 22:23:37 +0200 Manfred Lotz manfred.l...@arcor.de wrote: On Thu, 19 Apr 2012 06:15:47 +1000 Owen rc...@pcug.org.au wrote: Hi there, I've got a question about XML::Mini. When parsing an xml document for some reasons I want to preserve white space. However, it doesn't work really. Minimal example: ! /usr/bin/perl use strict; use warnings; use Data::Dumper; use XML::Mini::Document; my $XMLString = book Learning Perl /book; my $xmlDoc = XML::Mini::Document-new(); $XML::Mini::IgnoreWhitespaces = 0; # init the doc from an XML string $xmlDoc-parse($XMLString); my $xmlHash = $xmlDoc-toHash(); print Dumper($xmlHash); I get the following output: VAR1 = { 'book' = 'Learning Perl ' }; I would have expecte to have book' = ' Learning Perl ' instead. Any idea, what's going wrong? What Happens if you set $XML::Mini::IgnoreWhitespaces = 1 Seems to me that 1 = yes This is true. What does the documentation say? If I set it to 1 then I get book' = 'Learning Perl' which is even worse. Please note that I don't want to have ignored white space. Hm, I had no other idea but to look up the source code. I guess I found what happens. if ($XMLString =~ m/^\s*(\s*([^\s]+)([^]+)\/\s*| # unary \/ \?\s*([^\s]+)\s*([^]*)\?| # ? headers ? !--(.+?)--| # !-- comments -- !\[CDATA\s*\[(.*?)\]\]\s*\s*| # CDATA !DOCTYPE\s*([^\[]*)(\[.*?\])?\s*\s*| # DOCTYPE !ENTITY\s*([^']+)\s*(['])([^\11]+)\11\s*\s*| # ENTITY ([^]+))(.*)/xogsmi) # plain text IHMO, here is the bug. Here leading white space will be deleted which is ok if it is no plaintext. I changed it like this if ($XMLString =~ m/(^\s*\s*([^\s]+)([^]+)\/\s*| #unary \/ ^\s*\?\s*([^\s]+)\s*([^]*)\?| # ? headers ? ^\s*!--(.+?)--| # !-- comments -- ^\s*!\[CDATA\s*\[(.*?)\]\]\s*\s*| # CDATA ^\s*!DOCTYPE\s*([^\[]*)(\[.*?\])?\s*\s*| # DOCTYPE ^\s*!ENTITY\s*([^']+)\s*(['])([^\11]+)\11\s*\s*| # ENTITY ([^]+))(.*)/xogsmi) # plain text Now in all cases except plain text leading space will be deleted. $VAR1 = { 'book' = ' Learning Perl ' }; -- Manfred -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Book Recommendation - XPath, XML,SQL, XQuery
Can anyone recommend a good book/s for Perl and learning how to handle XPath,XML,SQL, XQuery. I would like to know better how to handle and retrieve text formats and utilise database storage of the data. Sayth -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/