From: "Jenda Krynicky" <je...@krynicky.cz>

> From:           "Octavian Rasnita" <orasn...@gmail.com>
> To:             <beginners@perl.org>
> Subject:        Fast XML parser?
> Date sent:      Thu, 25 Oct 2012 14:33:15 +0300
> 
>> Hi,
>> 
>> Can you recommend an XML parser which is faster than XML::Twig?
>> 
>> I need to use an XML parser that can parse the XML files chunk by chunk and 
>> which works faster (much faster) than XML::Twig, because I tried using this 
>> module but it is very slow.
>> 
>> I tried something like the code below, but I have also tried a version
>> that just opens the file and parses it using regular expressions,
>> however the unelegant regexp version is 25 times faster than the one
>> which uses XML::Twig, and it also uses less memory. 
>> 
>> If you think there is a module for parsing XML which would work faster
>> than regular expressions, or if I can substantially improve the
>> program which uses XML::Twig  then please tell me about it. If regexp
>> will still be faster, I will use regexp. 
> 
> You did not specify what do you want to do with the lexemes anyway 
> you might try something like this:
> 
> use strict;
> use XML::Rules;
> use Data::Dumper;
> 
> my $parser = XML::Rules->new(
> stripspaces => 7,
> rules => {
> _default => 'content',
> InflectedForm => 'as array',
> Lexem => sub {
> #print Dumper($_[1]);
> print "$_[1]->{Form}\n";
> foreach (@{$_[1]->{InflectedForm}}) {
> print "  $_->{InflectionId}: $_->{Form}\n";
> }
> },
> }
> );
> 
> $parser->parse(\*DATA);
> 
> __DATA__
> <?xml version="1.0" encoding="UTF-8"?>
> <Lexems>
>  <Lexem id="1">
> ...
> 
> XML::Rules sits on top of XML::Parser::Expat so I would not expect 
> this to be 25 times faster than XML::Twig, but it might be a bit 
> quicker. Or not.
> 
> Jenda



I forgot to say that the script I previously sent to the list also crashed Perl 
and it popped an error window with:

perl.exe - Application Error
The instruction at "0x7c910f20" referenced memory at "0x00000004". The memory 
could not be "read".  Click on OK to terminate the program 

I have created a smaller XML file with only ~ 100 lines and I ran agan that 
script, and it worked fine.

But it doesn't work with the entire xml file which has more than 200 MB, 
because it crashes Perl and I don't know why.

And strange, but I've seen that now it just crashes Perl, but it doesn't return 
that "Free to wrong pool" error.

Octavian


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to