Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com To: beginners@perl.org Subject:Fast XML parser? Date sent: Thu, 25 Oct 2012 14:33:15 +0300 Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. You did not specify what do you want to do with the lexemes anyway you might try something like this: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { print $_-{InflectionId}: $_-{Form}\n; } }, } ); $parser-parse(\*DATA); __DATA__ ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 ... XML::Rules sits on top of XML::Parser::Expat so I would not expect this to be 25 times faster than XML::Twig, but it might be a bit quicker. Or not. Jenda Hi Jenda, I tried your program above, modified as below, but it gives the error: Free to wrong pool 3967d8 not 20202020 at e:/usr/lib/XML/Parser/Expat.pm line 470. I was able to install XML::Rules under Windows using cpanm with no problems, so it should be working... The program: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); #print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { #print $_-{InflectionId}: $_-{Form}\n; } }, } ); my $file = '/path/to/file.xml'; open my $xml, ':utf8', $file or die Cannot open $file: $!; $parser-parse( $xml ); Thanks. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com To: beginners@perl.org Subject:Fast XML parser? Date sent: Thu, 25 Oct 2012 14:33:15 +0300 Hi, Can you recommend an XML parser which is faster than XML::Twig? I need to use an XML parser that can parse the XML files chunk by chunk and which works faster (much faster) than XML::Twig, because I tried using this module but it is very slow. I tried something like the code below, but I have also tried a version that just opens the file and parses it using regular expressions, however the unelegant regexp version is 25 times faster than the one which uses XML::Twig, and it also uses less memory. If you think there is a module for parsing XML which would work faster than regular expressions, or if I can substantially improve the program which uses XML::Twig then please tell me about it. If regexp will still be faster, I will use regexp. You did not specify what do you want to do with the lexemes anyway you might try something like this: use strict; use XML::Rules; use Data::Dumper; my $parser = XML::Rules-new( stripspaces = 7, rules = { _default = 'content', InflectedForm = 'as array', Lexem = sub { #print Dumper($_[1]); print $_[1]-{Form}\n; foreach (@{$_[1]-{InflectedForm}}) { print $_-{InflectionId}: $_-{Form}\n; } }, } ); $parser-parse(\*DATA); __DATA__ ?xml version=1.0 encoding=UTF-8? Lexems Lexem id=1 ... XML::Rules sits on top of XML::Parser::Expat so I would not expect this to be 25 times faster than XML::Twig, but it might be a bit quicker. Or not. Jenda I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
my first useful program...any corrections/suggestions?
I made a small program to display a X::Osd bar displaying my volume percentage (on GNU/Linux box). It works, but I'd like to have any suggestions or corrections about it (i'm not confident about my skills i suppose). So, here is how it works: 1) Have a named pipe defined at $OSD_VOLUME environmental variable. 2) Run the program in the background. 3) When echoing 'up', 'down' or 'toggle' in the named pipe, it raises, lowers or toggles mute state using amixer program. I have the following questions: 1) Is the code OK? (I mean is there anything I should avoid or add?) 2) Is there a better solution to make a perl program and a shell script and/or window manager communicate? (I really didn't love that named pipe solution, but I didn't know of anything else) Enough words, here is the code: #!/usr/bin/perl use strict; use warnings; # extra modules use X::Osd; # create osd bar (two output lines) my $osd = X::Osd-new(2); # osd bar properties $osd-set_font(-*-terminus-bold-*-*-*-18-*-*-*-*-*-*-*); $osd-set_shadow_offset(1); $osd-set_pos(XOSD_bottom); $osd-set_align(XOSD_center); $osd-set_horizontal_offset(0); $osd-set_vertical_offset(30); $osd-set_timeout(5); # locate (and if missing create) named pipe my $fifo_file = (defined $ENV{OSD_VOLUME}) ? $ENV{OSD_VOLUME} : glob(~/.osd-volume.fifo); unless (-p $fifo_file) { # delete non-named pipe file (risky) unlink $fifo_file or die cannot remove $fifo_file: $!; # create the named pipe require POSIX; POSIX::mkfifo($fifo_file, 0600) or die cannot mkfifo $fifo_file: $!; } # open named pipe open(FIFO, +, $fifo_file) or die cannot open $fifo_file: $!; # constantly read from it my $vol; while (chomp(my $fifo_line = FIFO)) { if ($fifo_line eq 'up') { $vol = '3%+'; } elsif ($fifo_line eq 'down') { $vol = '3%-'; } elsif ($fifo_line eq 'toggle') { $vol = 'toggle'; } else { die invalid input: $fifo_line; } # set new volume value and read the output my $amixer = `amixer sset Master,0 $vol` or die error: $!; # get new volume value $vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/); # change output color if volume is muted if ($amixer =~ m/\[off\]/) { $osd-set_colour(#DD); } else { $osd-set_colour(#1E90FF); } # print volume bar $osd-string(0, 'Master Volume:'.$vol.'%'); $osd-percentage(1, $vol); } # close pipe and exit (with error) # (impossible to get here) close(FIFO); exit(0); -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: my first useful program...any corrections/suggestions?
Hi Thanos, some comments about your code. On Wed, 31 Oct 2012 13:27:05 +0200 Thanos Zygouris athanasios.zygou...@gmail.com wrote: I made a small program to display a X::Osd bar displaying my volume percentage (on GNU/Linux box). It works, but I'd like to have any suggestions or corrections about it (i'm not confident about my skills i suppose). So, here is how it works: 1) Have a named pipe defined at $OSD_VOLUME environmental variable. 2) Run the program in the background. 3) When echoing 'up', 'down' or 'toggle' in the named pipe, it raises, lowers or toggles mute state using amixer program. I have the following questions: 1) Is the code OK? (I mean is there anything I should avoid or add?) 2) Is there a better solution to make a perl program and a shell script and/or window manager communicate? (I really didn't love that named pipe solution, but I didn't know of anything else) Enough words, here is the code: #!/usr/bin/perl use strict; use warnings; strict and warnings are a good idea. Well done. # extra modules use X::Osd; The comment is not really needed. It doesn't hurt much though. # create osd bar (two output lines) my $osd = X::Osd-new(2); # osd bar properties $osd-set_font(-*-terminus-bold-*-*-*-18-*-*-*-*-*-*-*); $osd-set_shadow_offset(1); $osd-set_pos(XOSD_bottom); $osd-set_align(XOSD_center); $osd-set_horizontal_offset(0); $osd-set_vertical_offset(30); $osd-set_timeout(5); # locate (and if missing create) named pipe my $fifo_file = (defined $ENV{OSD_VOLUME}) ? $ENV{OSD_VOLUME} : glob(~/.osd-volume.fifo); Better do exists instead of defined here, and if you're on perl-5.10.x you might wish to use the // (defined-or) operator. The glob can be replaced by $ENV{HOME} unless (-p $fifo_file) { # delete non-named pipe file (risky) unlink $fifo_file or die cannot remove $fifo_file: $!; # create the named pipe require POSIX; POSIX::mkfifo($fifo_file, 0600) or die cannot mkfifo $fifo_file: $!; } # open named pipe open(FIFO, +, $fifo_file) or die cannot open $fifo_file: $!; You should use lexical file handles instead of typeglobs. # constantly read from it my $vol; while (chomp(my $fifo_line = FIFO)) { if ($fifo_line eq 'up') { $vol = '3%+'; } elsif ($fifo_line eq 'down') { $vol = '3%-'; } elsif ($fifo_line eq 'toggle') { $vol = 'toggle'; } else { die invalid input: $fifo_line; } I would do that using a dispatch table. # set new volume value and read the output my $amixer = `amixer sset Master,0 $vol` or die error: $!; You should put $vol in a more inner scope. With `...` you risk shell-variable injection - maybe look at IPC::Run. # get new volume value $vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/); # change output color if volume is muted if ($amixer =~ m/\[off\]/) { This can be done using perldoc -f index. $osd-set_colour(#DD); } else { $osd-set_colour(#1E90FF); } # print volume bar $osd-string(0, 'Master Volume:'.$vol.'%'); $osd-percentage(1, $vol); } # close pipe and exit (with error) # (impossible to get here) close(FIFO); exit(0); Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ Escape from GNU Autohell - http://www.shlomifish.org/open-source/anti/autohell/ XSLT is the number one cause of programmers’ suicides since Visual Basic 1.0. Please reply to list if it's a mailing list post - http://shlom.in/reply . -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: my first useful program...any corrections/suggestions?
Hello: I'm not familiar with X::Osd so I'm just assuming that is all happy. :) /pedantic on On Wed, Oct 31, 2012 at 01:27:05PM +0200, Thanos Zygouris wrote: # delete non-named pipe file (risky) unlink $fifo_file or die cannot remove $fifo_file: $!; You might as well just die instead of deleting an existing file. I would consider it user error for an existing file to not be a fifo, which means that your program should just report it and refuse to function. It is then possible for the user to decide what to do. unless(-p $fifo_file) { if(-e $fifo_file) { die Fatal: '$fifo_file' is not a fifo.; } # ... } if ($fifo_line eq 'up') { $vol = '3%+'; } elsif ($fifo_line eq 'down') { $vol = '3%-'; } elsif ($fifo_line eq 'toggle') { $vol = 'toggle'; } else { die invalid input: $fifo_line; } It seems strange to die here. You might want to log the error to a file instead. You probably don't want your background process to die because of an invalid request from an external program. :) It can just do nothing instead. else { # Optionally log somewhere... next; } $vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/); As Shlomi Fish pointed out, using an outer scope for $vol and conditionally setting it here is unnecessary. Also, it's unnecessary to use m// with //. That's a personal preference, but some might find that it clutters the code. It also seems that a cluster group '(?:pattern)' is unnecessary around the % symbol. Also I note that you're fetching the volume here, but not using it until after the next if...else. I'd probably rearrange the code to keep it closer to where it's needed. Also, I believe that \d can match much more than [0-9]. You might prefer to use [0-9] instead, assuming that is what you meant. # change output color if volume is muted if ($amixer =~ m/\[off\]/) { $osd-set_colour(#DD); } else { $osd-set_colour(#1E90FF); } I see the two calls to X::Osd::set_colour as redundant. I'd probably normalize them into a single call with a variable colour. # print volume bar $osd-string(0, 'Master Volume:'.$vol.'%'); $osd-percentage(1, $vol); It seems to me that there's no point to updating the string or percentage if the volume hasn't been updated. I might write that like this: my $colour = $amixer =~ /\[off\]/ ? '#DD' : '#1E90FF'; $osd-set_colour($colour); my ($vol) = $amixer =~ /([0-9]{1,3})%/; if(defined $vol) { $osd-string(0, Master Volume: ${vol}%); $osd-percentage(1, $vol); } That's my two cents. :) Regards, -- Brandon McCaig bamcc...@gmail.com bamcc...@castopulence.org Castopulence Software https://www.castopulence.org/ Blog http://www.bamccaig.com/ perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say' signature.asc Description: Digital signature
Re: Fast XML parser?
From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/
Re: Fast XML parser?
On Wed, Oct 31, 2012 at 5:39 PM, Jenda Krynicky je...@krynicky.cz wrote: From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda = je...@krynicky.cz === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/ The memory issue is really an issue of the module it self I have had those problems as well, the more complex the xml structure the more memory it takes up and the faster you will run out. I simply moved on to other modules as I could not afford to spend my time on trying to figure out a workaround. Regards, Rob Coops
Re: Fast XML parser?
From: Jenda Krynicky je...@krynicky.cz From: Octavian Rasnita orasn...@gmail.com I forgot to say that the script I previously sent to the list also crashed Perl and it popped an error window with: perl.exe - Application Error The instruction at 0x7c910f20 referenced memory at 0x0004. The memory could not be read. Click on OK to terminate the program I have created a smaller XML file with only ~ 100 lines and I ran agan that script, and it worked fine. But it doesn't work with the entire xml file which has more than 200 MB, because it crashes Perl and I don't know why. And strange, but I've seen that now it just crashes Perl, but it doesn't return that Free to wrong pool error. Octavian That must be something either within your perl or the XML::Parser::Expat. What versions of those two do you have? Any chance you could update? Jenda perl -v This is perl 5, version 14, subversion 2 (v5.14.2) built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2011, Larry Wall Binary build 1402 [295342] provided by ActiveState http://www.ActiveState.com Built Oct 7 2011 15:49:44 ... cpanm XML::Parser::Expat Set up gcc environment - 3.4.5 (mingw-vista special r3) XML::Parser::Expat is up to date. (2.41) I think Perl is also new enough... Anyway, I solved the problem by parsing the XML content using regular expressions and it works very fast this way. And the regexp solution is not uglier and harder to maintain than using an XML parser... Octavian -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/