Re: Fast XML parser?

2012-10-31 Thread Octavian Rasnita
From: Jenda Krynicky je...@krynicky.cz

 From:   Octavian Rasnita orasn...@gmail.com
 To: beginners@perl.org
 Subject:Fast XML parser?
 Date sent:  Thu, 25 Oct 2012 14:33:15 +0300
 
 Hi,
 
 Can you recommend an XML parser which is faster than XML::Twig?
 
 I need to use an XML parser that can parse the XML files chunk by chunk and 
 which works faster (much faster) than XML::Twig, because I tried using this 
 module but it is very slow.
 
 I tried something like the code below, but I have also tried a version
 that just opens the file and parses it using regular expressions,
 however the unelegant regexp version is 25 times faster than the one
 which uses XML::Twig, and it also uses less memory. 
 
 If you think there is a module for parsing XML which would work faster
 than regular expressions, or if I can substantially improve the
 program which uses XML::Twig  then please tell me about it. If regexp
 will still be faster, I will use regexp. 
 
 You did not specify what do you want to do with the lexemes anyway 
 you might try something like this:
 
 use strict;
 use XML::Rules;
 use Data::Dumper;
 
 my $parser = XML::Rules-new(
 stripspaces = 7,
 rules = {
 _default = 'content',
 InflectedForm = 'as array',
 Lexem = sub {
 #print Dumper($_[1]);
 print $_[1]-{Form}\n;
 foreach (@{$_[1]-{InflectedForm}}) {
 print   $_-{InflectionId}: $_-{Form}\n;
 }
 },
 }
 );
 
 $parser-parse(\*DATA);
 
 __DATA__
 ?xml version=1.0 encoding=UTF-8?
 Lexems
  Lexem id=1
 ...
 
 XML::Rules sits on top of XML::Parser::Expat so I would not expect 
 this to be 25 times faster than XML::Twig, but it might be a bit 
 quicker. Or not.
 
 Jenda



Hi Jenda,

I tried your program above, modified as below, but it gives the error:

Free to wrong pool 3967d8 not 20202020 at e:/usr/lib/XML/Parser/Expat.pm line 
470.

I was able to install XML::Rules under Windows using cpanm with no problems, so 
it should be working...

The program:

use strict;
use XML::Rules;
use Data::Dumper;

my $parser = XML::Rules-new(
stripspaces = 7,
rules = {
_default = 'content',
InflectedForm = 'as array',
Lexem = sub {
#print Dumper($_[1]);
#print $_[1]-{Form}\n;
foreach (@{$_[1]-{InflectedForm}}) {
#print   $_-{InflectionId}: $_-{Form}\n;
}
},
}
);

my $file = '/path/to/file.xml';

open my $xml, ':utf8', $file or die Cannot open $file: $!;

$parser-parse( $xml );


Thanks.

Octavian


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: Fast XML parser?

2012-10-31 Thread Octavian Rasnita
From: Jenda Krynicky je...@krynicky.cz

 From:   Octavian Rasnita orasn...@gmail.com
 To: beginners@perl.org
 Subject:Fast XML parser?
 Date sent:  Thu, 25 Oct 2012 14:33:15 +0300
 
 Hi,
 
 Can you recommend an XML parser which is faster than XML::Twig?
 
 I need to use an XML parser that can parse the XML files chunk by chunk and 
 which works faster (much faster) than XML::Twig, because I tried using this 
 module but it is very slow.
 
 I tried something like the code below, but I have also tried a version
 that just opens the file and parses it using regular expressions,
 however the unelegant regexp version is 25 times faster than the one
 which uses XML::Twig, and it also uses less memory. 
 
 If you think there is a module for parsing XML which would work faster
 than regular expressions, or if I can substantially improve the
 program which uses XML::Twig  then please tell me about it. If regexp
 will still be faster, I will use regexp. 
 
 You did not specify what do you want to do with the lexemes anyway 
 you might try something like this:
 
 use strict;
 use XML::Rules;
 use Data::Dumper;
 
 my $parser = XML::Rules-new(
 stripspaces = 7,
 rules = {
 _default = 'content',
 InflectedForm = 'as array',
 Lexem = sub {
 #print Dumper($_[1]);
 print $_[1]-{Form}\n;
 foreach (@{$_[1]-{InflectedForm}}) {
 print   $_-{InflectionId}: $_-{Form}\n;
 }
 },
 }
 );
 
 $parser-parse(\*DATA);
 
 __DATA__
 ?xml version=1.0 encoding=UTF-8?
 Lexems
  Lexem id=1
 ...
 
 XML::Rules sits on top of XML::Parser::Expat so I would not expect 
 this to be 25 times faster than XML::Twig, but it might be a bit 
 quicker. Or not.
 
 Jenda



I forgot to say that the script I previously sent to the list also crashed Perl 
and it popped an error window with:

perl.exe - Application Error
The instruction at 0x7c910f20 referenced memory at 0x0004. The memory 
could not be read.  Click on OK to terminate the program 

I have created a smaller XML file with only ~ 100 lines and I ran agan that 
script, and it worked fine.

But it doesn't work with the entire xml file which has more than 200 MB, 
because it crashes Perl and I don't know why.

And strange, but I've seen that now it just crashes Perl, but it doesn't return 
that Free to wrong pool error.

Octavian


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




my first useful program...any corrections/suggestions?

2012-10-31 Thread Thanos Zygouris
I made a small program to display a X::Osd bar displaying my volume
percentage (on GNU/Linux box). It works, but I'd like to have any suggestions or
corrections about it (i'm not confident about my skills i suppose).

So, here is how it works:
1) Have a named pipe defined at $OSD_VOLUME environmental variable.
2) Run the program in the background.
3) When echoing 'up', 'down' or 'toggle' in the named pipe, it raises,
   lowers or toggles mute state using amixer program.

I have the following questions:
1) Is the code OK? (I mean is there anything I should avoid or add?)
2) Is there a better solution to make a perl program and a shell script
   and/or window manager communicate? (I really didn't love that named
   pipe solution, but I didn't know of anything else)

Enough words, here is the code:

#!/usr/bin/perl

use strict;
use warnings;
# extra modules
use X::Osd;

# create osd bar (two output lines)
my $osd = X::Osd-new(2);
# osd bar properties
$osd-set_font(-*-terminus-bold-*-*-*-18-*-*-*-*-*-*-*);
$osd-set_shadow_offset(1);
$osd-set_pos(XOSD_bottom);
$osd-set_align(XOSD_center);
$osd-set_horizontal_offset(0);
$osd-set_vertical_offset(30);
$osd-set_timeout(5);

# locate (and if missing create) named pipe
my $fifo_file = (defined $ENV{OSD_VOLUME}) ? $ENV{OSD_VOLUME} : 
glob(~/.osd-volume.fifo);
unless (-p $fifo_file) {
# delete non-named pipe file (risky)
unlink $fifo_file or die cannot remove $fifo_file: $!;
# create the named pipe
require POSIX;
POSIX::mkfifo($fifo_file, 0600) or die cannot mkfifo $fifo_file: $!;
}

# open named pipe
open(FIFO, +, $fifo_file) or die cannot open $fifo_file: $!;
# constantly read from it
my $vol;
while (chomp(my $fifo_line = FIFO)) {
if ($fifo_line eq 'up') {
$vol = '3%+';
} elsif ($fifo_line eq 'down') {
$vol = '3%-';
} elsif ($fifo_line eq 'toggle') {
$vol = 'toggle';
} else {
die invalid input: $fifo_line;
}
# set new volume value and read the output
my $amixer = `amixer sset Master,0 $vol` or die error: $!;
# get new volume value
$vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/);
# change output color if volume is muted
if ($amixer =~ m/\[off\]/) {
$osd-set_colour(#DD);
} else {
$osd-set_colour(#1E90FF);
}
# print volume bar
$osd-string(0, 'Master Volume:'.$vol.'%');
$osd-percentage(1, $vol);
}
# close pipe and exit (with error)
# (impossible to get here)
close(FIFO);
exit(0);


-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: my first useful program...any corrections/suggestions?

2012-10-31 Thread Shlomi Fish
Hi Thanos,

some comments about your code.

On Wed, 31 Oct 2012 13:27:05 +0200
Thanos Zygouris athanasios.zygou...@gmail.com wrote:

 I made a small program to display a X::Osd bar displaying my volume
 percentage (on GNU/Linux box). It works, but I'd like to have any suggestions
 or corrections about it (i'm not confident about my skills i suppose).
 
 So, here is how it works:
 1) Have a named pipe defined at $OSD_VOLUME environmental variable.
 2) Run the program in the background.
 3) When echoing 'up', 'down' or 'toggle' in the named pipe, it raises,
lowers or toggles mute state using amixer program.
 
 I have the following questions:
 1) Is the code OK? (I mean is there anything I should avoid or add?)
 2) Is there a better solution to make a perl program and a shell script
and/or window manager communicate? (I really didn't love that named
pipe solution, but I didn't know of anything else)
 
 Enough words, here is the code:
 
 #!/usr/bin/perl
 
 use strict;
 use warnings;

strict and warnings are a good idea. Well done.

 # extra modules
 use X::Osd;

The comment is not really needed. It doesn't hurt much though.

 
 # create osd bar (two output lines)
 my $osd = X::Osd-new(2);
 # osd bar properties
 $osd-set_font(-*-terminus-bold-*-*-*-18-*-*-*-*-*-*-*);
 $osd-set_shadow_offset(1);
 $osd-set_pos(XOSD_bottom);
 $osd-set_align(XOSD_center);
 $osd-set_horizontal_offset(0);
 $osd-set_vertical_offset(30);
 $osd-set_timeout(5);
 
 # locate (and if missing create) named pipe
 my $fifo_file = (defined $ENV{OSD_VOLUME}) ? $ENV{OSD_VOLUME} :
 glob(~/.osd-volume.fifo); 

Better do exists instead of defined here, and if you're on perl-5.10.x you
might wish to use the // (defined-or) operator. The glob can be replaced by
$ENV{HOME}

 unless (-p $fifo_file) {
 # delete non-named pipe file (risky)
 unlink $fifo_file or die cannot remove $fifo_file: $!;
 # create the named pipe
 require POSIX;
 POSIX::mkfifo($fifo_file, 0600) or die cannot mkfifo $fifo_file: $!;
 }
 
 # open named pipe
 open(FIFO, +, $fifo_file) or die cannot open $fifo_file: $!;

You should use lexical file handles instead of typeglobs.

 # constantly read from it
 my $vol;
 while (chomp(my $fifo_line = FIFO)) {
 if ($fifo_line eq 'up') {
 $vol = '3%+';
 } elsif ($fifo_line eq 'down') {
 $vol = '3%-';
 } elsif ($fifo_line eq 'toggle') {
 $vol = 'toggle';
 } else {
 die invalid input: $fifo_line;
 }

I would do that using a dispatch table.

 # set new volume value and read the output
 my $amixer = `amixer sset Master,0 $vol` or die error: $!;

You should put $vol in a more inner scope.

With `...` you risk shell-variable injection - maybe look at IPC::Run.

 # get new volume value
 $vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/);
 # change output color if volume is muted
 if ($amixer =~ m/\[off\]/) {

This can be done using perldoc -f index.

 $osd-set_colour(#DD);
 } else {
 $osd-set_colour(#1E90FF);
 }
 # print volume bar
 $osd-string(0, 'Master Volume:'.$vol.'%');
 $osd-percentage(1, $vol);
 }
 # close pipe and exit (with error)
 # (impossible to get here)
 close(FIFO);
 exit(0);
 
 

Regards,

Shlomi Fish

-- 
-
Shlomi Fish   http://www.shlomifish.org/
Escape from GNU Autohell - http://www.shlomifish.org/open-source/anti/autohell/

XSLT is the number one cause of programmers’ suicides since Visual Basic 1.0.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: my first useful program...any corrections/suggestions?

2012-10-31 Thread Brandon McCaig
Hello:

I'm not familiar with X::Osd so I'm just assuming that is all
happy. :)

/pedantic on

On Wed, Oct 31, 2012 at 01:27:05PM +0200, Thanos Zygouris wrote:
 # delete non-named pipe file (risky)
 unlink $fifo_file or die cannot remove $fifo_file: $!;

You might as well just die instead of deleting an existing file.
I would consider it user error for an existing file to not be a
fifo, which means that your program should just report it and
refuse to function. It is then possible for the user to decide
what to do.

unless(-p $fifo_file) {
if(-e $fifo_file) {
die Fatal: '$fifo_file' is not a fifo.;
}
# ...
}

 if ($fifo_line eq 'up') {
 $vol = '3%+';
 } elsif ($fifo_line eq 'down') {
 $vol = '3%-';
 } elsif ($fifo_line eq 'toggle') {
 $vol = 'toggle';
 } else {
 die invalid input: $fifo_line;
 }

It seems strange to die here. You might want to log the error to
a file instead. You probably don't want your background process
to die because of an invalid request from an external program. :)
It can just do nothing instead.

else {
# Optionally log somewhere...
next;
}

 $vol = $1 if ($amixer =~ m/(\d{1,3})(?:%)/);

As Shlomi Fish pointed out, using an outer scope for $vol and
conditionally setting it here is unnecessary. Also, it's
unnecessary to use m// with //. That's a personal preference, but
some might find that it clutters the code. It also seems that a
cluster group '(?:pattern)' is unnecessary around the % symbol.
Also I note that you're fetching the volume here, but not using
it until after the next if...else. I'd probably rearrange the
code to keep it closer to where it's needed. Also, I believe that
\d can match much more than [0-9]. You might prefer to use [0-9]
instead, assuming that is what you meant.

 # change output color if volume is muted
 if ($amixer =~ m/\[off\]/) {
 $osd-set_colour(#DD);
 } else {
 $osd-set_colour(#1E90FF);
 }

I see the two calls to X::Osd::set_colour as redundant. I'd
probably normalize them into a single call with a variable
colour.

 # print volume bar
 $osd-string(0, 'Master Volume:'.$vol.'%');
 $osd-percentage(1, $vol);

It seems to me that there's no point to updating the string or
percentage if the volume hasn't been updated. I might write that
like this:

my $colour = $amixer =~ /\[off\]/ ? '#DD' : '#1E90FF';

$osd-set_colour($colour);

my ($vol) = $amixer =~ /([0-9]{1,3})%/;

if(defined $vol)
{
$osd-string(0, Master Volume: ${vol}%);
$osd-percentage(1, $vol);
}

That's my two cents. :)

Regards,


-- 
Brandon McCaig bamcc...@gmail.com bamcc...@castopulence.org
Castopulence Software https://www.castopulence.org/
Blog http://www.bamccaig.com/
perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'



signature.asc
Description: Digital signature


Re: Fast XML parser?

2012-10-31 Thread Jenda Krynicky
From: Octavian Rasnita orasn...@gmail.com
 I forgot to say that the script I previously sent to the list also crashed 
 Perl and it popped an error window with:
 
 perl.exe - Application Error
 The instruction at 0x7c910f20 referenced memory at 0x0004. The memory 
 could not be read.  Click on OK to terminate the program 
 
 I have created a smaller XML file with only ~ 100 lines and I ran agan that 
 script, and it worked fine.
 
 But it doesn't work with the entire xml file which has more than 200 MB, 
 because it crashes Perl and I don't know why.
 
 And strange, but I've seen that now it just crashes Perl, but it doesn't 
 return that Free to wrong pool error.
 
 Octavian

That must be something either within your perl or the 
XML::Parser::Expat. What versions of those two do you have? Any 
chance you could update?


Jenda
= je...@krynicky.cz === http://Jenda.Krynicky.cz =
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/




Re: Fast XML parser?

2012-10-31 Thread Rob Coops
On Wed, Oct 31, 2012 at 5:39 PM, Jenda Krynicky je...@krynicky.cz wrote:

 From: Octavian Rasnita orasn...@gmail.com
  I forgot to say that the script I previously sent to the list also
 crashed Perl and it popped an error window with:
 
  perl.exe - Application Error
  The instruction at 0x7c910f20 referenced memory at 0x0004. The
 memory could not be read.  Click on OK to terminate the program
 
  I have created a smaller XML file with only ~ 100 lines and I ran agan
 that script, and it worked fine.
 
  But it doesn't work with the entire xml file which has more than 200 MB,
 because it crashes Perl and I don't know why.
 
  And strange, but I've seen that now it just crashes Perl, but it doesn't
 return that Free to wrong pool error.
 
  Octavian

 That must be something either within your perl or the
 XML::Parser::Expat. What versions of those two do you have? Any
 chance you could update?


 Jenda
 = je...@krynicky.cz === http://Jenda.Krynicky.cz =
 When it comes to wine, women and song, wizards are allowed
 to get drunk and croon as much as they like.
 -- Terry Pratchett in Sourcery


 --
 To unsubscribe, e-mail: beginners-unsubscr...@perl.org
 For additional commands, e-mail: beginners-h...@perl.org
 http://learn.perl.org/



The memory issue is really an issue of the module it self I have had those
problems as well, the more complex the xml structure the more memory it
takes up and the faster you will run out. I simply moved on to other
modules as I could not afford to spend my time on trying to figure out a
workaround.

Regards,

Rob Coops


Re: Fast XML parser?

2012-10-31 Thread Octavian Rasnita

From: Jenda Krynicky je...@krynicky.cz


From: Octavian Rasnita orasn...@gmail.com
I forgot to say that the script I previously sent to the list also 
crashed Perl and it popped an error window with:


perl.exe - Application Error
The instruction at 0x7c910f20 referenced memory at 0x0004. The 
memory could not be read.  Click on OK to terminate the program


I have created a smaller XML file with only ~ 100 lines and I ran agan 
that script, and it worked fine.


But it doesn't work with the entire xml file which has more than 200 MB, 
because it crashes Perl and I don't know why.


And strange, but I've seen that now it just crashes Perl, but it doesn't 
return that Free to wrong pool error.


Octavian


That must be something either within your perl or the
XML::Parser::Expat. What versions of those two do you have? Any
chance you could update?


Jenda





perl -v
This is perl 5, version 14, subversion 2 (v5.14.2) built for 
MSWin32-x86-multi-thread

(with 1 registered patch, see perl -V for more detail)
Copyright 1987-2011, Larry Wall
Binary build 1402 [295342] provided by ActiveState 
http://www.ActiveState.com

Built Oct  7 2011 15:49:44
...


cpanm XML::Parser::Expat

Set up gcc environment - 3.4.5 (mingw-vista special r3)
XML::Parser::Expat is up to date. (2.41)


I think Perl is also new enough...

Anyway, I solved the problem by parsing the XML content using regular 
expressions and it works very fast this way.
And the regexp solution is not uglier and harder to maintain than using an 
XML parser...


Octavian



--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/