Re: Parsing XML
On Sat, Jul 19, 2008 at 6:44 AM, Rob Dixon [EMAIL PROTECTED] wrote: Epanda wrote: Epanda wrote: I would like to know if we can parse XML with regexp faster than with an MSXML or Xerces library ? I just want to parse an XML and I have seen that the XML!!Parser of Perl based on Expat is the most faster ofth world, I don't know Twig. My XML is classical : ?xml version='1.0' encoding='ISO-8859-1'? !DOCTYPE CONF_INST SYSTEM dtd_conf_inst.dtd ROOT_NODE VERS=1.0 NODE1 TAG=VD/N1 SERIAL=3HHE C IDOM/ID VALSAT/VAL /C C IDTPS/ID VAL3E+01/VAL /C /NODE1 /ROOT_NODE but can be very big. XML::Twig is built on Expat, and is especially good at processing large files one element at a time instead of loading the whole file into memory first. For instance, if your data consists of multiple independent NODE1 elements XML::Twig can be set up to process them individually and so save memory. Take a look here http://www.xmltwig.com/xmltwig/ But if you are hoping to write something that is faster than MSXML or Xerces you may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if you want to try those. What do you need to do with the data? It may be possible with regular expressions if the data is consistently formatted. Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ Hi ! I am using use XML::SAX::ParserFactory . I want to know whether it's a better alternative than suggested so far in this thread or not ? If not, please suggest. Regards, Amit Saxena
Foreach with array of hashes
Hi all, I am getting a syntax error with the following code: Line 10: @snmpSessions = (%snmpSession1, %snmpSession2); Line 11: foreach %snmpSession (@snmpSessions) Line 12: { Line 13:# The following is call to my own subroutine that has been tested and it works (:-) Line 14:ethernetGlobalMode (\%snmpSession); Line 15: } I get the following error: syntax error at C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl line 11, near foreach %snmpSession syntax error at C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl line 15, near } Execution of C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl aborted due to compilation errors. Thanks for your comments and feedback. Regards, Bobby -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: Foreach with array of hashes
On Sat, Jul 19, 2008 at 2:45 PM, Bobby Jafari [EMAIL PROTECTED] wrote: Hi all, I am getting a syntax error with the following code: Line 10: @snmpSessions = (%snmpSession1, %snmpSession2); Line 11: foreach %snmpSession (@snmpSessions) Line 12: { Line 13:# The following is call to my own subroutine that has been tested and it works (:-) Line 14:ethernetGlobalMode (\%snmpSession); Line 15: } I get the following error: syntax error at C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl line 11, near foreach %snmpSession syntax error at C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl line 15, near } Execution of C:\SVN\qa-tests\2084\testcase\TC-Ether-Type-Test.pl aborted due to compilation errors. Thanks for your comments and feedback. Regards, Bobby -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ Hi ! The following statement is causing the problems :- *Line 10: @snmpSessions = (%snmpSession1, %snmpSession2);* The statement should be [EMAIL PROTECTED] = (\%snmpSession1, \%snmpSession2);* Remember, arrays can contain only scaler values so in case you want contents of some hashes etc in an array, you must use references. Having said that, the following lines * ** Line 11: foreach %snmpSession (@snmpSessions) Line 12: { Line 13:# The following is call to my own subroutine that has been tested and it works (:-) Line 14:ethernetGlobalMode (\%snmpSession); Line 15: }* should change to * Line 11: foreach $snmpS (@snmpSessions) Line 12: { Line 13:# The following is call to my own subroutine that has been tested and it works (:-) Line 14:ethernetGlobalMode ($snmpS); Line 15: }* Let us know in case you are facing more problems. Regards, Amit Saxena
Re: Parsing XML
2008/7/19 Amit Saxena [EMAIL PROTECTED]: On Sat, Jul 19, 2008 at 6:44 AM, Rob Dixon [EMAIL PROTECTED] wrote: Epanda wrote: Epanda wrote: I would like to know if we can parse XML with regexp faster than with an MSXML or Xerces library ? I just want to parse an XML and I have seen that the XML!!Parser of Perl based on Expat is the most faster ofth world, I don't know Twig. My XML is classical : ?xml version='1.0' encoding='ISO-8859-1'? !DOCTYPE CONF_INST SYSTEM dtd_conf_inst.dtd ROOT_NODE VERS=1.0 NODE1 TAG=VD/N1 SERIAL=3HHE C IDOM/ID VALSAT/VAL /C C IDTPS/ID VAL3E+01/VAL /C /NODE1 /ROOT_NODE but can be very big. XML::Twig is built on Expat, and is especially good at processing large files one element at a time instead of loading the whole file into memory first. For instance, if your data consists of multiple independent NODE1 elements XML::Twig can be set up to process them individually and so save memory. Take a look here http://www.xmltwig.com/xmltwig/ But if you are hoping to write something that is faster than MSXML or Xerces you may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if you want to try those. What do you need to do with the data? It may be possible with regular expressions if the data is consistently formatted. Rob Hi ! I am using use XML::SAX::ParserFactory . I want to know whether it's a better alternative than suggested so far in this thread or not ? If not, please suggest. Regards, Amit Saxena In many ways it's a preference which parser you use. I would say if you have a large amount of data in your XML then, XML::LibXML is a the best option. I haven't seen anything like it for sheer speed. Dp. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
RE: Foreach with array of hashes
Thanks heaps. Problem solved. code changed to Line 10: @snmpS = (\%snmpSession1, \%snmpSession2); Line 11: foreach $snmpS (@snmpSessions) Line 12: { Line 13:ethernetGlobalMode ($snmpS); Line 14: }
regular expression to exclude a phrase
Hi netters, I was wondering if there's a way to specify not including this phrase in perl regexp. Like such two strings: axbcabcd axbcacbd If I say not including the letter a or the letter b (=~/[^a^b]/) then neither of them will be matched. However now I want to say not including the phrase 'ab', then the string 2 should be matched. But I can't figure out how to specify the second condition by regexp. Could anyone help me? Thanks a lot! Zhihua Li _ 用手机MSN聊天写邮件看空间,无限沟通,分享精彩! http://mobile.msn.com.cn/
Re: Parsing XML
I would implement XML::LibXML or a better one, twig or Expat just to see if it is faster and then implement it in a C++ class. Have you got a sample of using Expat ? or Twig ? Are those lib available in C++ and twig or expat are faster than MSXML or Xerces ? Thanks On 19 juil, 13:15, [EMAIL PROTECTED] (Dermot) wrote: 2008/7/19 Amit Saxena [EMAIL PROTECTED]: On Sat, Jul 19, 2008 at 6:44 AM, Rob Dixon [EMAIL PROTECTED] wrote: Epanda wrote: Epanda wrote: I would like to know if we can parse XML with regexp faster than with an MSXML or Xerces library ? I just want to parse an XML and I have seen that the XML!!Parser of Perl based on Expat is the most faster ofth world, I don't know Twig. My XML is classical : ?xml version='1.0' encoding='ISO-8859-1'? !DOCTYPE CONF_INST SYSTEM dtd_conf_inst.dtd ROOT_NODE VERS=1.0 NODE1 TAG=VD/N1 SERIAL=3HHE C IDOM/ID VALSAT/VAL /C C IDTPS/ID VAL3E+01/VAL /C /NODE1 /ROOT_NODE but can be very big. XML::Twig is built on Expat, and is especially good at processing large files one element at a time instead of loading the whole file into memory first. For instance, if your data consists of multiple independent NODE1 elements XML::Twig can be set up to process them individually and so save memory. Take a look herehttp://www.xmltwig.com/xmltwig/ But if you are hoping to write something that is faster than MSXML or Xerces you may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if you want to try those. What do you need to do with the data? It may be possible with regular expressions if the data is consistently formatted. Rob Hi ! I am using use XML::SAX::ParserFactory . I want to know whether it's a better alternative than suggested so far in this thread or not ? If not, please suggest. Regards, Amit Saxena In many ways it's a preference which parser you use. I would say if you have a large amount of data in your XML then, XML::LibXML is a the best option. I haven't seen anything like it for sheer speed. Dp. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: Parsing XML
Hi Rob, I only have to read values from nodes but I am amazing by what you say. But the XML file can take 10 Mb. I thought that expat was the faster parser because I have found this link http://www.xml.com/lpt/a/37. Maybe it was true in 1999 but it isn't not at all. You have to notice that the parser has to be a modular one that I can call in C++ method. Could RegExp be more efficient and faster than a classical Parser XPat or MSXML ? More than that, I have heard that there is another generation which combine DOM and SAX parser, it is a microsoft I think but I have lost the name. Thanks On 19 juil, 03:14, [EMAIL PROTECTED] (Rob Dixon) wrote: Epanda wrote: Epanda wrote: I would like to know if we can parse XML with regexp faster than with an MSXML or Xerces library ? I just want to parse an XML and I have seen that the XML!!Parser of Perl based on Expat is the most faster ofth world, I don't know Twig. My XML is classical : ?xml version='1.0' encoding='ISO-8859-1'? !DOCTYPE CONF_INST SYSTEM dtd_conf_inst.dtd ROOT_NODE VERS=1.0 NODE1 TAG=VD/N1 SERIAL=3HHE C IDOM/ID VALSAT/VAL /C C IDTPS/ID VAL3E+01/VAL /C /NODE1 /ROOT_NODE but can be very big. XML::Twig is built on Expat, and is especially good at processing large files one element at a time instead of loading the whole file into memory first. For instance, if your data consists of multiple independent NODE1 elements XML::Twig can be set up to process them individually and so save memory. Take a look herehttp://www.xmltwig.com/xmltwig/ But if you are hoping to write something that is faster than MSXML or Xerces you may be unsucessful. Perl also has XML::LibXML and XML::Xerces modules as well if you want to try those. What do you need to do with the data? It may be possible with regular expressions if the data is consistently formatted. Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
perlpacktut ... not sure what I am doing wrong while following the example.. Explanation please
just trying to learn pack/unpack function in perl.. http://perldoc.perl.org/perlpacktut.html I thought i followed pretty much to the teeth from the tutorial itself and when I type them into my linux box , it didn't exactly work the way I expected them to.. What am I doing wrong? [EMAIL PROTECTED] ~]# cat -A pack_test2.pl #!/usr/bin/perl$ $ use strict;$ use warnings;$ $ my $tot_income;$ my $tot_expend;$ use POSIX;$ $ my $date = POSIX::strftime(%m/%d/%Y, localtime);$ $ while (DATA) {$ my($date, $desc, $income, $expend) = unpack(A10 A27 A7 A*, $_);$ $tot_income += $income;$ $tot_expend += $expend;$ print_this($tot_income,$tot_expend);$ }$ $ sub print_this {$ my($tot_income,$tot_expend) = @_;$ $tot_income = sprintf(%.2f, $tot_income); # Get them into $ $tot_expend = sprintf(%.2f, $tot_expend); # financial format$ print pack(A10 A27 A7 A*, $date, Totals, $tot_income, $tot_expend);$ }$ $ __END__$ 01/24/2001 Ahmed's Camel Emporium 1147.99$ 01/28/2001 Flea spray24.99$ 01/29/2001 Camel rides to tourists 235.00$ But it's not working out well.. [EMAIL PROTECTED] ~]# ./!$ ./././././././pack_test2.pl Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 1. Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 2. 07/19/2008Totals 0.00 1147.9907/19/2008Totals 0.00 1172.9807/19/2008Totals 235.00 [EMAIL PROTECTED] ~]# -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: perlpacktut ... not sure what I am doing wrong while following the example.. Explanation please
Richard Lee wrote: just trying to learn pack/unpack function in perl.. http://perldoc.perl.org/perlpacktut.html I thought i followed pretty much to the teeth from the tutorial itself and when I type them into my linux box , it didn't exactly work the way I expected them to.. What am I doing wrong? [EMAIL PROTECTED] ~]# cat -A pack_test2.pl #!/usr/bin/perl$ $ use strict;$ use warnings;$ $ my $tot_income;$ my $tot_expend;$ use POSIX;$ $ my $date = POSIX::strftime(%m/%d/%Y, localtime);$ $ while (DATA) {$ my($date, $desc, $income, $expend) = unpack(A10 A27 A7 A*, $_);$ $tot_income += $income;$ $tot_expend += $expend;$ print_this($tot_income,$tot_expend);$ }$ $ sub print_this {$ my($tot_income,$tot_expend) = @_;$ $tot_income = sprintf(%.2f, $tot_income); # Get them into $ $tot_expend = sprintf(%.2f, $tot_expend); # financial format$ print pack(A10 A27 A7 A*, $date, Totals, $tot_income, $tot_expend);$ }$ $ __END__$ 01/24/2001 Ahmed's Camel Emporium 1147.99$ 01/28/2001 Flea spray24.99$ 01/29/2001 Camel rides to tourists 235.00$ But it's not working out well.. [EMAIL PROTECTED] ~]# ./!$ ./././././././pack_test2.pl Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 1. Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 2. 07/19/2008Totals 0.00 1147.9907/19/2008Totals 0.00 1172.9807/19/2008Totals 235.00 [EMAIL PROTECTED] ~]# I think you're spacing your data out with tab characters. A tab is a single character as far as unpack is concerned so things are getting mis-aligned. The Argument isn't numeric warnings are because the income and expenditure fields are blank. If you want the to default to zero you need to write defined and /\S/ or $_ = 0 for $tot_income, $tot_expend; after the unpack. You know that a literal space in the unpack pattern is ignored? If you want to skip over a character use 'x'. HTH, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: regular expression to exclude a phrase
zhihuali wrote: I was wondering if there's a way to specify not including this phrase in perl regexp. Like such two strings: axbcabcd axbcacbd If I say not including the letter a or the letter b (=~/[^a^b]/) then neither of them will be matched. However now I want to say not including the phrase 'ab', then the string 2 should be matched. But I can't figure out how to specify the second condition by regexp. Could anyone help me? /[^ab]/ will match if the string contains any character other than a or b, so it will match both of those strings. you need to negate the whole match, and say not $str =~ /[ab]/ or $str !~ /[ab]/ and to to the same with a string like you can write not $str =~ /ab/ or $str !~ /ab/ for instance foreach my $str (qw/axbcabcd axbcacbd/) { print $str\n if $str !~ /ab/; } outputs just axbcacbd HTH, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: perlpacktut ... not sure what I am doing wrong while following the example.. Explanation please
Richard Lee wrote: just trying to learn pack/unpack function in perl.. http://perldoc.perl.org/perlpacktut.html You *should* have the documentation for Perl installed on your hard drive which will be more relevant to the version of Perl you are using. I thought i followed pretty much to the teeth from the tutorial itself and when I type them into my linux box , it didn't exactly work the way I expected them to.. What am I doing wrong? [EMAIL PROTECTED] ~]# cat -A pack_test2.pl cat -A? You're really trying to make it difficult to just copy and run your code? #!/usr/bin/perl$ $ use strict;$ use warnings;$ $ my $tot_income;$ my $tot_expend;$ use POSIX;$ $ my $date = POSIX::strftime(%m/%d/%Y, localtime);$ $ while (DATA) {$ my($date, $desc, $income, $expend) = unpack(A10 A27 A7 A*, $_);$ Your unpack() format defines the fields as: 01/24/2001 Ahmed's Camel Emporium 1147.99 01/28/2001 Flea spray24.99 01/29/2001 Camel rides to tourists 235.00 ^ ^ ^ ^ 123456789012345678901234567890123456789012345678901234567890 The second field includes leading whitespace which you don't need. The third field is not long enough and cuts off the decimal portion of the number. The fourth field includes the decimal portion from the third field which in numerical context will be interpreted as the fourth field instead of the actual fourth field. You probably want something like this instead: unpack 'A11 A26 A9 A*', $_; $tot_income += $income;$ $tot_expend += $expend;$ To suppress the isn't numeric warning you can do this: $tot_income += $income || 0; $tot_expend += $expend || 0; print_this($tot_income,$tot_expend);$ }$ $ sub print_this {$ my($tot_income,$tot_expend) = @_;$ $tot_income = sprintf(%.2f, $tot_income); # Get them into $ $tot_expend = sprintf(%.2f, $tot_expend); # financial format$ print pack(A10 A27 A7 A*, $date, Totals, $tot_income, $tot_expend);$ And don't forget to add a newline at the end of each output line: print pack( 'A10 A27 A7 A*', $date, 'Totals', $tot_income, $tot_expend ), \n; }$ $ __END__$ 01/24/2001 Ahmed's Camel Emporium 1147.99$ 01/28/2001 Flea spray24.99$ 01/29/2001 Camel rides to tourists 235.00$ But it's not working out well.. [EMAIL PROTECTED] ~]# ./!$ ./././././././pack_test2.pl Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 1. Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 2. 07/19/2008Totals 0.00 1147.9907/19/2008Totals 0.00 1172.9807/19/2008Totals 235.00 [EMAIL PROTECTED] ~]# John -- Perl isn't a toolbox, but a small machine shop where you can special-order certain sorts of tools at low cost and in short order.-- Larry Wall -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/
Re: perlpacktut ... not sure what I am doing wrong while following the example.. Explanation please
John W. Krahn wrote: Richard Lee wrote: just trying to learn pack/unpack function in perl.. http://perldoc.perl.org/perlpacktut.html You *should* have the documentation for Perl installed on your hard drive which will be more relevant to the version of Perl you are using. I thought i followed pretty much to the teeth from the tutorial itself and when I type them into my linux box , it didn't exactly work the way I expected them to.. What am I doing wrong? [EMAIL PROTECTED] ~]# cat -A pack_test2.pl cat -A? You're really trying to make it difficult to just copy and run your code? #!/usr/bin/perl$ $ use strict;$ use warnings;$ $ my $tot_income;$ my $tot_expend;$ use POSIX;$ $ my $date = POSIX::strftime(%m/%d/%Y, localtime);$ $ while (DATA) {$ my($date, $desc, $income, $expend) = unpack(A10 A27 A7 A*, $_);$ Your unpack() format defines the fields as: 01/24/2001 Ahmed's Camel Emporium 1147.99 01/28/2001 Flea spray24.99 01/29/2001 Camel rides to tourists 235.00 ^ ^ ^ ^ 123456789012345678901234567890123456789012345678901234567890 The second field includes leading whitespace which you don't need. The third field is not long enough and cuts off the decimal portion of the number. The fourth field includes the decimal portion from the third field which in numerical context will be interpreted as the fourth field instead of the actual fourth field. You probably want something like this instead: unpack 'A11 A26 A9 A*', $_; $tot_income += $income;$ $tot_expend += $expend;$ To suppress the isn't numeric warning you can do this: $tot_income += $income || 0; $tot_expend += $expend || 0; print_this($tot_income,$tot_expend);$ }$ $ sub print_this {$ my($tot_income,$tot_expend) = @_;$ $tot_income = sprintf(%.2f, $tot_income); # Get them into $ $tot_expend = sprintf(%.2f, $tot_expend); # financial format$ print pack(A10 A27 A7 A*, $date, Totals, $tot_income, $tot_expend);$ And don't forget to add a newline at the end of each output line: print pack( 'A10 A27 A7 A*', $date, 'Totals', $tot_income, $tot_expend ), \n; }$ $ __END__$ 01/24/2001 Ahmed's Camel Emporium 1147.99$ 01/28/2001 Flea spray24.99$ 01/29/2001 Camel rides to tourists 235.00$ But it's not working out well.. [EMAIL PROTECTED] ~]# ./!$ ./././././././pack_test2.pl Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 1. Argument isn't numeric in addition (+) at ./././././././pack_test2.pl line 14, DATA line 2. 07/19/2008Totals 0.00 1147.9907/19/2008Totals 0.00 1172.9807/19/2008Totals 235.00 [EMAIL PROTECTED] ~]# John Thanks John and others for explanation. I understand now. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/