On Wed, Dec 24, 2008 at 7:11 PM, Mr. Shawn H. Corey <shawnhco...@magma.ca>wrote:
> On Wed, 2008-12-24 at 11:40 +0530, Amit Saxena wrote: > > Hi all, > > > > I am trying to use recursive regular expression in Perl. > > > > I am using an example from > http://www.perl.com/pub/a/2003/06/06/regexps.html. > > > > Whenever I try to execute the program, it hangs and I have to do a > CNTRL-C > > to break it. > > > > Please let me know where I am wrong. > > > > *# cat t_r.pl* > > #! /usr/bin/perl > > > > use warnings; > > use strict; > > > > my $paren = qr/ \( [^)]+ \) /x; > > > > "Some (parenthesized) text" =~ /($paren)/; > > print $1; # parenthesized > > > > $paren = qr/ > > \( > > ( > > [^()]+ # Not parens > > | > > (??{ $paren }) # Another balanced group (not interpolated > yet) > > )+? > > \) > > /x; > > > > print "\n"; > > > > "Some (parenthesised and (gratuitously) sub-parenthesised text" =~ > > /($paren)/; > > print $1; # parenthesized > > > > print "\n"; > > > > *# perl t_r.pl* > > (parenthesized) > > <<Control-Break>> > > *# perl -v* > > > > This is perl, v5.8.5 built for i386-linux-thread-multi > > > > Copyright 1987-2004, Larry Wall > > > > Perl may be copied only under the terms of either the Artistic License or > > the > > GNU General Public License, which may be found in the Perl 5 source kit. > > > > Complete documentation for Perl, including FAQ lists, should be found on > > this system using `man perl' or `perldoc perl'. If you have access to > the > > Internet, point your browser at http://www.perl.com/, the Perl Home > Page. > > > > *#* > > I don't think you're doing anything wrong. I think the guy who wrote > the web page didn't test his regex with a string with unbalanced > parenthesis. Try: > > "Some (parenthesised and (gratuitously) sub-parenthesised) text" =~ > /($paren)/; > > You cannot parse unbound-nested contexts with regular expressions. You > need a finite-state automation (FSA) with a push-down stack. > > > -- > Just my 0.00000002 million dollars worth, > Shawn > > Believe in the Gods but row away from the rocks. > -- ancient Hindu proverb > > Thanks for the response Shawn. However when I try to execute it even with the proper text, I get the different output than anticipated. *# cat t_r.pl* #! /usr/bin/perl use warnings; use strict; my $paren = qr/ \( [^)]+ \) /x; "Some (parenthesized) text" =~ /($paren)/; print $1; # parenthesized $paren = qr/ \( ( [^()]+ # Not parens | (??{ $paren }) # Another balanced group (not interpolated yet) )+? \) /x; print "\n"; "Some (parenthesised and (gratuitously) sub-parenthesised text)" =~ /($paren)/; print "$1\n"; print "$2\n"; print "\n"; *# perl t_r.pl* (parenthesized) (parenthesised and (gratuitously) sub-parenthesised text) sub-parenthesised text *#* I was expecting $2 to be "gratuitously" instead of " sub-parenthesized text". Please let me know whether there is some difference in my understanding here. Regards, Amit Saxena