On Wed, Dec 24, 2008 at 7:11 PM, Mr. Shawn H. Corey <shawnhco...@magma.ca>wrote:

> On Wed, 2008-12-24 at 11:40 +0530, Amit Saxena wrote:
> > Hi all,
> >
> > I am trying to use recursive regular expression in Perl.
> >
> > I am using an example from
> http://www.perl.com/pub/a/2003/06/06/regexps.html.
> >
> > Whenever I try to execute the program, it hangs and I have to do a
> CNTRL-C
> > to break it.
> >
> > Please let me know where I am wrong.
> >
> > *# cat t_r.pl*
> > #! /usr/bin/perl
> >
> > use warnings;
> > use strict;
> >
> > my $paren = qr/ \( [^)]+ \) /x;
> >
> > "Some (parenthesized) text" =~ /($paren)/;
> > print $1; # parenthesized
> >
> >     $paren = qr/
> >       \(
> >         (
> >            [^()]+  # Not parens
> >          |
> >            (??{ $paren })  # Another balanced group (not interpolated
> yet)
> >         )+?
> >       \)
> >     /x;
> >
> > print "\n";
> >
> > "Some (parenthesised and (gratuitously) sub-parenthesised text" =~
> > /($paren)/;
> > print $1; # parenthesized
> >
> > print "\n";
> >
> > *# perl t_r.pl*
> > (parenthesized)
> > <<Control-Break>>
> > *# perl -v*
> >
> > This is perl, v5.8.5 built for i386-linux-thread-multi
> >
> > Copyright 1987-2004, Larry Wall
> >
> > Perl may be copied only under the terms of either the Artistic License or
> > the
> > GNU General Public License, which may be found in the Perl 5 source kit.
> >
> > Complete documentation for Perl, including FAQ lists, should be found on
> > this system using `man perl' or `perldoc perl'.  If you have access to
> the
> > Internet, point your browser at http://www.perl.com/, the Perl Home
> Page.
> >
> > *#*
>
> I don't think you're doing anything wrong.  I think the guy who wrote
> the web page didn't test his regex with a string with unbalanced
> parenthesis.  Try:
>
> "Some (parenthesised and (gratuitously) sub-parenthesised) text" =~
> /($paren)/;
>
> You cannot parse unbound-nested contexts with regular expressions.  You
> need a finite-state automation (FSA) with a push-down stack.
>
>
> --
> Just my 0.00000002 million dollars worth,
>  Shawn
>
> Believe in the Gods but row away from the rocks.
>  -- ancient Hindu proverb
>
>
Thanks for the response Shawn.

However when I try to execute it even with the proper text, I get the
different output than anticipated.

*# cat t_r.pl*
#! /usr/bin/perl

use warnings;
use strict;

my $paren = qr/ \( [^)]+ \) /x;

"Some (parenthesized) text" =~ /($paren)/;
print $1; # parenthesized

    $paren = qr/
      \(
        (
           [^()]+  # Not parens
         |
           (??{ $paren })  # Another balanced group (not interpolated yet)
        )+?
      \)
    /x;

print "\n";

"Some (parenthesised and (gratuitously) sub-parenthesised text)" =~
/($paren)/;
print "$1\n";
print "$2\n";

print "\n";

*# perl t_r.pl*
(parenthesized)
(parenthesised and (gratuitously) sub-parenthesised text)
 sub-parenthesised text

*#*

I was expecting $2 to be "gratuitously" instead of " sub-parenthesized
text".

Please let me know whether there is some difference in my understanding
here.

Regards,
Amit Saxena

Reply via email to