On Mon, 2008-10-06 at 12:12 +0100, Rob Dixon wrote:
> Mr. Shawn H. Corey wrote:
> > On Fri, 2008-10-03 at 12:38 +0100, Rob Dixon wrote:
> >> Mr. Shawn H. Corey wrote:
> >>>
> >>> Note that if these structures can be nested, you will have to use a FSA
> >>> with a push-down stack.
> >>
> >> That will match a line like
> >>
> >>   [wrong) and (wrong]
> >>
> >> Rob
> >>
> > 
> > Note that if these structures can be nested, you will have to use a FSA
> > with a push-down stack.
> 
> To rework the adage, "When your only tool is an FSA, every problem looks like
> it's nested."
> 
> My example wasn't a nested one. It was just an example of two incorrect brace
> matches as the OP described them.
> 
> Rob
> 

You make it sound like I just discovered them and haven't been using
them for the past 30 years.

On Fri, 2008-10-03 at 12:24 +0100, Rob Dixon wrote:
> Vyacheslav Karamov wrote:
> > Hi All!
> > 
> > I need to capture something in braces using regular expressions.
> > But I don't need to capture wrong data:
> > 
> > [Some text] - correct
> > (Some text) - also correct
> > [Some text) - wrong
> > (Some text] - also wrong
> 
> HTH,
> 
> Rob
> 
> 
> use strict;
> use warnings;
> 
> while (<DATA>) {
>    while ( / ( \[[^])]+\] | \([^])]+\) ) /xg ) {
>      print $1, "\n";
>    }
> }
> 
> __DATA__
> [correct] - correct
> (also correct) - also correct
> [wrong) - wrong
> (also wrong] - also wrong

Let's add some more data:

__DATA__
[correct] - correct
(also correct) - also correct
[wrong) - wrong
(also wrong] - also wrong
(correct (and) nested) - correct, matches:  (correct (and)
(wrong [and] nested) - wrong, matches:  (wrong [and]
(correct [and) nested] - correct, matches:  (correct [and)


The problem is that the OP has specified a data format with nested
contexts, even though they may not realize it.  If there are nested
context or the meaning of the symbols changes, it cannot be parsed with
just regular expressions; you need a FSA to parse it.  If it has
unbounded recursion, you need a FSA with a push-down stack.

The sad thing is that no-one teaches how to recognize the different
formats so the correct code can be written.


-- 
Just my 0.00000002 million dollars worth,
  Shawn

Linux is obsolete.
-- Andrew Tanenbaum


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to