[EMAIL PROTECTED] wrote:

> Hi Eric, 5-Jan-2000 you wrote:
>
> [...]
> >I think backtracking is an essential part of regular expressions. To quote
> >from Mastering Regular Expressions by Jeffrey Friedl (O'Reilly, p. 102),
>
> >    The essence of an NFA engine [the regex matching engine of the kind
> >    that Perl uses] is this: it considers each subexpression or component
> >    in turn, and whenever it needs to decide between two equally viable
> >    options, it selects one and remembers the other to return to later if
> >    need be. ....
>
> >Suppose you want to find a word ending in "ing". In Perl you could write:
>
> >   /\b[a-z_]+ing\b/i
>
> >Backtracking is crucial here, because "[a-z_]+" first matches to the end of
> >the word, and if you didn't backtrack, "ing" would never match. With a parse
> >rule, you have to do something like this:
>
> >    pp: copy []
> >    alpha: make bitset!  [#"a" - #"z" #"A" - #"Z" #"_"]
> >    non-alpha: complement alpha
>
> >    parse/all string [ some [
> >        p: some [ "ing" [non-alpha | end ] (append pp p) | alpha ] |
> >        some non-alpha
> >    ]]
>
> >This will leave you with the block PP holding pointers to the beginning of
> >all the words ending in "ing". It isn't quite equivalent to the Perl regex,
> >but it's close enough. Here backtracking is accomplished in the following
> >way: Suppose you were scanning through the word "ringing". The string "ing"
> >would match "ringing" first, and then attempt to match a non-alphabet
> >              ~~~
> >character or the string end. That wouldn't work, but the following | says
> >that we should try to jump back to the second letter of "ringing" and just
> >match any alphabet character, and keep going through the word. This is the
> >kind of thing that regexes do for you automatically.
>
> How about this:
>
> >> a: [skip a | "ing"]
> == [skip a | "ing"]
> >> parse "ringin" a
> == false
> >> parse "ringing" a
> == true
>
> It should do the job. So actually backtracking in Parse can be achieved by use
> of the "|" operator in Parse blocks.

What? | is simple OR or am I wrong? You will end up in recursion imho.
Try this:

a: [skip mark: (print mark) a | "ing"]

you will get following result:

ingin
ngin
gin
in
n

== false

"ing" is never going to be considered imho, unless skip fails, and that's when?
When we reach end. Then it fails, as at the end, "ing" is not applicable.

How's that your second example works? Let's extend the rule:

 a: [skip mark1: (print ["mark1: " mark1 index? mark1]) a | mark2: (print ["mark2:
" mark2 index? mark2]) "ing"]

mark1:  inging 2
mark1:  nging 3
mark1:  ging 4
mark1:  ing 5
mark1:  ng 6
mark1:  g 7
mark1:   8
mark2:   8
mark2:  g 7
mark2:  ng 6
mark2:  ing 5
== true

Hmm, strange, at mark1: 8 skip fails as we are at the end of the string, parse
tries to aply "ing" - not a succes, it then returns from recursion one level upon.
How's that 'skip fails here (mark2: 7)? In real - it doesn't, as the rule is:

[skip a | "ing"]

and we are just behind the skip, back from the nested 'a, which failed, so it
tries to aply "ing" ....

Whee ya, backtracking as a wine? :-)

I am just starting to learn some simple parse rules, so take my comments easy, if
wrong :-)

See ya,

PS: I am somehow tired today to study and comment another examples, sorry :-)

-pekr-

> Kind regards,
> --
> Ole Friis <[EMAIL PROTECTED]>
>
> "Ignorance is bliss"
> (Cypher, The Matrix)

Reply via email to