On 2018-10-29 08:02, Karsten Hilbert wrote:
On Sun, Oct 28, 2018 at 11:14:15PM +0000, MRAB wrote:

> - lines can contain several placeholders
> > - placeholders start and end with '$' > > - placeholders are parsed in three passes > > - the pass in which a placeholder is parsed is denoted by the number of '<' and '>' next to the '$': > > $<...>$ / $<<...>>$ / $<<<...>>>$ > > - placeholders for different parsing passes must be nestable: > > $<<<...$<...>$...>>>$
>    ....
>    (lower=earlier parsing passes will be inside)
> > - the internal structure is "name::options::range" > > $<name::options::range>$ > > - name will *not* contain '$' '<' '>' ':' > > - range can be either a length or a "from-until" > > - a length will be a positive integer (no bounds checking) > > - "from-until" is: a positive integer, a '-', and a positive integer (no sanity checking) > > - options needs to be able to contain nearly anything, except '::' > > > Is that sufficiently defined and helpful to design the regular expression ? > How can they be nested inside one another?
Is the string scanned, placeholders filled in for that level, and then the
string scanned again for the next level? (That would mean that the fill
value itself will be scanned in the next pass.)

Exactly. But *different* levels can be nested inside each other.

You could try matching the top level, for each match then match the next
level, and for each of those matches then match for the final level.

So I do.

Trying to do it all in one regex is usually a bad idea.

Right, I am not trying to do that. I was, however, worried
that I need to make the expression not "trip over" fragments
of what might seem to constitute part of another placeholder.

        $<<ph_1::option=$<ph_2::option=3::10>$::15>>$

Pass 1 might fill in to:

        $<<ph_1::option=3 '>s'::15>>$

and I was worried to make sure the second pass does not stop here:

        $<<ph_1::option=3 '>s'::15>>$
                        ^

Logically it should not because

        >s'::15>>$

does not match

        ::\d*>>$

but I am not sure how to tell it that :-)

For something like that, I'd use parsing by recursive descent.

It might be worth looking at pyparsing.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to