Re: regular expression problem

Karsten Hilbert Mon, 29 Oct 2018 04:07:39 -0700

On Sun, Oct 28, 2018 at 11:57:48PM +0100, Brian Oney wrote:

> On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote:
> > [^<:]
> 
> Would a simple regex work?


This brought about the solution.

However, not this way:

> >>> import re
> >>> t = '$<name::options::range>$'
> >>> re.findall('[^<>:$]+', t)
> ['name', 'options', 'range']

because I am not trying to parcel out the placeholder *parts*
(but rather the placeholders from a given line).

I eventually figured that denoting the parsing stages
differently made for easier matching. Rather than

        $<>$
        $<<>>$
        $<<<>>>$

do this

        $1<>1$
        $2<>2$
        $3<>3$

which makes it way less ambiguous, and more matchable:

regexen = [
        r'\$1{0,1}<[^<].*?>1{0,1}\$',
        r'\$2<[^<].*?>2\$',
        r'\$3<[^<].*?>3\$'
]

The [^<] part ("the single < is NOT to be followed directly
by another <") is actually superfluous but does protect
against legacy document templates still having
$<<(<)...(>)>>$ in them.

$<>$ is still retained as an alias for $1<>1$ because there is
A LOT of them in existing document templates. It is
normalized explicitely inside Python before fillin values are
generated.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

Reply via email to