On 5 March 2012 17:23, Robert Haas <robertmh...@gmail.com> wrote:
> This is different from what Perl does, but I think Perl's behavior
> here is batty: given a+|a+b+ and the string aaabbb, it picks the first
> branch and matches only aaa.

Yeah, this is sometimes referred to as "ordered alternation",
basically that the branches of the alternation are prioritised in the
same order in which they are described.  It is fairly commonplace
among regex implementations.

> apparently, it selects the syntactically first
> branch that can match, regardless of the length of the match, which
> strikes me as nearly pure evil.

As long as it's documented that alternation prioritises in this way, I
don't feel upset about it.  At least it still provides you with a
sensible way to get whatever you want from your RE; if you want a
shorter alternative to be preferred, put it up the front.  Ordered
alternation also gives you a way to specify which of several
same-length alternatives you would prefer to be matched, which can
come in handy.  It also means you can specify less-complex
alternatives before more-complex ones, which can have performance
advantages.

I do agree with you that if you *don't* do ordered alternation, then
it is right to treat alternation as greedy by default.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to