Andrew Dunstan wrote:
Tom Lane wrote:
Andrew Dunstan <[EMAIL PROTECTED]> writes:
... It turns out (according to the analysis) that the only time we
actually need to use NextChar is when we are matching an "_" in a
like/ilike pattern.
I thought we'd determined that advancing bytewise for "%" was also
risky,
in two cases:
1. Multibyte character set that is not UTF8 (more specifically, does not
have a guarantee that first bytes and not-first bytes are distinct)
I thought we disposed of the idea that there was a problem with charsets
that didn't do first byte special.
And Dennis said:
Tom Lane skrev:
You could imagine trying to do
% a byte at a time (and indeed that's what I'd been thinking it did)
but that gets you out of sync which breaks the _ case.
It is only when you have a pattern like '%_' when this is a problem
and we could detect this and do byte by byte when it's not. Now we
check (*p == '\\') || (*p == '_') in each iteration when we scan over
characters for '%', and we could do it once and have different loops
for the two cases.
That's pretty much what the patch does now - It never tries to match a
single byte when it sees "_", whether or not preceeded by "%".
cheers
andrew
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq