Re: [PATCHES] UTF8MatchText

Dennis Bjorklund Sun, 20 May 2007 03:09:53 -0700

Tom Lane skrev:

You could imagine trying to do
% a byte at a time (and indeed that's what I'd been thinking it did)
but that gets you out of sync which breaks the _ case.

It is only when you have a pattern like '%_' when this is a problem andwe could detect this and do byte by byte when it's not. Now we check (*p== '\\') || (*p == '_') in each iteration when we scan over charactersfor '%', and we could do it once and have different loops for the two cases.

Other than this part that I think can be optimized I don't see anythingwrong with the idea behind the patch. To make the '%' case fast might bean important optimization for a lot of use cases. It's not uncommon that'%' matches a bigger part of the string than the rest of the pattern.

It's easy to make a misstake when one is used to think about the simplefixed size characters like ascii. Strange that this simple topic can beso difficult to think about... :-)


/Dennis

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [PATCHES] UTF8MatchText

Reply via email to