On Thu, May 24, 2007 at 11:20:51PM -0400, Tom Lane wrote: > I wrote: > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > >> Yes, I agree completely. However it looks to me like IsFirstByte will in > >> fact always be true when we get to call NextChar for matching "_" for UTF8. > > If that's true, the patch is failing to achieve its goal of treating % > > bytewise ... > OK, I studied it a bit more and now see what you're driving at: in this > form of the patch, we treat % bytewise unless it is followed by _, in > which case we treat it char-wise. That seems a good tradeoff, > considering that such a pattern is probably pretty uncommon --- we > should be willing to handle it a bit slower to simplify other cases.
Is it worth the effort to pre-process the pattern? For example: %% -> % %_ -> _% If applied recursively, this would automatically cover: %_% -> _% _%_ -> __% The 'benefit' would be that the pattern matching code would not need an inner if statement? Also - I didn't see a response to my query with regard treating UTF-8 as a two pass match. First pass treating it as bytes. If the first pass matches, the second pass doing a full analysis. In the case of low selectivity, this will be a win, as the primary filter would be the full speed byte-based matching. I had also asked why the focus would be on high selectivity. Why would the primary filter criteria for a properly designed select statement by a like with high selectivity? The only time I have ever used like is when I expect low selectivity. Is there a reasonable case I am missing? Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match