Re: [PATCHES] [HACKERS] LIKE optimization in UTF-8 and locale-C

2007-03-22 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 I found LIKE operators are slower on multi-byte encoding databases
 than single-byte encoding ones. It comes from difference between
 MatchText() and MBMatchText().

 We've had an optimization for single-byte encodings using 
 pg_database_encoding_max_length() == 1 test. I'll propose to extend it
 in UTF-8 with locale-C case.

If this works for UTF8, won't it work for all the backend-legal
encodings?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] LIKE optimization in UTF-8 and locale-C

2007-03-22 Thread Hannu Krosing
Ühel kenal päeval, N, 2007-03-22 kell 11:08, kirjutas Tom Lane:
 ITAGAKI Takahiro [EMAIL PROTECTED] writes:
  I found LIKE operators are slower on multi-byte encoding databases
  than single-byte encoding ones. It comes from difference between
  MatchText() and MBMatchText().
 
  We've had an optimization for single-byte encodings using 
  pg_database_encoding_max_length() == 1 test. I'll propose to extend it
  in UTF-8 with locale-C case.
 
 If this works for UTF8, won't it work for all the backend-legal
 encodings?

I guess it works well for % but not for _ , the latter has to know, how
many bytes the current (multibyte) character covers.

The length is still easy to find out for UTF8 encoding, so it may be
feasible to write UTF8MatchText() that is still faster than
MBMatchText().

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 6: explain analyze is your friend