I have created a patch to enable the Boyer-More-Horspool search algorithm (B-M-H) for LIKE queries.
B-M-H needs to initialize the skip table and keep it during SQL execution. In this patch, flinfo->fn_extra is used to keep the skip table. The conditions under which B-M-H can be used are as follows. (1) B-M-H in LIKE search supports only single-byte character sets and UTF8. Multibyte character sets does not support, because it may contain another characters in the byte sequence. For UTF-8, it works fine, because in UTF-8 the byte sequence of one character cannot contain another character. (2) The pattern string should be stable parameter, because B-M-H needs to keep the skip table generated from the pattern string during the execution of the query. (3) The pattern string should be at least 4 characters. For example, '%AB%' can use B-M-H. (4) The first and last character of the pattern string should be '%'. (5) Characters other than the first and last of the pattern string should not be '%', '_'. However, escaped characters such as '\%', '\_' are available. Also, this patch changes the collation validity check in functions (such as textlike) to be performed at the first execution of the query, instead of each function execution. I have measured the performance with the following query. ---------- ---------- ---------- ---------- ---------- ---------- ---------- SET client_min_messages TO notice; \timing DO $$ DECLARE cnt integer := 0; total integer := 0; BEGIN FOR i IN 1..500 LOOP select count(*) into cnt from pg_catalog.pg_description d where d.description like '%greater%'; total := total + cnt; END LOOP; RAISE NOTICE 'TOTAL: %', total; END $$ ; ---------- ---------- ---------- ---------- ---------- ---------- ---------- Result Without patch: 257.504ms With patch: 191.638ms Regards, Atsushi Ogawa
like_bmh.patch
Description: Binary data