Re: [PERFORM] wildcard search performance with "like"
As far as I know the index is only used when you do a prefix search, for example col like 'xyz%' I think that if you are looking for expressions such as 'A%B', you could rephrase them like this: col like 'A%' AND col like 'A%B' So the database could use the index to narrow down the result and then do a sequential search for the second condition. Mike Yantao Shi schrieb: Hi, I have a postges 8.1.1 table with over 29 million rows in it. The colunm (file_name) that I need to search on has entries like the following: MOD04_L2.A2005311.1400.004.2005312013848.hdf MYD04_L2.A2005311.0700.004.2005312013437.hdf I have an index on this column. But an index search is performance only when I give the full file_name for search: testdbspc=# explain select file_name from catalog where file_name = 'MOD04_L2.A2005311.1400.004.2005312013848.hdf'; QUERY PLAN Index Scan using catalog_pk_idx on catalog (cost=0.00..6.01 rows=1 width=404) Index Cond: (file_name = 'MOD04_L2.A2005311.1400.004.2005312013848.hdf'::bpchar) (2 rows) What I really need to do most of the time is a multi-wildcard search on this column, which is now doing a whole table scan without using the index at all: testdbspc=# explain select file_name from catalog where file_name like 'MOD04_L2.A2005311.%.004.2005312013%.hdf'; QUERY PLAN Seq Scan on catalog (cost=0.00..429.00 rows=1 width=404) Filter: (file_name ~~ 'MOD04_L2.A2005311.%.004.2005312013%.hdf'::text) (2 rows) Obviously, the performance of the table scan on such a large table is not acceptable. I tried full-text indexing and searching. It did NOT work on this column because all the letters and numbers are linked together with "." and considered one big single word by to_tsvector. Any solutions for this column to use an index search with multiple wild cards? Thanks a lot, Yantao Shi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] wildcard search performance with "like"
Yantao Shi <[EMAIL PROTECTED]> writes: > testdbspc=# explain select file_name from catalog where file_name like > 'MOD04_L2.A2005311.%.004.2005312013%.hdf'; > QUERY PLAN > Seq Scan on catalog (cost=0.00..429.00 rows=1 width=404) > Filter: (file_name ~~ 'MOD04_L2.A2005311.%.004.2005312013%.hdf'::text) > (2 rows) I'm betting you are using a non-C locale. You need either to run the database in C locale, or to create a special index type that is compatible with LIKE searches. See http://www.postgresql.org/docs/8.1/static/indexes-opclass.html regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] wildcard search performance with "like"
Hi, I have a postges 8.1.1 table with over 29 million rows in it. The colunm (file_name) that I need to search on has entries like the following: MOD04_L2.A2005311.1400.004.2005312013848.hdf MYD04_L2.A2005311.0700.004.2005312013437.hdf I have an index on this column. But an index search is performance only when I give the full file_name for search: testdbspc=# explain select file_name from catalog where file_name = 'MOD04_L2.A2005311.1400.004.2005312013848.hdf'; QUERY PLAN Index Scan using catalog_pk_idx on catalog (cost=0.00..6.01 rows=1 width=404) Index Cond: (file_name = 'MOD04_L2.A2005311.1400.004.2005312013848.hdf'::bpchar) (2 rows) What I really need to do most of the time is a multi-wildcard search on this column, which is now doing a whole table scan without using the index at all: testdbspc=# explain select file_name from catalog where file_name like 'MOD04_L2.A2005311.%.004.2005312013%.hdf'; QUERY PLAN Seq Scan on catalog (cost=0.00..429.00 rows=1 width=404) Filter: (file_name ~~ 'MOD04_L2.A2005311.%.004.2005312013%.hdf'::text) (2 rows) Obviously, the performance of the table scan on such a large table is not acceptable. I tried full-text indexing and searching. It did NOT work on this column because all the letters and numbers are linked together with "." and considered one big single word by to_tsvector. Any solutions for this column to use an index search with multiple wild cards? Thanks a lot, Yantao Shi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq