Actually, the index returns page numbers in the table on disc which may contain one or more rows that are relevant. Postgres has to fetch the whole row to find out the email_id and any other information, including whether the row is visible in your current transaction (concurrency control complicates it all). Just having a page number isn't much use to you!

Matthew

Out of interest, if I could create a multicolumn index with both the primary key and the fts key (I don't think I can create a multi-column index using GIST with both the email_id and the fts field), would this reduce access to the table due to the primary key being part of the index?

More importantly, are there other ways that I can improve performance on this? I am guessing that a lot of the problem is that the email table is so big. If I cut out some of the text fields that are not needed in the search and put them in another table, presumably the size of the table will be reduced to a point where it will reduce the number of disk hits and speed the query up.

So I could split the table into two parts:

create table email_part2 (
email_id int8 references email_part1 (email_id),
fts ...,
email_directory_id ...,
)

create table email_part1(
email_id serial8 primary key,
cc text,
bcc text,
...
)

and the query will be
select email_id from email_part2 where to_tsquery('default', 'howard') @@ fts;

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to