Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-20 Thread Howard Cole
PFC wrote: Hi, I am looking to improve the initial query speed for the following query: Try Xapian full text search engine, it behaves much better than tsearch when the dataset exceeds your memory cache size. __ Information from ESET NOD32 Antivirus, version of virus signature d

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-19 Thread Howard Cole
Good idea. Note that Postgres is already doing this to some extent with TOAST - read http://www.postgresql.org/docs/8.3/interactive/storage-toast.html - unfortunately, there doesn't seem to be an option to always move particular columns out to TOAST. Your idea will produce an even smaller ta

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-18 Thread Matthew Wakeling
On Wed, 18 Jun 2008, Howard Cole wrote: Out of interest, if I could create a multicolumn index with both the primary key and the fts key (I don't think I can create a multi-column index using GIST with both the email_id and the fts field), would this reduce access to the table due to the primar

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-18 Thread Howard Cole
Actually, the index returns page numbers in the table on disc which may contain one or more rows that are relevant. Postgres has to fetch the whole row to find out the email_id and any other information, including whether the row is visible in your current transaction (concurrency control c

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Matthew Wakeling
On Tue, 17 Jun 2008, Alan Hodgson wrote: On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote: Incidentally, how can I clear the cache in between queries? Stop PostgreSQL, unmount the filesystem it's on, remount it, restart PostgreSQL. Works under Linux. If it's on a filesystem you

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Alan Hodgson
On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote: > Incidentally, how can I clear the cache in between queries? Stop PostgreSQL, unmount the filesystem it's on, remount it, restart PostgreSQL. Works under Linux. If it's on a filesystem you can't unmount hot, you'll need to reboot.

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Alan Hodgson
On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote: > This misses out the random access of the email table, turning my 27 > second query into 6 seconds. It took less time because it retrieved a lot less data - it still has to look at the table. -- Alan -- Sent via pgsql-performanc

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Howard Cole
Actually, the index returns page numbers in the table on disc which may contain one or more rows that are relevant. Postgres has to fetch the whole row to find out the email_id and any other information, including whether the row is visible in your current transaction (concurrency control com

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Matthew Wakeling
On Tue, 17 Jun 2008, Howard Cole wrote: If I do a query that uses another index, then it uses the index only and does not scan the email table. Not true. It only looks a little bit like that from the explain output. However, if you look closely: Index Scan using email_email_directory_id_idx

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Howard Cole
Matthew Wakeling wrote: On Tue, 17 Jun 2008, Howard Cole wrote: They both appear to do a scan on the email table (Why?). The indexes don't contain copies of the row data. They only contain pointers to the rows in the table. So once the index has been consulted, Postgres still needs to look a

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Matthew Wakeling
On Tue, 17 Jun 2008, Howard Cole wrote: They both appear to do a scan on the email table (Why?). The indexes don't contain copies of the row data. They only contain pointers to the rows in the table. So once the index has been consulted, Postgres still needs to look at the table to fetch the

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Howard Cole
As far as I can see, that shouldn't make any difference. Both queries still do the bitmap heap scan, and have almost exactly the same cost. Matthew You may have a point there Matthew, they both appear to do a scan on the email table (Why?). But for whatever reason, I swear the second method

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Matthew Wakeling
On Tue, 17 Jun 2008, Howard Cole wrote: I think I may have answered my own question partially, the problem may be how I structure the query. Original statement: "Nested Loop (cost=4.40..65.08 rows=16 width=8)" " -> Function Scan on q (cost=0.00..0.01 rows=1 width=32)" " -> Bitmap Heap Sc

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Matthew Wakeling
On Tue, 17 Jun 2008, Howard Cole wrote: Alan Hodgson wrote: It's because everything is cached, in particular the relevant rows from the "email" table (accessing which took 22 of the original 27 seconds). Thanks Alan, I guessed that the caching was the difference, but I do not understand why t

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Howard Cole
I think I may have answered my own question partially, the problem may be how I structure the query. I always structured my tsearch queries as follows following my initial read of the tsearch2 instructions... select email_id from email, to_tsquery('default', 'howard') as q where q@@fts; Ho

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-17 Thread Howard Cole
Alan Hodgson wrote: It's because everything is cached, in particular the relevant rows from the "email" table (accessing which took 22 of the original 27 seconds). The plan looks good for what it's doing. I don't see that query getting much faster unless you could add a lot more cache RAM; 30

Re: [PERFORM] Tsearch2 Initial Search Speed

2008-06-16 Thread Alan Hodgson
On Monday 16 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote: > Hi, I am looking to improve the initial query speed for the following > query: > > select email_id from email, to_tsquery('default','example') as q where > q@@fts; > > This is running on 8.2.4 on Windows Server 2K3. > > The initial ou

[PERFORM] Tsearch2 Initial Search Speed

2008-06-16 Thread Howard Cole
Hi, I am looking to improve the initial query speed for the following query: select email_id from email, to_tsquery('default','example') as q where q@@fts; This is running on 8.2.4 on Windows Server 2K3. The initial output from explain analyse is as follows. "Nested Loop (cost=8.45..76.70 r