PFC wrote:
Hi, I am looking to improve the initial query speed for the following
query:
Try Xapian full text search engine, it behaves much better than
tsearch when the dataset exceeds your memory cache size.
__ Information from ESET NOD32 Antivirus, version of virus
signature d
Good idea. Note that Postgres is already doing this to some extent
with TOAST - read
http://www.postgresql.org/docs/8.3/interactive/storage-toast.html -
unfortunately, there doesn't seem to be an option to always move
particular columns out to TOAST. Your idea will produce an even
smaller ta
On Wed, 18 Jun 2008, Howard Cole wrote:
Out of interest, if I could create a multicolumn index with both the primary
key and the fts key (I don't think I can create a multi-column index using
GIST with both the email_id and the fts field), would this reduce access to
the table due to the primar
Actually, the index returns page numbers in the table on disc which
may contain one or more rows that are relevant. Postgres has to fetch
the whole row to find out the email_id and any other information,
including whether the row is visible in your current transaction
(concurrency control c
On Tue, 17 Jun 2008, Alan Hodgson wrote:
On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote:
Incidentally, how can I clear the cache in between queries?
Stop PostgreSQL, unmount the filesystem it's on, remount it, restart
PostgreSQL. Works under Linux.
If it's on a filesystem you
On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote:
> Incidentally, how can I clear the cache in between queries?
Stop PostgreSQL, unmount the filesystem it's on, remount it, restart
PostgreSQL. Works under Linux.
If it's on a filesystem you can't unmount hot, you'll need to reboot.
On Tuesday 17 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote:
> This misses out the random access of the email table, turning my 27
> second query into 6 seconds.
It took less time because it retrieved a lot less data - it still has to
look at the table.
--
Alan
--
Sent via pgsql-performanc
Actually, the index returns page numbers in the table on disc which
may contain one or more rows that are relevant. Postgres has to fetch
the whole row to find out the email_id and any other information,
including whether the row is visible in your current transaction
(concurrency control com
On Tue, 17 Jun 2008, Howard Cole wrote:
If I do a query that uses another index, then it uses the index only and
does not scan the email table.
Not true. It only looks a little bit like that from the explain output.
However, if you look closely:
Index Scan using email_email_directory_id_idx
Matthew Wakeling wrote:
On Tue, 17 Jun 2008, Howard Cole wrote:
They both appear to do a scan on the email table (Why?).
The indexes don't contain copies of the row data. They only contain
pointers to the rows in the table. So once the index has been
consulted, Postgres still needs to look a
On Tue, 17 Jun 2008, Howard Cole wrote:
They both appear to do a scan on the email table (Why?).
The indexes don't contain copies of the row data. They only contain
pointers to the rows in the table. So once the index has been consulted,
Postgres still needs to look at the table to fetch the
As far as I can see, that shouldn't make any difference. Both queries
still do the bitmap heap scan, and have almost exactly the same cost.
Matthew
You may have a point there Matthew, they both appear to do a scan on the
email table (Why?). But for whatever reason, I swear the second method
On Tue, 17 Jun 2008, Howard Cole wrote:
I think I may have answered my own question partially, the problem may be how
I structure the query.
Original statement:
"Nested Loop (cost=4.40..65.08 rows=16 width=8)"
" -> Function Scan on q (cost=0.00..0.01 rows=1 width=32)"
" -> Bitmap Heap Sc
On Tue, 17 Jun 2008, Howard Cole wrote:
Alan Hodgson wrote:
It's because everything is cached, in particular the relevant rows from the
"email" table (accessing which took 22 of the original 27 seconds).
Thanks Alan, I guessed that the caching was the difference, but I do not
understand why t
I think I may have answered my own question partially, the problem may
be how I structure the query.
I always structured my tsearch queries as follows following my initial
read of the tsearch2 instructions...
select email_id from email, to_tsquery('default', 'howard') as q where
q@@fts;
Ho
Alan Hodgson wrote:
It's because everything is cached, in particular the relevant rows from
the "email" table (accessing which took 22 of the original 27 seconds).
The plan looks good for what it's doing.
I don't see that query getting much faster unless you could add a lot more
cache RAM; 30
On Monday 16 June 2008, Howard Cole <[EMAIL PROTECTED]> wrote:
> Hi, I am looking to improve the initial query speed for the following
> query:
>
> select email_id from email, to_tsquery('default','example') as q where
> q@@fts;
>
> This is running on 8.2.4 on Windows Server 2K3.
>
> The initial ou
Hi, I am looking to improve the initial query speed for the following query:
select email_id from email, to_tsquery('default','example') as q where
q@@fts;
This is running on 8.2.4 on Windows Server 2K3.
The initial output from explain analyse is as follows.
"Nested Loop (cost=8.45..76.70 r
18 matches
Mail list logo