Re: [GENERAL] Query not using the indexes properly.

2009-10-01 Thread David Wilson
On Thu, Oct 1, 2009 at 10:21 PM, Tim Uckun  wrote:

>
> Interesting. I would have thought the order of the fields would not
> matter. I don't have to rewrite the query do I?
>
>
No. For multi-column indices, however, postgres can, starting at the
leftmost in the index, use as many columns as match equality comparisons
plus one column using an inequality comparison.

>From our fine manual, section 11.3:

"A multicolumn B-tree index can be used with query conditions that involve
any subset of the index's columns, but the index is most efficient when
there are constraints on the leading (leftmost) columns. The exact rule is
that equality constraints on leading columns, plus any inequality
constraints on the first column that does not have an equality constraint,
will be used to limit the portion of the index that is scanned. Constraints
on columns to the right of these columns are checked in the index, so they
save visits to the table proper, but they do not reduce the portion of the
index that has to be scanned. For example, given an index on (a, b, c) and a
query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to
be scanned from the first entry with a = 5 and b = 42 up through the last
entry with a = 5. Index entries with c >= 77 would be skipped, but they'd
still have to be scanned through. This index could in principle be used for
queries that have constraints on b and/or c with no constraint on a — but
the entire index would have to be scanned, so in most cases the planner
would prefer a sequential table scan over using the index."

-- 
- David T. Wilson
david.t.wil...@gmail.com


Re: [GENERAL] Query not using the indexes properly.

2009-10-01 Thread Tim Uckun
>
> If you try the multi-column index (which is a good idea), be sure that "id"
> is the last of the three columns, since that's the column on which you have
> an inequality test rather than an equality test; eg,
> (company_id,source_model_name,id).
>


Interesting. I would have thought the order of the fields would not
matter. I don't have to rewrite the query do I?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query not using the indexes properly.

2009-10-01 Thread Tim Uckun
> It probably thinks the id check is going to be better to limit the result
> set.
>
> How many records are there for id > 1935759 ?

About 40 million or so.

> vs
> How many records for company_id = 4 and source_model_name =
> 'CommissionedVisit' ?
>
> If this is a common query you could probably do a multi-column index on all
> 3 columns (id, company_id, source_model_name) - but if company_id and
> source_model_name have a low number of distinct values, then it's not going
> to help.


Both of those will indeed have a pretty low number of distinct values.

Looks like I need to figure out something else.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query not using the indexes properly.

2009-10-01 Thread David Wilson
On Thu, Oct 1, 2009 at 10:04 PM, Chris  wrote:

> Tim Uckun wrote:
>
> If this is a common query you could probably do a multi-column index on all
> 3 columns (id, company_id, source_model_name) - but if company_id and
> source_model_name have a low number of distinct values, then it's not going
> to help.
>

If you try the multi-column index (which is a good idea), be sure that "id"
is the last of the three columns, since that's the column on which you have
an inequality test rather than an equality test; eg,
(company_id,source_model_name,id).

-- 
- David T. Wilson
david.t.wil...@gmail.com


Re: [GENERAL] Query not using the indexes properly.

2009-10-01 Thread Chris

Tim Uckun wrote:

I have a pretty simple query on a pretty simple table with about 60
million records in it.

This is the query.

SELECT * FROM "changes" WHERE (id > 1935759 and company_id = 4 and
source_model_name = 'CommissionedVisit') ORDER BY id ASC LIMIT 1


The id field is the primary key. The other fields are indexed
(company_id and source_model_name).

This query takes about 30 seconds to run on a pretty beefy machine.

Here is the explain.

"Limit  (cost=0.00..7.46 rows=1 width=45) (actual
time=28799.712..28799.712 rows=0 loops=1)"
"  ->  Index Scan using changes_pkey on changes
(cost=0.00..2331939.52 rows=312519 width=45) (actual
time=28799.710..28799.710 rows=0 loops=1)"
"Index Cond: (id > 1935759)"
"Filter: ((company_id = 4) AND ((source_model_name)::text =
'CommissionedVisit'::text))"
"Total runtime: 28799.749 ms"


It seem to me that it's ignoring the indexes on the text fields. Is
that right?


It probably thinks the id check is going to be better to limit the 
result set.


How many records are there for id > 1935759 ?
vs
How many records for company_id = 4 and source_model_name = 
'CommissionedVisit' ?


If this is a common query you could probably do a multi-column index on 
all 3 columns (id, company_id, source_model_name) - but if company_id 
and source_model_name have a low number of distinct values, then it's 
not going to help.


--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Query not using the indexes properly.

2009-10-01 Thread Tim Uckun
I have a pretty simple query on a pretty simple table with about 60
million records in it.

This is the query.

SELECT * FROM "changes" WHERE (id > 1935759 and company_id = 4 and
source_model_name = 'CommissionedVisit') ORDER BY id ASC LIMIT 1


The id field is the primary key. The other fields are indexed
(company_id and source_model_name).

This query takes about 30 seconds to run on a pretty beefy machine.

Here is the explain.

"Limit  (cost=0.00..7.46 rows=1 width=45) (actual
time=28799.712..28799.712 rows=0 loops=1)"
"  ->  Index Scan using changes_pkey on changes
(cost=0.00..2331939.52 rows=312519 width=45) (actual
time=28799.710..28799.710 rows=0 loops=1)"
"Index Cond: (id > 1935759)"
"Filter: ((company_id = 4) AND ((source_model_name)::text =
'CommissionedVisit'::text))"
"Total runtime: 28799.749 ms"


It seem to me that it's ignoring the indexes on the text fields. Is
that right?

I did a vacuum analyze on the table but that didn't seem to help at all.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general