I'm interested to know a little bit more about the postgres implementation of indexes. I'm specifically wondering what it means in the output of EXPLAIN when a filter is applied.

I'm trying to decide whether it makes sense to use indexes on expressions rather than relying on a left-anchored LIKE for date filtering.

Here's what I've got:

WHERE some_date LIKE '<year>-<month>%' *

And what I'm wondering is whether it would be faster to add indexes on expressions for something like:

WHERE EXTRACT( year from some_date ) = '<year>'
AND EXTRACT( month from some_date ) = '<month>'

In practice, the point in the implementation has other parameters, so it ends up looking something like:

SELECT COUNT( * )
FROM some_table AS st, some_other_table AS sot
WHERE st.id = '<id>'
AND st.some_date LIKE '<year>-<month>%'
AND st.other_id = sot.other_id

Here's what I get from an EXPLAIN:

QUERY PLAN
------------------------------------------------------------------------ ----------------------------------------------------------------
Aggregate (cost=4749.19..4749.19 rows=1 width=0)
-> Nested Loop (cost=0.00..4749.17 rows=9 width=0)
-> Index Scan using st_id_idx on emma_mailings em (cost=0.00..147.56 rows=1 width=8)
Index Cond: (emma_account_id = 1::bigint)
Filter: ((some_date)::text ~~ '2004-06%'::text)
-> Index Scan using sot_other_id_idx on some_other_table sot (cost=0.00..3164.42 rows=114975 width=8)
Index Cond: ("outer".other_id = sot.other_id)



Then I try:

SELECT COUNT( * )
FROM some_table AS st, some_other_table AS sot
WHERE st.id = '<id>'
AND EXTRACT( year FROM st.some_date ) = '<year>'
AND EXTRACT( month FROM st.some_date  ) = '<month>'
AND st.other_id = sot.other_id

When I first added indexes on the EXTRACT expressions on some_table.some_date, I basically (the numbers are fudged because now the new indexes are working) saw:

QUERY PLAN
------------------------------------------------------------------------ ----------------------------------------------------------------
Aggregate (cost=4749.19..4749.19 rows=1 width=0)
-> Nested Loop (cost=0.00..4749.17 rows=9 width=0)
-> Index Scan using st_id_idx on emma_mailings em (cost=0.00..147.56 rows=1 width=8)
Index Cond: (emma_account_id = 1::bigint)
Filter: ((date_part('year'::text, emma_mailing_start_ts) = 2004::double precision) AND (date_part('year'::text, emma_mailing_start_ts) = 6::double precision))
-> Index Scan using sot_other_id_idx on some_other_table sot (cost=0.00..3164.42 rows=114975 width=8)
Index Cond: ("outer".other_id = sot.other_id)


Now, though, it seems to be using the expression indexes. I guess my main question is what it means to apply a Filter to an Index Scan, and whether the LIKE filter would be slower than the EXTRACT filter. In the queries I've tested, EXPLAIN ANALYZE resulted in almost identical runtimes until the indexes on expressions kicked in. Once that happened, the new indexes were much faster.

A related question might be what might've happened between the times I tested when the indexes weren't working and when they were. I've been testing on a low-traffic development server, and I ANALYZED after adding the indexes. It was in a new session, though, that the indexes on EXTRACT actually kicked in.

-tfo

* <> = generic pseudocode placeholder for variable/constant data


---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to