Hi,
I'm seeing a slow running query. After some optimization with
indexes,
it appears that the query plan is correct, it's just slow. Running the
query twice, not surprisingly, is very fast, due to OS caching or
shared_buffers caching. If the parameters for the query are different,
however, the query is slow until run a second time. In our usage
pattern, this query must run fast the first time it runs.
A couple of general stats: this is a linode machine with a single
3GB DB
with 4GBs of ram. Shared buffers = 1024mb, effective_cache_size=2048MB.
We are running with postgres
9.1.
The machine is otherwise dormant when
this query runs. Here is the schema:
Table
"public.premiseaccount"
Column | Type
| Modifiers
------------------------+--------------------------+-------------------------------------------------------------------------
id | integer | not null
default
nextval('premiseaccount_id_seq'::regclass)
created | timestamp with time zone | not null
default
'2013-09-25 07:00:00+00'::timestamp with time zone
modified | timestamp with time zone | not null
default
'2013-09-25 07:00:00+00'::timestamp with time zone
account_id | integer | not null
premise_id | integer | not null
bucket | character varying(255) |
Indexes:
"premiseaccount_pkey" PRIMARY KEY, btree (id)
"premiseaccount_account_id" btree (account_id)
"premiseaccount_bucket" btree (bucket)
"premiseaccount_bucket_58c70392619aa36f" btree (bucket, id)
"premiseaccount_bucket_like" btree (bucket varchar_pattern_ops)
"premiseaccount_premise_id" btree (premise_id)
Foreign-key constraints:
"account_id_refs_id_529631edfff28022" FOREIGN KEY (account_id)
REFERENCES utilityaccount(id) DEFERRABLE INITIALLY DEFERRED
"premise_id_refs_id_5ecea2842007328b" FOREIGN KEY (premise_id)
REFERENCES premise(id) DEFERRABLE INITIALLY DEFERRED
Table "public.electricusage"
Column | Type
| Modifiers
--------------------+--------------------------+------------------------------------------------------------------------
id | integer | not null default
nextval('electricusage_id_seq'::regclass)
created | timestamp with time zone | not null default
'2013-09-25 07:00:00+00'::timestamp with time zone
modified | timestamp with time zone | not null default
'2013-09-25 07:00:00+00'::timestamp with time zone
from_date | timestamp with time zone | not null
to_date | timestamp with time zone | not null
usage | numeric(9,2) |
demand | numeric(9,2) |
bill_amount | numeric(9,2) |
premise_account_id | integer | not null
Indexes:
"electricusage_pkey" PRIMARY KEY, btree (id)
"electricusage_premise_account_id" btree (premise_account_id)
"electricusage_covered_id_from_date_usage" btree
(premise_account_id, from_date, usage)
Foreign-key constraints:
"premise_account_id_refs_id_4c39e54406369128" FOREIGN KEY
(premise_account_id) REFERENCES premiseaccount(id) DEFERRABLE INITIALLY
DEFERRED
For reference, premiseaccount has about 1 million rows, one for each
account, grouped in buckets with an average of 5000 accounts per bucket.
electricusage has 10 million rows, about 10 rows per premiseaccount.
Here is the query:
explain analyze
SELECT premiseaccount.id, SUM(electricusage.usage) AS total_kwh
FROM premiseaccount
LEFT OUTER JOIN electricusage
ON premiseaccount.id = electricusage.premise_account_id
WHERE premiseaccount.bucket = 'XXX' AND electricusage.from_date
>=
'2012-11-20
00:00:00'
GROUP BY premiseaccount.id
HAVING SUM(electricusage.usage) BETWEEN 3284 and 3769
LIMIT 50;
Limit (cost=0.00..1987.24 rows=50 width=8) (actual
time=931.524..203631.435 rows=50 loops=1)
-> GroupAggregate (cost=0.00..179487.78 rows=4516 width=8)
(actual
time=931.519..203631.275 rows=50 loops=1)
Filter: ((sum(electricusage.usage) >= 3284::numeric) AND
(sum(electricusage.usage) <= 3769::numeric))
-> Nested Loop (cost=0.00..179056.87 rows=36317
width=8)
(actual time=101.934..203450.761 rows=30711 loops=1)
-> Index Scan using
premiseaccount_bucket_58c70392619aa36f on premiseaccount premiseaccount
(cost=0.00..14882.30 rows=4516 width=4) (actual time=77.620..7199.527
rows=3437 loops=1)
Index Cond: ((bucket)::text =
'XXX'::text)
-> Index Scan using
electricusage_premise_account_id_36bc8999ced10059 on electricusage
electricusage (cost=0.00..36.24 rows=9 width=8) (actual
time=8.607..57.055 rows=9 loops=3437)
Index Cond: ((premise_account_id =
premiseaccount.id) AND (from_date >= '2012-11-20
00:00:00+00'::timestamp
with time zone))
Total runtime: 203631.666 ms
(see online at:
http://explain.depesz.com/s/zeC)
What am I missing here? It seems like this is a relatively
straightforward foreign key join, using the correct indexes, that is
just slow. Warming the OS cache seems like the only way to make this
actually perform reasonably, but it seems like it's masking the
underlying problem. I could probably do some denormalization, like
putting the bucket field on the electricusage table, but that seems like
a shoddy solution.
Before running my query, I clear the os and postgres cache. I know
that
in theory some subset of the data will be in the os cache and postgres
cache. But in the real world, what we see is that a number of people get
timeouts waiting for this query to finish, because it's not in the
cache. e.g., running the query with where bucket='xxx' will be fast a
second time, but where bucket='yyy' will not.
Here are things I've tried:
+ Clustering premiseaccount on
premiseaccount_bucket_58c70392619aa36f
and electricusage on electricusage_covered_id_from_date_usage. This
provided modest improvement
+ Fit the whole DB in memory. Since this query is fast if the data
is
cached, try to fit everything into the cache. Bumping the ram to 8GBs,
and warming up the os cache by taring the postgres data directory. On
the 4GB machine, I still run into slow performance, but at 8GBs, the
entire DB can fit in the OS cache. This made a huge improvement, but
will only work for a little while as the database grows.
+ Tried on other linode servers - same results.
+ Removing the from_date part of the query. It uses a different
index
but same performance characteristics.
+ Looked at kern, sys, and postgres logs. Nothing interesting.
I would really appreciate any help I could get! Thanks!
Bryce