On Wed, Feb 8, 2012 at 3:03 PM, David Yeu <david....@skype.net> wrote:
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
...
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

I think you have something to gain from partitioning.
You could partition on group_id, which is akin to sharding only on a
single server, and that would significantly decrease each partition's
index size. Since those queries' performance is highly dependent on
index size, and since you seem to have such a huge table, I would
imagine such partitioning would help keep the indices performant.

Now, we do need statistics. How many groups are there? Do they grow
with your table, or is the number of groups constant? Which values of
offsets do you use? (offset is quite expensive)

And of course... explain analyze.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to