Hi, I'm getting low performance on SUM and GROUP BY queries. How can I improve my database to perform such queries.
Here is my table schema: => \d acct_2010_25 Tabela "public.acct_2010_25" Coluna | Tipo | Modificadores ----------------+-----------------------------+------------------------------------------------------------------------ ip_src | inet | not null default '0.0.0.0'::inet ip_dst | inet | not null default '0.0.0.0'::inet as_src | bigint | not null default 0 as_dst | bigint | not null default 0 port_src | integer | not null default 0 port_dst | integer | not null default 0 tcp_flags | smallint | not null default 0 ip_proto | smallint | not null default 0 packets | integer | not null flows | integer | not null default 0 bytes | bigint | not null stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00 BC'::timestamp without time zone stamp_updated | timestamp without time zone | Índices: "acct_2010_25_pk" PRIMARY KEY, btree (stamp_inserted, ip_src, ip_dst, port_src, port_dst, ip_proto) "ibytes_acct_2010_25" btree (bytes) Here is my one query example (could add pk to flow and packet fields): => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual time=276981.107..276981.133 rows=50 loops=1) -> Sort (cost=3998662.81..4001046.07 rows=953305 width=50) (actual time=276981.105..276981.107 rows=50 loops=1) Sort Key: sum(bytes) -> GroupAggregate (cost=3499863.27..3754872.33 rows=953305 width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1) -> Sort (cost=3499863.27..3523695.89 rows=9533049 width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1) Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto -> Seq Scan on acct_2010_25 (cost=0.00..352648.10 rows=9533049 width=50) (actual time=0.038..50860.391 rows=9494165 loops=1) Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone)) Total runtime: 278791.661 ms (9 registros) Another one just summing bytes (still low): => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto LIMIT 50 OFFSET 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual time=106261.359..106261.451 rows=50 loops=1) -> GroupAggregate (cost=3395202.50..3602225.48 rows=974226 width=42) (actual time=106261.357..106261.435 rows=50 loops=1) -> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42) (actual time=106261.107..106261.169 rows=176 loops=1) Sort Key: ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto -> Seq Scan on acct_2010_25 (cost=0.00..367529.72 rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 loops=1) Filter: ((stamp_inserted >= '2010-06-20 10:10:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-21 10:10:00'::timestamp without time zone)) Total runtime: 109911.882 ms (7 registros) The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM. I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just increased checkpoint_segments to 50). What can I change to increase performance? Thanks in advance. Cheers. -- Sergio Roberto Charpinel Jr.