Hi, One more question about two specifics query behavior: If I add "AND (ip_dst = x.x.x.x)", it uses another plan and take a much more time. In both of them, I'm using WHERE clause. Why this behavior?
=> explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto, bytes, packets, flows FROM "acct_2010_26" WHERE "stamp_inserted">='2010-06-28 09:07' AND "stamp_inserted"<'2010-06-29 08:07' AND (ip_dst = '8.8.8.8') ORDER BY bytes DESC LIMIT 50 OFFSET 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=496332.56..496332.69 rows=50 width=50) (actual time=125390.523..125390.540 rows=50 loops=1) -> Sort (cost=496332.56..496351.35 rows=7517 width=50) (actual time=125390.520..125390.525 rows=50 loops=1) Sort Key: bytes -> Index Scan using acct_2010_26_pk on acct_2010_26 (cost=0.00..495848.62 rows=7517 width=50) (actual time=0.589..125385.680 rows=1011 loops=1) Index Cond: ((stamp_inserted >= '2010-06-28 09:07:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-29 08:07:00'::timestamp without time zone) AND (ip_dst = '8.8.8.8'::inet)) Total runtime: 125390.711 ms (6 registros) => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, tcp_flags, ip_proto, bytes, packets, flows FROM "acct_2010_26" WHERE "stamp_inserted">='2010-06-28 09:07' AND "stamp_inserted"<'2010-06-29 08:07' ORDER BY bytes DESC LIMIT 50 OFFSET 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..268.25 rows=50 width=50) (actual time=0.150..70.780 rows=50 loops=1) -> Index Scan Backward using ibytes_acct_2010_26 on acct_2010_26 (cost=0.00..133240575.70 rows=24835384 width=50) (actual time=0.149..70.762 rows=50 loops=1) Filter: ((stamp_inserted >= '2010-06-28 09:07:00'::timestamp without time zone) AND (stamp_inserted < '2010-06-29 08:07:00'::timestamp without time zone)) Total runtime: 70.830 ms (4 registros) Thanks in advance. 2010/6/23 Sergio Charpinel Jr. <sergiocharpi...@gmail.com> > Craig, Russel, > > I appreciate your help. > > Thanks. > > 2010/6/22 Russell Smith <mr-r...@pws.com.au> > > On 22/06/10 00:42, Sergio Charpinel Jr. wrote: >> > Hi, >> > >> [snip] >> > >> > => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, >> > tcp_flags, ip_proto,SUM("bytes"),SUM("packets"),SUM("flows") FROM >> > "acct_2010_25" WHERE "stamp_inserted">='2010-06-20 10:10' AND >> > "stamp_inserted"<'2010-06-21 10:10' GROUP BY ip_src, port_src, ip_dst, >> > port_dst, tcp_flags, ip_proto order by SUM(bytes) desc LIMIT 50 OFFSET >> 0; >> > >> > QUERY PLAN >> > >> > >> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > Limit (cost=3998662.81..3998662.94 rows=50 width=50) (actual >> > time=276981.107..276981.133 rows=50 loops=1) >> > -> Sort (cost=3998662.81..4001046.07 rows=953305 width=50) >> > (actual time=276981.105..276981.107 rows=50 loops=1) >> > Sort Key: sum(bytes) >> > -> GroupAggregate (cost=3499863.27..3754872.33 rows=953305 >> > width=50) (actual time=165468.257..182677.580 rows=8182616 loops=1) >> > -> Sort (cost=3499863.27..3523695.89 rows=9533049 >> > width=50) (actual time=165468.022..168908.828 rows=9494165 loops=1) >> > Sort Key: ip_src, port_src, ip_dst, port_dst, >> > tcp_flags, ip_proto >> >> You are having to sort and aggregate a large number of rows before you >> can get the top 50. That's 9 million rows in this case, width 50 = >> 400MB+ sort. That's going to be slow as you are going to have to sort >> it on disk unless you bump up sort mem to 500Mb (bad idea). So unless >> you have really fast storage for temporary tables it's going to take a >> while. About 2.5 minutes you are experiencing at the moment is probably >> not too bad. >> >> I'm sure improvements have been made in the area since 8.1 and if you >> are able to upgrade to 8.4 which is also offered by Centos5 now, you >> might get benefit there. I can't remember the specific benefits, but I >> believe sorting speed has improved, your explain analyze will also give >> you more information about what's going on with disk/memory sorting. >> >> > -> Seq Scan on acct_2010_25 >> > (cost=0.00..352648.10 rows=9533049 width=50) (actual >> > time=0.038..50860.391 rows=9494165 loops=1) >> > Filter: ((stamp_inserted >= '2010-06-20 >> > 10:10:00'::timestamp without time zone) AND (stamp_inserted < >> > '2010-06-21 10:10:00'::timestamp without time zone)) >> > Total runtime: 278791.661 ms >> > (9 registros) >> > >> > Another one just summing bytes (still low): >> > >> > => explain analyze SELECT ip_src, port_src, ip_dst, port_dst, >> > tcp_flags, ip_proto,SUM("bytes") FROM "acct_2010_25" WHERE >> > "stamp_inserted">='2010-06-20 10:10' AND "stamp_inserted"<'2010-06-21 >> > 10:10' GROUP BY ip_src, port_src, ip_dst, port_dst, tcp_flags, >> > ip_proto LIMIT 50 OFFSET 0; >> > >> > QUERY PLAN >> > >> > >> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > Limit (cost=3395202.50..3395213.12 rows=50 width=42) (actual >> > time=106261.359..106261.451 rows=50 loops=1) >> > -> GroupAggregate (cost=3395202.50..3602225.48 rows=974226 >> > width=42) (actual time=106261.357..106261.435 rows=50 loops=1) >> > -> Sort (cost=3395202.50..3419558.14 rows=9742258 width=42) >> > (actual time=106261.107..106261.169 rows=176 loops=1) >> > Sort Key: ip_src, port_src, ip_dst, port_dst, >> > tcp_flags, ip_proto >> > -> Seq Scan on acct_2010_25 (cost=0.00..367529.72 >> > rows=9742258 width=42) (actual time=0.073..8058.598 rows=9494165 >> loops=1) >> > Filter: ((stamp_inserted >= '2010-06-20 >> > 10:10:00'::timestamp without time zone) AND (stamp_inserted < >> > '2010-06-21 10:10:00'::timestamp without time zone)) >> > Total runtime: 109911.882 ms >> > (7 registros) >> > >> > >> > The server has 2 Intel(R) Xeon(R) CPU E5430 @ 2.66GHz and 16GB RAM. >> > I'm using PostgreSQL 8.1.18 default config from Centos 5.5 (just >> > increased checkpoint_segments to 50). >> >> Checkpoint segments won't help you as the number of segments is about >> writing to the database and how fast that can happen. >> >> > >> > What can I change to increase performance? >> >> Increasing sort-memory (work_mem) will give you speed benefits even >> though you are going to disk. I don't know how much spare memory you >> have, but trying other values between 8MB and 128MB may be useful just >> for the specific query runs. If you can afford 512Mb for each of the >> two sorts, go for that, but it's dangerous as mentioned due to the risk >> of using more RAM than you have. work_mem allocates that amount of >> memory per sort. >> >> If you are running these queries all the time, a summary table the >> produces there reports on a regular basis, maybe daily or even hourly >> would be useful. Basically the large amount of information that needs >> to be processed and sorted is what's taking all the time here. >> >> Regards >> >> Russell >> > > > > -- > Sergio Roberto Charpinel Jr. > -- Sergio Roberto Charpinel Jr.