Re: [GENERAL] postgres FROM clause problem
Alle 18:53, giovedì 11 marzo 2004, hai scritto: > Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > > [ query plans after updating to 7.4.2 ] > > Okay, they're certainly a lot closer than before, so I think I was right > that you were getting bitten somehow by the pg_statistic alignment > problem. It seems like there may still be a bug lurking though. The > rows-out estimates are 7 versus 9 for the middle nested-loop join. > That might just be roundoff error, or there may be something else going > on. Also the estimates for the top join are 7 versus 1 rows, which > seems too large a ratio to be explainable as roundoff error. > > Could I pester you to send me a pg_dump dump of this database off-list? > I'd like to trace through the planner with a debugger and see exactly > where the results are diverging. > > regards, tom lane > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly I have reconstructed the database from zero and i have tried the usual two query in various conditions : NO ANALYZE : similar execution time but different plan 1) FROM seat,spettacoli,tran,teatri QUERY PLAN --- HashAggregate (cost=51.00..51.00 rows=1 width=37) (actual time=4163.191..4164.883 rows=1317 loops=1) -> Nested Loop (cost=0.00..50.99 rows=1 width=37) (actual time=0.253..3868.225 rows=67218 loops=1) -> Nested Loop (cost=0.00..46.15 rows=1 width=50) (actual time=0.215..2903.309 rows=68167 loops=1) -> Nested Loop (cost=0.00..41.31 rows=1 width=38) (actual time=0.124..508.361 rows=68174 loops=1) -> Index Scan using time_idx on tran (cost=0.00..17.11 rows=5 width=32) (actual time=0.068..42.747 rows=24923 loops=1) Index Cond: (("time" >= ('2004-03-01 00:00:00'::timestamp without time zone)::timestamp with time zone) AND ("time" <= ('2004-03-08 23:59:59'::timestamp without time zone)::timestamp with time zone)) -> Index Scan using id_system_idx on seat (cost=0.00..4.83 rows=1 width=14) (actual time=0.007..0.010 rows=3 loops=24923) Index Cond: (("outer".id = seat.bt_id) AND ("outer".system = seat.system)) -> Index Scan using spe_sys_tea_perf on spettacoli (cost=0.00..4.83 rows=1 width=31) (actual time=0.031..0.032 rows=1 loops=68174) Index Cond: ((spettacoli.system = "outer".system) AND ("outer".thea_code = spettacoli.teatro) AND (("outer".perf_code)::text = (spettacoli.code)::text)) -> Index Scan using teatri_pkey on teatri (cost=0.00..4.82 rows=1 width=13) (actual time=0.011..0.012 rows=1 loops=68167) Index Cond: (teatri.code = "outer".thea_code) Total runtime: 4165.522 ms 2) FROM seat,teatri,tran,spettacoli QUERY PLAN --- HashAggregate (cost=51.00..51.00 rows=1 width=37) (actual time=4161.159..4162.811 rows=1317 loops=1) -> Nested Loop (cost=0.00..50.99 rows=1 width=37) (actual time=0.251..3865.205 rows=67218 loops=1) -> Nested Loop (cost=0.00..46.14 rows=1 width=51) (actual time=0.186..1512.814 rows=67225 loops=1) -> Nested Loop (cost=0.00..41.31 rows=1 width=38) (actual time=0.121..507.694 rows=68174 loops=1) -> Index Scan using time_idx on tran (cost=0.00..17.11 rows=5 width=32) (actual time=0.066..40.742 rows=24923 loops=1) Index Cond: (("time" >= ('2004-03-01 00:00:00'::timestamp without time zone)::timestamp with time zone) AND ("time" <= ('2004-03-08 23:59:59'::timestamp without time zone)::timestamp with time zone)) -> Index Scan using id_system_idx on seat (cost=0.00..4.83 rows=1 width=14) (actual time=0.008..0.011 rows=3 loops=24923) Index Cond: (("outer".id = seat.bt_id) AND ("outer".system = seat.system)) -> Index Scan using teatri_pkey on teatri (cost=0.00..4.82 rows=1 width=13) (actual time=0.011..0.012 rows=1 loops=68174) Index Cond: (teatri.code = "outer".thea_code) -> Index Sc
Re: [GENERAL] postgres FROM clause problem (GROUP BY subquestion)
Alle 20:14, giovedì 11 marzo 2004, hai scritto: > On Thu, Mar 11, 2004 at 09:43:57 +0100, > > Paolo Tavalazzi <[EMAIL PROTECTED]> wrote: > > Alle 19:12, mercoledì 10 marzo 2004, hai scritto: > > > On Wed, Mar 10, 2004 at 18:33:41 +0100, > > > > > > Paolo Tavalazzi <[EMAIL PROTECTED]> wrote: > > > > I don't know what to make, help me please! > > > > > > > > Another question, in 7.4 GROUP BY clause not guarantee the ordering > > > > of result. Can I desable the new group by algorithm to maintain the > > > > coherence whit the programs that I have in production with pg_7.3 > > > > > > GROUP BY never guarenteed an order. That this happened in 7.3 was an > > > implementation detail. If you want a specific order you need to use an > > > ORDER BY clause. > > > > Seeing that GROUP BY in 7.3 required sorting the input data to bring > > group members together, this guaranteed to me also the order of the > > attributes in GROUP BY clause. > > This might be a language difference. You weren't guarenteed (or promised) > that the output would be sorted. It just happened that way because of how > the database implemented the group by. By reading the source code for > that version it would be possible to determine that the output would > always be sorted. However, that could change at any upgrade or if you > switched databases. You have reason. My deduction comes only from the reading of the code. I thought to have read it also in the postgres documentation,but it is not true. I have seen that the algorithm of group by in 7.4 is various then 7.3,and it seems more efficent. Therefore, i have changed my program to manage the order by clause. Thank you ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] postgres FROM clause problem (GROUP BY subquestion)
On Thu, Mar 11, 2004 at 09:43:57 +0100, Paolo Tavalazzi <[EMAIL PROTECTED]> wrote: > Alle 19:12, mercoledì 10 marzo 2004, hai scritto: > > On Wed, Mar 10, 2004 at 18:33:41 +0100, > > > > Paolo Tavalazzi <[EMAIL PROTECTED]> wrote: > > > I don't know what to make, help me please! > > > > > > Another question, in 7.4 GROUP BY clause not guarantee the ordering of > > > result. Can I desable the new group by algorithm to maintain the > > > coherence whit the programs that I have in production with pg_7.3 > > > > GROUP BY never guarenteed an order. That this happened in 7.3 was an > > implementation detail. If you want a specific order you need to use an > > ORDER BY clause. > > > Seeing that GROUP BY in 7.3 required sorting the input data to bring group members > together, > this guaranteed to me also the order of the attributes in GROUP BY clause. This might be a language difference. You weren't guarenteed (or promised) that the output would be sorted. It just happened that way because of how the database implemented the group by. By reading the source code for that version it would be possible to determine that the output would always be sorted. However, that could change at any upgrade or if you switched databases. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] postgres FROM clause problem
Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > [ query plans after updating to 7.4.2 ] Okay, they're certainly a lot closer than before, so I think I was right that you were getting bitten somehow by the pg_statistic alignment problem. It seems like there may still be a bug lurking though. The rows-out estimates are 7 versus 9 for the middle nested-loop join. That might just be roundoff error, or there may be something else going on. Also the estimates for the top join are 7 versus 1 rows, which seems too large a ratio to be explainable as roundoff error. Could I pester you to send me a pg_dump dump of this database off-list? I'd like to trace through the planner with a debugger and see exactly where the results are diverging. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] postgres FROM clause problem
Alle 19:40, mercoledì 10 marzo 2004, hai scritto: > Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > > I have applied the procedure for fixing pg_statistic as you had said, > > but the result is the same! > > Hm. It could be a planner bug. Can you reproduce the misbehavior if > you dump and load the tables into a fresh database? If so, could you > send me the dump so I can look at the problem with a debugger? > > (Also, you might try updating to 7.4.2 first and see if that changes > anything. We did fix quite a number of bugs already in 7.4.2...) > > > Another question, in 7.4 GROUP BY clause not guarantee the ordering of > > result. Can I desable the new group by algorithm to maintain the > > coherence whit the programs that I have in production with pg_7.3 > > As Bruno said, your programs are broken because they are assuming > something not guaranteed by the SQL spec. But until you get around to > adding the ORDER BY clauses they should have, see enable_hashagg. > > regards, tom lane I update to 7.4.2 (dump - initdb -restore) and I have had this result : QUERY 1 explain analyze SELECT teatri.code, spettacoli.code, sum(seat.price), sum(seat.prev) FROM seat, spettacoli, tran , teatri WHERE tran.time >= timestamp '2004-02-20 00:00:00' AND tran.time <= timestamp '2004-03-08 23:59:59' AND tran.perf_code = spettacoli.code AND tran.thea_code = spettacoli.teatro AND tran.id = seat.bt_id AND tran.system = seat.system AND spettacoli.system= tran.system AND spettacoli.teatro= teatri.code GROUP BY 1,2 EXPLAIN output : HashAggregate (cost=8057.62..8057.63 rows=1 width=32) (actual time=2711.656..2713.337 rows=1317 loops=1) -> Nested Loop (cost=1538.40..8057.61 rows=1 width=32) (actual time=648.011..2425.314 rows=67218 loops=1) -> Nested Loop (cost=1538.40..8015.80 rows=9 width=45) (actual time=647.968..1494.797 rows=68167 loops=1) -> Merge Join (cost=1538.40..7920.98 rows=16 width=45) (actual time=647.909..1087.032 rows=24919 loops=1) Merge Cond: (("outer".system = "inner".system) AND ("outer".teatro = "inner".thea_code) AND (("outer".code)::text = "inner"."?column5?")) -> Index Scan using spe_sys_tea_perf on spettacoli (cost=0.00..5836.07 rows=56079 width=26) (actual time=0.049..64.118 rows=55565 loops=1) -> Sort (cost=1538.40..1569.84 rows=12576 width=30) (actual time=647.374..653.048 rows=24923 loops=1) Sort Key: tran.system, tran.thea_code, (tran.perf_code)::text -> Index Scan using time_idx on tran (cost=0.00..682.08 rows=12576 width=30) (actual time=0.082..73.057 rows=24923 loops=1) Index Cond: (("time" >= ('2004-03-01 00:00:00'::timestamp without time zone)::timestamp with time zone) AND ("time" <= ('2004-03-08 23:59:59'::timestamp without time zone)::timestamp with time zone)) -> Index Scan using id_system_idx on seat (cost=0.00..5.91 rows=1 width=14) (actual time=0.007..0.010 rows=3 loops=24919) Index Cond: (("outer".id = seat.bt_id) AND ("outer".system = seat.system)) -> Index Scan using teatri_pkey on teatri (cost=0.00..4.63 rows=1 width=13) (actual time=0.010..0.011 rows=1 loops=68167) Index Cond: (teatri.code = "outer".thea_code) Total runtime: 2717.573 ms QUERY 2 explain analyze SELECT teatri.code, spettacoli.code, sum(seat.price), sum(seat.prev) FROM seat, teatri, tran , spettacoli WHERE tran.time >= timestamp '2004-03-01 00:00:00' AND tran.time <= timestamp '2004-03-08 23:59:59' AND tran.perf_code = spettacoli.code AND tran.thea_code = spettacoli.teatro AND tran.id = seat.bt_id AND tran.system = seat.system AND spettacoli.system= tran.system AND spettacoli.teatro= teatri.code GROUP BY 1,2 EXPLAIN output : HashAggregate (cost=8057.62..8057.63 rows=1 width=32) (actual time=2711.656..2713.337 rows=1317 loops=1) -> Nested Loop (cost=1538.40..8057.61 rows=1 width=32) (actual time=648.011..2425.314 rows=67218 loops=1) -> Nested Loop (cost=1538.40..8015.80 rows=9 width=45) (actual time=647.968..1494.797 rows=68167 loops=1) -> Merge Join (cost=1538.40..7920.98 rows=16 width=45) (actual time=647.909..1087.032 rows=24919 loops=1) Merge Cond: (("outer".system = "inner".system) AND ("outer".teatro = "inner".thea_code) AND (("outer".code)::text = "inner"."?column5?")) -> Index Scan using spe_sys_tea_perf on spettacoli (cost=0.00..5836.07 rows=56079 width=26) (actual time=0.049..64.118 rows=55565 loops=1)
Re: [GENERAL] postgres FROM clause problem
Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > But the query plans are still various!! I think you made a copy-and-paste mistake, because the explain results you posted are exactly the same ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] postgres FROM clause problem
Alle 18:03, giovedì 11 marzo 2004, hai scritto: > Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > > But the query plans are still various!! > > I think you made a copy-and-paste mistake, because the explain results > you posted are exactly the same ... > > regards, tom lane Excuse I have mistaken!! QUERY 1 explain analyze SELECT teatri.code, spettacoli.code, sum(seat.price), sum(seat.prev) FROM seat, spettacoli, tran , teatri WHERE tran.time >= timestamp '2004-02-20 00:00:00' AND tran.time <= timestamp '2004-03-08 23:59:59' AND tran.perf_code = spettacoli.code AND tran.thea_code = spettacoli.teatro AND tran.id = seat.bt_id AND tran.system = seat.system AND spettacoli.system= tran.system AND spettacoli.teatro= teatri.code GROUP BY 1,2 EXPLAIN : HashAggregate (cost=8042.89..8042.92 rows=7 width=32) (actual time=2069.895..2071.505 rows=1317 loops=1) -> Hash Join (cost=8015.82..8042.82 rows=7 width=32) (actual time=1538.771..1779.257 rows=67218 loops=1) Hash Cond: ("outer".code = "inner".teatro) -> Seq Scan on teatri (cost=0.00..22.62 rows=862 width=13) (actual time=0.008..0.572 rows=862 loops=1) -> Hash (cost=8015.80..8015.80 rows=7 width=45) (actual time=1538.652..1538.652 rows=0 loops=1) -> Nested Loop (cost=1538.40..8015.80 rows=7 width=45) (actual time=652.105..1486.577 rows=68167 loops=1) -> Merge Join (cost=1538.40..7920.98 rows=16 width=45) (actual time=652.045..1095.559 rows=24919 loops=1) Merge Cond: (("outer".system = "inner".system) AND ("outer".teatro = "inner".thea_code) AND (("outer".code)::text = "inner"."?column5?")) -> Index Scan using spe_sys_tea_perf on spettacoli (cost=0.00..5836.07 rows=56079 width=26) (actual time=0.049..63.556 rows=55565 loops=1) -> Sort (cost=1538.40..1569.84 rows=12576 width=30) (actual time=651.509..656.391 rows=24923 loops=1) Sort Key: tran.system, tran.thea_code, (tran.perf_code)::text -> Index Scan using time_idx on tran (cost=0.00..682.08 rows=12576 width=30) (actual time=0.083..69.887 rows=24923 loops=1) Index Cond: (("time" >= ('2004-03-01 00:00:00'::timestamp without time zone)::timestamp with time zone) AND ("time" <= ('2004-03-08 23:59:59'::timestamp without time zone)::timestamp with time zone)) -> Index Scan using id_system_idx on seat (cost=0.00..5.91 rows=1 width=14) (actual time=0.006..0.010 rows=3 loops=24919) Index Cond: (("outer".id = seat.bt_id) AND ("outer".system = seat.system)) Total runtime: 2076.726 ms QUERY 2 : explain anlyze SELECT teatri.code, spettacoli.code, sum(seat.price), sum(seat.prev) FROM seat, teatri, tran , spettacoli WHERE tran.time >= timestamp '2004-03-01 00:00:00' AND tran.time <= timestamp '2004-03-08 23:59:59' AND tran.perf_code = spettacoli.code AND tran.thea_code = spettacoli.teatro AND tran.id = seat.bt_id AND tran.system = seat.system AND spettacoli.system= tran.system AND spettacoli.teatro= teatri.code GROUP BY 1,2 EXPLAIN output : HashAggregate (cost=8057.62..8057.63 rows=1 width=32) (actual time=2728.066..2729.738 rows=1317 loops=1) -> Nested Loop (cost=1538.40..8057.61 rows=1 width=32) (actual time=665.122..2438.275 rows=67218 loops=1) -> Nested Loop (cost=1538.40..8015.80 rows=9 width=45) (actual time=665.078..1509.890 rows=68167 loops=1) -> Merge Join (cost=1538.40..7920.98 rows=16 width=45) (actual time=665.018..1101.716 rows=24919 loops=1) Merge Cond: (("outer".system = "inner".system) AND ("outer".teatro = "inner".thea_code) AND (("outer".code)::text = "inner"."?column5?")) -> Index Scan using spe_sys_tea_perf on spettacoli (cost=0.00..5836.07 rows=56079 width=26) (actual time=0.046..63.772 rows=55565 loops=1) -> Sort (cost=1538.40..1569.84 rows=12576 width=30) (actual time=664.481..669.947 rows=24923 loops=1) Sort Key: tran.system, tran.thea_code, (tran.perf_code)::text -> Index Scan using time_idx on tran (cost=0.00..682.08 rows=12576 width=30) (actual time=0.080..70.663 rows=24923 loops=1) Index Cond: (("time" >= ('2004-03-01 00:00:00'::timestamp without time zone)::timestamp with time zone) AND ("time" <= ('2004-03-08 23:59:59'::timestamp without time zone)::timestamp with time zone)) -> Index Scan using id_system_id
Re: [GENERAL] postgres FROM clause problem
On Wed, Mar 10, 2004 at 06:33:41PM +0100, Paolo Tavalazzi wrote: > Another question, in 7.4 GROUP BY clause not guarantee the ordering of result. > Can I desable the new group by algorithm to maintain the coherence whit the programs > that I have in production with pg_7.3 Uf you're expecting ordered output, perhaps you should add an ORDER BY clause? -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > If the Catholic church can survive the printing press, science fiction > will certainly weather the advent of bookwarez. >http://craphound.com/ebooksneitherenorbooks.txt - Cory Doctorow pgp0.pgp Description: PGP signature
Re: [GENERAL] postgres FROM clause problem (GROUP BY subquestion)
On Wed, Mar 10, 2004 at 18:33:41 +0100, Paolo Tavalazzi <[EMAIL PROTECTED]> wrote: > > I don't know what to make, help me please! > > Another question, in 7.4 GROUP BY clause not guarantee the ordering of result. > Can I desable the new group by algorithm to maintain the coherence whit the programs > that I have in production with pg_7.3 GROUP BY never guarenteed an order. That this happened in 7.3 was an implementation detail. If you want a specific order you need to use an ORDER BY clause. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] postgres FROM clause problem
Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > I have applied the procedure for fixing pg_statistic as you had said, but the > result is the same! Hm. It could be a planner bug. Can you reproduce the misbehavior if you dump and load the tables into a fresh database? If so, could you send me the dump so I can look at the problem with a debugger? (Also, you might try updating to 7.4.2 first and see if that changes anything. We did fix quite a number of bugs already in 7.4.2...) > Another question, in 7.4 GROUP BY clause not guarantee the ordering of result. > Can I desable the new group by algorithm to maintain the coherence whit the programs > that I have in production with pg_7.3 As Bruno said, your programs are broken because they are assuming something not guaranteed by the SQL spec. But until you get around to adding the ORDER BY clauses they should have, see enable_hashagg. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] postgres FROM clause problem
Alle 16:54, mercoledì 10 marzo 2004, hai scritto: > Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > > I have two query that they are different only for order of the tables > > in FROM lclause , but give back different query plan : > > Hm, seems like the planner is making wacko estimates in the second case. > You didn't say what data types are involved in this query --- are any of > the join columns int8 or float8 or timestamp? If so you might be > getting bitten by the 7.4 pg_statistic alignment bug. Please follow the > repair procedures suggested in the 7.4.2 release notes: > http://developer.postgresql.org/docs/postgres/release.html#RELEASE-7-4-2 > and see if that improves matters. > > regards, tom lane I have applied the procedure for fixing pg_statistic as you had said, but the result is the same! Only tran.time in the query is a timestamp , no int8 or float8. The OS is FEDORA 0.96 x86_64 and the flag --enable-integer-datetimes is false,it could be a problem?? I don't know what to make, help me please! Another question, in 7.4 GROUP BY clause not guarantee the ordering of result. Can I desable the new group by algorithm to maintain the coherence whit the programs that I have in production with pg_7.3 Thank you !!! ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] postgres FROM clause problem
Paolo Tavalazzi <[EMAIL PROTECTED]> writes: > I have two query that they are different only for order of the tables > in FROM lclause , but give back different query plan : Hm, seems like the planner is making wacko estimates in the second case. You didn't say what data types are involved in this query --- are any of the join columns int8 or float8 or timestamp? If so you might be getting bitten by the 7.4 pg_statistic alignment bug. Please follow the repair procedures suggested in the 7.4.2 release notes: http://developer.postgresql.org/docs/postgres/release.html#RELEASE-7-4-2 and see if that improves matters. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend