Re: [HACKERS] about explain analyze
did you happen to take a look at the other set of queries I sent you, that were showing 39k and 41k explain results? do they show pretty much the same 'lag' in gettimeofday()? On Fri, 28 Nov 2003, Tom Lane wrote: > > ... if its just me mis-reading the numbers, let me > > know ... it just "feels" off > > Here's what I see: > > > time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives > QUERY PLAN > > Aggregate (cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 > rows=1 loops=1) >-> Seq Scan on url (cost=0.00..10741.55 rows=174255 width=4) (actual > time=0.111..2292.234 rows=215552 loops=1) > Total runtime: 3539.922 ms > (3 rows) > > 0.006u 0.000s 0:03.57 0.0% 0+0k 0+0io 0pf+0w > > So there doesn't seem to be any discrepancy between what EXPLAIN says > and what time(1) says. It is true that the overhead of EXPLAIN ANALYZE > looks a bit high: > > > time psql -c "SELECT count(rec_id) FROM url" 186_archives > count > > 215169 > (1 row) > > 0.000u 0.004s 0:00.77 0.0% 0+0k 0+0io 0pf+0w > > If I'm doing the arithmetic correctly this makes the measurement > overhead about 13 microseconds per row. Presumably that's almost > entirely spent on the two gettimeofday() calls, so they are costing > upwards of 6 microseconds apiece, which seems like a lot for a modern > processor. Might be worth griping to the BSD kernel folk... > > Still though I think we could be proud that we've gotten the price of > a seqscan down to the point where a couple of gettimeofday() calls per > row are dominating the runtime. > > regards, tom lane > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] about explain analyze
"Marc G. Fournier" <[EMAIL PROTECTED]> writes: > did you happen to take a look at the other set of queries I sent you, that > were showing 39k and 41k explain results? You mean this one? > time psql -c "explain analyze select max(intag) from ndict3" 186_archives QUERY PLAN Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual time=48953.823..48953.827 rows=1 loops=1) -> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=4.903..-666785.605 rows=3516680 loops=1) Total runtime: 48982.514 ms (3 rows) 0.000u 0.005s 0:49.06 0.0% 0+0k 0+0io 0pf+0w > time psql -c "select max(intag) from ndict3" 186_archives max 2147418368 (1 row) 0.000u 0.005s 0:03.06 0.0% 0+0k 0+0io 0pf+0w This looks like 46 seconds of overhead for 3516680 rows, or still right about 13 microseconds per row, so that's consistent. The negative "actual time" measurement for the Seq Scan row seems pretty broken though :-(. Now that I recall, didn't you complain of something similar with a beta? regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] about explain analyze
Tom, I am afraid that I don't understand. My first example: wow=# explain analyze select max(click.accesses) from click; QUERY PLAN - Aggregate (cost=1103.70..1103.70 rows=1 width=8) (actual time=289.391..289.393 rows=1 loops=1) -> Seq Scan on click (cost=0.00..971.36 rows=52936 width=8) (actual time=0.013..133.943 rows=52936 loops=1) Total runtime: 289.498 ms (3 rows) Time: 290,695 ms select max() returns only one row... If you mean rows in Seq Scan than gettimeofday was called (52936+1)*2. Huge value :( Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: How many times is gettimeofday called? Twice per plan node visit, if you are doing EXPLAIN ANALYZE. (The number of "visits" is one more than the number of rows returned.) regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Teodor Sigaev E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] about explain analyze
On Fri, 28 Nov 2003, Tom Lane wrote: > "Marc G. Fournier" <[EMAIL PROTECTED]> writes: > > did you happen to take a look at the other set of queries I sent you, that > > were showing 39k and 41k explain results? > > You mean this one? Yup ... > > time psql -c "explain analyze select max(intag) from ndict3" 186_archives > QUERY PLAN > > Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual > time=48953.823..48953.827 rows=1 loops=1) >-> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual > time=4.903..-666785.605 rows=3516680 loops=1) > Total runtime: 48982.514 ms > (3 rows) > > 0.000u 0.005s 0:49.06 0.0% 0+0k 0+0io 0pf+0w > > time psql -c "select max(intag) from ndict3" 186_archives > max > > 2147418368 > (1 row) > > 0.000u 0.005s 0:03.06 0.0% 0+0k 0+0io 0pf+0w > > This looks like 46 seconds of overhead for 3516680 rows, or still right > about 13 microseconds per row, so that's consistent. The negative > "actual time" measurement for the Seq Scan row seems pretty broken > though :-(. > > Now that I recall, didn't you complain of something similar with a beta? Yup ... and I bet its not reproducible yet again, is it? :) That would make for twice though, with v7.4, that I've come up with - results :) Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] about explain analyze
Teodor Sigaev <[EMAIL PROTECTED]> writes: > Tom, I am afraid that I don't understand. My first example: > wow=# explain analyze select max(click.accesses) from click; > QUERY PLAN > - > Aggregate (cost=1103.70..1103.70 rows=1 width=8) (actual > time=289.391..289.393 rows=1 loops=1) > -> Seq Scan on click (cost=0.00..971.36 rows=52936 width=8) (actual > time=0.013..133.943 rows=52936 loops=1) > Total runtime: 289.498 ms > (3 rows) > Time: 290,695 ms > select max() returns only one row... If you mean rows in Seq Scan than > gettimeofday was called (52936+1)*2. Huge value :( Right, that many times at the Seq Scan plan node, and (1+1)*2 more times at the Aggregate node. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] about explain analyze
"Marc G. Fournier" <[EMAIL PROTECTED]> writes: > On Fri, 28 Nov 2003, Tom Lane wrote: >> Now that I recall, didn't you complain of something similar with a beta? > Yup ... and I bet its not reproducible yet again, is it? :) That would > make for twice though, with v7.4, that I've come up with - results :) Well, it's not reproducibly negative, but it seems reproducibly wrong: Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual time=49641.603..49641.611 rows=1 loops=1) -> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=34.854..724754.474 rows=3570252 loops=1) Total runtime: 49688.524 ms Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual time=36625.013..36625.018 rows=1 loops=1) -> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.128..-676113.173 rows=3572298 loops=1) Total runtime: 36625.779 ms Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual time=41380.881..41380.885 rows=1 loops=1) -> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual time=0.091..718200.092 rows=3575264 loops=1) Total runtime: 41381.504 ms (3 rows) I'm wondering if there's an actual bug in gettimeofday() in this platform, such that once in a while it returns a value that's off a minute or so ... regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] about explain analyze
None that I've heard of ... Sean? Tom, any way of writing a quite test C program for this? Something to 'simulate' the same thing, but without having to build the whole postmaster? On Fri, 28 Nov 2003, Tom Lane wrote: > "Marc G. Fournier" <[EMAIL PROTECTED]> writes: > > On Fri, 28 Nov 2003, Tom Lane wrote: > >> Now that I recall, didn't you complain of something similar with a beta? > > > Yup ... and I bet its not reproducible yet again, is it? :) That would > > make for twice though, with v7.4, that I've come up with - results :) > > Well, it's not reproducibly negative, but it seems reproducibly wrong: > > Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual > time=49641.603..49641.611 rows=1 loops=1) >-> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual > time=34.854..724754.474 rows=3570252 loops=1) > Total runtime: 49688.524 ms > > Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual > time=36625.013..36625.018 rows=1 loops=1) >-> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual > time=0.128..-676113.173 rows=3572298 loops=1) > Total runtime: 36625.779 ms > > Aggregate (cost=40168.96..40168.96 rows=1 width=4) (actual > time=41380.881..41380.885 rows=1 loops=1) >-> Seq Scan on ndict3 (cost=0.00..34560.57 rows=2243357 width=4) (actual > time=0.091..718200.092 rows=3575264 loops=1) > Total runtime: 41381.504 ms > (3 rows) > > I'm wondering if there's an actual bug in gettimeofday() in this > platform, such that once in a while it returns a value that's off > a minute or so ... > > regards, tom lane > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] about explain analyze
> ... if its just me mis-reading the numbers, let me > know ... it just "feels" off Here's what I see: > time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives QUERY PLAN Aggregate (cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 rows=1 loops=1) -> Seq Scan on url (cost=0.00..10741.55 rows=174255 width=4) (actual time=0.111..2292.234 rows=215552 loops=1) Total runtime: 3539.922 ms (3 rows) 0.006u 0.000s 0:03.57 0.0% 0+0k 0+0io 0pf+0w So there doesn't seem to be any discrepancy between what EXPLAIN says and what time(1) says. It is true that the overhead of EXPLAIN ANALYZE looks a bit high: > time psql -c "SELECT count(rec_id) FROM url" 186_archives count 215169 (1 row) 0.000u 0.004s 0:00.77 0.0% 0+0k 0+0io 0pf+0w If I'm doing the arithmetic correctly this makes the measurement overhead about 13 microseconds per row. Presumably that's almost entirely spent on the two gettimeofday() calls, so they are costing upwards of 6 microseconds apiece, which seems like a lot for a modern processor. Might be worth griping to the BSD kernel folk... Still though I think we could be proud that we've gotten the price of a seqscan down to the point where a couple of gettimeofday() calls per row are dominating the runtime. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] about explain analyze
Teodor Sigaev <[EMAIL PROTECTED]> writes: > How many times is gettimeofday called? Twice per plan node visit, if you are doing EXPLAIN ANALYZE. (The number of "visits" is one more than the number of rows returned.) regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] about explain analyze
How many times is gettimeofday called? wow=# explain analyze select 1; QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.008..0.010 rows=1 loops=1) Total runtime: 0.047 ms (2 rows) Several thousands? Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: Explain analyze takes 3 times more time for execution. Why? Measurement overhead. It would seem your platform has a particularly slow version of gettimeofday() though ... I've never noticed such a large discrepancy myself. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Teodor Sigaev E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] about explain analyze
On Fri, 28 Nov 2003, Tom Lane wrote: > "Marc G. Fournier" <[EMAIL PROTECTED]> writes: > > Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE" > > seems to be showing some *very* high ms for execution time, but if you run > > the actual query, it doesn't seem to take even 1/10th the time reported > > ... > > Example? I don't see anything out of line here. pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "SELECT count(rec_id) FROM url" 186_archives count 209872 (1 row) pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "EXPLAIN ANALYZE SELECT max(rec_id) FROM url" 186_archives QUERY PLAN Aggregate (cost=11177.19..11177.19 rows=1 width=4) (actual time=2400.579..2400.585 rows=1 loops=1) -> Seq Scan on url (cost=0.00..10741.55 rows=174255 width=4) (actual time=0.056..1387.803 rows=209872 loops=1) Total runtime: 2407.095 ms (3 rows) pgsql74# time /usr/local/pgsql/bin/psql -U pgsql -c "SELECT max(rec_id) FROM url" 186_archives max 690582 (1 row) 0.000u 0.004s 0:00.32 0.0% 0+0k 0+0io 0pf+0w Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] about explain analyze
"Marc G. Fournier" <[EMAIL PROTECTED]> writes: > Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE" > seems to be showing some *very* high ms for execution time, but if you run > the actual query, it doesn't seem to take even 1/10th the time reported > ... Example? I don't see anything out of line here. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] about explain analyze
On Fri, 28 Nov 2003, Tom Lane wrote: > Teodor Sigaev <[EMAIL PROTECTED]> writes: > > Explain analyze takes 3 times more time for execution. Why? > > Measurement overhead. It would seem your platform has a particularly > slow version of gettimeofday() though ... I've never noticed such a > large discrepancy myself. Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE" seems to be showing some *very* high ms for execution time, but if you run the actual query, it doesn't seem to take even 1/10th the time reported ... This is on a FreeBSD 4.x box ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] about explain analyze
Teodor Sigaev <[EMAIL PROTECTED]> writes: > Explain analyze takes 3 times more time for execution. Why? Measurement overhead. It would seem your platform has a particularly slow version of gettimeofday() though ... I've never noticed such a large discrepancy myself. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings