Re: [HACKERS] about explain analyze

2003-11-28 Thread Marc G. Fournier

did you happen to take a look at the other set of queries I sent you, that
were showing 39k and 41k explain results?  do they show pretty much the
same 'lag' in gettimeofday()?

On Fri, 28 Nov 2003, Tom Lane wrote:

> > ... if its just me mis-reading the numbers, let me
> > know ... it just "feels" off
>
> Here's what I see:
>
> > time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives
>  QUERY PLAN
> 
>  Aggregate  (cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 
> rows=1 loops=1)
>->  Seq Scan on url  (cost=0.00..10741.55 rows=174255 width=4) (actual 
> time=0.111..2292.234 rows=215552 loops=1)
>  Total runtime: 3539.922 ms
> (3 rows)
>
> 0.006u 0.000s 0:03.57 0.0%  0+0k 0+0io 0pf+0w
>
> So there doesn't seem to be any discrepancy between what EXPLAIN says
> and what time(1) says.  It is true that the overhead of EXPLAIN ANALYZE
> looks a bit high:
>
> > time psql -c "SELECT count(rec_id) FROM url" 186_archives
>  count
> 
>  215169
> (1 row)
>
> 0.000u 0.004s 0:00.77 0.0%  0+0k 0+0io 0pf+0w
>
> If I'm doing the arithmetic correctly this makes the measurement
> overhead about 13 microseconds per row.  Presumably that's almost
> entirely spent on the two gettimeofday() calls, so they are costing
> upwards of 6 microseconds apiece, which seems like a lot for a modern
> processor.  Might be worth griping to the BSD kernel folk...
>
> Still though I think we could be proud that we've gotten the price of
> a seqscan down to the point where a couple of gettimeofday() calls per
> row are dominating the runtime.
>
>   regards, tom lane
>


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
"Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> did you happen to take a look at the other set of queries I sent you, that
> were showing 39k and 41k explain results?

You mean this one?

> time psql -c "explain analyze select max(intag) from ndict3" 186_archives
 QUERY PLAN

 Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=48953.823..48953.827 
rows=1 loops=1)
   ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
time=4.903..-666785.605 rows=3516680 loops=1)
 Total runtime: 48982.514 ms
(3 rows)

0.000u 0.005s 0:49.06 0.0%  0+0k 0+0io 0pf+0w
> time psql -c "select max(intag) from ndict3" 186_archives
max

 2147418368
(1 row)

0.000u 0.005s 0:03.06 0.0%  0+0k 0+0io 0pf+0w

This looks like 46 seconds of overhead for 3516680 rows, or still right
about 13 microseconds per row, so that's consistent.  The negative
"actual time" measurement for the Seq Scan row seems pretty broken
though :-(.

Now that I recall, didn't you complain of something similar with a beta?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] about explain analyze

2003-11-28 Thread Teodor Sigaev
Tom, I am afraid that I don't understand. My first example:
wow=# explain analyze select max(click.accesses) from click;
   QUERY PLAN
-
 Aggregate  (cost=1103.70..1103.70 rows=1 width=8) (actual 
time=289.391..289.393 rows=1 loops=1)
   ->  Seq Scan on click  (cost=0.00..971.36 rows=52936 width=8) (actual 
time=0.013..133.943 rows=52936 loops=1)
 Total runtime: 289.498 ms
(3 rows)

Time: 290,695 ms

select max() returns only one row... If you mean rows in Seq Scan than
gettimeofday was called (52936+1)*2. Huge value :(


Tom Lane wrote:
Teodor Sigaev <[EMAIL PROTECTED]> writes:

How many times is gettimeofday called?


Twice per plan node visit, if you are doing EXPLAIN ANALYZE.  (The
number of "visits" is one more than the number of rows returned.)
			regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
--
Teodor Sigaev  E-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] about explain analyze

2003-11-28 Thread Marc G. Fournier
On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> > did you happen to take a look at the other set of queries I sent you, that
> > were showing 39k and 41k explain results?
>
> You mean this one?

Yup ...

> > time psql -c "explain analyze select max(intag) from ndict3" 186_archives
>  QUERY PLAN
> 
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual 
> time=48953.823..48953.827 rows=1 loops=1)
>->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
> time=4.903..-666785.605 rows=3516680 loops=1)
>  Total runtime: 48982.514 ms
> (3 rows)
>
> 0.000u 0.005s 0:49.06 0.0%  0+0k 0+0io 0pf+0w
> > time psql -c "select max(intag) from ndict3" 186_archives
> max
> 
>  2147418368
> (1 row)
>
> 0.000u 0.005s 0:03.06 0.0%  0+0k 0+0io 0pf+0w
>
> This looks like 46 seconds of overhead for 3516680 rows, or still right
> about 13 microseconds per row, so that's consistent.  The negative
> "actual time" measurement for the Seq Scan row seems pretty broken
> though :-(.
>
> Now that I recall, didn't you complain of something similar with a beta?

Yup ... and I bet its not reproducible yet again, is it? :)  That would
make for twice though, with v7.4, that I've come up with - results :)


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
Teodor Sigaev <[EMAIL PROTECTED]> writes:
> Tom, I am afraid that I don't understand. My first example:
> wow=# explain analyze select max(click.accesses) from click;
> QUERY PLAN
> -
>   Aggregate  (cost=1103.70..1103.70 rows=1 width=8) (actual 
> time=289.391..289.393 rows=1 loops=1)
> ->  Seq Scan on click  (cost=0.00..971.36 rows=52936 width=8) (actual 
> time=0.013..133.943 rows=52936 loops=1)
>   Total runtime: 289.498 ms
> (3 rows)

> Time: 290,695 ms

> select max() returns only one row... If you mean rows in Seq Scan than
> gettimeofday was called (52936+1)*2. Huge value :(

Right, that many times at the Seq Scan plan node, and (1+1)*2 more times
at the Aggregate node.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
"Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> On Fri, 28 Nov 2003, Tom Lane wrote:
>> Now that I recall, didn't you complain of something similar with a beta?

> Yup ... and I bet its not reproducible yet again, is it? :)  That would
> make for twice though, with v7.4, that I've come up with - results :)

Well, it's not reproducibly negative, but it seems reproducibly wrong:

 Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=49641.603..49641.611 
rows=1 loops=1)
   ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
time=34.854..724754.474 rows=3570252 loops=1)
 Total runtime: 49688.524 ms

 Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=36625.013..36625.018 
rows=1 loops=1)
   ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
time=0.128..-676113.173 rows=3572298 loops=1)
 Total runtime: 36625.779 ms

 Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual time=41380.881..41380.885 
rows=1 loops=1)
   ->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
time=0.091..718200.092 rows=3575264 loops=1)
 Total runtime: 41381.504 ms
(3 rows)

I'm wondering if there's an actual bug in gettimeofday() in this
platform, such that once in a while it returns a value that's off
a minute or so ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] about explain analyze

2003-11-28 Thread Marc G. Fournier

None that I've heard of ... Sean?

Tom, any way of writing a quite test C program for this?  Something to
'simulate' the same thing, but without having to build the whole
postmaster?

On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> > On Fri, 28 Nov 2003, Tom Lane wrote:
> >> Now that I recall, didn't you complain of something similar with a beta?
>
> > Yup ... and I bet its not reproducible yet again, is it? :)  That would
> > make for twice though, with v7.4, that I've come up with - results :)
>
> Well, it's not reproducibly negative, but it seems reproducibly wrong:
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual 
> time=49641.603..49641.611 rows=1 loops=1)
>->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
> time=34.854..724754.474 rows=3570252 loops=1)
>  Total runtime: 49688.524 ms
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual 
> time=36625.013..36625.018 rows=1 loops=1)
>->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
> time=0.128..-676113.173 rows=3572298 loops=1)
>  Total runtime: 36625.779 ms
>
>  Aggregate  (cost=40168.96..40168.96 rows=1 width=4) (actual 
> time=41380.881..41380.885 rows=1 loops=1)
>->  Seq Scan on ndict3  (cost=0.00..34560.57 rows=2243357 width=4) (actual 
> time=0.091..718200.092 rows=3575264 loops=1)
>  Total runtime: 41381.504 ms
> (3 rows)
>
> I'm wondering if there's an actual bug in gettimeofday() in this
> platform, such that once in a while it returns a value that's off
> a minute or so ...
>
>   regards, tom lane
>


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
> ... if its just me mis-reading the numbers, let me
> know ... it just "feels" off

Here's what I see:

> time psql -c "explain analyze SELECT count(rec_id) FROM url" 186_archives
 QUERY PLAN

 Aggregate  (cost=11177.19..11177.19 rows=1 width=4) (actual time=3539.180..3539.184 
rows=1 loops=1)
   ->  Seq Scan on url  (cost=0.00..10741.55 rows=174255 width=4) (actual 
time=0.111..2292.234 rows=215552 loops=1)
 Total runtime: 3539.922 ms
(3 rows)

0.006u 0.000s 0:03.57 0.0%  0+0k 0+0io 0pf+0w

So there doesn't seem to be any discrepancy between what EXPLAIN says
and what time(1) says.  It is true that the overhead of EXPLAIN ANALYZE
looks a bit high:

> time psql -c "SELECT count(rec_id) FROM url" 186_archives
 count

 215169
(1 row)

0.000u 0.004s 0:00.77 0.0%  0+0k 0+0io 0pf+0w

If I'm doing the arithmetic correctly this makes the measurement
overhead about 13 microseconds per row.  Presumably that's almost
entirely spent on the two gettimeofday() calls, so they are costing
upwards of 6 microseconds apiece, which seems like a lot for a modern
processor.  Might be worth griping to the BSD kernel folk...

Still though I think we could be proud that we've gotten the price of
a seqscan down to the point where a couple of gettimeofday() calls per
row are dominating the runtime.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
Teodor Sigaev <[EMAIL PROTECTED]> writes:
> How many times is gettimeofday called?

Twice per plan node visit, if you are doing EXPLAIN ANALYZE.  (The
number of "visits" is one more than the number of rows returned.)

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] about explain analyze

2003-11-28 Thread Teodor Sigaev
How many times is gettimeofday called?
wow=# explain analyze select 1;
 QUERY PLAN 


 Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.008..0.010 rows=1 loops=1)
 Total runtime: 0.047 ms
(2 rows)
Several thousands?

Tom Lane wrote:
Teodor Sigaev <[EMAIL PROTECTED]> writes:

Explain analyze takes 3 times more time for execution. Why?


Measurement overhead.  It would seem your platform has a particularly
slow version of gettimeofday() though ... I've never noticed such a
large discrepancy myself.
			regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
--
Teodor Sigaev  E-mail: [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] about explain analyze

2003-11-28 Thread Marc G. Fournier
On Fri, 28 Nov 2003, Tom Lane wrote:

> "Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> > Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
> > seems to be showing some *very* high ms for execution time, but if you run
> > the actual query, it doesn't seem to take even 1/10th the time reported
> > ...
>
> Example?  I don't see anything out of line here.

pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "SELECT count(rec_id) FROM url" 
186_archives
 count

 209872
(1 row)

pgsql74# /usr/local/pgsql/bin/psql -U pgsql -c "EXPLAIN ANALYZE SELECT max(rec_id) 
FROM url" 186_archives
 QUERY PLAN

 Aggregate  (cost=11177.19..11177.19 rows=1 width=4) (actual time=2400.579..2400.585 
rows=1 loops=1)
   ->  Seq Scan on url  (cost=0.00..10741.55 rows=174255 width=4) (actual 
time=0.056..1387.803 rows=209872 loops=1)
 Total runtime: 2407.095 ms
(3 rows)

pgsql74# time /usr/local/pgsql/bin/psql -U pgsql -c "SELECT max(rec_id) FROM url" 
186_archives
  max

 690582
(1 row)

0.000u 0.004s 0:00.32 0.0%  0+0k 0+0io 0pf+0w


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
"Marc G. Fournier" <[EMAIL PROTECTED]> writes:
> Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
> seems to be showing some *very* high ms for execution time, but if you run
> the actual query, it doesn't seem to take even 1/10th the time reported
> ...

Example?  I don't see anything out of line here.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] about explain analyze

2003-11-28 Thread Marc G. Fournier
On Fri, 28 Nov 2003, Tom Lane wrote:

> Teodor Sigaev <[EMAIL PROTECTED]> writes:
> > Explain analyze takes 3 times more time for execution. Why?
>
> Measurement overhead.  It would seem your platform has a particularly
> slow version of gettimeofday() though ... I've never noticed such a
> large discrepancy myself.

Actually, I'm noticing similar "problems" with v7.4 ... "EXPLAIN ANALYZE"
seems to be showing some *very* high ms for execution time, but if you run
the actual query, it doesn't seem to take even 1/10th the time reported
...

This is on a FreeBSD 4.x box ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] about explain analyze

2003-11-28 Thread Tom Lane
Teodor Sigaev <[EMAIL PROTECTED]> writes:
> Explain analyze takes 3 times more time for execution. Why?

Measurement overhead.  It would seem your platform has a particularly
slow version of gettimeofday() though ... I've never noticed such a
large discrepancy myself.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings