On Wed, Jun 07, 2006 at 03:32:35PM -0400, Tom Lane wrote:
> On reflection it's easy to imagine other cases where some iterations
> take much longer than others in a not-very-predictable way.  For
> instance, a join where only a subset of the outer tuples have a match
> is going to act that way.  I don't think there's any good way that we
> can be sure we have a representative sample of executions, and so I'm
> afraid this approach to sampling EXPLAIN ANALYZE is a failure.

I don't think we ever assumed it would never be a problem. We just
assumed that the sampling would cancel the effect out to give a decent
average.

Thing is, I never expected to get a three order magnitude difference
between samples. That's just far too much to be corrected in any way.
The random sampling should counter most effects, and I didn't consider
the one tuple in a million that takes much longer to be a particularly
common case.

Anyway, as a test, if you take the approach that the measurement at
item X only applies to the tuples immediately preceding it, for the
data you posted you get a result of 0.681148 seconds. How long did that
query run that produced that data?

(The bit of perl I used is:

cat data | perl -lne 'BEGIN { $last=0; $sum =0 } 
/: (\d+) usec in iter (\d+)/ && do { $sum += ($2-$last)*$1; $last=$2 }; 
END { print "$sum\n" }' 

> I propose we revert this patch and think about an interrupt-driven
> sampling method instead.

That's another possibility ofcourse...
-- 
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.

Attachment: signature.asc
Description: Digital signature

Reply via email to