On Wed, Jun 07, 2006 at 03:32:35PM -0400, Tom Lane wrote: > On reflection it's easy to imagine other cases where some iterations > take much longer than others in a not-very-predictable way. For > instance, a join where only a subset of the outer tuples have a match > is going to act that way. I don't think there's any good way that we > can be sure we have a representative sample of executions, and so I'm > afraid this approach to sampling EXPLAIN ANALYZE is a failure.
I don't think we ever assumed it would never be a problem. We just assumed that the sampling would cancel the effect out to give a decent average. Thing is, I never expected to get a three order magnitude difference between samples. That's just far too much to be corrected in any way. The random sampling should counter most effects, and I didn't consider the one tuple in a million that takes much longer to be a particularly common case. Anyway, as a test, if you take the approach that the measurement at item X only applies to the tuples immediately preceding it, for the data you posted you get a result of 0.681148 seconds. How long did that query run that produced that data? (The bit of perl I used is: cat data | perl -lne 'BEGIN { $last=0; $sum =0 } /: (\d+) usec in iter (\d+)/ && do { $sum += ($2-$last)*$1; $last=$2 }; END { print "$sum\n" }' > I propose we revert this patch and think about an interrupt-driven > sampling method instead. That's another possibility ofcourse... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
signature.asc
Description: Digital signature