Darcy Buskermolen <[EMAIL PROTECTED]> writes: > After 30+ minutes I'm now starting to see the same problems on my 5.2 testbed > as well. I'm going to fire this test up on my FreeBSD 3.x and 2.x boxen and > see if' it's there too. I can confirm that this not a SMP issue as that it's > happeing on both UP and SMP boxen for me.
Good, that's one variable eliminated. Looking at my own data, I notice that when the error happens, the elapsed time shown between the immediately preceding and following okay-looking timestamps is always significant (at least 100 msec and often a second or more). That is kind of a lot for a tight loop containing one simple kernel call, no? I am suspicious that the failure occurs when gettimeofday() is called just as the process is losing control of the CPU (due to using up its timeslice or whatever). When control eventually returns, the process gets a reading that is neither pre-loss-of-CPU nor post-regain-of-CPU, but some unholy combination that nets out as a time about 15 min in the past. Just a theory, but it fits some of the available facts. Can anyone think of a significant interpretation for the number 695 seconds? That's got to be an important clue ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org