Tom Lane wrote: > I wrote: > > Anyway it's only a guess. It could well be that that machine was simply > > so heavily loaded that the stats collector couldn't respond fast enough. > > I'm just wondering whether there's an unrecognized bug lurking here. > > Still meditating on this ... and it strikes me that the pgstat.c code > is really uncommunicative about problems. In particular, > pgstat_read_statsfile_timestamp and pgstat_read_statsfile don't complain > at all about being unable to read a stats file.
Yeah, I had the same thought. > Lastly, backend_read_statsfile is designed to send an inquiry message > every time through the loop, ie, every 10 msec. This is said to be in > case the stats collector drops one. But is this enough to flood the > collector and make things worse? I wonder if there should be some > backoff there. I also think the autovacuum worker minimum timestamp may be playing games with the retry logic too. Maybe a worker is requesting a new file continuously because pgstat is not able to provide one before the deadline is past, and thus overloading it. I still think that 500ms is too much for a worker, but backing off all the way to 10ms seems too much. Maybe it should just be, say, 100ms. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers