Tom Lane wrote:
> I wrote:
> > Anyway it's only a guess.  It could well be that that machine was simply
> > so heavily loaded that the stats collector couldn't respond fast enough.
> > I'm just wondering whether there's an unrecognized bug lurking here.
> 
> Still meditating on this ... and it strikes me that the pgstat.c code
> is really uncommunicative about problems.  In particular, 
> pgstat_read_statsfile_timestamp and pgstat_read_statsfile don't complain
> at all about being unable to read a stats file.

Yeah, I had the same thought.

> Lastly, backend_read_statsfile is designed to send an inquiry message
> every time through the loop, ie, every 10 msec.  This is said to be in
> case the stats collector drops one.  But is this enough to flood the
> collector and make things worse?  I wonder if there should be some
> backoff there.

I also think the autovacuum worker minimum timestamp may be playing
games with the retry logic too.  Maybe a worker is requesting a new file
continuously because pgstat is not able to provide one before the
deadline is past, and thus overloading it.  I still think that 500ms is
too much for a worker, but backing off all the way to 10ms seems too
much.  Maybe it should just be, say, 100ms.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to