RE: [Proposal] Add accumulated statistics for wait event

Phil Florent Sat, 28 Jul 2018 09:46:45 -0700

Hi,

I agree with that. PostgreSQL 10 is really great, tuning tools based on 
sampling of pg_stat_activity became accurate without any modification.

Best regards

Phil

________________________________
De : Michael Paquier <mich...@paquier.xyz>
Envoyé : jeudi 26 juillet 2018 03:24
À : Phil Florent
Cc : Tomas Vondra; pgsql-hackers@lists.postgresql.org
Objet : Re: [Proposal] Add accumulated statistics for wait event

On Tue, Jul 24, 2018 at 04:23:03PM +0000, Phil Florent wrote:
> It loses non meaningful details and it's in fact a good point. In this
> example, sampling will definitely find the cause and won't cost
> resources.

The higher the sampling frequency, the more details you get, with the
most load on the instance.  So if you are able to take an infinity of
samples, where registering multiple times the same event for the same
backend also matters because its overall weight gets higher and it shows
up higher in profiles, then you would be able converge to the set of
results that this patch adds.  Sampling method, especially its
frequency, is something controlled by the client and not the server.
Approaches like the one proposed here push the load on the server-side,
unconditionally, for *all* backends, and this has its cost.

Even if you have spiky workloads, sampling may miss those, but even with
adding counters for each event you would need to query the table holding
the counters at an insane frequency to be able to perhaps get something
out of it as you need to do sampling of the counters as well to extract
deltas.

As Tomas has mentioned up-thread, sampling is light-weight, as-is the
current design for wait events.  Even if it is not perfect because it
cannot give exact numbers, it would find bottlenecks in really most
cases, and that's what matters.  If not, increasing the sampling
frequency makes things easier to detect as well.  What would be the
point of taking only one sample every checkpoint for example?

There may be a benefit in having counters, I don't know the answer to
that, though the point would be to make sure that there is a specific
set of workloads where it makes sense, still my gut feeling is that
sampling would be able to detect those anyway.

(I am not a computer scientist by default but a physicist, think fluid
dynamics and turbulence, and I had my load of random events and signal
analysis as well.  All that is just statistics with attempts to approach
reality, where sampling is a life-saver over exactitude of
measurements.)

Adding hooks is not acceptable to me either, those have a cost, and it
is not clear what's the benefit we can get into putting hooks in such a
place for cases other than sampling and counters...
--
Michael

RE: [Proposal] Add accumulated statistics for wait event

Reply via email to