RE: Re: [Proposal] Add accumulated statistics for wait event

2018-07-24 Thread Phil Florent
Hi,

I am skeptical about accumulated statistics.

pg_stat_activity now gives necessary information about wait events. It can be 
easily be used with a polling system that sleeps most of the time to limit the 
overhead. Measuring the duration of individual wait events is not necessary to 
know the repartition of the charge.

You can aggregate the results of the polling by application, query, wait events 
or whatever you want.

I wrote a script for that that can be used interactively or in batch mode to 
produce reports but many solutions exist .

Best regards

Phil

<http://aka.ms/weboutlook>



De : MyungKyu LIM 
Envoyé : mardi 24 juillet 2018 12:10
À : Alexander Korotkov; pgsql-hack...@postgresql.org
Cc : Woosung Sohn; DoHyung HONG
Objet : RE: Re: [Proposal] Add accumulated statistics for wait event

> On Mon, Jul 23, 2018 at 10:53 AM Michael Paquier  wrote:
>> What's the performance penalty?  I am pretty sure that this is
>> measurable as wait events are stored for a backend for each I/O
>> operation as well, and you are calling a C routine within an inlined
>> function which is designed to be light-weight, doing only a four-byte
>> atomic operation.

> Yes, the question is overhead of measuring durations of individual wait 
> events.  It has been proposed before, and there been heated debates about 
> that (see threads [1-3]).  It doesn't seem
> to be a conclusion about this feature.  The thing to be said for sure:
> performance penalty heavily depends on OS/hardware/workload.  In some cases 
> overhead is negligible, but in other cases it appears to be huge.

Thanks for good information.
I agree. Performance penalty is exist.
But wait stats are demandable and useful. In some cases, it is worth 
sacrificing performance and using it.

So, what do you think about developing as extension? I have another concept 
proposal.
2. This feature can be implemented as extension if some hooks were provided in 
following functions,
 - pgstat_report_wait_start
 - pgstat_report_wait_end
This feature can be turned on/off by on-line config when necessary.

Best regards,
MyungKyu, Lim




RE: Re: [Proposal] Add accumulated statistics for wait event

2018-07-24 Thread MyungKyu LIM
> On Mon, Jul 23, 2018 at 10:53 AM Michael Paquier  wrote:
>> What's the performance penalty?  I am pretty sure that this is 
>> measurable as wait events are stored for a backend for each I/O 
>> operation as well, and you are calling a C routine within an inlined 
>> function which is designed to be light-weight, doing only a four-byte 
>> atomic operation.

> Yes, the question is overhead of measuring durations of individual wait 
> events.  It has been proposed before, and there been heated debates about 
> that (see threads [1-3]).  It doesn't seem 
> to be a conclusion about this feature.  The thing to be said for sure:
> performance penalty heavily depends on OS/hardware/workload.  In some cases 
> overhead is negligible, but in other cases it appears to be huge.

Thanks for good information.
I agree. Performance penalty is exist.
But wait stats are demandable and useful. In some cases, it is worth 
sacrificing performance and using it.

So, what do you think about developing as extension? I have another concept 
proposal.
2. This feature can be implemented as extension if some hooks were provided in 
following functions,
 - pgstat_report_wait_start
 - pgstat_report_wait_end
This feature can be turned on/off by on-line config when necessary.

Best regards,
MyungKyu, Lim
 



RE: Re: [Proposal] Add accumulated statistics for wait event

2018-07-24 Thread MyungKyu LIM
 2018-07-23 16:53 (GMT+9), Michael Paquier wrote:
> On Mon, Jul 23, 2018 at 04:04:42PM +0900, 임명규 wrote:
>> This proposal is about recording additional statistics of wait events.
 
> I have comments about your patch.  First, I don't think that you need to
> count precisely the number of wait events triggered as usually when it
> comes to analyzing a workload's bottleneck what counts is a periodic
> *sampling* of events, patterns which can be fetched already from
> pg_stat_activity and stored say in a different place.

Thanks for your feedback.

This proposal is not about *sampling*. 
Accumulated statistics of wait events information is useful for solving
issue. It can measure accurate data. 

Some case, sampling of events can not find the cause of issue. It lose detail 
data.
For example, some throughput issue occur(ex : disk io), but each wait point 
occurs only a few milliseconds. 
In this case, it is highly likely that will not find the cause.

> This is ugly and unmaintainable style.

I'm sorry. You're right.
Think as the PoC.
 
> What's the performance penalty?

I have same worries. I just tried pgbench several times.
Let me know what some good performance check method.

Best regards,
MyungKyu, Lim