Hi Andres, On Mon, 14 Jun 2021 15:01:14 -0700 Andres Freund <and...@anarazel.de> wrote:
> On 2021-06-14 23:20:47 +0200, Jehan-Guillaume de Rorthais wrote: > > > On 2021-06-14 16:10:32 +0200, Jehan-Guillaume de Rorthais wrote: > > > > In the patch in attachment, I tried to fix this by using kind of an > > > > internal hook for pgstat_report_wait_start and pgstat_report_wait_end. > > > > This allows to "instrument" wait events only when required, on the fly, > > > > dynamically. > > > > > > That's *far worse*. You're adding an indirect function call. Which > > > requires loading a global variable and then a far call to a different > > > function. You're changing a path that's ~2 instructions with minimal > > > dependencies (and no branches (i.e. fully out of order executable) to > > > something on the order of ~15 instructions with plenty dependencies and > > > at least two branches (call, ret). > > > > Oh, I didn't realized it would affect all queries, even when > > log_statement_stats was off. Thank you for your explanation. > > Maybe I just am misunderstanding what you were doing? As far as I can > tell your patch changed pgstat_report_wait_start() to be an indirect > function call - right? Exact. I didn't realized this indirection would be so costy on every single calls, after the variable assignation itself. > Then yes, this adds overhead to everything. I understand now, thank you for the explanation. For my own curiosity and study, I'll remove this indirection and bench my patch anyway. > You *could* add a pgstat_report_wait_(start|end)_with_time() or such and > only use that in places that won't have a high frequency. But I just > don't quite see the use-case for that. Well, it could be useful if we decide to only track a subset of wait event. In my scenario, I originally wanted to only track ClientWrite, but then I realized this might be too specific and tried to generalize. There are probably some other way to deal with this issue. Eg.: * do NOT include the time lost waiting for the frontend side in the query execution time * expose the frontend part of the query time in log_min_duration_stmt, auto_explain, pg_stat_statements, in the same fashion we currently do with planning and execution time * having some wait-even sampling mechanism in core or as easy and hot-loadable than auto_explain Thoughts? Thanks again! Regards,