RE: minimizing pg_stat_statements performance overhead

Raymond Martin Wed, 27 Mar 2019 15:16:15 -0700

Hi Fabien, 
Thank you for your time. Apologies for not being more specific about my testing 
methodology.


> > PGSS not loaded: 0.18ms
>
> This means 0.0018 ms latency per transaction, which seems rather fast, on my 
> laptop I have typically 0.0XX ms...

This actually means 0.18 milliseconds. I agree that this is a bit high, so I 
instead created an Ubuntu VM to get results that would align with yours. 

> I could not reproduce these results on my ubuntu laptop. Could you be more 
> precise about the test? Did you use pgbench? Did it run in parallel? What 
> options were used? What is the test script?

I did not use pgbench. It is important to call pg_stat_statements_reset before 
every query. This simulates a user that is performing distinct and non-repeated 
queries on their database. If you use prepared statements or the same set of 
queries each time, you would remove the contention on the pgss query text file. 
I re-tested this on an Ubuntu machine with 4cores and 14GB ram. I did not run 
it in parallel. I used a python script that implements the follow logic: 
        - select pg_stat_statements_reset() -- this is important because we are 
making pgss treat the 'select 1' like a new query which it has not cached into 
pgss_hash. 
        - time 'select 1'
Repeat 100 times for each configuration. 

Here are my Ubuntu results:
  pgss unloaded
  Mean: 0.076
  Standard Deviation: 0.038

  pgss.track=none
  Mean: 0.099
  Standard Deviation: 0.040
 
  pgss.track=top 
  Mean: 0.098
  Standard Deviation: 0.107

  pgss.track=none + patch
  Mean: 0.078
  Standard Deviation: 0.042

The results are less noticeable, but I still see about a 20% performance 
improvement here.

> There I have an impact of 10% in these ideal testing conditions wrt latency 
> where the DB does basically nothing, thus which would not warrant to disable 
> pg_stat_statements given the great service this extension brings to 
> performance analysis.

I agree that pg_stat_statements should not be disabled based on these 
performance results. 

> Note that this does not mean that the patch should not be applied, it looks 
> like an oversight, but really I do not have the performance degradation you 
> are suggesting.

I appreciate your input and I want to come up with a canonical test that makes 
this contention more obvious. 
Unfortunately, it is difficult because the criteria that causes this slow down 
(large query sizes and distinct non-repeated queries) are difficult to 
reproduce with pgbench. I would be open to any suggestions here. 

So even though the performance gains in this specific scenario are not as 
great, do you still think it would make sense to submit a patch like this? 

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL

RE: minimizing pg_stat_statements performance overhead

Reply via email to