Hi. 

> I'll send performance measurement results in the next email. Thanks a lot for
> the reviews!

Below are the performance measurement results.
I was only able to use low-spec machine:
CPU 4v, Memory 8GB, RHEL, xfs filesystem.

[Failover/Recovery Test]
1. (Master) Create table (ex. 10,000 tables). Insert data to tables.
2. (M) DELETE FROM TABLE (ex. all rows of 10,000 tables)
3. (Standby) To test with failover, pause the WAL replay on standby server.
(SELECT pg_wal_replay_pause();)
4. (M) psql -c "\timing on" (measures total execution of SQL queries)
5. (M) VACUUM (whole db)
6. (M) After vacuum finishes, stop primary server: pg_ctl stop -w -mi
7. (S) Resume wal replay and promote standby.
Because it's difficult to measure recovery time I used the attached script 
(resume.sh)
that prints timestamp before and after promotion. It basically does the 
following
- "SELECT pg_wal_replay_resume();" is executed and the WAL application is 
resumed.
- "pg_ctl promote" to promote standby.
- The time difference of "select pg_is_in_recovery();" from "t" to "f" is 
measured.

[Results]
Recovery/Failover performance (in seconds). 3 trial runs.

| shared_buffers | master | patch  | %reg    | 
|----------------|--------|--------|---------| 
| 128MB          | 32.406 | 33.785 | 4.08%   | 
| 1GB            | 36.188 | 32.747 | -10.51% | 
| 2GB            | 41.996 | 32.88  | -27.73% |

There's a bit of small regression with the default shared_buffers (128MB),
but as for the recovery time when we have large NBuffers, it's now at least 
almost constant
so there's boosted performance. IOW, we enter the optimization most of the time
during recovery.

I also did similar benchmark performance as what Tomas did [1],
simple "pgbench -S" tests (warmup and then 15 x 1-minute runs with
1, 8 and 16 clients, but I'm not sure if my machine is reliable enough to
produce reliable results for 8 clients and more.

| #          | master      | patch       | %reg   | 
|------------|-------------|-------------|--------| 
| 1 client   | 1676.937825 | 1707.018029 | -1.79% | 
| 8 clients  | 7706.835401 | 7529.089044 | 2.31%  | 
| 16 clients | 9823.65254  | 9991.184206 | -1.71% |


If there's additional/necessary performance measurement, kindly advise me too.
Thank you in advance.

[1] 
https://www.postgresql.org/message-id/flat/20200806213334.3bzadeirly3mdtzl%40development#473168a61e229de40eaf36326232f86c

Best regards,
Kirk Jamison

Attachment: resume.sh
Description: resume.sh

Attachment: run.sh
Description: run.sh

Reply via email to