Hi, On 2023-05-11 09:42:39 +0000, Zhijie Hou (Fujitsu) wrote: > I did some simple tests for this to see the performance impact on > the streaming replication, just share it here for reference. > > 1) sync primary-standby setup, load data on primary and count the time spent > on > replication. the degradation will be more obvious as the value of > max_wal_senders > increases.
FWIW, using syncrep likely under-estimates the overhead substantially, because that includes a lot overhead on the WAL generating side. I saw well over 20% overhead for the default max_wal_senders=10. I just created a standby, shut it down, then ran a deterministically-sized workload on the primary, started the standby, and measured how long it took to catch up. I just used the log messages to measure the time. > 2) Similar as 1) but count the time that the standby startup process spent on > replaying WAL(via gprof). I don't think that's the case here, but IME gprof's overhead is so high, that it can move bottlenecks quite drastically. The problem is that it adds code to every function enter/exit - for simple functions, that overhead is much higher than the "original" cost of the function. gprof style instrumentation is good for things like code coverage, but for performance evaluation it's normally better to use a sampling profiler like perf. That also causes slowdowns, but largely only in places that already take up substantial execution time. Greetings, Andres Freund