2017-03-09 20:19 GMT+01:00 Andres Freund <and...@anarazel.de>: > Hi, > > On 2017-03-09 13:47:35 +0100, Naytro Naytro wrote: > > We are having some performance issues after we upgraded to newest > > version of PostgreSQL, before it everything was fast and smooth. > > > > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we > > upgraded to 9.6.2 with no improvement. > > > > Some information about our setup: Freebsd, Solaris (SmartOS), simple > > master-slave using streaming replication. > > Which node is on which of those, and where is the high load? > > High load in only on slaves, FreeBSD (master+slave) and Solaris (only slaves)
> > > Problem: > > Very high system CPU when master is streaming replication data, CPU > > goes up to 77%. Only one process is generating this load, it's a > > postgresql startup process. When I attached a truss to this process I > > saw a lot o read calls with almost the same number of errors (EAGAIN). > > Hm. Just to clarify: The load is on the *receiving* side, in the startup > process? Because the load doesn't quite look that way... > > Yes > > > read(6,0x7fffffffa0c7,1) ERR#35 'Resource temporarily unavailable' > > > > Descriptor 6 is a pipe > > That's presumably a latches internal pipe. Could you redo that > truss/strace with timestamps attached? Does truss show signals > received? The above profile would e.g. make a lot more sense if not. Is > the wal receiver sending signals? > > Truss from Solaris: http://pastebin.com/WajedZ8Y and FreeBSD: http://pastebin.com/DB5iT8na FreeBSD truss should show signals by default Dtrace from solaris: http://pastebin.com/u03uVKbr > > > Read call try to read one byte over and over, I looked up to source > > code and I think this file is responsible for this behavior > > src/backend/storage/ipc/latch.c. There was no such file in 9.4. > > It was "just" moved (and expanded), used to be at > src/backend/port/unix_latch.c. > > There normally shouldn't be that much "latch traffic" in the startup > process, we'd expect to block from within WaitForWALToBecomeAvailable(). > > Hm. Any chance you've configured a recovery_min_apply_delay? Although > I'd expect more timestamp calls in that case. > > No, we don't have this option configured > > Greetings, > > Andres Freund >