2015-12-25 21:28 GMT+03:00 Tom Lane <t...@sss.pgh.pa.us>: > Andres Freund <and...@anarazel.de> writes: > > On December 25, 2015 7:10:23 PM GMT+01:00, Tom Lane <t...@sss.pgh.pa.us> > wrote: > >> Seems like what you've got here is a kernel bug. > > > I wouldn't go as far as calling it a kernel bug. Were still doing 300k > tps. And were triggering the performance degradation by adding another > socket (IIRC) to the poll(2) call. > > Hmm. And all those FDs point to the same pipe. I wonder if we're looking > at contention for some pipe-related data structure inside the kernel. > > regards, tom lane >
I did bt on backends and found it in following state: #0 0x00007f77b0e5bb60 in __poll_nocancel () from /lib64/libc.so.6 #1 0x00000000006a7cd0 in WaitLatchOrSocket (latch=0x7f779e2e96c4, wakeEvents=wakeEvents@entry=19, sock=9, timeout=timeout@entry=0) at pg_latch.c:333 #2 0x0000000000612c7d in secure_read (port=0x17e6af0, ptr=0xcc94a0 <PqRecvBuffer>, len=8192) at be-secure.c:147 #3 0x000000000061be36 in pq_recvbuf () at pqcomm.c:915 #4 pq_getbyte () at pqcomm.c:958 #5 0x0000000000728ad5 in SocketBackend (inBuf=0x7ffd8b6b1460) at postgres.c:345 Perf shows _raw_spin_lock_irqsave call remove_wait_queue add_wait_queue There’s screenshots: http://i.imgur.com/pux2bGJ.png http://i.imgur.com/LJQbm2V.png --- Dmitry Vasilyev Postgres Professional: http://www.postgrespro.com Russian Postgres Company