Hi Holger, On Fri, Jun 10, 2016 at 04:32:55PM +0200, Holger Just wrote: > Hi Willy et al., > > > Thank you for this report, it helps. How often does it happen, and/or after > > how long on average after you start it ? What's your workload ? Do you use > > SSL, compression, TCP and/or HTTP mode, peers synchronization, etc ? > > Yesterday, we upgraded from 1.5.14 to 1.5.18 and now observed exactly > this issue in production. After rolling back to 1.5.14, it didn't occur > anymore. > > We have mostly http traffic, little TCP with about 100-200 req/s, about > 2000 concurrent connections over all. About all traffic is SSL > terminated. We use no peer synchronization and no compression. > > An strace on the process reveals this (with most of the calls being > epoll_wait): > > [...] > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {{EPOLLIN, {u32=796, u64=796}}}, 200, 0) = 1 > read(796, " > \357\275Y\231\275'b\5\216#\33\220\337'\370\312\215sG4\316\275\277y-%\v\v\211\331\342"..., > 5872) = 1452 > read(796, 0x9fa26ec, 4420) = -1 EAGAIN (Resource > temporarily unavailable) > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > epoll_wait(0, {}, 200, 0) = 0 > [...]
Thank you for the report. I'll inspect the SSL part just in case I'd miss something. Don't take risks in your production of course. Best regards, Willy