Forwarding to tech, since people may have missed this. Hope to commit this soon.
----- Forwarded message from Claudio Jeker <cje...@diehard.n-r-g.com> ----- Date: Sun, 24 Dec 2017 19:07:11 +0100 From: Claudio Jeker To: Mischa Peters Subject: Re: relayd stops processing traffic intermittently On Sat, Dec 23, 2017 at 02:04:19PM +0100, Mischa Peters wrote: > > > On 23 Dec 2017, at 13:08, Claudio Jeker wrote: > > > >> On Sat, Dec 23, 2017 at 11:40:57AM +0100, Mischa wrote: > >> Hi All, > >> > >> Since OpenBSD 6.2, just confirmed this in the latest snapshot > >> (GENERIC.MP#305) as well, for some reason relayd stops processing traffic > >> and starts flooding the log file with the following message: > >> > >> Dec 23 11:19:11 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:12 lb2 relayd[52110]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:12 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:12 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >> [snip] > >> Dec 23 11:19:17 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:18 lb2 relayd[22515]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:18 lb2 relayd[52110]: rsae_send_imsg: poll timeout > >> Dec 23 11:19:18 lb2 relayd[69641]: rsae_send_imsg: poll timeout > >> ...etc... > >> > >> Restarting the daemon "fixes" the problem. > >> Not sure how to trouble shoot this but I am able to reproduce this > >> consistently by pointing SSLLabs towards relayd. > >> Would be great to get some pointers. > >> > > > > I have seen this as well on our production systems. This is a problem in > > the privsep part of the TLS code. I could not do more testing yet but my > > assumption is that a new option / feature is freaking this code out. > > Anything I can do or collect to give you more information? So, I think I found the problem. The ca process did not handle errors from RSA_private_encrypt correctly. So once you got a bad signature in the system chocked and stopped. This diff seems to work for me (against SSLlabs). Cheers -- :wq Claudio Index: ca.c =================================================================== RCS file: /cvs/src/usr.sbin/relayd/ca.c,v retrieving revision 1.31 diff -u -p -r1.31 ca.c --- ca.c 28 Nov 2017 00:20:23 -0000 1.31 +++ ca.c 24 Dec 2017 18:01:20 -0000 @@ -266,9 +266,15 @@ ca_dispatch_relay(int fd, struct privsep break; } + if (cko.cko_tlen == -1) { + char buf[256]; + log_warnx("%s: %s", __func__, + ERR_error_string(ERR_get_error(), buf)); + } + iov[c].iov_base = &cko; iov[c++].iov_len = sizeof(cko); - if (cko.cko_tlen) { + if (cko.cko_tlen > 0) { iov[c].iov_base = to; iov[c++].iov_len = cko.cko_tlen; } @@ -381,12 +387,12 @@ rsae_send_imsg(int flen, const u_char *f IMSG_SIZE_CHECK(&imsg, (&cko)); memcpy(&cko, imsg.data, sizeof(cko)); - if (IMSG_DATA_SIZE(&imsg) != - (sizeof(cko) + cko.cko_tlen)) - fatalx("data size"); ret = cko.cko_tlen; - if (ret) { + if (ret > 0) { + if (IMSG_DATA_SIZE(&imsg) != + (sizeof(cko) + ret)) + fatalx("data size"); toptr = (u_char *)imsg.data + sizeof(cko); memcpy(to, toptr, ret); } ----- End forwarded message ----- -- :wq Claudio