Hi! Thanks for enlightening me, I’ll try to fix the real problem.
Rivo > On 13 Dec 2017, at 08:42, Claudio Jeker <cje...@diehard.n-r-g.com> wrote: > >> On Wed, Dec 13, 2017 at 12:25:39AM +0000, Rivo Nurges wrote: >> Hi! >> >> If you http PUT a "big" file through relayd, server<>relay read side >> will eventually get a EVBUFFER_TIMEOUT. Nothing comes back from the >> server until the PUT is done. I disabled server read timeouts for PUT >> requests. >> >> While trying to fix the issue I managed to trigger another problem. For >> HTTP relays we open relay<>server connection only after the first >> request is completely read from the client. If http PUT is the the >> first request and is big enough we will run out of memory and >> eventually out of swap. To avoid the issue I will open relay<>server >> connection earlyer and let relayd to start sending the stuff to the >> server. >> >> And another one I don't know how to fix. If relayd fills all memory and >> swap with buffers kernel enters infinite loop. relayd is in flt_noram >> state and pagedaemon constantly tries to free something without any >> luck. userland scheduling halts. bgp looses its peers but carp still >> happily sends its hellos... >> > > I have seen something similar and came to the conclusion that the timeout > handling of relayd is not correct. As long as traffic is flowing the > timeout should be reset (at least that is what every other implementation > does). This is not really happening in relayd. I have seen this on GET > requests that are huge (timeout hits in the middle of the transimit and > kills the session). > > Because of this I think the diff is a workaround and does not solve the > real underlying problem. > >> Rivo >> >> Index: usr.sbin/relayd/relay.c >> =================================================================== >> RCS file: /cvs/src/usr.sbin/relayd/relay.c,v >> retrieving revision 1.236 >> diff -u -p -r1.236 relay.c >> --- usr.sbin/relayd/relay.c 28 Nov 2017 01:51:47 -0000 1. >> 236 >> +++ usr.sbin/relayd/relay.c 13 Dec 2017 00:05:33 -0000 >> @@ -723,7 +723,8 @@ relay_connected(int fd, short sig, void >> relay_tls_connected(out); >> >> bufferevent_settimeout(bev, >> - rlay->rl_conf.timeout.tv_sec, rlay- >>> rl_conf.timeout.tv_sec); >> + con->se_out.writeonly ? 0 : rlay->rl_conf.timeout.tv_sec, >> + rlay->rl_conf.timeout.tv_sec); >> bufferevent_setwatermark(bev, EV_WRITE, >> RELAY_MIN_PREFETCHED * proto->tcpbufsiz, 0); >> bufferevent_enable(bev, EV_READ|EV_WRITE); >> Index: usr.sbin/relayd/relay_http.c >> =================================================================== >> RCS file: /cvs/src/usr.sbin/relayd/relay_http.c,v >> retrieving revision 1.70 >> diff -u -p -r1.70 relay_http.c >> --- usr.sbin/relayd/relay_http.c 27 Nov 2017 16:25:50 -0000 >> 1.70 >> +++ usr.sbin/relayd/relay_http.c 13 Dec 2017 00:05:33 -0000 >> @@ -439,6 +439,10 @@ relay_read_http(struct bufferevent *bev, >> case HTTP_METHOD_OPTIONS: >> case HTTP_METHOD_POST: >> case HTTP_METHOD_PUT: >> + con->se_out.writeonly = 1; >> + if(cre->dst->state == STATE_CONNECTED) >> + bufferevent_settimeout(bev, >> + 0, rlay->rl_conf.timeout.tv_sec); >> case HTTP_METHOD_RESPONSE: >> /* WebDAV methods */ >> case HTTP_METHOD_PROPFIND: >> @@ -569,6 +573,9 @@ relay_read_httpcontent(struct buffereven >> goto fail; >> cre->toread -= size; >> } >> + if (cre->dst->writeonly && cre->dst->state != >> STATE_CONNECTED) >> + if (relay_connect(con) == -1) >> + goto fail; >> DPRINTF("%s: done, size %lu, to read %lld", __func__, >> size, cre->toread); >> } >> Index: usr.sbin/relayd/relayd.h >> =================================================================== >> RCS file: /cvs/src/usr.sbin/relayd/relayd.h,v >> retrieving revision 1.248 >> diff -u -p -r1.248 relayd.h >> --- usr.sbin/relayd/relayd.h 28 Nov 2017 18:25:53 -0000 1 >> .248 >> +++ usr.sbin/relayd/relayd.h 13 Dec 2017 00:05:33 -0000 >> @@ -218,6 +218,7 @@ struct ctl_relay_event { >> int line; >> int done; >> int timedout; >> + int writeonly; >> enum relay_state state; >> enum direction dir; >> > > -- > :wq Claudio >