Re: relayd closing sessions prematurely

Mike Belopuhov Wed, 16 Oct 2019 15:18:43 -0700

Hi,


Pascal Guitierrez writes:

> I seem to be having the same issue as this post:
> https://marc.info/?l=openbsd-misc&m=156080861431000&w=2
>
>
> however even with the suggestion of disabling socket splicing i'm still
> hitting the same problem, which seems to be an hard timeout where the
> relayd session is killed and then subsequently the HTTP session is cut and
> the download from relayd fails, even if there's active traffic travelling
> over the session.
>
>
> To reproduce:
>
> # create a 100MB random file as the site index
>
> dd if=/dev/random of=/var/www/index.html bs=1M count=100
>
>
> httpd.conf:
>
> server "test" {
>
> listen on lo0 port 8080
>
> }
>
>
> relayd.conf:
>
> table <webhosts> {127.0.0.1}
>
> http protocol "web" {
>
> pass
>
> #tcp {no splice}
>
> }
>
> relay "localweb" {
>
> listen on lo0 port 80
>
> protocol "web"
>
> session timeout 5
>
> forward to <webhosts> port 8080
>
> }
>
>
> then to test i use: wget -O /dev/null --limit-rate 1k http://localhost
>
>
> from benno@'s post, socket splicing means relayd cannot see the traffic on
> the socket to know if it's idle
>
> so the behaviour should be that:
>
> 1. with splicing: the relayd session should be implicitly killed after 5
> seconds
>
> 2. no splicing: the relayd session should keep running until the HTTP
> transfer is completed as it's actively got data travelling over it every 5
> seconds or less
>
>
> are my assumptions correct?
>
>
> here are the results:
>
>
> with splice:
>
> relayd session is killed after 10 seconds, idle timeout is reset every 5
> seconds
>
> here's what relayctl show sessions looks like:
>
> age: 01, idle: 01
>
> age: 05, idle: 00
>
> age: 09, idle: 04
>
> ... then when age = 10 session is killed (wget fails the fetch after a
> total of 35 seconds from its execution)
>
>
> no splice:
>
> relayd session is killed after 5 seconds
>
> here's what relayctl show sessions looks like:
>
> age: 01, idle: 01
>
> age: 04, idle: 04
>
> ... then when age = 5 session is killed (wget fails the fetch after a total
> of 39 seconds from its execution)
>
>
> This seems strange, is this expected behaviour?


We're experiencing a similar issue but with a spliced TCP relay.
I've tracked it down to a chunk in relay_statistics where it iterates
the session tree and attempts to close connections that have exceeded
their timeout. However, this are what regularly scheduled session
timeouts are for, aren't they? And besides there's a lot more than a
timercmp that's happening there. Thus the whole chunk feels out of place.

Would be nice to get a clarification on this.

diff --git usr.sbin/relayd/relay.c usr.sbin/relayd/relay.c
index 21efe259e05..7c898158f11 100644
--- usr.sbin/relayd/relay.c
+++ usr.sbin/relayd/relay.c
@@ -369,11 +369,10 @@ relay_statistics(int fd, short events, void *arg)
        struct privsep          *ps = arg;
        struct relay            *rlay;
        struct ctl_stats         crs, *cur;
        struct timeval           tv, tv_now;
        int                      resethour = 0, resetday = 0;
-       struct rsession         *con, *next_con;
 
        /*
         * This is a hack to calculate some average statistics.
         * It doesn't try to be very accurate, but could be improved...
         */
@@ -411,19 +410,10 @@ relay_statistics(int fd, short events, void *arg)
 
                crs.id = rlay->rl_conf.id;
                crs.proc = ps->ps_instance;
                proc_compose(env->sc_ps, PROC_PFE, IMSG_STATISTICS,
                    &crs, sizeof(crs));
-
-               for (con = SPLAY_ROOT(&rlay->rl_sessions);
-                   con != NULL; con = next_con) {
-                       next_con = SPLAY_NEXT(session_tree,
-                           &rlay->rl_sessions, con);
-                       timersub(&tv_now, &con->se_tv_last, &tv);
-                       if (timercmp(&tv, &rlay->rl_conf.timeout, >=))
-                               relay_close(con, "hard timeout", 1);
-               }
        }
 
        /* Schedule statistics timer */
        evtimer_set(&env->sc_statev, relay_statistics, ps);
        bcopy(&env->sc_conf.statinterval, &tv, sizeof(tv));

Re: relayd closing sessions prematurely

Reply via email to