[Openvpn-devel] weird issue with server failover when Not using keepalive

Jan Just Keijser Fri, 04 Dec 2020 04:00:44 -0800

hey guys,

I'm posting this on behalf of the eduVPN team. François Kooman spent along time debugging an issue and finally managed to find the piece ofcode that causes the weird behavior.

Let me explain:

For eduVPN, multiple openvpn instances are offered , both on UDP and TCPports and the client config that is used lists all of these instances.The client can then do automatic roll-over to a TCP based setup if UDPis not working (blocked) for some reason.Now François had *not* set the keepalive option in the TCP setup, as aTCP connection has a keepalive of its own, more or less and this causedsome very odd behaviour:

1) the client tries to connect to a UDP based server; server isdown/blocked, hence openvpn does a failover to the next client

2) openvpn connects but after exactly 2 minutes the connection is restarted

3) the reconnects keep happening every 2 minutes suggesting it is aping-restart/keepalive setting

We've tracked this down to the following piece of code, which has beenpresent in the OpenVPN code base since v2.1 (which was the first versionto support connection entries). File is init.c, here from v2.4.9:


 188 static void
 189 update_options_ce_post(struct options *options)
 190 {
 191 #if P2MP
 192     /*

193 * In pull mode, we usually import --ping/--ping-restartparameters from 194 * the server. However we should also set an initial default--ping-restart 195 * for the period of time before we pull the --ping-restartparameter

 196      * from the server.
 197      */
 198     if (options->pull
 199         && options->ping_rec_timeout_action == PING_UNDEF
 200         && proto_is_dgram(options->ce.proto))
 201     {
 202         options->ping_rec_timeout = PRE_PULL_INITIAL_PING_RESTART;
 203         options->ping_rec_timeout_action = PING_RESTART;
 204     }
 205 #endif
 206 }

When failing over, this function 'update_options_ce_post' is called andfor UDP based connections, the ping_rec_timeout is updated.

*Why?*

Note that ping_rec_timeout is a GLOBAL option and affects all connectionentries, both TCP and UDP based. Comment out the call to'update_options_ce_post' and the restarts are gone.

Shall we just comment out/remove this particular piece of code altogether?

JJK

PS This leads me to think that perhaps the ping-* options should be madeconnection-entry specific. That way, you can have different behaviorfor TCP based setups and UDP based setups. Also note (see below) thatthis problem also affect failover from one UDP based server to the next, if --keepalive is disabled, so it's not "just" UDP vs TCP.


-------------------------
For completeness:

I managed to recreate this setup and I can even get the same oddbehaviour in a 100% UDP based setup.

Server config:
###############
proto udp
port 1194
dev tun
server 10.200.0.0 255.255.255.0
dh       dh2048.pem
ca       ca.crt
cert     server.crt
key      server.key
persist-key
persist-tun
topology subnet
user  nobody
group nobody  # use "group nogroup" on some distros
cipher aes-256-cbc
auth   sha256
###############

(yes, I know the server blurts out
  WARNING: --keepalive option is missing from server config
on startup)

Client config:
###############
client
remote <server> 1195 udp  ## use a non available port first
remote <server> 1194 udp
### remote <server> 1195 tcp
dev tun
nobind
remote-cert-tls server
ca       ca.crt
cert     client1.crt
key      client1.key
cipher aes-256-cbc
auth   sha256
################

So the client first connects to a (non-existent) server, and then failsover to the second entry, and we get a connection. Then, every 2 minuteswe get a connection restart.


If I change the client config to list only a single
  remote <server> 1194 udp
line then this reconnect behavior does NOT occur ?!?!?!?

_______________________________________________
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel

[Openvpn-devel] weird issue with server failover when *Not* using keepalive

Reply via email to

[Openvpn-devel] weird issue with server failover when Not using keepalive