[stunnel-users] stunnel randomly crashing

Chris Knipe Thu, 02 Feb 2017 06:11:35 -0800

Hi All,

Let me first get the formalities out of the way:
stunnel 5.40 on x86_64-unknown-linux-gnu platform
Compiled/running with OpenSSL 1.0.1f 6 Jan 2014
Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI


Compiled the latest version of stunnel today, and yes, I know I am running an 
old/insecure version of OpenSSL

stunnel.cnf
debug           = debug
pid             = /var/run/stunnel.pid
socket          = l:TCP_NODELAY=1
socket          = r:TCP_NODELAY=1
ciphers         = ALL
options         = NO_SSLv2
fips            = no

[my.service]
accept          = *:501
CAfile          = /etc/stunnel/my.service.ca
cert            = /etc/stunnel/my.service.pem
exec            = /path/to/my/server
TIMEOUTclose    = 0

Here’s what been happening.  I’ve been running stunnel for almost 4 or 5 years 
now with the above service, BUT, I was always running under xinetd.  I’ve never 
had one single issue, stunnel served me well, and I was rather happy.  Lately 
however, the amount of connections (and I guess more importantly the rate of 
incoming connections) has been increasing steadily on the server.  After lots 
of debugging, it was determined that this was due to stunnel processes 
constantly being fired up under xinetd.  By a lot, I am talking about 20-30+ 
connections/sec.

So today, I took the time and changed our entire cluster of 17 servers….  All 
servers was upgraded to the latest version (we were on 5.24 previously), and 
instead of using xinetd I have amended the configurations so that stunnel now 
runs in daemon mode (runs under root).  For the most of it, it works absolutely 
fine.  As you can see the configuration is in maximum debugging mode, and I am 
thus taking as much info as I can out of the logs.  The logs show absolutely 
nothing as to what is happening and why ☹ I’m more than happy to provide the 
logs to someone to look at, but there’s THOUSANDS of connections and debug info 
– it’s large, very large.

After a seemingly random amount of time (from a few minutes, to a few hours), 
and after successfully accepting THOUSANDS of connections, stunnel would just 
die.  Nothing abnormal logged, no dmesg, no crash.  The process just dies.  
Stunnel accepts by default 500 connections (which is a bit low), but I have 
also confirmed that it is not stunnel running out of file descriptors.  When it 
does run out, stunnel logs the appropriate connections refused messages, and 
continues to run (i.e. it does not crash – we’ve specifically tested that).

# cat /proc/7095/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             257585               257585               processes 
Max open files            1024                 4096                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       257585               257585               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

# ls /proc/7095/fd/ | wc -l
956

We are very far from the limits (from what I understand at least, seeing that 
we are running under root, hard limit?).

Except for the fact that the servers are very busy in terms of incoming 
connections/sec (although +- 20/sec surely can’t be that much?), is there 
anything else to possibly look at?  After moving from xinetd to daemon mode my 
load on the server dropped by > 60% - so the saving is significant and I don’t 
want to go back to xinetd mode if I can avoid it.  It also means that the 
machine isn’t under any significant strain anymore where load could be a factor 
affecting stunnel.

Thnx,
Chris.





_______________________________________________
stunnel-users mailing list
[email protected]
https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users

[stunnel-users] stunnel randomly crashing

Reply via email to