Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2017-01-01 Thread Sebastiaan Couwenberg
tags 849417 - unreproducible moreinfo
tags 849417 + upstream
forwarded 849417 https://github.com/NagiosEnterprises/nrpe/issues/91
thanks

Hi Adam,

Thanks for the additional debugging. I've now been able to reproduce the
issue on a Debian unstable VM, and have forwarded the issue upstream.

On 01/01/2017 04:07 AM, Adam Di Carlo wrote:
> Sebastiaan Couwenberg  writes:
> 
>> The debug symbols are already available, no need to a rebuild. Just
>> install the nagios-nrpe-server-dbgsym package.
> [...]
> 
> Thanks, that system is new to me!

Debug packages have existed for quite some time, the automatic dbgsym
packages are new in stretch, see: https://wiki.debian.org/DebugPackage

> Let me know if you're still stumped.   I think my next step would be to
> have to try to hack sources and come up with a diff which fixes matters.

That would be excellent, please forward your proposed fix upstream.

> Also, I'm clearly missing some debug symbols, covering
> .../sysdeps/x86_64/strlen.S, but not sure what package I need to install
> to cover that.

You need the libc source for that.

Kind Regards,

Bas

-- 
 GPG Key ID: 4096R/6750F10AE88D4AF1
Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-31 Thread Adam Di Carlo
Sebastiaan Couwenberg  writes:

> The debug symbols are already available, no need to a rebuild. Just
> install the nagios-nrpe-server-dbgsym package.
[...]

Thanks, that system is new to me!

>>> Due to the signal handler in NRPE you won't easily get a backtrace since
>>> SIGSEGV is caught too and NRPE just continues instead of terminating. If
>>> you can get a backtrace (with debug symbols installed) that would be
>>> helpful.

It didn't really give me too much trouble.  I think gdb replaces all the
signal handlers anyhow.

To recap my current behavior, in case things maybe changed subtly here,
here's the logging I get in daemon.log with ssl_debug set to 0x0f:

Dec 31 21:37:22 salsa nrpe[24931]: Allowing connections from: 
127.0.0.1,192.168.1.5
Dec 31 21:37:27 salsa nrpe[24935]: Connection from 192.168.1.5 port 42463
Dec 31 21:37:27 salsa nrpe[24935]: Host address is in allowed_hosts
Dec 31 21:37:27 salsa nrpe[24935]: Error: Could not complete SSL handshake with 
192.168.1.5: 1
Dec 31 21:37:27 salsa nrpe[24935]: Connection from 192.168.1.5 closed.


Whereas if I set it to 0xff:
Dec 31 21:36:23 salsa nrpe[24897]: Allowing connections from: 
127.0.0.1,192.168.1.5
Dec 31 21:36:30 salsa nrpe[24899]: Connection from 192.168.1.5 port 41951
Dec 31 21:36:30 salsa nrpe[24899]: Host address is in allowed_hosts

and then in kernl.log:
Dec 31 21:36:30 salsa kernel: [632644.965865] nrpe[24899]: segfault at
b0935335 ip 7f3fafd3d496 sp 7ffee43c9dc8 error 4 in
libc-2.24.so[7f3fafcbd000+195000]


Here's my gdb session and the best backtrace I was able to get out:

# gdb /usr/sbin/nrpe 24967
(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[New process 25047]
[New process 25048]

Thread 3.1 "nrpe" received signal SIGSEGV, Segmentation fault.
[Switching to process 25048]
strlen () at ../sysdeps/x86_64/strlen.S:106
106 ../sysdeps/x86_64/strlen.S: No such file or directory.
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x7fc8e3c34da3 in _IO_vfprintf_internal (s=s@entry=0x561cf790d280, 
format=, format@entry=0x561cf6be9eb8 "Error: Could not complete 
SSL handshake with %s: %s", 
ap=0x7fff6996e188) at vfprintf.c:1637
#2  0x7fc8e3ce2f66 in ___vfprintf_chk (fp=fp@entry=0x561cf790d280, 
flag=flag@entry=1, format=format@entry=0x561cf6be9eb8 "Error: Could not 
complete SSL handshake with %s: %s", 
ap=ap@entry=0x7fff6996e188) at vfprintf_chk.c:33
#3  0x7fc8e3ccfad8 in __GI___vsyslog_chk (pri=, flag=1, 
fmt=0x561cf6be9eb8 "Error: Could not complete SSL handshake with %s: %s", 
ap=ap@entry=0x7fff6996e188)
at ../misc/syslog.c:222
#4  0x7fc8e3ccffd2 in __syslog_chk (pri=, flag=, fmt=) at ../misc/syslog.c:129
#5  0x561cf6be51ba in syslog (__fmt=0x561cf6be9eb8 "Error: Could not 
complete SSL handshake with %s: %s", __pri=3) at 
/usr/include/x86_64-linux-gnu/bits/syslog.h:31
#6  handle_conn_ssl (sock=, ssl_ptr=0x561cf78f7b70) at 
./nrpe.c:1753
#7  0x561cf6be6a53 in handle_connection (sock=6) at ./nrpe.c:1491
#8  0x561cf6be7085 in wait_for_connections () at ./nrpe.c:1198
#9  0x561cf6be71c3 in run_src () at ./nrpe.c:506
#10 0x561cf6be288c in main (argc=, argv=) at 
./nrpe.c:198

(gdb) frame 6
#6  handle_conn_ssl (sock=, ssl_ptr=0x561cf78f7b70) at 
./nrpe.c:1753
1753./nrpe.c: No such file or directory.
nerrs = 0
c = 
buffer = 
"\000\000\000\000\000\000\000\000\324\006\000\000\000\000\000\000\250\310\311\344\310\177\000\000\220\375\276\343\310\177\000\000\070п\343\310\177\000\000SI\250\344\310\177\000\000\324\006\000\000\000\000\000\000\070п\343\310\177\000\000\250\310\311\344\310\177\000\000\070\343\226i\377\177\000\000\064\343\226i\377\177\000\000\313B\250\344\310\177\000\000\020\265\370\343\310\177\000\000(\252\370\343\310\177\000\000\070\343\226i\377\177\000\000\066\025\025e\000\000\000\000TT\224\001\000\000\000\000\070п\343\310\177\000\000\020\344\226i\377\177\000\000\220\375\276\343\310\177\000\000\064\343\226i\377\177\000\000\000\344\226i\377\177\000\000PF\306\344\310\177\000\000\b",
 '\000' ...
ssl = 0x561cf78f7b70
peer = 
rc = 
x = 


Let me know if you're still stumped.   I think my next step would be to
have to try to hack sources and come up with a diff which fixes matters.

Also, I'm clearly missing some debug symbols, covering
.../sysdeps/x86_64/strlen.S, but not sure what package I need to install
to cover that.

-- 
...Adam Di Carlo......



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-28 Thread Sebastiaan Couwenberg
On 12/28/2016 07:07 PM, Adam Di Carlo wrote:
> Sebastiaan Couwenberg  writes:
> 
>> As documented in /usr/share/doc/nagios-nrpe-server/NEWS.Debian.gz which
>> is shown to you on upgrade when you have apt-listchanges installed:
> [...]
>>   Beware that the new NRPE daemon only works with old check_nrpe
>>   plugins when SSL support is disabled on both sides, likewise the
>>   new check_nrpe plugin only works with the old NRPE daemon when SSL
>>   support is disabled.
> 
> Oh!  I totally didn't see that.  Ok.  So what I'm trying to do will
> never work and I need to disable SSL for all NRPE servers as well as on
> my (Jessie) nagios server.

You only need to disable SSL for NRPE >= 3.0. The SSL support for NRPE
2.x still works.

For example, on my jessie server I changed the check_nrpe commands to
match the configuration in NRPE 3.x (see attached check_nrpe.cfg) by
modifying /etc/nagios-plugins/config/check_nrpe.cfg.

In the service configuration I changed all check_nrpe_1arg commands to
check_nrpe_ssl, and for the hosts running testing/unstable I changed it
to check_nrpe. Once the jessie systems get upgraded to stretch their
service configuration will be changed to use check_nrpe instead of
check_nrpe_ssl too.

>>   To use SSL between the NRPE client and server, configuring Stunnel
>>   is recommended.
> 
> I suppose that disabling SSL, so long as I also disable the NRPE
> argument processing on the older NRPEs which allow it, won't create too
> many security issues on an internal network.  The most an attacker could
> do, assuming they could spoof my the one allowed IP that commands can
> come from, is run the checks configured on the NRPE server.  So, there
> is a denial-of-service risk here but not much more than that

The SSL support in NRPE 2.x never got you much security on your internal
network (it did not verify the hostname for example), it mostly
obfuscated tcpdumps. Disabling the NRPE arguments brings you much more
security that the (broken) SSL support in NRPE 2.x.

> Pardon me for failing to RTM here.
> 
>> Due to the signal handler in NRPE you won't easily get a backtrace since
>> SIGSEGV is caught too and NRPE just continues instead of terminating. If
>> you can get a backtrace (with debug symbols installed) that would be
>> helpful.
> 
> Ok, I'll give it a whack.  Lets leave the bug in "moreinfo" until I get
> that.  I do believe I need to rebuild the package with '-g' to get
> symbols out, which I've done.  Off to work for now but I'll give this
> another attempt, should have result by no later than end of day tomorrow.

The debug symbols are already available, no need to a rebuild. Just
install the nagios-nrpe-server-dbgsym package. You may need to configure
the sources for that first, e.g. for unstable:

# Debug packages
deb http://debug.mirrors.debian.org/debian-debug/ unstable-debug main
contrib non-free

Kind Regards,

Bas

-- 
 GPG Key ID: 4096R/6750F10AE88D4AF1
Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1
# this command runs a program $ARG1$ with no arguments and disables SSL support
define command {
command_namecheck_nrpe
command_line/usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c 
$ARG1$ -n
}

# this command runs a program $ARG1$ with no arguments and enables SSL support
define command {
command_namecheck_nrpe_ssl
command_line/usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c 
$ARG1$
}


Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-28 Thread Adam Di Carlo
Sebastiaan Couwenberg  writes:

> As documented in /usr/share/doc/nagios-nrpe-server/NEWS.Debian.gz which
> is shown to you on upgrade when you have apt-listchanges installed:
[...]
>   Beware that the new NRPE daemon only works with old check_nrpe
>   plugins when SSL support is disabled on both sides, likewise the
>   new check_nrpe plugin only works with the old NRPE daemon when SSL
>   support is disabled.

Oh!  I totally didn't see that.  Ok.  So what I'm trying to do will
never work and I need to disable SSL for all NRPE servers as well as on
my (Jessie) nagios server.

>   To use SSL between the NRPE client and server, configuring Stunnel
>   is recommended.

I suppose that disabling SSL, so long as I also disable the NRPE
argument processing on the older NRPEs which allow it, won't create too
many security issues on an internal network.  The most an attacker could
do, assuming they could spoof my the one allowed IP that commands can
come from, is run the checks configured on the NRPE server.  So, there
is a denial-of-service risk here but not much more than that

Pardon me for failing to RTM here.

> Due to the signal handler in NRPE you won't easily get a backtrace since
> SIGSEGV is caught too and NRPE just continues instead of terminating. If
> you can get a backtrace (with debug symbols installed) that would be
> helpful.

Ok, I'll give it a whack.  Lets leave the bug in "moreinfo" until I get
that.  I do believe I need to rebuild the package with '-g' to get
symbols out, which I've done.  Off to work for now but I'll give this
another attempt, should have result by no later than end of day tomorrow.

-- 
...Adam Di Carlo......



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-27 Thread Sebastiaan Couwenberg
On 12/28/2016 05:41 AM, Adam Di Carlo wrote:
> Sebastiaan Couwenberg  writes:
> 
>>> -- Configuration Files:
>>> /etc/default/nagios-nrpe-server changed:
>>> USE_SSL=1
>>
>> Please note that the /etc/default/nagios-nrpe-server changed in
>> nagios-nrpe (3.0.1-3) because of the systemd service file.
>>
>> The USE_SSL option is no longer used, instead the NRPE_OPTS variable is
>> used to disable SSL in both the init script and systemd service file.
>> The default content is now as attached.
> 
> Gotit.
> 
> I'll work my way through your instructions, attempt to fix my interop
> issue.  Its always *overconfiguration* that gets me.

As documented in /usr/share/doc/nagios-nrpe-server/NEWS.Debian.gz which
is shown to you on upgrade when you have apt-listchanges installed:

"
  SSL support is disabled by default, the reworked SSL/TLS support in
  NRPE requires configuration before it can be used. Read the
  instructions in /usr/share/doc/nagios-nrpe-server/README.SSL.md.gz
  before enabling SSL support in /etc/default/nagios-nrpe-server.

  The default check_nrpe command in check_nrpe.cfg has been updated
  to disable SSL by default too. The check_nrpe_ssl command has been
  added to connect to the NRPE daemon over SSL.

  Beware that the new NRPE daemon only works with old check_nrpe
  plugins when SSL support is disabled on both sides, likewise the
  new check_nrpe plugin only works with the old NRPE daemon when SSL
  support is disabled.

  To use SSL between the NRPE client and server, configuring Stunnel
  is recommended.
"

Once all systems have upgraded to NRPE 3.x using its SSL support is an
option, but that will take some time (no other distributions have
upgraded to 3.x yet).

> Thank you for taking the time to help!
> 
> 
> However, no matter my legacy misconfig, isn't it still problematic to
> segfault like this?  Let me know if a backtrace would help.

Due to the signal handler in NRPE you won't easily get a backtrace since
SIGSEGV is caught too and NRPE just continues instead of terminating. If
you can get a backtrace (with debug symbols installed) that would be
helpful.

Kind Regards,

Bas

-- 
 GPG Key ID: 4096R/6750F10AE88D4AF1
Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-27 Thread Adam Di Carlo
Sebastiaan Couwenberg  writes:

>> -- Configuration Files:
>> /etc/default/nagios-nrpe-server changed:
>> USE_SSL=1
>
> Please note that the /etc/default/nagios-nrpe-server changed in
> nagios-nrpe (3.0.1-3) because of the systemd service file.
>
> The USE_SSL option is no longer used, instead the NRPE_OPTS variable is
> used to disable SSL in both the init script and systemd service file.
> The default content is now as attached.

Gotit.

I'll work my way through your instructions, attempt to fix my interop
issue.  Its always *overconfiguration* that gets me.

Thank you for taking the time to help!


However, no matter my legacy misconfig, isn't it still problematic to
segfault like this?  Let me know if a backtrace would help.

-- 
...Adam Di Carlo......



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-27 Thread Adam Di Carlo
Sebastiaan Couwenberg  writes:

> Thanks for reporting this issue. Unfortunately I cannot reproduce it.

Oh dear.

> To help reproduce this issue, can you clarify how nagios-nrpe-server is
> configured. I assume that you configured SSL before removing the -n
> option of the nrpe daemon? Do you use a CA certificate, or
> self-signed?

Hmm, actually I left all those settings (ssl_cacert_file, ssl_cert_file,
ssl_privatekey_file) commented out.

FYI, I'm trying to interoperate with nagios-nrpe-plugin from jessie
(version 2.15-1), which doesn't seem to have any way to configure a CA
or client cert.  Any advice is welcome.

-- 
.Adam Di carloa...@debian.org.



Bug#849417: [Pkg-nagios-devel] Bug#849417: nagios-nrpe-server: segfault during SSL negotiation with older NRPE 2.15 plugin

2016-12-26 Thread Sebastiaan Couwenberg
Control: tags -1 unreproducible moreinfo

Hi Adam,

Thanks for reporting this issue. Unfortunately I cannot reproduce it.

On 12/26/2016 09:06 PM, Adam Di Carlo wrote:
> Given a situation where a debian/stable (Jessie) server is polling an
> NRPE node running the latest unstable NRPE server, with all debugging
> enabled (ssl_logging=-1), I am getting the following segfault, as reported in
> /var/log/syslog:
> 
> Dec 26 14:49:38 salsa nrpe[14736]: Connection from 192.168.1.5 port 59564
> Dec 26 14:49:38 salsa nrpe[14736]: Host address is in allowed_hosts
> Dec 26 14:49:38 salsa kernel: [176235.037105] nrpe[14736]: segfault at 
> 5335 ip 7fd44f408496 sp 7ffd5abfb418 error 4 in 
> libc-2.24.so[7fd44f388000+195000]
> 
> However, if I rachet down the SSL debugging, e.g., ssl_logging=0x03,
> the segfault disappears. 

To help reproduce this issue, can you clarify how nagios-nrpe-server is
configured. I assume that you configured SSL before removing the -n
option of the nrpe daemon? Do you use a CA certificate, or self-signed?

-- System Information:
> -- Configuration Files:
> /etc/default/nagios-nrpe-server changed:
> USE_SSL=1

Please note that the /etc/default/nagios-nrpe-server changed in
nagios-nrpe (3.0.1-3) because of the systemd service file.

The USE_SSL option is no longer used, instead the NRPE_OPTS variable is
used to disable SSL in both the init script and systemd service file.
The default content is now as attached.

> /etc/nagios/nrpe.cfg changed:
> log_facility=daemon
> debug=1
> pid_file=/var/run/nagios/nrpe.pid
> server_port=5666
> nrpe_user=nagios
> nrpe_group=nagios
> allowed_hosts=127.0.0.1,192.168.1.5
> dont_blame_nrpe=1
> allow_bash_command_substitution=0
> command_timeout=60
> connection_timeout=300
> ssl_version=SSLv2+
> ssl_logging=-1

It doesn't look like you configured SSL, but you did enable the feature.

To use SSL in NRPE 3.x you'll need to configure at least a certificate
file (ssl_cert_file) and its key (ssl_privatekey_file), e.g. for the
snakeoil certificate generated by the ssl-cert package:

 ssl_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
 ssl_privatekey_file=/etc/ssl/private/ssl-cert-snakeoil.key

For proper SSL certificates you also need to configure the path to the
CA certificate (including intermediate certificates) in ssl_cacert_file.

Also note that setting dont_blame_nrpe=1 has no effect, the package is
not configured with --enable-command-args.

Kind Regards,

Bas

-- 
 GPG Key ID: 4096R/6750F10AE88D4AF1
Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1
# defaults file for nagios-nrpe-server
# (this file is a /bin/sh compatible fragment)

# NRPE_OPTS are any extra cmdline parameters you'd like to pass along to the
# nrpe daemon.
#
# The -n option disables SSL support.
# Don't remove this option before configuring SSL in /etc/nagios/nrpe.cfg!
# See /usr/share/doc/nagios-nrpe-server/README.SSL.md.gz for instructions.
NRPE_OPTS="-n"

# NICENESS is if you want to run the server at a different nice() priority.
# (only used by the init script)
#NICENESS=5

# INETD is if you want to run the server via inetd (default=0, run as daemon).
# (only used by the init script)
#INETD=0