I'm running nginx/1.19.5 on a Fedora32 VM, w/

  uname -rm
    5.9.11-100.fc32.x86_64 x86_64

Its run for ages without issues.  At least that I'd noticed ...

Today, I caught a SEGV/core-dump; the server stopped

  systemctl status nginx

        ● nginx.service - The nginx HTTP and reverse proxy server
                Loaded: loaded (/etc/systemd/system/nginx.service; enabled; 
vendor preset: disabled)
                Active: failed (Result: core-dump) since Sat 2020-12-05 
05:58:03 PST; 7h ago
                Process: 993 ExecStartPre=/bin/chown -R wwwrun:www 
/usr/local/etc/nginx (code=exited, status=0/SUCCESS)
                Process: 999 ExecStartPre=/usr/local/nginx/sbin/nginx -t -c 
/usr/local/etc/nginx/nginx.conf -g pid /run/nginx/nginx.pid; (code=exited, 
status=0/SUCCESS)
                Process: 1063 ExecStart=/usr/local/nginx/sbin/nginx -c 
/usr/local/etc/nginx/nginx.conf -g pid /run/nginx/nginx.pid; (code=exited, 
status=0/SUCCESS)
                Process: 1108 ExecStartPost=/bin/chown -R wwwrun:www 
/var/log/nginx (code=exited, status=0/SUCCESS)
                Process: 25986 ExecReload=/bin/kill -s HUP $MAINPID 
(code=exited, status=0/SUCCESS)
        Main PID: 1103 (code=dumped, signal=SEGV)
                        CPU: 14.607s

Checking logs (at current production loglevels) for this one, nothing out of 
the ordinary ...

EXCEPT The last log entry I see,:

        /var/log/nginx/main.access.log:61.219.11.153 _ - [05/Dec/2020:05:08:14 -0800] 
\x01A\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
 "400" 150 "-" "-" "-"

Given the proximity of the timestamp, I'g guess it's related?

I haven't yet figured out where/how to grab the core-dump ; working on that.

Checking history it's happened a few times

        xzegrep SEGV /var/log/messages
                2020-11-29T05:52:32.436235-08:00 vm0026 systemd[1]: 
nginx.service: Main process exited, code=dumped, status=11/SEGV
                2020-12-01T05:39:03.218376-08:00 vm0026 systemd[1]: 
nginx.service: Main process exited, code=dumped, status=11/SEGV
                2020-12-03T05:17:51.653637-08:00 vm0026 systemd[1]: 
nginx.service: Main process exited, code=dumped, status=11/SEGV
                2020-12-05T05:58:03.611240-08:00 vm0026 systemd[1]: 
nginx.service: Main process exited, code=dumped, status=11/SEGV

where each instance in log looks like,

        ....
        2020-12-05T05:55:00.854490-08:00 vm0026 systemd[25768]: Reached target 
Shutdown.
        2020-12-05T05:55:00.854510-08:00 vm0026 systemd[25768]: 
systemd-exit.service: Succeeded.
        2020-12-05T05:55:00.854531-08:00 vm0026 systemd[25768]: Finished Exit 
the Session.
        2020-12-05T05:55:00.854550-08:00 vm0026 systemd[25768]: Reached target 
Exit the Session.
        2020-12-05T05:55:00.858225-08:00 vm0026 systemd[1]: user@0.service: 
Succeeded.
        2020-12-05T05:55:00.858322-08:00 vm0026 systemd[1]: Stopped User 
Manager for UID 0.
        2020-12-05T05:55:00.860232-08:00 vm0026 systemd[1]: Stopping User 
Runtime Directory /run/user/0...
        2020-12-05T05:55:00.868288-08:00 vm0026 systemd[1]: run-user-0.mount: 
Succeeded.
        2020-12-05T05:55:00.870265-08:00 vm0026 systemd[1]: 
user-runtime-dir@0.service: Succeeded.
        2020-12-05T05:55:00.870383-08:00 vm0026 systemd[1]: Stopped User 
Runtime Directory /run/user/0.
        2020-12-05T05:55:00.871216-08:00 vm0026 systemd[1]: Removed slice User 
Slice of UID 0.
        2020-12-05T05:58:03.418222-08:00 vm0026 systemd[1]: Reloading The nginx 
HTTP and reverse proxy server.
        2020-12-05T05:58:03.420214-08:00 vm0026 systemd[1]: Reloaded The nginx 
HTTP and reverse proxy server.
        2020-12-05T05:58:03.432221-08:00 vm0026 systemd[1]: nginx.service: Unit 
cannot be reloaded because it is inactive.
        2020-12-05T05:58:03.432358-08:00 vm0026 systemctl[25987]: nginx.service 
is not active, cannot reload.
        2020-12-05T05:58:03.468235-08:00 vm0026 kernel: nginx[1103]: segfault 
at 10 ip 00007f5c566d6283 sp 00007ffeebdca500 error 4 in 
libperl.so.5.30.3[7f5c56668000+16d000]
        2020-12-05T05:58:03.468374-08:00 vm0026 kernel: Code: f9 ff 48 89 45 10 48 83 
c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 90 0f b6 7f 30 48 c1 e8 03 48 29 f8 48 89 
c5 74 89 48 8b 02 <4c> 8b 68 10 4d 85 ed 0f 84 28 01 00 00 0f b6 40 30 49 c1 ed 
03 49
        2020-12-05T05:58:03.468407-08:00 vm0026 kernel: potentially unexpected 
fatal signal 11.
        2020-12-05T05:58:03.468436-08:00 vm0026 kernel: CPU: 1 PID: 1103 Comm: 
nginx Not tainted 5.9.11-100.fc32.x86_64 #1
        2020-12-05T05:58:03.468460-08:00 vm0026 kernel: Hardware name: QEMU 
Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
        2020-12-05T05:58:03.468483-08:00 vm0026 kernel: RIP: 0033:0x7f5c566d6283
        2020-12-05T05:58:03.468504-08:00 vm0026 kernel: Code: f9 ff 48 89 45 10 48 83 
c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 90 0f b6 7f 30 48 c1 e8 03 48 29 f8 48 89 
c5 74 89 48 8b 02 <4c> 8b 68 10 4d 85 ed 0f 84 28 01 00 00 0f b6 40 30 49 c1 ed 
03 49
        2020-12-05T05:58:03.468549-08:00 vm0026 kernel: RSP: 
002b:00007ffeebdca500 EFLAGS: 00010202
        2020-12-05T05:58:03.468570-08:00 vm0026 kernel: RAX: 0000000000000000 
RBX: 00007ffeebdca6c8 RCX: 0000000000000001
        2020-12-05T05:58:03.468587-08:00 vm0026 kernel: RDX: 0000556ebd9e9750 
RSI: 0000556ebd924b80 RDI: 0000000000000000
        2020-12-05T05:58:03.468607-08:00 vm0026 kernel: RBP: 0000000000000597 
R08: 00007ffeebdca6c8 R09: 0000000000000011
        2020-12-05T05:58:03.468628-08:00 vm0026 kernel: R10: 0000556ebda8c808 
R11: 00007ffeebdca570 R12: 0000556ebd99e7e0
        2020-12-05T05:58:03.468646-08:00 vm0026 kernel: R13: 00007ffeebdca568 
R14: 0000000000000001 R15: 0000000000000001
        2020-12-05T05:58:03.468667-08:00 vm0026 kernel: FS:  00007f5c5837bb80 
GS:  0000000000000000
!!      2020-12-05T05:58:03.611240-08:00 vm0026 systemd[1]: nginx.service: Main 
process exited, code=dumped, status=11/SEGV
        2020-12-05T05:58:03.645225-08:00 vm0026 systemd[1]: nginx.service: 
Failed with result 'core-dump'.
        2020-12-05T05:58:03.646343-08:00 vm0026 systemd[1]: nginx.service: 
Consumed 14.607s CPU time.
        2020-12-05T06:00:43.703240-08:00 vm0026 systemd[1]: Starting system 
activity accounting tool...
        2020-12-05T06:00:43.712400-08:00 vm0026 systemd[1]: 
sysstat-collect.service: Succeeded.
        2020-12-05T06:00:43.712741-08:00 vm0026 systemd[1]: Finished system 
activity accounting tool.
        2020-12-05T06:01:01.537222-08:00 vm0026 systemd[1]: Created slice User 
Slice of UID 982.
        2020-12-05T06:01:01.538234-08:00 vm0026 systemd[1]: Starting User 
Runtime Directory /run/user/982...
        2020-12-05T06:01:01.550219-08:00 vm0026 systemd[1]: Finished User 
Runtime Directory /run/user/982.
        2020-12-05T06:01:01.552232-08:00 vm0026 systemd[1]: Starting User 
Manager for UID 982...
        2020-12-05T06:01:01.705790-08:00 vm0026 systemd[26023]: Startup 
finished in 132ms.
        ....


IIUC, nginx error "400" is 'bad header'.

But I do note,

        segfault at 10 ip 00007f5c566d6283 sp 00007ffeebdca500 error 4 in 
libperl.so.5.30.3[7f5c56668000+16d000]

Not clear what the cause of this infrequent/intermittent issue is.

Apart from blocking that IP/block/country, what's the right way to harden nginx 
against this^ ? Ideally, preventing the SEGV in the first place.

Or, hints as to what to look at more closely?

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Reply via email to