Re: [Bloat] bufferbloat.net server having troubles?

2018-04-01 Thread Toke Høiland-Jørgensen
Eric Dumazet  writes:

> On 03/31/2018 12:54 PM, Jonathan Morton wrote:
>>> On 31 Mar, 2018, at 10:50 pm, Toke Høiland-Jørgensen  wrote:
>>>
>>> Yeah, the box running the web server is having some issues with NULL
>>> pointer dereferences in tcp_push() in the kernel crashing processes
>>> running TCP. Haven't been able to figure out why :/
>> 
>> Maybe build/install a new kernel and reboot?
>> 
>> Possibility exists of hardware failure, too.  Less likely, perhaps, but if 
>> you don't have ECC...
>> 
>
> Nope, known bug on stable kernels.
>
> Please upgrade or downgrade.

Ah, saw that in the changelog for 4.14.32, and was hoping it was a fix
for the issue I was seeing. But haven't had time to upgrade yet. Thanks
for confirming that it is! I'll go upgrade I guess :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] bufferbloat.net server having troubles?

2018-03-31 Thread Eric Dumazet


On 03/31/2018 12:54 PM, Jonathan Morton wrote:
>> On 31 Mar, 2018, at 10:50 pm, Toke Høiland-Jørgensen  wrote:
>>
>> Yeah, the box running the web server is having some issues with NULL
>> pointer dereferences in tcp_push() in the kernel crashing processes
>> running TCP. Haven't been able to figure out why :/
> 
> Maybe build/install a new kernel and reboot?
> 
> Possibility exists of hardware failure, too.  Less likely, perhaps, but if 
> you don't have ECC...
> 

Nope, known bug on stable kernels.

Please upgrade or downgrade.

$ git log --oneline v4.14.31..v4.14.32  -- net/ipv4/tcp.c
e44c1733059c tcp: purge write queue upon aborting the connection

commit e44c1733059c69868e81f82eb09fcb6bbc492050
Author: Soheil Hassas Yeganeh 
Date:   Tue Mar 6 17:15:12 2018 -0500

tcp: purge write queue upon aborting the connection


[ Upstream commit e05836ac07c77dd90377f8c8140bce2a44af5fe7 ]

When the connection is aborted, there is no point in
keeping the packets on the write queue until the connection
is closed.

Similar to a27fd7a8ed38 ('tcp: purge write queue upon RST'),
this is essential for a correct MSG_ZEROCOPY implementation,
because userspace cannot call close(fd) before receiving
zerocopy signals even when the connection is aborted.

Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Soheil Hassas Yeganeh 
Signed-off-by: Neal Cardwell 
Reviewed-by: Eric Dumazet 
Signed-off-by: Yuchung Cheng 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 


___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] bufferbloat.net server having troubles?

2018-03-31 Thread Jonathan Morton
> On 31 Mar, 2018, at 10:50 pm, Toke Høiland-Jørgensen  wrote:
> 
> Yeah, the box running the web server is having some issues with NULL
> pointer dereferences in tcp_push() in the kernel crashing processes
> running TCP. Haven't been able to figure out why :/

Maybe build/install a new kernel and reboot?

Possibility exists of hardware failure, too.  Less likely, perhaps, but if you 
don't have ECC...

 - Jonathan Morton

___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] bufferbloat.net server having troubles?

2018-03-31 Thread Toke Høiland-Jørgensen
Rich Brown  writes:

> I just went to bufferbloat.net and
> https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Chart_Explanation/
> and am receiving intermittent 502 & 522 errors. Is anyone else seeing
> this? Let me know if you need more details. Thanks.

Yeah, the box running the web server is having some issues with NULL
pointer dereferences in tcp_push() in the kernel crashing processes
running TCP. Haven't been able to figure out why :/

[332756.817052] BUG: unable to handle kernel NULL pointer dereference at 
0038
[332756.817072] IP: tcp_push+0x40/0x120 


[332756.817075] PGD 0 P4D 0
[332756.817082] Oops: 0002 [#11] SMP PTI
[332756.817085] Modules linked in: fuse md4 nls_utf8 cifs ccm dns_resolver 
fscache wireguard(O) ip6_udp_tunnel udp_tunnel ip6t_REJECT nf_reject_ipv6 
nf_conntrack_ipv6 nf_defrag_
ipv6 ip6table_filter ip6table_mangle ip6_tables ipt_REJECT nf_reject_ipv4 
xt_policy xt_set xt_hashlimit xt_conntrack iptable_filter ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable
_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c 
crc32c_generic xt_TCPMSS xt_tcpudp iptable_mangle ip_set_hash_ip tun ip_set 
nfnetlink sunrpc snd_
hda_codec_hdmi nvidia_drm(PO) nvidia_modeset(PO) nls_iso8859_1 nls_cp437 
intel_rapl vfat nvidia(PO) fat uvcvideo videobuf2_vmalloc x86_pkg_temp_thermal 
snd_usb_audio intel_power
clamp videobuf2_memops videobuf2_v4l2 videobuf2_core coretemp joydev dcdbas 
videodev kvm_intel mousedev
[332756.817155]  snd_usbmidi_lib iTCO_wdt input_leds snd_hda_codec_realtek 
snd_rawmidi evdev mei_wdt iTCO_vendor_support media snd_seq_device 
drm_kms_helper mac_hid dell_smm_hwm
on snd_hda_codec_generic kvm drm irqbypass snd_hda_intel intel_cstate 
intel_rapl_perf snd_hda_codec pcspkr snd_hda_core agpgart ipmi_devintf 
snd_hwdep ipmi_msghandler snd_pcm sy
scopyarea sysfillrect sysimgblt fb_sys_fops snd_timer i2c_i801 snd mei_me 
soundcore mei lpc_ich shpchp wmi button vmmon(O) vmw_vmci vboxnetflt(O) 
vboxnetadp(O) pci_stub vboxpci(
O) vboxdrv(O) tcp_bbr sit tunnel4 ip_tunnel sg ip_tables x_tables ext4 crc16 
mbcache jbd2 fscrypto algif_skcipher af_alg hid_generic usbhid hid arc4 
rt2800usb rt2x00usb rt2800li
b rt2x00lib led_class mac80211 cfg80211 rfkill dm_crypt dm_mod sd_mod 
crct10dif_pclmul crc32_pclmul crc32c_intel
[332756.817239]  ghash_clmulni_intel pcbc xhci_pci xhci_hcd ehci_pci ahci 
ehci_hcd libahci aesni_intel aes_x86_64 crypto_simd glue_helper libata cryptd 
tg3 libphy scsi_mod e1000
e usbcore ptp usb_common pps_core sch_fq
[332756.817267] CPU: 0 PID: 22603 Comm: nginx Tainted: P  DO
4.14.29-1-lts #1
[332756.817270] Hardware name: Dell Inc. Precision T3610/09M8Y8, BIOS A15 
01/04/2018 
[332756.817273] task: 97ccd2914880 task.stack: be1e04cc4000
[332756.817279] RIP: 0010:tcp_push+0x40/0x120
[332756.817282] RSP: 0018:be1e04cc7d00 EFLAGS: 00010246
[332756.817286] RAX:  RBX: 97cb98b43b80 RCX: 
0001
[332756.817289] RDX:  RSI: 0040 RDI: 
97cb98b43b80
[332756.817292] RBP: 21f0 R08: 05a8 R09: 
05a8
[332756.817295] R10: 97cb98b43cd8 R11:  R12: 
05a8
[332756.817298] R13: 0040 R14: 97cb98b43cd8 R15: 
ffe0
[332756.817302] FS:  7f909a778b80() GS:97ceafc0() 
knlGS:
[332756.817305] CS:  0010 DS:  ES:  CR0: 80050033
[332756.817308] CR2: 0038 CR3: 00031172a001 CR4: 
001606f0
[332756.817311] Call Trace:
[332756.817320]  tcp_sendmsg_locked+0xb10/0xe50
[332756.817328]  ? sock_poll+0x70/0x90
[332756.817334]  tcp_sendmsg+0x27/0x40 
[332756.817339]  sock_write_iter+0xa3/0x110
[332756.817347]  __vfs_write+0x102/0x180
[332756.817353]  vfs_write+0xad/0x1a0
[332756.817358]  SyS_write+0x52/0xc0
[332756.817366]  do_syscall_64+0x67/0x120
[332756.817379] RIP: 0033:0x7f909a1958b4
[332756.817382] RSP: 002b:75e51c48 EFLAGS: 0246 ORIG_RAX: 
0001
[332756.817386] RAX: ffda RBX: 2f4c RCX: 
7f909a1958b4
[332756.817389] RDX: 2f4c RSI: 55be7827c36c RDI: 
0016
[332756.817392] RBP: 55be78050340 R08:  R09: 

[332756.817395] R10: 55be7804eab0 R11: 0246 R12: 
55be7827c36c
[332756.817398] R13: 2f4c R14: 7f909a778b00 R15: 
55be78272150
[332756.817401] Code: 00 48 8b 87 60 01 00 00 4c 8d 97 58 01 00 00 ba 00 00 00 
00 41 89 f3 49 39 c2 48 0f 44 c2 41 81 e3 00 80 00 00 0f 85 9d 00 00 00 <80> 48 
38 08 8b 97 74 06
00 00 89 97 7c 06 00 00 83 e6 01 74 0c
[332756.817461] RIP: tcp_push+0x40/0x120 RSP: be1e04cc7d00
[332756.817463] CR2: 0038 
[332756.817469] ---[ end trace 797c2d8c9eead6f1 ]---


-Toke

[Bloat] bufferbloat.net server having troubles?

2018-03-31 Thread Rich Brown
I just went to bufferbloat.net and 
https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Chart_Explanation/ and am 
receiving intermittent 502 & 522 errors. Is anyone else seeing this? Let me 
know if you need more details. Thanks.

Rich
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat