Re: 5.3-rc3-ish VM crash: RIP: 0010:tcp_trim_head+0x20/0xe0
On 17/08/2019 18:35, Eric Dumazet wrote: > > > On 8/17/19 10:24 AM, Sander Eikelenboom wrote: >> On 12/08/2019 19:56, Eric Dumazet wrote: >>> >>> >>> On 8/12/19 2:50 PM, Sander Eikelenboom wrote: L.S., While testing a somewhere-after-5.3-rc3 kernel (which included the latest net merge (33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9), one of my Xen VM's (which gets quite some network load) crashed. See below for the stacktrace. Unfortunately I haven't got a clear trigger, so bisection doesn't seem to be an option at the moment. I haven't encountered this on 5.2, so it seems to be an regression against 5.2. Any ideas ? -- Sander [16930.653595] general protection fault: [#1] SMP NOPTI [16930.653624] CPU: 0 PID: 3275 Comm: rsync Not tainted 5.3.0-rc3-20190809-doflr+ #1 [16930.653657] RIP: 0010:tcp_trim_head+0x20/0xe0 [16930.653677] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 [16930.653741] RSP: :c9003ad8 EFLAGS: 00010286 [16930.653762] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: 801b >>> >>> crash in " mov0x20(%rax),%eax" and RAX=fffe888005bf62c0 (not a valid >>> kernel address) >>> >>> Look like one bit corruption maybe. >>> >>> Nothing comes to mind really between 5.2 and 53 that could explain this. >>> [16930.653791] RDX: 05a0 RSI: 8880115fb800 RDI: 888016b00880 [16930.653819] RBP: 888016b00880 R08: 0001 R09: [16930.653848] R10: 88800ae00800 R11: bfe632e6 R12: 05a0 [16930.653875] R13: 0001 R14: bfe62d46 R15: 0004 [16930.653913] FS: 7fe71fe2cb80() GS:88801f20() knlGS: [16930.653943] CS: 0010 DS: ES: CR0: 80050033 [16930.653965] CR2: 55de0f3e7000 CR3: 11f32000 CR4: 06f0 [16930.653993] Call Trace: [16930.654005] [16930.654018] tcp_ack+0xbb0/0x1230 [16930.654033] tcp_rcv_established+0x2e8/0x630 [16930.654053] tcp_v4_do_rcv+0x129/0x1d0 [16930.654070] tcp_v4_rcv+0xac9/0xcb0 [16930.654088] ip_protocol_deliver_rcu+0x27/0x1b0 [16930.654109] ip_local_deliver_finish+0x3f/0x50 [16930.654128] ip_local_deliver+0x4d/0xe0 [16930.654145] ? ip_protocol_deliver_rcu+0x1b0/0x1b0 [16930.654163] ip_rcv+0x4c/0xd0 [16930.654179] __netif_receive_skb_one_core+0x79/0x90 [16930.654200] netif_receive_skb_internal+0x2a/0xa0 [16930.654219] napi_gro_receive+0xe7/0x140 [16930.654237] xennet_poll+0x9be/0xae0 [16930.654254] net_rx_action+0x136/0x340 [16930.654271] __do_softirq+0xdd/0x2cf [16930.654287] irq_exit+0x7a/0xa0 [16930.654304] xen_evtchn_do_upcall+0x27/0x40 [16930.654320] xen_hvm_callback_vector+0xf/0x20 [16930.654339] [16930.654349] RIP: 0033:0x55de0d87db99 [16930.654364] Code: 00 00 48 89 7c 24 f8 45 39 fe 45 0f 42 fe 44 89 7c 24 f4 eb 09 0f 1f 40 00 83 e9 01 74 3e 89 f2 48 63 f8 4c 01 d2 44 38 1c 3a <75> 25 44 38 6c 3a ff 75 1e 41 0f b6 3c 24 40 38 3a 75 14 41 0f b6 [16930.654432] RSP: 002b:7ffd5531eec8 EFLAGS: 0a87 ORIG_RAX: ff0c [16930.655004] RAX: 0002 RBX: 55de0f3e8e50 RCX: 007f [16930.655034] RDX: 55de0f3dc2d2 RSI: 3492 RDI: 0002 [16930.655062] RBP: 7fff R08: 80ea R09: 01f0 [16930.655089] R10: 55de0f3d8e40 R11: 0094 R12: 55de0f3e0f2a [16930.655116] R13: 0010 R14: 7f16 R15: 0080 [16930.655144] Modules linked in: [16930.655200] ---[ end trace 533367c95501b645 ]--- [16930.655223] RIP: 0010:tcp_trim_head+0x20/0xe0 [16930.655243] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 [16930.655312] RSP: :c9003ad8 EFLAGS: 00010286 [16930.655331] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: 801b [16930.655360] RDX: 05a0 RSI: 8880115fb800 RDI: 888016b00880 [16930.655387] RBP: 888016b00880 R08: 0001 R09: [16930.655414] R10: 88800ae00800 R11: bfe632e6 R12: 05a0 [16930.655441] R13: 0001 R14: bfe62d46 R15: 0004 [16930.655475] FS: 7fe71fe2cb80() GS:88801f20()
Re: 5.3-rc3-ish VM crash: RIP: 0010:tcp_trim_head+0x20/0xe0
On 8/17/19 10:24 AM, Sander Eikelenboom wrote: > On 12/08/2019 19:56, Eric Dumazet wrote: >> >> >> On 8/12/19 2:50 PM, Sander Eikelenboom wrote: >>> L.S., >>> >>> While testing a somewhere-after-5.3-rc3 kernel (which included the latest >>> net merge (33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9), >>> one of my Xen VM's (which gets quite some network load) crashed. >>> See below for the stacktrace. >>> >>> Unfortunately I haven't got a clear trigger, so bisection doesn't seem to >>> be an option at the moment. >>> I haven't encountered this on 5.2, so it seems to be an regression against >>> 5.2. >>> >>> Any ideas ? >>> >>> -- >>> Sander >>> >>> >>> [16930.653595] general protection fault: [#1] SMP NOPTI >>> [16930.653624] CPU: 0 PID: 3275 Comm: rsync Not tainted >>> 5.3.0-rc3-20190809-doflr+ #1 >>> [16930.653657] RIP: 0010:tcp_trim_head+0x20/0xe0 >>> [16930.653677] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >>> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 >>> <8b> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >>> [16930.653741] RSP: :c9003ad8 EFLAGS: 00010286 >>> [16930.653762] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >>> 801b >> >> crash in " mov0x20(%rax),%eax" and RAX=fffe888005bf62c0 (not a valid >> kernel address) >> >> Look like one bit corruption maybe. >> >> Nothing comes to mind really between 5.2 and 53 that could explain this. >> >>> [16930.653791] RDX: 05a0 RSI: 8880115fb800 RDI: >>> 888016b00880 >>> [16930.653819] RBP: 888016b00880 R08: 0001 R09: >>> >>> [16930.653848] R10: 88800ae00800 R11: bfe632e6 R12: >>> 05a0 >>> [16930.653875] R13: 0001 R14: bfe62d46 R15: >>> 0004 >>> [16930.653913] FS: 7fe71fe2cb80() GS:88801f20() >>> knlGS: >>> [16930.653943] CS: 0010 DS: ES: CR0: 80050033 >>> [16930.653965] CR2: 55de0f3e7000 CR3: 11f32000 CR4: >>> 06f0 >>> [16930.653993] Call Trace: >>> [16930.654005] >>> [16930.654018] tcp_ack+0xbb0/0x1230 >>> [16930.654033] tcp_rcv_established+0x2e8/0x630 >>> [16930.654053] tcp_v4_do_rcv+0x129/0x1d0 >>> [16930.654070] tcp_v4_rcv+0xac9/0xcb0 >>> [16930.654088] ip_protocol_deliver_rcu+0x27/0x1b0 >>> [16930.654109] ip_local_deliver_finish+0x3f/0x50 >>> [16930.654128] ip_local_deliver+0x4d/0xe0 >>> [16930.654145] ? ip_protocol_deliver_rcu+0x1b0/0x1b0 >>> [16930.654163] ip_rcv+0x4c/0xd0 >>> [16930.654179] __netif_receive_skb_one_core+0x79/0x90 >>> [16930.654200] netif_receive_skb_internal+0x2a/0xa0 >>> [16930.654219] napi_gro_receive+0xe7/0x140 >>> [16930.654237] xennet_poll+0x9be/0xae0 >>> [16930.654254] net_rx_action+0x136/0x340 >>> [16930.654271] __do_softirq+0xdd/0x2cf >>> [16930.654287] irq_exit+0x7a/0xa0 >>> [16930.654304] xen_evtchn_do_upcall+0x27/0x40 >>> [16930.654320] xen_hvm_callback_vector+0xf/0x20 >>> [16930.654339] >>> [16930.654349] RIP: 0033:0x55de0d87db99 >>> [16930.654364] Code: 00 00 48 89 7c 24 f8 45 39 fe 45 0f 42 fe 44 89 7c 24 >>> f4 eb 09 0f 1f 40 00 83 e9 01 74 3e 89 f2 48 63 f8 4c 01 d2 44 38 1c 3a >>> <75> 25 44 38 6c 3a ff 75 1e 41 0f b6 3c 24 40 38 3a 75 14 41 0f b6 >>> [16930.654432] RSP: 002b:7ffd5531eec8 EFLAGS: 0a87 ORIG_RAX: >>> ff0c >>> [16930.655004] RAX: 0002 RBX: 55de0f3e8e50 RCX: >>> 007f >>> [16930.655034] RDX: 55de0f3dc2d2 RSI: 3492 RDI: >>> 0002 >>> [16930.655062] RBP: 7fff R08: 80ea R09: >>> 01f0 >>> [16930.655089] R10: 55de0f3d8e40 R11: 0094 R12: >>> 55de0f3e0f2a >>> [16930.655116] R13: 0010 R14: 7f16 R15: >>> 0080 >>> [16930.655144] Modules linked in: >>> [16930.655200] ---[ end trace 533367c95501b645 ]--- >>> [16930.655223] RIP: 0010:tcp_trim_head+0x20/0xe0 >>> [16930.655243] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >>> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 >>> <8b> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >>> [16930.655312] RSP: :c9003ad8 EFLAGS: 00010286 >>> [16930.655331] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >>> 801b >>> [16930.655360] RDX: 05a0 RSI: 8880115fb800 RDI: >>> 888016b00880 >>> [16930.655387] RBP: 888016b00880 R08: 0001 R09: >>> >>> [16930.655414] R10: 88800ae00800 R11: bfe632e6 R12: >>> 05a0 >>> [16930.655441] R13: 0001 R14: bfe62d46 R15: >>> 0004 >>> [16930.655475] FS: 7fe71fe2cb80() GS:88801f20() >>> knlGS: >>> [16930.655502] CS: 0010 DS: ES: CR0: 80050033 >>> [16930.655525] CR2: 55de0f3e7000 CR3: 11f32000 CR4: >>>
Re: 5.3-rc3-ish VM crash: RIP: 0010:tcp_trim_head+0x20/0xe0
On 12/08/2019 19:56, Eric Dumazet wrote: > > > On 8/12/19 2:50 PM, Sander Eikelenboom wrote: >> L.S., >> >> While testing a somewhere-after-5.3-rc3 kernel (which included the latest >> net merge (33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9), >> one of my Xen VM's (which gets quite some network load) crashed. >> See below for the stacktrace. >> >> Unfortunately I haven't got a clear trigger, so bisection doesn't seem to be >> an option at the moment. >> I haven't encountered this on 5.2, so it seems to be an regression against >> 5.2. >> >> Any ideas ? >> >> -- >> Sander >> >> >> [16930.653595] general protection fault: [#1] SMP NOPTI >> [16930.653624] CPU: 0 PID: 3275 Comm: rsync Not tainted >> 5.3.0-rc3-20190809-doflr+ #1 >> [16930.653657] RIP: 0010:tcp_trim_head+0x20/0xe0 >> [16930.653677] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> >> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >> [16930.653741] RSP: :c9003ad8 EFLAGS: 00010286 >> [16930.653762] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >> 801b > > crash in " mov0x20(%rax),%eax" and RAX=fffe888005bf62c0 (not a valid > kernel address) > > Look like one bit corruption maybe. > > Nothing comes to mind really between 5.2 and 53 that could explain this. > >> [16930.653791] RDX: 05a0 RSI: 8880115fb800 RDI: >> 888016b00880 >> [16930.653819] RBP: 888016b00880 R08: 0001 R09: >> >> [16930.653848] R10: 88800ae00800 R11: bfe632e6 R12: >> 05a0 >> [16930.653875] R13: 0001 R14: bfe62d46 R15: >> 0004 >> [16930.653913] FS: 7fe71fe2cb80() GS:88801f20() >> knlGS: >> [16930.653943] CS: 0010 DS: ES: CR0: 80050033 >> [16930.653965] CR2: 55de0f3e7000 CR3: 11f32000 CR4: >> 06f0 >> [16930.653993] Call Trace: >> [16930.654005] >> [16930.654018] tcp_ack+0xbb0/0x1230 >> [16930.654033] tcp_rcv_established+0x2e8/0x630 >> [16930.654053] tcp_v4_do_rcv+0x129/0x1d0 >> [16930.654070] tcp_v4_rcv+0xac9/0xcb0 >> [16930.654088] ip_protocol_deliver_rcu+0x27/0x1b0 >> [16930.654109] ip_local_deliver_finish+0x3f/0x50 >> [16930.654128] ip_local_deliver+0x4d/0xe0 >> [16930.654145] ? ip_protocol_deliver_rcu+0x1b0/0x1b0 >> [16930.654163] ip_rcv+0x4c/0xd0 >> [16930.654179] __netif_receive_skb_one_core+0x79/0x90 >> [16930.654200] netif_receive_skb_internal+0x2a/0xa0 >> [16930.654219] napi_gro_receive+0xe7/0x140 >> [16930.654237] xennet_poll+0x9be/0xae0 >> [16930.654254] net_rx_action+0x136/0x340 >> [16930.654271] __do_softirq+0xdd/0x2cf >> [16930.654287] irq_exit+0x7a/0xa0 >> [16930.654304] xen_evtchn_do_upcall+0x27/0x40 >> [16930.654320] xen_hvm_callback_vector+0xf/0x20 >> [16930.654339] >> [16930.654349] RIP: 0033:0x55de0d87db99 >> [16930.654364] Code: 00 00 48 89 7c 24 f8 45 39 fe 45 0f 42 fe 44 89 7c 24 >> f4 eb 09 0f 1f 40 00 83 e9 01 74 3e 89 f2 48 63 f8 4c 01 d2 44 38 1c 3a <75> >> 25 44 38 6c 3a ff 75 1e 41 0f b6 3c 24 40 38 3a 75 14 41 0f b6 >> [16930.654432] RSP: 002b:7ffd5531eec8 EFLAGS: 0a87 ORIG_RAX: >> ff0c >> [16930.655004] RAX: 0002 RBX: 55de0f3e8e50 RCX: >> 007f >> [16930.655034] RDX: 55de0f3dc2d2 RSI: 3492 RDI: >> 0002 >> [16930.655062] RBP: 7fff R08: 80ea R09: >> 01f0 >> [16930.655089] R10: 55de0f3d8e40 R11: 0094 R12: >> 55de0f3e0f2a >> [16930.655116] R13: 0010 R14: 7f16 R15: >> 0080 >> [16930.655144] Modules linked in: >> [16930.655200] ---[ end trace 533367c95501b645 ]--- >> [16930.655223] RIP: 0010:tcp_trim_head+0x20/0xe0 >> [16930.655243] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> >> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >> [16930.655312] RSP: :c9003ad8 EFLAGS: 00010286 >> [16930.655331] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >> 801b >> [16930.655360] RDX: 05a0 RSI: 8880115fb800 RDI: >> 888016b00880 >> [16930.655387] RBP: 888016b00880 R08: 0001 R09: >> >> [16930.655414] R10: 88800ae00800 R11: bfe632e6 R12: >> 05a0 >> [16930.655441] R13: 0001 R14: bfe62d46 R15: >> 0004 >> [16930.655475] FS: 7fe71fe2cb80() GS:88801f20() >> knlGS: >> [16930.655502] CS: 0010 DS: ES: CR0: 80050033 >> [16930.655525] CR2: 55de0f3e7000 CR3: 11f32000 CR4: >> 06f0 >> [16930.63] Kernel panic - not syncing: Fatal exception in interrupt >> [16930.655789] Kernel Offset: disabled >> Hi Eric, Got another VM
Re: 5.3-rc3-ish VM crash: RIP: 0010:tcp_trim_head+0x20/0xe0
On 12/08/2019 19:56, Eric Dumazet wrote: > > > On 8/12/19 2:50 PM, Sander Eikelenboom wrote: >> L.S., >> >> While testing a somewhere-after-5.3-rc3 kernel (which included the latest >> net merge (33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9), >> one of my Xen VM's (which gets quite some network load) crashed. >> See below for the stacktrace. >> >> Unfortunately I haven't got a clear trigger, so bisection doesn't seem to be >> an option at the moment. >> I haven't encountered this on 5.2, so it seems to be an regression against >> 5.2. >> >> Any ideas ? >> >> -- >> Sander >> >> >> [16930.653595] general protection fault: [#1] SMP NOPTI >> [16930.653624] CPU: 0 PID: 3275 Comm: rsync Not tainted >> 5.3.0-rc3-20190809-doflr+ #1 >> [16930.653657] RIP: 0010:tcp_trim_head+0x20/0xe0 >> [16930.653677] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> >> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >> [16930.653741] RSP: :c9003ad8 EFLAGS: 00010286 >> [16930.653762] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >> 801b > > crash in " mov0x20(%rax),%eax" and RAX=fffe888005bf62c0 (not a valid > kernel address) > > Look like one bit corruption maybe. > > Nothing comes to mind really between 5.2 and 53 that could explain this. Hi Eric, Hmm could be it's a rare coincidence, sp that it just never occurred on pre 5.3 by chance. Let's wait and see if it reoccurs, will report back if it does. Thanks for your explanation. -- Sander >> [16930.653791] RDX: 05a0 RSI: 8880115fb800 RDI: >> 888016b00880 >> [16930.653819] RBP: 888016b00880 R08: 0001 R09: >> >> [16930.653848] R10: 88800ae00800 R11: bfe632e6 R12: >> 05a0 >> [16930.653875] R13: 0001 R14: bfe62d46 R15: >> 0004 >> [16930.653913] FS: 7fe71fe2cb80() GS:88801f20() >> knlGS: >> [16930.653943] CS: 0010 DS: ES: CR0: 80050033 >> [16930.653965] CR2: 55de0f3e7000 CR3: 11f32000 CR4: >> 06f0 >> [16930.653993] Call Trace: >> [16930.654005] >> [16930.654018] tcp_ack+0xbb0/0x1230 >> [16930.654033] tcp_rcv_established+0x2e8/0x630 >> [16930.654053] tcp_v4_do_rcv+0x129/0x1d0 >> [16930.654070] tcp_v4_rcv+0xac9/0xcb0 >> [16930.654088] ip_protocol_deliver_rcu+0x27/0x1b0 >> [16930.654109] ip_local_deliver_finish+0x3f/0x50 >> [16930.654128] ip_local_deliver+0x4d/0xe0 >> [16930.654145] ? ip_protocol_deliver_rcu+0x1b0/0x1b0 >> [16930.654163] ip_rcv+0x4c/0xd0 >> [16930.654179] __netif_receive_skb_one_core+0x79/0x90 >> [16930.654200] netif_receive_skb_internal+0x2a/0xa0 >> [16930.654219] napi_gro_receive+0xe7/0x140 >> [16930.654237] xennet_poll+0x9be/0xae0 >> [16930.654254] net_rx_action+0x136/0x340 >> [16930.654271] __do_softirq+0xdd/0x2cf >> [16930.654287] irq_exit+0x7a/0xa0 >> [16930.654304] xen_evtchn_do_upcall+0x27/0x40 >> [16930.654320] xen_hvm_callback_vector+0xf/0x20 >> [16930.654339] >> [16930.654349] RIP: 0033:0x55de0d87db99 >> [16930.654364] Code: 00 00 48 89 7c 24 f8 45 39 fe 45 0f 42 fe 44 89 7c 24 >> f4 eb 09 0f 1f 40 00 83 e9 01 74 3e 89 f2 48 63 f8 4c 01 d2 44 38 1c 3a <75> >> 25 44 38 6c 3a ff 75 1e 41 0f b6 3c 24 40 38 3a 75 14 41 0f b6 >> [16930.654432] RSP: 002b:7ffd5531eec8 EFLAGS: 0a87 ORIG_RAX: >> ff0c >> [16930.655004] RAX: 0002 RBX: 55de0f3e8e50 RCX: >> 007f >> [16930.655034] RDX: 55de0f3dc2d2 RSI: 3492 RDI: >> 0002 >> [16930.655062] RBP: 7fff R08: 80ea R09: >> 01f0 >> [16930.655089] R10: 55de0f3d8e40 R11: 0094 R12: >> 55de0f3e0f2a >> [16930.655116] R13: 0010 R14: 7f16 R15: >> 0080 >> [16930.655144] Modules linked in: >> [16930.655200] ---[ end trace 533367c95501b645 ]--- >> [16930.655223] RIP: 0010:tcp_trim_head+0x20/0xe0 >> [16930.655243] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 >> fd 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> >> 40 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 >> [16930.655312] RSP: :c9003ad8 EFLAGS: 00010286 >> [16930.655331] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: >> 801b >> [16930.655360] RDX: 05a0 RSI: 8880115fb800 RDI: >> 888016b00880 >> [16930.655387] RBP: 888016b00880 R08: 0001 R09: >> >> [16930.655414] R10: 88800ae00800 R11: bfe632e6 R12: >> 05a0 >> [16930.655441] R13: 0001 R14: bfe62d46 R15: >> 0004 >> [16930.655475] FS: 7fe71fe2cb80() GS:88801f20() >> knlGS: >> [16930.655502] CS: 0010 DS: ES: CR0: 80050033 >> [16930.655525] CR2:
Re: 5.3-rc3-ish VM crash: RIP: 0010:tcp_trim_head+0x20/0xe0
On 8/12/19 2:50 PM, Sander Eikelenboom wrote: > L.S., > > While testing a somewhere-after-5.3-rc3 kernel (which included the latest net > merge (33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9), > one of my Xen VM's (which gets quite some network load) crashed. > See below for the stacktrace. > > Unfortunately I haven't got a clear trigger, so bisection doesn't seem to be > an option at the moment. > I haven't encountered this on 5.2, so it seems to be an regression against > 5.2. > > Any ideas ? > > -- > Sander > > > [16930.653595] general protection fault: [#1] SMP NOPTI > [16930.653624] CPU: 0 PID: 3275 Comm: rsync Not tainted > 5.3.0-rc3-20190809-doflr+ #1 > [16930.653657] RIP: 0010:tcp_trim_head+0x20/0xe0 > [16930.653677] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 fd > 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> 40 > 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 > [16930.653741] RSP: :c9003ad8 EFLAGS: 00010286 > [16930.653762] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: > 801b crash in " mov0x20(%rax),%eax" and RAX=fffe888005bf62c0 (not a valid kernel address) Look like one bit corruption maybe. Nothing comes to mind really between 5.2 and 53 that could explain this. > [16930.653791] RDX: 05a0 RSI: 8880115fb800 RDI: > 888016b00880 > [16930.653819] RBP: 888016b00880 R08: 0001 R09: > > [16930.653848] R10: 88800ae00800 R11: bfe632e6 R12: > 05a0 > [16930.653875] R13: 0001 R14: bfe62d46 R15: > 0004 > [16930.653913] FS: 7fe71fe2cb80() GS:88801f20() > knlGS: > [16930.653943] CS: 0010 DS: ES: CR0: 80050033 > [16930.653965] CR2: 55de0f3e7000 CR3: 11f32000 CR4: > 06f0 > [16930.653993] Call Trace: > [16930.654005] > [16930.654018] tcp_ack+0xbb0/0x1230 > [16930.654033] tcp_rcv_established+0x2e8/0x630 > [16930.654053] tcp_v4_do_rcv+0x129/0x1d0 > [16930.654070] tcp_v4_rcv+0xac9/0xcb0 > [16930.654088] ip_protocol_deliver_rcu+0x27/0x1b0 > [16930.654109] ip_local_deliver_finish+0x3f/0x50 > [16930.654128] ip_local_deliver+0x4d/0xe0 > [16930.654145] ? ip_protocol_deliver_rcu+0x1b0/0x1b0 > [16930.654163] ip_rcv+0x4c/0xd0 > [16930.654179] __netif_receive_skb_one_core+0x79/0x90 > [16930.654200] netif_receive_skb_internal+0x2a/0xa0 > [16930.654219] napi_gro_receive+0xe7/0x140 > [16930.654237] xennet_poll+0x9be/0xae0 > [16930.654254] net_rx_action+0x136/0x340 > [16930.654271] __do_softirq+0xdd/0x2cf > [16930.654287] irq_exit+0x7a/0xa0 > [16930.654304] xen_evtchn_do_upcall+0x27/0x40 > [16930.654320] xen_hvm_callback_vector+0xf/0x20 > [16930.654339] > [16930.654349] RIP: 0033:0x55de0d87db99 > [16930.654364] Code: 00 00 48 89 7c 24 f8 45 39 fe 45 0f 42 fe 44 89 7c 24 f4 > eb 09 0f 1f 40 00 83 e9 01 74 3e 89 f2 48 63 f8 4c 01 d2 44 38 1c 3a <75> 25 > 44 38 6c 3a ff 75 1e 41 0f b6 3c 24 40 38 3a 75 14 41 0f b6 > [16930.654432] RSP: 002b:7ffd5531eec8 EFLAGS: 0a87 ORIG_RAX: > ff0c > [16930.655004] RAX: 0002 RBX: 55de0f3e8e50 RCX: > 007f > [16930.655034] RDX: 55de0f3dc2d2 RSI: 3492 RDI: > 0002 > [16930.655062] RBP: 7fff R08: 80ea R09: > 01f0 > [16930.655089] R10: 55de0f3d8e40 R11: 0094 R12: > 55de0f3e0f2a > [16930.655116] R13: 0010 R14: 7f16 R15: > 0080 > [16930.655144] Modules linked in: > [16930.655200] ---[ end trace 533367c95501b645 ]--- > [16930.655223] RIP: 0010:tcp_trim_head+0x20/0xe0 > [16930.655243] Code: 2e 0f 1f 84 00 00 00 00 00 90 41 54 41 89 d4 55 48 89 fd > 53 48 89 f3 f6 46 7e 01 74 2f 8b 86 bc 00 00 00 48 03 86 c0 00 00 00 <8b> 40 > 20 66 83 f8 01 74 19 31 d2 31 f6 b9 20 0a 00 00 48 89 df e8 > [16930.655312] RSP: :c9003ad8 EFLAGS: 00010286 > [16930.655331] RAX: fffe888005bf62c0 RBX: 8880115fb800 RCX: > 801b > [16930.655360] RDX: 05a0 RSI: 8880115fb800 RDI: > 888016b00880 > [16930.655387] RBP: 888016b00880 R08: 0001 R09: > > [16930.655414] R10: 88800ae00800 R11: bfe632e6 R12: > 05a0 > [16930.655441] R13: 0001 R14: bfe62d46 R15: > 0004 > [16930.655475] FS: 7fe71fe2cb80() GS:88801f20() > knlGS: > [16930.655502] CS: 0010 DS: ES: CR0: 80050033 > [16930.655525] CR2: 55de0f3e7000 CR3: 11f32000 CR4: > 06f0 > [16930.63] Kernel panic - not syncing: Fatal exception in interrupt > [16930.655789] Kernel Offset: disabled >