I took a stab at it this way - not sure if I am doing this correctly or not.
[root@myVMslot12 ~]# gdb /boot/vmlinuz-4.4.0 /proc/kcore GNU gdb (GDB) Fedora (7.3.50.20110722-13.fc16) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... BFD: /boot/vmlinuz-4.4.0: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .bss BFD: /boot/vmlinuz-4.4.0: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .bss Reading symbols from /boot/vmlinuz-4.4.0...(no debugging symbols found)...done. warning: core file may not match specified executable file. [New process 1] Core was generated by `BOOT_IMAGE=/vmlinuz-4.4.0 root=UUID=b419f9ff-80ce-459e-855c-614d86a48105 ro rd.'. #0 0x0000000000000000 in ?? () (gdb) file /lib/modules/4.4.0/kernel/net/tipc/tipc.ko warning: core file may not match specified executable file. Reading symbols from /lib/modules/4.4.0/kernel/net/tipc/tipc.ko...done. (gdb) list *(tipc_sk_rcv+0x238) 0x14898 is in tipc_sk_rcv (net/tipc/msg.h:131). warning: Source file is more recent than executable. 126 return (struct tipc_msg *)skb->data; 127 } 128 129 static inline u32 msg_word(struct tipc_msg *m, u32 pos) 130 { 131 return ntohl(m->hdr[pos]); 132 } 133 134 static inline void msg_set_word(struct tipc_msg *m, u32 w, u32 val) 135 { -----Original Message----- From: Butler, Peter Sent: February-22-17 12:45 PM To: Jon Maloy <jon.ma...@ericsson.com>; tipc-discussion@lists.sourceforge.net Cc: Butler, Peter <pbut...@sonusnet.com> Subject: RE: TIPC Oops in tipc_sk_recv Hi Jon Thanks for the info. One thing I should clarify. Although we are running the 4.4.0 kernel, we had backported a number of post-4.4.0 TIPC patches into our 4.4.0 kernel. As such, the offset in question (tipc_sk_rcv+0x238) will not match that in the vanilla 4.4.0 source. Should I post the entire socket.c file to this list for your review? Or is there an easy way for me to do a similar listing using our actual tipc.ko file here in the lab? Peter -----Original Message----- From: Jon Maloy [mailto:jon.ma...@ericsson.com] Sent: February-22-17 12:29 PM To: Butler, Peter <pbut...@sonusnet.com>; tipc-discussion@lists.sourceforge.net Subject: RE: TIPC Oops in tipc_sk_recv Hi Peter, Very hard to make any suggestions on how to reproduce this. What I can see is that it is a STREAM message being sent from a node local socket, i.e., it doesn't go via any interface. The crash seems to happen when the receiving socket is owned by the user, and while we are instead adding the message to the backlog queue: Reading symbols from net/tipc/tipc.ko...done. (gdb) list *(tipc_sk_rcv+0x238) 0x13d78 is in tipc_sk_rcv (./arch/x86/include/asm/atomic.h:214). 209 static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u) 210 { 211 int c, old; 212 c = atomic_read(v); 213 for (;;) { 214 if (unlikely(c == (u))) 215 break; 216 old = atomic_cmpxchg((v), c, c + (a)); 217 if (likely(old == c)) 218 break; This is about what I can get out of it at the moment. Maybe you should try a high-load test between two local sockets (try the benchmark demo from tipcutils) and see what you can achieve. BR ///jon > -----Original Message----- > From: Butler, Peter [mailto:pbut...@sonusnet.com] > Sent: Wednesday, February 22, 2017 10:40 AM > To: Jon Maloy <jon.ma...@ericsson.com>; tipc- > discuss...@lists.sourceforge.net > Cc: Butler, Peter <pbut...@sonusnet.com> > Subject: RE: TIPC Oops in tipc_sk_recv > > If you have any suggestions as to procedures/tricks you think might > trigger this bug I can certainly attempt to do so in the lab. > Obviously we can't attempt to reproduce it on the customer's (live) system. > > > > -----Original Message----- > From: Butler, Peter > Sent: February-21-17 3:39 PM > To: Jon Maloy <jon.ma...@ericsson.com>; tipc- > discuss...@lists.sourceforge.net > Cc: Butler, Peter <pbut...@sonusnet.com> > Subject: RE: TIPC Oops in tipc_sk_recv > > Unfortunately this occurred on a customer system so it is not readily > reproducible. We have not seen this occur in our lab. > > For what it's worth, it occurred while the process was in > TASK_UNINTERRUPTIBLE. As such, the kernel could not actually kill off > the associated process despite the Oops, and the process remained > forever frozen in the 'D' state and the card had to be rebooted. > > > > > -----Original Message----- > From: Jon Maloy [mailto:jon.ma...@ericsson.com] > Sent: February-21-17 3:36 PM > To: Butler, Peter <pbut...@sonusnet.com>; tipc- > discuss...@lists.sourceforge.net > Subject: RE: TIPC Oops in tipc_sk_recv > > Hi Peter, > I don't think this is any known bug. Is it repeatable? > > ///jon > > > -----Original Message----- > > From: Butler, Peter [mailto:pbut...@sonusnet.com] > > Sent: Tuesday, February 21, 2017 12:14 PM > > To: tipc-discussion@lists.sourceforge.net > > Cc: Butler, Peter <pbut...@sonusnet.com> > > Subject: [tipc-discussion] TIPC Oops in tipc_sk_recv > > > > This was with kernel 4.4.0, however I don't see any fix specifically > > related to this in any subsequent 4.4.x kernel... > > > > BUG: unable to handle kernel NULL pointer dereference at > > 00000000000000d8 > > IP: [<ffffffffa0148868>] tipc_sk_rcv+0x238/0x4d0 [tipc] PGD > > 34f4c0067 PUD > > 34ed95067 PMD 0 > > Oops: 0000 [#1] SMP > > Modules linked in: nf_log_ipv4 nf_log_common xt_LOG sctp libcrc32c > > e1000e tipc udp_tunnel ip6_udp_tunnel iTCO_wdt 8021q garp xt_physdev > > br_netfilter bridge stp llc nf_conntrack_ipv4 ipmiq_drv(O) > > nf_defrag_ipv4 > > sio_mmc(O) ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 > > nf_defrag_ipv6 xt_state nf_conntrack event_drv(O) ip6table_filter > > lockd ip6_tables > > pt_timer_info(O) ddi(O) grace usb_storage ixgbe igb > > iTCO_vendor_support i2c_algo_bit ptp i2c_i801 pps_core lpc_ich > > i2c_core intel_ips mfd_core pcspkr ioatdma sunrpc dca tpm_tis mdio > > tpm > [last unloaded: iTCO_wdt] > > CPU: 2 PID: 12144 Comm: dinamo Tainted: G O 4.4.0 #23 > > Hardware name: PT AMC124/Base Board Product Name, BIOS > > LGNAJFIP.PTI.0012.P15 01/15/2014 > > task: ffff880036ad8000 ti: ffff880036900000 task.ti: > > ffff880036900000 > > RIP: 0010:[<ffffffffa0148868>] [<ffffffffa0148868>] > > tipc_sk_rcv+0x238/0x4d0 [tipc] > > RSP: 0018:ffff880036903bb8 EFLAGS: 00010292 > > RAX: 0000000000000000 RBX: ffff88034def3970 RCX: 0000000000000001 > > RDX: 0000000000000101 RSI: 0000000000000292 RDI: ffff88034def3984 > > RBP: ffff880036903c28 R08: 0000000000000101 R09: 0000000000000004 > > R10: 0000000000000001 R11: 0000000000000000 R12: ffff880036903d28 > > R13: 00000000bd1fd8b2 R14: ffff88034def3840 R15: ffff880036903d3c > > FS: 00007f1e86299740(0000) GS:ffff88035fc40000(0000) > > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000000000d8 CR3: 0000000036835000 CR4: 00000000000006e0 > > Stack: > > 000000000000009b ffff880036903d28 0000000000000018 ffff88034def38c8 > > ffffffff81ce6240 ffff8802b9bdba00 ffff880036903ca8 ffffffffa013bd7e > > ffff8802b99d5ee8 ffff880036903c60 0000000000000000 ffff88003693cb00 > > Call > > Trace: > > [<ffffffffa013bd7e>] ? tipc_msg_build+0xde/0x4f0 [tipc] > > [<ffffffffa014358f>] tipc_node_xmit+0x11f/0x150 [tipc] > > [<ffffffffa01470ba>] > > __tipc_send_stream+0x16a/0x300 [tipc] [<ffffffff81625eb5>] ? > > tcp_sendmsg+0x4d5/0xb00 [<ffffffffa0147292>] > > tipc_send_stream+0x42/0x70 [tipc] [<ffffffff815bcf77>] > > sock_sendmsg+0x47/0x50 [<ffffffff815bd03f>] > > sock_write_iter+0x7f/0xd0 [<ffffffff811d799a>] __vfs_write+0xaa/0xe0 > > [<ffffffff811d8b16>] > > vfs_write+0xb6/0x1a0 [<ffffffff811d8e3f>] SyS_write+0x4f/0xb0 > > [<ffffffff816de6d7>] entry_SYSCALL_64_fastpath+0x12/0x6a > > Code: 89 de 4c 89 f7 e8 29 d3 ff ff 48 8b 7d a8 e8 60 59 59 e1 49 8d > > 9e 30 01 00 > > 00 49 3b 9e 30 01 00 00 74 30 48 89 df e8 b8 b6 47 e1 <48> 8b 90 d8 > > 00 > > 00 00 48 8b 7d b0 44 89 e9 48 89 c6 48 89 45 c0 RIP > > [<ffffffffa0148868>] > > tipc_sk_rcv+0x238/0x4d0 [tipc] RSP <ffff880036903bb8> > > CR2: 00000000000000d8 > > ---[ end trace 1c2d69738941d565 ]--- > > > > > > -------------------------------------------------------------------- > > -- > > -------- Check out the vibrant tech community on one of the world's > > most engaging tech sites, SlashDot.org! http://sdm.link/slashdot > > _______________________________________________ > > tipc-discussion mailing list > > tipc-discussion@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion