Hi Peter,
Very hard to make any suggestions on how to reproduce this. What I can see is 
that it is a STREAM message being sent from a node local socket, i.e., it 
doesn't go via any interface. The crash seems to happen when the receiving 
socket is owned by the user, and while we are instead adding the message to the 
backlog queue:

Reading symbols from net/tipc/tipc.ko...done.
(gdb) list *(tipc_sk_rcv+0x238)
0x13d78 is in tipc_sk_rcv (./arch/x86/include/asm/atomic.h:214).
209     static __always_inline int __atomic_add_unless(atomic_t *v, int a, int 
u)
210     {
211             int c, old;
212             c = atomic_read(v);
213             for (;;) {
214                     if (unlikely(c == (u)))
215                             break;
216                     old = atomic_cmpxchg((v), c, c + (a));
217                     if (likely(old == c))
218                             break;

This is about what I can get out of it at the moment. Maybe you should try a 
high-load test between two local sockets (try the benchmark demo from 
tipcutils) and see what you can achieve.

BR
///jon


> -----Original Message-----
> From: Butler, Peter [mailto:pbut...@sonusnet.com]
> Sent: Wednesday, February 22, 2017 10:40 AM
> To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> discuss...@lists.sourceforge.net
> Cc: Butler, Peter <pbut...@sonusnet.com>
> Subject: RE: TIPC Oops in tipc_sk_recv
> 
> If you have any suggestions as to procedures/tricks you think might trigger
> this bug I can certainly attempt to do so in the lab.  Obviously we can't
> attempt to reproduce it on the customer's (live) system.
> 
> 
> 
> -----Original Message-----
> From: Butler, Peter
> Sent: February-21-17 3:39 PM
> To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> discuss...@lists.sourceforge.net
> Cc: Butler, Peter <pbut...@sonusnet.com>
> Subject: RE: TIPC Oops in tipc_sk_recv
> 
> Unfortunately this occurred on a customer system so it is not readily
> reproducible.  We have not seen this occur in our lab.
> 
> For what it's worth, it occurred while the process was in
> TASK_UNINTERRUPTIBLE.  As such, the kernel could not actually kill off the
> associated process despite the Oops, and the process remained forever
> frozen in the 'D' state and the card had to be rebooted.
> 
> 
> 
> 
> -----Original Message-----
> From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> Sent: February-21-17 3:36 PM
> To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> discuss...@lists.sourceforge.net
> Subject: RE: TIPC Oops in tipc_sk_recv
> 
> Hi Peter,
> I don't think this is any known bug. Is it repeatable?
> 
> ///jon
> 
> > -----Original Message-----
> > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > Sent: Tuesday, February 21, 2017 12:14 PM
> > To: tipc-discussion@lists.sourceforge.net
> > Cc: Butler, Peter <pbut...@sonusnet.com>
> > Subject: [tipc-discussion] TIPC Oops in tipc_sk_recv
> >
> > This was with kernel 4.4.0, however I don't see any fix specifically
> > related to this in any subsequent 4.4.x kernel...
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 00000000000000d8
> > IP: [<ffffffffa0148868>] tipc_sk_rcv+0x238/0x4d0 [tipc] PGD 34f4c0067
> > PUD
> > 34ed95067 PMD 0
> > Oops: 0000 [#1] SMP
> > Modules linked in: nf_log_ipv4 nf_log_common xt_LOG sctp libcrc32c
> > e1000e tipc udp_tunnel ip6_udp_tunnel iTCO_wdt 8021q garp xt_physdev
> > br_netfilter bridge stp llc nf_conntrack_ipv4 ipmiq_drv(O)
> > nf_defrag_ipv4
> > sio_mmc(O) ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6
> > xt_state nf_conntrack event_drv(O) ip6table_filter lockd ip6_tables
> > pt_timer_info(O) ddi(O) grace usb_storage ixgbe igb
> > iTCO_vendor_support i2c_algo_bit ptp i2c_i801 pps_core lpc_ich
> > i2c_core intel_ips mfd_core pcspkr ioatdma sunrpc dca tpm_tis mdio tpm
> [last unloaded: iTCO_wdt]
> > CPU: 2 PID: 12144 Comm: dinamo Tainted: G           O    4.4.0 #23
> > Hardware name: PT AMC124/Base Board Product Name, BIOS
> > LGNAJFIP.PTI.0012.P15 01/15/2014
> > task: ffff880036ad8000 ti: ffff880036900000 task.ti: ffff880036900000
> > RIP: 0010:[<ffffffffa0148868>]  [<ffffffffa0148868>]
> > tipc_sk_rcv+0x238/0x4d0 [tipc]
> > RSP: 0018:ffff880036903bb8  EFLAGS: 00010292
> > RAX: 0000000000000000 RBX: ffff88034def3970 RCX: 0000000000000001
> > RDX: 0000000000000101 RSI: 0000000000000292 RDI: ffff88034def3984
> > RBP: ffff880036903c28 R08: 0000000000000101 R09: 0000000000000004
> > R10: 0000000000000001 R11: 0000000000000000 R12: ffff880036903d28
> > R13: 00000000bd1fd8b2 R14: ffff88034def3840 R15: ffff880036903d3c
> > FS:  00007f1e86299740(0000) GS:ffff88035fc40000(0000)
> > knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000000000000d8 CR3: 0000000036835000 CR4: 00000000000006e0
> > Stack:
> >  000000000000009b ffff880036903d28 0000000000000018 ffff88034def38c8
> >  ffffffff81ce6240 ffff8802b9bdba00 ffff880036903ca8 ffffffffa013bd7e
> >  ffff8802b99d5ee8 ffff880036903c60 0000000000000000 ffff88003693cb00
> > Call
> > Trace:
> >  [<ffffffffa013bd7e>] ? tipc_msg_build+0xde/0x4f0 [tipc]
> > [<ffffffffa014358f>] tipc_node_xmit+0x11f/0x150 [tipc]
> > [<ffffffffa01470ba>]
> > __tipc_send_stream+0x16a/0x300 [tipc]  [<ffffffff81625eb5>] ?
> > tcp_sendmsg+0x4d5/0xb00  [<ffffffffa0147292>]
> > tipc_send_stream+0x42/0x70 [tipc]  [<ffffffff815bcf77>]
> > sock_sendmsg+0x47/0x50  [<ffffffff815bd03f>] sock_write_iter+0x7f/0xd0
> > [<ffffffff811d799a>] __vfs_write+0xaa/0xe0  [<ffffffff811d8b16>]
> > vfs_write+0xb6/0x1a0  [<ffffffff811d8e3f>] SyS_write+0x4f/0xb0
> > [<ffffffff816de6d7>] entry_SYSCALL_64_fastpath+0x12/0x6a
> > Code: 89 de 4c 89 f7 e8 29 d3 ff ff 48 8b 7d a8 e8 60 59 59 e1 49 8d
> > 9e 30 01 00
> > 00 49 3b 9e 30 01 00 00 74 30 48 89 df e8 b8 b6 47 e1 <48> 8b 90 d8 00
> > 00 00 48 8b 7d b0 44 89 e9 48 89 c6 48 89 45 c0 RIP
> > [<ffffffffa0148868>]
> > tipc_sk_rcv+0x238/0x4d0 [tipc]  RSP <ffff880036903bb8>
> > CR2: 00000000000000d8
> > ---[ end trace 1c2d69738941d565 ]---
> >
> >
> > ----------------------------------------------------------------------
> > -------- Check out the vibrant tech community on one of the world's
> > most engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> > _______________________________________________
> > tipc-discussion mailing list
> > tipc-discussion@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to