> -----Original Message-----
> From: Butler, Peter [mailto:pbut...@sonusnet.com]
> Sent: Thursday, February 23, 2017 01:23 PM
> To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> discuss...@lists.sourceforge.net; Parthasarathy Bhuvaragan
> <parthasarathy.bhuvara...@ericsson.com>
> Cc: Butler, Peter <pbut...@sonusnet.com>
> Subject: RE: TIPC Oops in tipc_sk_recv
> 
> That might be a possibility - I know the customer is close to 32 nodes
> however, so it might not be.
> 
> I'm also looking at porting the required functionality from
> include/net/netlink.h and lib/nlattr.c directly into the TIPC monitor.c file 
> (as
> opposed to changing any code directly in include/net and lib/.....

I think you are moving into dangerous waters here, unless you only want the 
code to compile.
A simpler and safer option: change #define TIPC_DEF_MON_THRESHOLD in core.h 
from  32 to e.g. 100, and the hierarchical monitoring will be disabled. This is 
the way we have been running forever until 4.7, so this is a safe bet.

//jon

> 
> 
> 
> -----Original Message-----
> From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> Sent: February-23-17 1:19 PM
> To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> discuss...@lists.sourceforge.net; Parthasarathy Bhuvaragan
> <parthasarathy.bhuvara...@ericsson.com>
> Subject: RE: TIPC Oops in tipc_sk_recv
> 
> 
> 
> > -----Original Message-----
> > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > Sent: Thursday, February 23, 2017 01:09 PM
> > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > discuss...@lists.sourceforge.net; Parthasarathy Bhuvaragan
> > <parthasarathy.bhuvara...@ericsson.com>
> > Cc: Butler, Peter <pbut...@sonusnet.com>
> > Subject: RE: TIPC Oops in tipc_sk_recv
> >
> > Partha - an update for you
> >
> > I've ported all the TIPC code from 4.9.11 into our 4.4.0 kernel code
> > base.  By this I mean I have completely removed all the existing TIPC
> > files in their entirety from:
> >
> > include/uapi/linux/tipc*
> > net/tipc/*
> >
> > in our 4.4.0 kernel source tree, and replaced these with all the files
> > from 4.9.11.
> >
> > As Jon indeed forewarned me, there will be a hurdle or two to
> > integrate this with the 4.4.0 kernel's internal API.  As it stands
> > this is where the compilation first fails.  I can certainly look into this 
> > myself
> but am told you are the expert.
> > (I am far from a kernel expert myself.)
> >
> >   LD      net/tipc/built-in.o
> >   CC [M]  net/tipc/addr.o
> >   CC [M]  net/tipc/bcast.o
> >   CC [M]  net/tipc/bearer.o
> >   CC [M]  net/tipc/core.o
> >   CC [M]  net/tipc/link.o
> >   CC [M]  net/tipc/discover.o
> >   CC [M]  net/tipc/msg.o
> >   CC [M]  net/tipc/name_distr.o
> >   CC [M]  net/tipc/subscr.o
> >   CC [M]  net/tipc/monitor.o
> > net/tipc/monitor.c: In function '__tipc_nl_add_monitor_peer':
> 
> Unless you are running a cluster > 32 nodes and need the hierarchical
> neighbor monitoring feature, you can just comment out the contents of this
> function and other monitor-related netlink function.
> 
> ///jon
> 
> > net/tipc/monitor.c:707:3: error: implicit declaration of function
> > 'nla_put_u64_64bit' [-Werror=implicit-function-declaration]
> > cc1: some warnings being treated as errors
> > make[2]: *** [net/tipc/monitor.o] Error 1
> > make[1]: *** [net/tipc] Error 2
> > make: *** [net] Error 2
> >
> >
> >
> > -----Original Message-----
> > From: Butler, Peter
> > Sent: February-23-17 10:56 AM
> > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > discuss...@lists.sourceforge.net; Parthasarathy Bhuvaragan
> > <parthasarathy.bhuvara...@ericsson.com>
> > Cc: Butler, Peter <pbut...@sonusnet.com>
> > Subject: RE: TIPC Oops in tipc_sk_recv
> >
> > Hi Partha,
> >
> > I'll give you the short version here to save you the time of reading
> > this entire thread.
> >
> > Basically I need to port the latest and greatest TIPC code (i.e. from
> > the latest longterm kernel release, namely 4.9.11) into a 4.4.0 kernel
> > source base.  (I know that sounds ugly but it's for an emergency
> > quick-fix and upgrading the entire kernel is not an option at this
> > time...)
> >
> > Jon has said this is entirely doable but that you are the expert, and
> > that there will be at least one minor hurdle in doing so, namely in
> > iov handling in msg_build().
> >
> > Thanks,
> >
> > Peter
> >
> >
> >
> > -----Original Message-----
> > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > Sent: February-23-17 10:45 AM
> > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > discuss...@lists.sourceforge.net; Parthasarathy Bhuvaragan
> > <parthasarathy.bhuvara...@ericsson.com>
> > Subject: RE: TIPC Oops in tipc_sk_recv
> >
> >
> >
> > > -----Original Message-----
> > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > Sent: Thursday, February 23, 2017 10:25 AM
> > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > discuss...@lists.sourceforge.net
> > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > Subject: RE: TIPC Oops in tipc_sk_recv
> > >
> > > Hi Jon,
> > >
> > > Thanks for the info.  The solution we are considering (to give the
> > > customer an emergency patch) is backport the TIPC code from kernel
> > > 4.4.50 into our 4.4.0 kernel source tree.  From what I can see, I
> > > should be able to do so with little effort.  I am assuming (?) that
> > > since 4.4.x is a longterm kernel release that the
> > > 4.4.50 TIPC code is considered stable and devoid of the original bug
> > > associated with this section of code in tipc_sk_rcv() - am I wrong
> > > to assume that?
> >
> > Unfortunately yes. The only safe solution to the deadlock problem is
> > the one you find in later versions.
> > The patch fixing this particular problem hasn't been applied this far
> > back, probably because it didn't apply cleanly.
> >
> > > The section of code in question is entirely different in 4.4.50 than
> > > what we currently have:
> > >
> > >       if (likely(tsk)) {
> > >          sk = &tsk->sk;
> > >          if (likely(spin_trylock_bh(&sk->sk_lock.slock))) {
> > >             tipc_sk_enqueue(inputq, sk, dport);
> > >             spin_unlock_bh(&sk->sk_lock.slock);
> > >          }
> > >          sock_put(sk);
> > >          continue;
> > >       }
> > >
> > > Does this mean that the 4.4.50 version (as shown above) is still
> > > susceptible to the original bug?  (Our original O/S maintainer
> > > patched this section because of the original bug that was causing an
> > > oops there - but obviously the patch he implemented was also buggy,
> > > as previously discussed.)
> > >
> > > Ultimately we would rather upgrade our entire kernel (say, to 4.9.11
> > > - the latest and greatest longterm release) but I see the TIPC
> > > design has changed significantly and I'm not sure if it would
> > > backport into our 4.4.0 kernel without significant effort; i.e.
> > > perhaps this change in design also depends on other API changes
> > > within other layers of the kernel.  If I am wrong in this and you
> > > think that the 4.9.11 TIPC code should be able to be backported to
> > > our 4.4.0 base then I will do so,
> >
> > It is absolutely doable. As a matter of fact, this is what Partha has
> > been doing in one of our own product lines.
> > AFAIK, the only build issue you will encounter is a change to the iov
> > handling in msg_build(), and that is easily fixed by reverting to the old
> method.
> > (Correct me Partha, if I am wrong here). But, with new functionality
> > (e.g., new flow control) there are new issues which still haven't been
> > ironed out completely. I think Partha is the one to give a better update
> here.
> >
> > ///jon
> >
> > > as there are far more fixes in 4.9.11 than in 4.4.50.  The reason we
> > > can't upgrade the entire kernel to 4.4.50 or 4.9.11 in the short
> > > term is a bit of a long story (which I will spare you), but suffice
> > > it to say that that is only an option for a long-term fix for our
> > > customers and not for this short term emergency fix which we need
> released asap.
> > >
> > > All this to say, the goal here is to move to the latest possible
> > > TIPC code which will (relatively) seamlessly integrate with our
> > > 4.4.0 kernel, and also be free of the aforementioned bug.  Let me
> > > know what
> > you think.
> > >
> > > Thanks,
> > >
> > > Peter
> > >
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > > Sent: February-23-17 8:22 AM
> > > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > > discuss...@lists.sourceforge.net
> > > Subject: RE: TIPC Oops in tipc_sk_recv
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > > Sent: Wednesday, February 22, 2017 04:31 PM
> > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > discuss...@lists.sourceforge.net
> > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > >
> > > > Hi Jon,
> > > >
> > > > I think I found the problem, which ultimately may only exist on
> > > > our end (see below for an explanation, and let me know if you agree).
> > > >
> > > > The fellow that was maintaining our O/S previously (no longer with
> > > > the
> > > > company) had made some patches to the 4.4.0 kernel TIPC code, and
> > > > indeed one of them is in the offending tipc_sk_rcv() function.
> > > >
> > > > Specifically, note this segment of code from our kernel source tree:
> > > >
> > > >                        /* Send pending response/rejected messages, if 
> > > > any */
> > > >                        while (!skb_queue_empty(&sk->sk_write_queue)) {
> > > >                                skb = skb_dequeue(&sk->sk_write_queue);
> > > >                                dnode = msg_destnode(buf_msg(skb));
> > > >                                tipc_node_xmit_skb(net, skb, dnode, 
> > > > dport);
> > > >                        }
> > >
> > > Yes, this is wrong. The socket write queue is only used for outgoing
> > > regular messages (Partha has later changed that), and should only be
> > > emptied by the sending thread. Running this code in interrupt
> > > context will give exactly the symptom you see, because the writing
> > > thread might already have freed or sent the buffer in question.
> > > >
> > > > Whereas the latest and greatest official longterm 4.9.11 kernel has:
> > > >
> > > >          /* Send pending response/rejected messages, if any */
> > > >          while ((skb = __skb_dequeue(&xmitq))) {
> > > >             dnode = msg_destnode(buf_msg(skb));
> > > >             tipc_node_xmit_skb(net, skb, dnode, dport);
> > > >          }
> > > >
> > > > The code path that triggers the oops (in our source code) is from:
> > > >
> > > > dnode = msg_destnode(buf_msg(skb));
> > > >
> > > > where msg_destnode() calls msg_word() which calls:
> > > >
> > > > ntohl(m->hdr[pos]);
> > > >
> > > > which is precisely where the oops occurred.
> > > >
> > > > I'm not exactly sure where he got that code change - my guess is
> > > > he posted a question on the tipc-discussion list and got a
> > > > suggestion to try a code snippet, but in the end the actual
> > > > changes (that were officially released at kernel.org) differed, as per
> above.
> > >
> > > I rather suspect he might have looked at the more recent code and
> > > tried to do the same, while misunderstanding the role of the write
> queue.
> > >
> > > > Indeed, on Google I can see some threads discussing a 'deadly
> embrace'
> > > > deadlock (for example
> > > > http://www.spinics.net/lists/netdev/msg382379.html) between
> > > > yourself and him.  Another possibility is that the offending
> > > > source code in question was indeed released sometime after 4.4.0,
> > > > but has since modified/fixed, thus explaining the discrepancy.
> > >
> > > The loop was introduced in conjunction with that discussion, but it
> > > should not be done in the way it is done above. Indeed, I cannot see
> > > that this can have solved the "deadly embrace" problem at all,
> > > unless he made other changes and added the rejected/returned
> > > messages to the write queue. That might work most of the time, but
> > > will still sooner or later interfere with a sending thread.
> > >
> > > There are two ways you can solve this:
> > > 1: Introduce a stack based queue for reject/return messages, as we
> > > do, and pass it along in the calls.
> > > 2: Put send messages on a stack based queue, as Partha has done in
> > > the later versions. This assuming that the rejected messages are
> > > added to the write queue, as I am speculating above.
> > >
> > > BR
> > > ///jon
> > >
> > > >
> > > > If either of possibilities is what actually happened, then this
> > > > may not a bug you need to worry about.  Granted, the same
> > > > msg_destnode() call still exists in the current (4.9.11 and 4.10)
> > > > code, but the semantics of the encapsulating while loop are
> > > > different, and maybe as such
> > > that eliminates the issue.
> > > > Thoughts?
> > > >
> > > > Peter
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > > > Sent: February-22-17 3:01 PM
> > > > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > > > discuss...@lists.sourceforge.net
> > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > > > Sent: Wednesday, February 22, 2017 02:15 PM
> > > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > > discuss...@lists.sourceforge.net
> > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > >
> > > > > For the " Source file is more recent than executable" message,
> > > > > could this simply be due to the fact that I copied the kernel
> > > > > source to the lab and then ran the gdb commands as shown?  As
> > > > > such, the newly copied files would have a newer timestamp than
> > > > > the
> > kernel/tipc.ko files.
> > > > > (The kernel is actual built on a separate compiler than the test
> > > > > lab
> > > > > machine.)
> > > >
> > > > If you are certain that the build was made from the same source
> > > > this is false alarm, caused by the timestamp as you suggest.
> > > >
> > > > ///jon
> > > >
> > > > >
> > > > > Or could I get that message for another reason?
> > > > >
> > > > >
> > > > >
> > > > > -----Original Message-----
> > > > > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > > > > Sent: February-22-17 2:11 PM
> > > > > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > > > > discuss...@lists.sourceforge.net
> > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > > > > Sent: Wednesday, February 22, 2017 01:04 PM
> > > > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > > > discuss...@lists.sourceforge.net
> > > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > >
> > > > > > I took a stab at it this way - not sure if I am doing this 
> > > > > > correctly or
> not.
> > > > > >
> > > > > > [root@myVMslot12 ~]# gdb /boot/vmlinuz-4.4.0 /proc/kcore GNU
> > gdb
> > > > > > (GDB) Fedora (7.3.50.20110722-13.fc16) Copyright (C) 2011 Free
> > > > > > Software Foundation, Inc.
> > > > > > License GPLv3+: GNU GPL version 3 or later
> > > > > > <http://gnu.org/licenses/gpl.html>
> > > > > > This is free software: you are free to change and redistribute it.
> > > > > > There is NO WARRANTY, to the extent permitted by law.  Type
> > > > > > "show copying"
> > > > > > and "show warranty" for details.
> > > > > > This GDB was configured as "x86_64-redhat-linux-gnu".
> > > > > > For bug reporting instructions, please see:
> > > > > > <http://www.gnu.org/software/gdb/bugs/>...
> > > > > > BFD: /boot/vmlinuz-4.4.0: Warning: Ignoring section flag
> > > > > > IMAGE_SCN_MEM_NOT_PAGED in section .bss
> > > > > > BFD: /boot/vmlinuz-4.4.0: Warning: Ignoring section flag
> > > > > > IMAGE_SCN_MEM_NOT_PAGED in section .bss Reading symbols
> > from
> > > > > > /boot/vmlinuz-4.4.0...(no debugging symbols found)...done.
> > > > > >
> > > > > > warning: core file may not match specified executable file.
> > > > > > [New process 1]
> > > > > > Core was generated by `BOOT_IMAGE=/vmlinuz-4.4.0
> > > > > root=UUID=b419f9ff-
> > > > > > 80ce-459e-855c-614d86a48105 ro rd.'.
> > > > > > #0  0x0000000000000000 in ?? ()
> > > > > >  (gdb) file /lib/modules/4.4.0/kernel/net/tipc/tipc.ko
> > > > > > warning: core file may not match specified executable file.
> > > > > > Reading symbols from
> > > /lib/modules/4.4.0/kernel/net/tipc/tipc.ko...done.
> > > > > > (gdb) list *(tipc_sk_rcv+0x238)
> > > > > > 0x14898 is in tipc_sk_rcv (net/tipc/msg.h:131).
> > > > > > warning: Source file is more recent than executable.
> > > > >
> > > > > Seems like you didn't rebuild after you updated the source file?
> > > > > Try again just to make sure.
> > > > >
> > > > > > 126             return (struct tipc_msg *)skb->data;
> > > > > > 127     }
> > > > > > 128
> > > > > > 129     static inline u32 msg_word(struct tipc_msg *m, u32 pos)
> > > > > > 130     {
> > > > > > 131             return ntohl(m->hdr[pos]);
> > > > >
> > > > > If this is correct, you are receiving a corrupt buffer where the
> > > > > data pointer is invalid. This is typical if the buffer already
> > > > > has been
> > > released.
> > > > >
> > > > > ///jon
> > > > >
> > > > > > 132     }
> > > > > > 133
> > > > > > 134     static inline void msg_set_word(struct tipc_msg *m, u32 w,
> u32
> > > val)
> > > > > > 135     {
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Butler, Peter
> > > > > > Sent: February-22-17 12:45 PM
> > > > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > > > discuss...@lists.sourceforge.net
> > > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > >
> > > > > > Hi Jon
> > > > > >
> > > > > > Thanks for the info.
> > > > > >
> > > > > > One thing I should clarify.  Although we are running the 4.4.0
> > > > > > kernel, we had backported a number of post-4.4.0 TIPC patches
> > > > > > into our 4.4.0 kernel.  As such, the offset in question
> > > > > > (tipc_sk_rcv+0x238) will not match that in the vanilla 4.4.0 source.
> > > > > >
> > > > > > Should I post the entire socket.c file to this list for your review?
> > > > > > Or is there an easy way for me to do a similar listing using
> > > > > > our actual tipc.ko file here in the lab?
> > > > > >
> > > > > > Peter
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > > > > > Sent: February-22-17 12:29 PM
> > > > > > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > > > > > discuss...@lists.sourceforge.net
> > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > >
> > > > > > Hi Peter,
> > > > > > Very hard to make any suggestions on how to reproduce this.
> > > > > > What I can see is that it is a STREAM message being sent from
> > > > > > a node local socket, i.e., it doesn't go via any interface.
> > > > > > The crash seems to happen when the receiving socket is owned
> > > > > > by the user, and while we are instead adding the message to the
> backlog queue:
> > > > > >
> > > > > > Reading symbols from net/tipc/tipc.ko...done.
> > > > > > (gdb) list *(tipc_sk_rcv+0x238)
> > > > > > 0x13d78 is in tipc_sk_rcv (./arch/x86/include/asm/atomic.h:214).
> > > > > > 209     static __always_inline int __atomic_add_unless(atomic_t *v,
> int
> > > a,
> > > > int
> > > > > > u)
> > > > > > 210     {
> > > > > > 211             int c, old;
> > > > > > 212             c = atomic_read(v);
> > > > > > 213             for (;;) {
> > > > > > 214                     if (unlikely(c == (u)))
> > > > > > 215                             break;
> > > > > > 216                     old = atomic_cmpxchg((v), c, c + (a));
> > > > > > 217                     if (likely(old == c))
> > > > > > 218                             break;
> > > > > >
> > > > > > This is about what I can get out of it at the moment. Maybe
> > > > > > you should try a high-load test between two local sockets (try
> > > > > > the benchmark demo from
> > > > > > tipcutils) and see what you can achieve.
> > > > > >
> > > > > > BR
> > > > > > ///jon
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > > > > > Sent: Wednesday, February 22, 2017 10:40 AM
> > > > > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > > > > discuss...@lists.sourceforge.net
> > > > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > > >
> > > > > > > If you have any suggestions as to procedures/tricks you
> > > > > > > think might trigger this bug I can certainly attempt to do so in 
> > > > > > > the
> lab.
> > > > > > > Obviously we can't attempt to reproduce it on the customer's
> > > > > > > (live)
> > > > > system.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Butler, Peter
> > > > > > > Sent: February-21-17 3:39 PM
> > > > > > > To: Jon Maloy <jon.ma...@ericsson.com>; tipc-
> > > > > > > discuss...@lists.sourceforge.net
> > > > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > > >
> > > > > > > Unfortunately this occurred on a customer system so it is
> > > > > > > not readily reproducible.  We have not seen this occur in our lab.
> > > > > > >
> > > > > > > For what it's worth, it occurred while the process was in
> > > > > > > TASK_UNINTERRUPTIBLE.  As such, the kernel could not
> > > > > > > actually kill off the associated process despite the Oops,
> > > > > > > and the process remained forever frozen in the 'D' state and
> > > > > > > the card had to be
> > > > rebooted.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> > > > > > > Sent: February-21-17 3:36 PM
> > > > > > > To: Butler, Peter <pbut...@sonusnet.com>; tipc-
> > > > > > > discuss...@lists.sourceforge.net
> > > > > > > Subject: RE: TIPC Oops in tipc_sk_recv
> > > > > > >
> > > > > > > Hi Peter,
> > > > > > > I don't think this is any known bug. Is it repeatable?
> > > > > > >
> > > > > > > ///jon
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Butler, Peter [mailto:pbut...@sonusnet.com]
> > > > > > > > Sent: Tuesday, February 21, 2017 12:14 PM
> > > > > > > > To: tipc-discussion@lists.sourceforge.net
> > > > > > > > Cc: Butler, Peter <pbut...@sonusnet.com>
> > > > > > > > Subject: [tipc-discussion] TIPC Oops in tipc_sk_recv
> > > > > > > >
> > > > > > > > This was with kernel 4.4.0, however I don't see any fix
> > > > > > > > specifically related to this in any subsequent 4.4.x kernel...
> > > > > > > >
> > > > > > > > BUG: unable to handle kernel NULL pointer dereference at
> > > > > > > > 00000000000000d8
> > > > > > > > IP: [<ffffffffa0148868>] tipc_sk_rcv+0x238/0x4d0 [tipc]
> > > > > > > > PGD
> > > > > > > > 34f4c0067 PUD
> > > > > > > > 34ed95067 PMD 0
> > > > > > > > Oops: 0000 [#1] SMP
> > > > > > > > Modules linked in: nf_log_ipv4 nf_log_common xt_LOG sctp
> > > > > > > > libcrc32c e1000e tipc udp_tunnel ip6_udp_tunnel iTCO_wdt
> > > > > > > > 8021q garp
> > > > > > xt_physdev
> > > > > > > > br_netfilter bridge stp llc nf_conntrack_ipv4 ipmiq_drv(O)
> > > > > > > > nf_defrag_ipv4
> > > > > > > > sio_mmc(O) ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
> > > > > > > > nf_defrag_ipv6 xt_state nf_conntrack event_drv(O)
> > > > > > > > ip6table_filter lockd ip6_tables
> > > > > > > > pt_timer_info(O) ddi(O) grace usb_storage ixgbe igb
> > > > > > > > iTCO_vendor_support i2c_algo_bit ptp i2c_i801 pps_core
> > > > > > > > lpc_ich i2c_core intel_ips mfd_core pcspkr ioatdma sunrpc
> > > > > > > > dca tpm_tis mdio tpm
> > > > > > > [last unloaded: iTCO_wdt]
> > > > > > > > CPU: 2 PID: 12144 Comm: dinamo Tainted: G           O    4.4.0 
> > > > > > > > #23
> > > > > > > > Hardware name: PT AMC124/Base Board Product Name, BIOS
> > > > > > > > LGNAJFIP.PTI.0012.P15 01/15/2014
> > > > > > > > task: ffff880036ad8000 ti: ffff880036900000 task.ti:
> > > > > > > > ffff880036900000
> > > > > > > > RIP: 0010:[<ffffffffa0148868>]  [<ffffffffa0148868>]
> > > > > > > > tipc_sk_rcv+0x238/0x4d0 [tipc]
> > > > > > > > RSP: 0018:ffff880036903bb8  EFLAGS: 00010292
> > > > > > > > RAX: 0000000000000000 RBX: ffff88034def3970 RCX:
> > > > > > > > 0000000000000001
> > > > > > > > RDX: 0000000000000101 RSI: 0000000000000292 RDI:
> > > > > > > > ffff88034def3984
> > > > > > > > RBP: ffff880036903c28 R08: 0000000000000101 R09:
> > > > > > > > 0000000000000004
> > > > > > > > R10: 0000000000000001 R11: 0000000000000000 R12:
> > > > > > > > ffff880036903d28
> > > > > > > > R13: 00000000bd1fd8b2 R14: ffff88034def3840 R15:
> > > > > > > > ffff880036903d3c
> > > > > > > > FS:  00007f1e86299740(0000) GS:ffff88035fc40000(0000)
> > > > > > > > knlGS:0000000000000000
> > > > > > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > > > CR2: 00000000000000d8 CR3: 0000000036835000 CR4:
> > > > > > > > 00000000000006e0
> > > > > > > > Stack:
> > > > > > > >  000000000000009b ffff880036903d28 0000000000000018
> > > > > > > > ffff88034def38c8
> > > > > > > >  ffffffff81ce6240 ffff8802b9bdba00 ffff880036903ca8
> > > > > > > > ffffffffa013bd7e
> > > > > > > >  ffff8802b99d5ee8 ffff880036903c60 0000000000000000
> > > > > > > > ffff88003693cb00 Call
> > > > > > > > Trace:
> > > > > > > >  [<ffffffffa013bd7e>] ? tipc_msg_build+0xde/0x4f0 [tipc]
> > > > > > > > [<ffffffffa014358f>] tipc_node_xmit+0x11f/0x150 [tipc]
> > > > > > > > [<ffffffffa01470ba>]
> > > > > > > > __tipc_send_stream+0x16a/0x300 [tipc]  [<ffffffff81625eb5>] ?
> > > > > > > > tcp_sendmsg+0x4d5/0xb00  [<ffffffffa0147292>]
> > > > > > > > tipc_send_stream+0x42/0x70 [tipc]  [<ffffffff815bcf77>]
> > > > > > > > sock_sendmsg+0x47/0x50  [<ffffffff815bd03f>]
> > > > > > > > sock_write_iter+0x7f/0xd0 [<ffffffff811d799a>]
> > > > > > > > __vfs_write+0xaa/0xe0 [<ffffffff811d8b16>]
> > > > > > > > vfs_write+0xb6/0x1a0  [<ffffffff811d8e3f>]
> > > > > > > > SyS_write+0x4f/0xb0 [<ffffffff816de6d7>]
> > > > > > > > entry_SYSCALL_64_fastpath+0x12/0x6a
> > > > > > > > Code: 89 de 4c 89 f7 e8 29 d3 ff ff 48 8b 7d a8 e8 60 59
> > > > > > > > 59
> > > > > > > > e1
> > > > > > > > 49 8d 9e 30 01 00
> > > > > > > > 00 49 3b 9e 30 01 00 00 74 30 48 89 df e8 b8 b6 47 e1 <48>
> > > > > > > > 8b
> > > > > > > > 90
> > > > > > > > d8
> > > > > > > > 00
> > > > > > > > 00 00 48 8b 7d b0 44 89 e9 48 89 c6 48 89 45 c0 RIP
> > > > > > > > [<ffffffffa0148868>]
> > > > > > > > tipc_sk_rcv+0x238/0x4d0 [tipc]  RSP <ffff880036903bb8>
> > > > > > > > CR2: 00000000000000d8
> > > > > > > > ---[ end trace 1c2d69738941d565 ]---
> > > > > > > >
> > > > > > > >
> > > > > > > > ----------------------------------------------------------
> > > > > > > > --
> > > > > > > > --
> > > > > > > > --
> > > > > > > > --
> > > > > > > > --
> > > > > > > > --
> > > > > > > > -------- Check out the vibrant tech community on one of
> > > > > > > > the world's most engaging tech sites, SlashDot.org!
> > > > > > > > http://sdm.link/slashdot
> > > > > > > > _______________________________________________
> > > > > > > > tipc-discussion mailing list
> > > > > > > > tipc-discussion@lists.sourceforge.net
> > > > > > > > https://lists.sourceforge.net/lists/listinfo/tipc-discussi
> > > > > > > > on


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to