ourse upgrading to the latest kernel appears to just make things
worse as per this crash...
-Original Message-
From: Ying Xue [mailto:ying@windriver.com]
Sent: July-25-17 8:48 AM
To: Butler, Peter ; Parthasarathy Bhuvaragan
; tipc-discussion@lists.sourceforge.net
Cc: Jon Maloy ; LU
]
Sent: July-24-17 9:00 AM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Cc: Jon Maloy ; Parthasarathy Bhuvaragan
; LUU Duc Canh
Subject: Re: TIPC connection stalling due to invalid congestion status when
bearer 0 recovers
Hi Peter,
Thank you for well describing your met issue
4] RIP: kfree_skb_list+0x18/0x30 RSP: c90005383b18
[ 2385.388611] ---[ end trace 125f5b3fcb6ee71d ]---
-Original Message-
From: Butler, Peter
Sent: July-24-17 11:21 AM
To: Parthasarathy Bhuvaragan ;
tipc-discussion@lists.sourceforge.net
Cc: Jon Maloy ; Ying Xue ; LUU
Duc Canh
Subject: RE
look into upgrading the entire kernel...
Peter
-Original Message-
From: Butler, Peter
Sent: July-24-17 11:21 AM
To: Parthasarathy Bhuvaragan ;
tipc-discussion@lists.sourceforge.net
Cc: Jon Maloy ; Ying Xue ; LUU
Duc Canh
Subject: RE: TIPC connection stalling due to invalid congestion
: Parthasarathy Bhuvaragan [mailto:parthasarathy.bhuvara...@ericsson.com]
Sent: July-24-17 8:58 AM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Cc: Jon Maloy ; Ying Xue ; LUU
Duc Canh
Subject: Re: TIPC connection stalling due to invalid congestion status when
bearer 0 recovers
Hi Peter
Hello,
I am using a 19-node TIPC configuration, whereby each card (node) in the mesh
has two Ethernet interfaces connected to two disjoint subnets served by switch0
and switch1, respectively. TIPC is set to use two bearers on each card. 16 of
these cards are using TIPC 4.4.0 (with a few patche
Cc: Andrew Booth (abo...@pt.com) ; Butler, Peter
; Parthasarathy Bhuvaragan
; Ying Xue ;
tipc-discussion@lists.sourceforge.net
Subject: Re: FW: TIPC issue: connection stalls when switch for bearer 0 recovers
Hi Andrew,
Could you help me apply and try with Jon's patch as file attachmen
I see " [PATCH v2 net-next 0/6] solve two deadlock issues" that Ying just
committed a few minutes before my post - not sure if it is the same thing or
not...
-Original Message-----
From: Butler, Peter
Sent: March-09-17 9:53 AM
To: tipc-discussion@lists.sourceforge.net
Sub
This is on node running 4.9.11 TIPC. 9 nodes in cluster, 7 of which are
running the same 4.9.11 TIPC (on x86-64), 2 running an old 1.7 TIPC (on PPC).
It keeps cycling through these same logs every few seconds.
[118768.064830] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s!
[swapper/3:0]
Important data point: when the two TIPC 1.7 nodes are taken out of the cluster,
the error logs (which were being generated on the 4.9.11 TIPC nodes) cease.
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: March-08-17 11:57 AM
To: Butler, Peter ; tipc-discussion
AM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Subject: RE: Constant Illegal FSM event / Resetting Link errors
This looks very much like the deadlock that Partha tried to fix in commit
d094c4d5f5c7e1b2 ("tipc: add subscription refcount..") in 4.10. It is quite
likely that this
00 01 00 00
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: March-08-17 11:32 AM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Subject: RE: Constant Illegal FSM event / Resetting Link errors
> -Original Message-
> From: Butler, P
There are 7 nodes in the system running 4.9.11 TIPC (on 4.4.0 x86-64 kernels),
and 2 nodes in the system running TIPC 1.7 (on 2.6.20 PPC kernels).
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: March-08-17 11:21 AM
To: Butler, Peter ; tipc-discussion
8 nodes in mesh, running TIPC from kernel 4.9.11.
The following log messages are continually being spammed (many times per
second):
Mar 8 00:17:31 [SEQ 409067] myVMslot12 kernel: [ 130.406118] Resetting link
Link state 2000
Mar 8 00:17:31 [SEQ 409068] myVMslot12 kernel: [ 130.406120] XM
d over
several connections", do you mean 1000+ connections? Or 1000+ messages per
second?
Our mesh only has ~30 nodes.
Peter
-Original Message-
From: Parthasarathy Bhuvaragan [mailto:parthasarathy.bhuvara...@ericsson.com]
Sent: February-27-17 7:37 AM
To: Butler, Peter
Cc: Jon Mal
? It is my understanding that
kernel code is meant to be backward-compatible in principle...
Peter
-Original Message-
From: Parthasarathy Bhuvaragan [mailto:parthasarathy.bhuvara...@ericsson.com]
Sent: February-27-17 7:37 AM
To: Butler, Peter
Cc: Jon Maloy ; tipc-discussion
ning - but that of course doesn't mean
that run-time issues won't occur.
/Peter
From: Parthasarathy Bhuvaragan [mailto:parthasarathy.bhuvara...@ericsson.com]
Sent: February-24-17 5:21 AM
To: Butler, Peter
Cc: Jon Maloy ; tipc-discussion@lists.sourceforge.net
Subject: Re: TIPC Oops
compilation to fail there?
Or were you expecting it to succeed, but the resulting TIPC functionality to
simply be erroneous at run-time?
Peter
-Original Message-
From: Butler, Peter
Sent: February-23-17 2:48 PM
To: Jon Maloy ; tipc-discussion@lists.sourceforge.net;
Parthasarathy Bhuvarag
Correct - we only use 'eth' as a bearer.
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-23-17 3:03 PM
To: Butler, Peter ;
tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan
Subject: RE: TIPC Oops in tipc_sk_recv
Just comme
too many arguments to function
'udp_tunnel6_xmit_skb'
include/net/udp_tunnel.h:87:5: note: declared here
make[1]: *** [net/tipc/udp_media.o] Error 1
make: *** [net/tipc/] Error 2
-Original Message-
From: Butler, Peter
Sent: February-23-17 2:14 PM
To: Jon Maloy ; tipc-disc
/tipc/] Error 2
-----Original Message-
From: Butler, Peter
Sent: February-23-17 1:45 PM
To: Jon Maloy ; tipc-discussion@lists.sourceforge.net;
Parthasarathy Bhuvaragan
Cc: Butler, Peter
Subject: RE: TIPC Oops in tipc_sk_recv
I definitely don't want to be moving into dangerous waters, so I
I definitely don't want to be moving into dangerous waters, so I'll take your
suggestion right now and start over
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-23-17 1:43 PM
To: Butler, Peter ;
tipc-discussion@lists.sourceforge.net; Par
e/net and lib/.
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-23-17 1:19 PM
To: Butler, Peter ;
tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan
Subject: RE: TIPC Oops in tipc_sk_recv
> -Original Message-
> F
tipc/monitor.o] Error 1
make[1]: *** [net/tipc] Error 2
make: *** [net] Error 2
-----Original Message-
From: Butler, Peter
Sent: February-23-17 10:56 AM
To: Jon Maloy ; tipc-discussion@lists.sourceforge.net;
Parthasarathy Bhuvaragan
Cc: Butler, Peter
Subject: RE: TIPC Oops in tipc_sk_r
--Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-23-17 10:45 AM
To: Butler, Peter ;
tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan
Subject: RE: TIPC Oops in tipc_sk_recv
> -Original Message-
> From: Butler, Peter [mailto:pbut
PC code
which will (relatively) seamlessly integrate with our 4.4.0 kernel, and also be
free of the aforementioned bug. Let me know what you think.
Thanks,
Peter
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-23-17 8:22 AM
To: Butler, Peter ; tipc-d
ame msg_destnode() call still exists in
the current (4.9.11 and 4.10) code, but the semantics of the encapsulating
while loop are different, and maybe as such that eliminates the issue.
Thoughts?
Peter
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: Febr
kernel is actual built on
a separate compiler than the test lab machine.)
Or could I get that message for another reason?
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-22-17 2:11 PM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Subject: RE:
return (struct tipc_msg *)skb->data;
127 }
128
129 static inline u32 msg_word(struct tipc_msg *m, u32 pos)
130 {
131 return ntohl(m->hdr[pos]);
132 }
133
134 static inline void msg_set_word(struct tipc_msg *m, u32 w, u32 val)
135 {
-Original Mess
the entire socket.c file to this list for your review? Or is
there an easy way for me to do a similar listing using our actual tipc.ko file
here in the lab?
Peter
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-22-17 12:29 PM
To: Butler, Peter
If you have any suggestions as to procedures/tricks you think might trigger
this bug I can certainly attempt to do so in the lab. Obviously we can't
attempt to reproduce it on the customer's (live) system.
-Original Message-----
From: Butler, Peter
Sent: February-21-17 3:39
Oops, and the process remained forever frozen in the 'D' state and the card
had to be rebooted.
-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com]
Sent: February-21-17 3:36 PM
To: Butler, Peter ; tipc-discussion@lists.sourceforge.net
Subject: RE: TIPC Oops
This was with kernel 4.4.0, however I don't see any fix specifically related to
this in any subsequent 4.4.x kernel...
BUG: unable to handle kernel NULL pointer dereference at 00d8
IP: [] tipc_sk_rcv+0x238/0x4d0 [tipc]
PGD 34f4c0067 PUD 34ed95067 PMD 0
Oops: [#1] SMP
Modules link
PM, Butler, Peter wrote:
> We can certainly do that for future upgrades of our customers. However we
> may need to just patch in the interim.
>
>
> Is the patch small enough (self-contained enough) that it would be easy
> enough for me to port it into our 4.4.0 kernel? Or d
changed between 4.4 and 4.8?
From: Jon Maloy
Sent: Friday, December 9, 2016 1:57:46 PM
To: Butler, Peter; tipc-discussion@lists.sourceforge.net
Subject: RE: reproducible link failure scenario
Hi Peter,
This is a known bug, fixed in commit d2f394dc4816 ("tipc
I have a reproducible failure scenario that results in the following kernel
messages being printed in succession (along with the associated link failing):
Dec 8 12:10:33 [SEQ 617259] lab236slot6 kernel: [44856.752261] Retransmission
failure on link <1.1.6:p19p1-1.1.8:p19p1>
Dec 8 12:10:33 [SE
36 matches
Mail list logo