> -----Original Message-----
> From: Rune Torgersen [mailto:ru...@innovsys.com]
> Sent: Friday, 09 December, 2016 12:03
> To: tipc-discussion@lists.sourceforge.net
> Subject: [tipc-discussion] Transmission errors
> 
> Can someone help me decode the following errors inn my kernel log?
> (Ubuntu 16.04 with 4.4.0-45)
> 
> Dec  8 13:12:10 michelltelctrl1 kernel: [2603354.089708] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  8 13:12:10 michelltelctrl1 kernel: [2603354.089712] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  8 13:12:10 michelltelctrl1 kernel: [2603354.089714] XMTQ: 33 
> [43899-43931],
> BKLGQ: 0, SNDNX: 43932, RCVNX: 21825
> Dec  8 13:12:10 michelltelctrl1 kernel: [2603354.089715] Failed msg: usr 0, 
> typ 2,
> len 266, err 0
> Dec  8 13:12:10 michelltelctrl1 kernel: [2603354.089716] sqno 43899, prev:
> 1001001, src: 1001001

Hi Rune,
It means that a link is reset because TIPC made 100 failed attempts to send the 
same packet through it.
In this case it is a data message (user 0) containing a port name address (type 
2), but looking further 
down I see no pattern in this; it seems to happen with any type of message.
What is listed is the contents of the link send queue (33 pkts, ##43899 to 
43931) and the failing packet
 (#43899) of size 266 bytes.

I can see nothing wrong with the packets or the link, but I observe another 
interesting pattern:
all resets seem to happen on the same minute of the hour (13:12, 14:12 etc.) Is 
it possible
that you have some recurring hourly job that overloads the switch, the 
interfaces or the CPUs?

I think that might be point to start.

BR
///jon


> Dec  8 14:12:12 michelltelctrl1 kernel: [2606955.869185] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  8 14:12:12 michelltelctrl1 kernel: [2606955.869197] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  8 14:12:12 michelltelctrl1 kernel: [2606955.869201] XMTQ: 50 
> [44194-44243],
> BKLGQ: 36, SNDNX: 44244, RCVNX: 21597
> Dec  8 14:12:12 michelltelctrl1 kernel: [2606955.869203] Failed msg: usr 0, 
> typ 2,
> len 1339, err 0
> Dec  8 14:12:12 michelltelctrl1 kernel: [2606955.869205] sqno 44194, prev:
> 1001001, src: 1001001
> Dec  8 16:12:17 michelltelctrl1 kernel: [2614160.280016] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  8 16:12:17 michelltelctrl1 kernel: [2614160.280028] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  8 16:12:17 michelltelctrl1 kernel: [2614160.280034] XMTQ: 2 [6-7], 
> BKLGQ: 0,
> SNDNX: 8, RCVNX: 5
> Dec  8 16:12:17 michelltelctrl1 kernel: [2614160.280036] Failed msg: usr 0, 
> typ 2,
> len 266, err 0
> Dec  8 16:12:17 michelltelctrl1 kernel: [2614160.280038] sqno 6, prev: 
> 1001001, src:
> 1001001
> Dec  8 21:12:12 michelltelctrl1 kernel: [2632155.842657] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  8 21:12:12 michelltelctrl1 kernel: [2632155.842668] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  8 21:12:12 michelltelctrl1 kernel: [2632155.842673] XMTQ: 50 
> [22133-22182],
> BKLGQ: 32, SNDNX: 22183, RCVNX: 41898
> Dec  8 21:12:12 michelltelctrl1 kernel: [2632155.842674] Failed msg: usr 12, 
> typ 1,
> len 1460, err 0
> Dec  8 21:12:12 michelltelctrl1 kernel: [2632155.842676] sqno 22133, prev:
> 1001001, src: 1001001
> Dec  8 23:12:10 michelltelctrl1 kernel: [2639354.222566] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  8 23:12:10 michelltelctrl1 kernel: [2639354.222578] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  8 23:12:10 michelltelctrl1 kernel: [2639354.222583] XMTQ: 23 
> [22895-22917],
> BKLGQ: 0, SNDNX: 22918, RCVNX: 42582
> Dec  8 23:12:10 michelltelctrl1 kernel: [2639354.222585] Failed msg: usr 12, 
> typ 1,
> len 1460, err 0
> Dec  8 23:12:10 michelltelctrl1 kernel: [2639354.222587] sqno 22895, prev:
> 1001001, src: 1001001
> Dec  9 07:12:06 michelltelctrl1 kernel: [2668150.028976] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  9 07:12:06 michelltelctrl1 kernel: [2668150.028988] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  9 07:12:06 michelltelctrl1 kernel: [2668150.028995] XMTQ: 6 
> [22095-22100],
> BKLGQ: 0, SNDNX: 22101, RCVNX: 41522
> Dec  9 07:12:06 michelltelctrl1 kernel: [2668150.028997] Failed msg: usr 0, 
> typ 2,
> len 266, err 0
> Dec  9 07:12:06 michelltelctrl1 kernel: [2668150.029000] sqno 22095, prev:
> 1001001, src: 1001001
> Dec  9 09:12:16 michelltelctrl1 kernel: [2675360.164471] Retransmission 
> failure on
> link <1.1.1:eth0-1.1.2:eth0>
> Dec  9 09:12:16 michelltelctrl1 kernel: [2675360.164483] Resetting link  Link
> <1.1.1:eth0-1.1.2:eth0> state e
> Dec  9 09:12:16 michelltelctrl1 kernel: [2675360.164487] XMTQ: 2 
> [21277-21278],
> BKLGQ: 0, SNDNX: 21279, RCVNX: 13712
> Dec  9 09:12:16 michelltelctrl1 kernel: [2675360.164489] Failed msg: usr 0, 
> typ 2,
> len 1347, err 0
> Dec  9 09:12:16 michelltelctrl1 kernel: [2675360.164490] sqno 21277, prev:
> 1001001, src: 1001001
> 
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/xeonphi
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to