Please don't drop the list from CC.

On 2011-03-01 15:52, Vinzenz Bargsten wrote:
> Am 28.02.2011 20:39, schrieb Jan Kiszka:
>> On 2011-02-28 17:15, Vinzenz Bargsten wrote:
>>   
>>> Am 27.02.2011 19:49, schrieb Vinzenz Bargsten:
>>>     
>>>> Hi,
>>>>
>>>> I have setup RTNet with the rt_8139too module and use it to communicate 
>>>> with a 
>>>> non-tdma station.
>>>> This works quite goot so far, however every now and then no more packets 
>>>> are 
>>>> received.
>>>> I checked that the other station sends packets ( it does ), but nothing 
>>>> shows 
>>>> up anymore using wireshark / rt_cap module.
>>>> When I execute rtping then, everything works fine again. Even some old 
>>>> packets 
>>>> then show up in wireshark,
>>>> also when the remote station isn't sending anymore or the cable has been 
>>>> disconnected.
>>>>
>>>> These abrupt disconnections are quite problematic for my application 
>>>> (robot 
>>>> movement).
>>>>
>>>> syslog shows messages like <interface> enters/leaves promiscuous mode, and 
>>>> if
>>>> I execute rtping  an "Abnormal interrupt 00000020" message appears.
>>>>
>>>>       
>>> Here is a syslog from loading the rt modules to several times where no 
>>> packets 
>>> are received any longer,
>>> rtping somehow fixes this each time
>>> --------------------------------------------------------------------------------------------------------
>>> <loading rt modules,
>>> reload normal 8139too module for second nic>
>>>
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.032927]
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.032927] *** RTnet 0.9.12 - built on 
>>> Nov 
>>> 16 2010 17:58:50 ***
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.032928]
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.032930] RTnet: initialising 
>>> real-time 
>>> networking
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.083904] initializing loopback...
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.083910] RTnet: registered rtlo
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.099315] rt_8139too Fast Ethernet 
>>> driver 
>>> 0.9.24-rt0.7
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.099343] rt_8139too 0000:03:00.0: 
>>> PCI->APIC IRQ transform: INT A -> IRQ 16
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.099719] RTnet: registered rteth0
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.119140] RTcap: real-time capturing 
>>> interface
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.157079] 8139too: 8139too Fast 
>>> Ethernet 
>>> driver 0.9.28
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.157103] 8139too 0000:03:01.0: 
>>> PCI->APIC 
>>> IRQ transform: INT A -> IRQ 17
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.157987] 8139too 0000:03:01.0: eth0: 
>>> RealTek RTL8139 at 0xe400, 00:50:fc:4b:ab:fb, IRQ 17
>>> Feb 28 14:37:12 robot02 kernel: [ 2700.159954] eth0: link up, 100Mbps, 
>>> full-duplex, lpa 0xC1E1
>>>
>>> <starting wireshark>
>>> Feb 28 14:38:36 robot02 kernel: [ 2783.323474] device rteth0 entered 
>>> promiscuous 
>>> mode
>>>
>>> <switching on remote station, tg3 interface is also connected to it>
>>> Feb 28 14:41:41 robot02 kernel: [ 2968.349004] rteth0: Abnormal interrupt, 
>>> status 00002020.
>>> Feb 28 14:41:42 robot02 kernel: [ 2969.335530] rteth0: Abnormal interrupt, 
>>> status 00000020.
>>> Feb 28 14:41:44 robot02 kernel: [ 2971.016290] rteth0: Abnormal interrupt, 
>>> status 00002020.
>>> Feb 28 14:41:47 robot02 kernel: [ 2974.027938] tg3 0000:02:00.0: eth1: Link 
>>> is 
>>> up at 1000 Mbps, full duplex
>>> Feb 28 14:41:47 robot02 kernel: [ 2974.027941] tg3 0000:02:00.0: eth1: Flow 
>>> control is on for TX and on for RX
>>> Feb 28 14:41:47 robot02 kernel: [ 2974.028230] ADDRCONF(NETDEV_CHANGE): 
>>> eth1: 
>>> link becomes ready
>>> Feb 28 14:42:03 robot02 kernel: [ 2989.989823] tg3 0000:02:00.0: eth1: Link 
>>> is down
>>> Feb 28 14:42:06 robot02 kernel: [ 2992.982582] tg3 0000:02:00.0: eth1: Link 
>>> is 
>>> up at 1000 Mbps, full duplex
>>> Feb 28 14:42:06 robot02 kernel: [ 2992.982585] tg3 0000:02:00.0: eth1: Flow 
>>> control is on for TX and on for RX
>>>     
>> Smells like IRQ conflict in line 17. What devices are using it? Check
>> /proc/interrupt or lspci.
>>   
> The real-time nic was using IRQ 16, see
> 
> Feb 28 14:37:12 robot02 kernel: [ 2700.099343] rt_8139too 0000:03:00.0: 
> PCI->APIC IRQ transform: INT A -> IRQ 16
> You are right, IRQ 16 was shared with a usb controller.
> 
> As the 2nd (non-rt) 8139 nic had it's own IRQ 17, 
> I tried to use that instead and just swapped the cables. The problem occured, 
> too.
> 
> Then I swapped the cards in the PCI slots and the situation got worse. 
> See list of interrupts attached (interrupts.txt)

Both 8139 cards are of the same type. What were you steps to ensure that
the right driver handles the right card?

> 
> Now only one packet is received and one transmitted, 
> until dmesg shows PCI bus error messages:
> 
> -----------------------------------------------------------------------
> [ 1041.881542] device rteth0 entered promiscuous mode
> [ 1186.669561] rteth0: Abnormal interrupt, status 00008000.
> [ 1186.669565] rteth0: PCI Bus error 1280.
> [ 1186.669569] rteth0: Abnormal interrupt, status 00008001.
> [ 1186.669573] rteth0: PCI Bus error 0280.
> [ 1188.483690] Clocksource tsc unstable (delta = 907083906 ns)
> [ 1190.777794] rteth0: Abnormal interrupt, status 00008000.
> [ 1191.935881] rteth0: PCI Bus error 1280.
> [ 1191.935905] rteth0: Abnormal interrupt, status 00008000.
> [ 1191.935918] rteth0: PCI Bus error 1280.
> [ 1191.935919] rteth0: Abnormal interrupt, status 00008000.
> [ 1191.935921] rteth0: PCI Bus error 1280.
> [ 1191.935922] rteth0: Abnormal interrupt, status 00008000.
> [ 1191.935923] rteth0: PCI Bus error 0280.
> [ 1191.935971] Switching to clocksource jiffies
> -----------------------------------------------------------------------
> 
> Btw, Removing and reconnecting the cable causes:
> ---------------------------------------------------------------------
> ## messages on cable disconnect and connect
> [  637.303635] rteth0: Abnormal interrupt, status 00002020.
> [  637.493578] rteth0: Abnormal interrupt, status 00002020.
> ---------------------------------------------------------------------- 

That's "normal" as link status changes are considered abnormal for an RT
network.

> 
> I also get a message on boot, don't know if its related:
> ## message on boot:
> [    0.102240] ..MP-BIOS bug: 8254 timer not connected to IO-APIC

And that's a false positive I'm going to silence one day in I-pipe...

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
RTnet-users mailing list
RTnet-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rtnet-users

Reply via email to