Re: [etherlab-users] Redundancy support
Yesterday, I quoth: Possibly some of this is related to the weird linked-but-not-linked behaviour when the backup link is connected but unneeded (described in my prior email). Well, I've fixed that now (it was an incorrect assumption in the EtherCAT driver mods), but it hasn't helped the delay measurements. I guess the main problem is that the measurement code seems to assume that the receive timestamps are all updated from the same packet. But in a redundant topology, especially with no network breaks, some of the ports don't actually receive packets, or the timestamps within a single slave are updated from different packets traversing. So a port might be open but the timestamp may not have updated or may be from a packet sent at a different time from what you're expecting. And the topology calculations assume that the packet is always entering from port 0, which isn't always true with a redundant setup. (I can understand why -- I've had a think about it and it's quite a tricky problem, especially given some of the limitations on data available from the slaves.) ___ etherlab-users mailing list etherlab-users@etherlab.org http://lists.etherlab.org/mailman/listinfo/etherlab-users
Re: [etherlab-users] Redundancy support
Yesterday, I quoth: 2. There appear to be a few things that only seem to work on the main link, not the backup link (unless I'm missing something). Register requests (maybe only some types?) seem to be one of them, and I'm dubious about the DC sync behaviour as well -- I don't think the RMW broadcast sync to the refclock is really going to work on a link that doesn't contain the refclock. The transmission delay measurements seem incorrect too. As an example of this, here are some excerpts of ethercat slaves -v in various states with two slaves. This is a normal chain network -- backup link not connected: === Master 0, Slave 0 === Device: Main Distributed clocks: yes, 64 bit DC system time transmission delay: 0 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII upopenyes -114873362 0 0 1 MII upopenyes 11148746421280 640 === Master 0, Slave 1 === Device: Main Distributed clocks: yes, 64 bit DC system time transmission delay: 640 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII upopenyes 0644914350 0 640 1 MII down closed no -- - - This is exactly the same network with the backup link connected but not needed (no break in the main network): === Master 0, Slave 0 === Device: Main Distributed clocks: yes, 64 bit DC system time transmission delay: 0 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII upopenyes - 3165080514 0 0 1 MII upopenyes 1114874642 1244761424 622380712 === Master 0, Slave 1 === Device: Main Distributed clocks: yes, 64 bit DC system time transmission delay: 622380712 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII upopenyes 0 3692590109 0 622380712 1 MII upopenyes - 1829843904 2432221091 0 And here it is with a break between the two slaves: === Master 0, Slave 0 === Device: Main Distributed clocks: yes, 64 bit DC system time transmission delay: 0 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII upopenyes - 4023995730 0 0 1 MII down closed no -- - - === Master 0, Slave 1 === Device: Backup Distributed clocks: yes, 64 bit DC system time transmission delay: 0 ns Port Type Link LoopSignal NextSlave RxTime [ns] Diff [ns] NextDc [ns] 0 MII down closed no -- - - 1 MII upopenyes -519954610 1122331797 0 Possibly some of this is related to the weird linked-but-not-linked behaviour when the backup link is connected but unneeded (described in my prior email). ___ etherlab-users mailing list etherlab-users@etherlab.org http://lists.etherlab.org/mailman/listinfo/etherlab-users
Re: [etherlab-users] Redundancy support
On 27 February 2015 22:06, quoth Richard Hacker: I have a question regarding support for cable redundancy in the stable-1.5 branch. I know that it has options for enabling a backup network port on the PC and connecting the end of a single chain to this port. Presumably this is mostly transparent to the application code (although it can query for status)? Does it also support redundant tree links similarly? In principle it should work, although I have not tested it. The trick with redundancy is, that the number of visible slaves and the order of packet traversal must not change when a single link is destroyed. You are quite correct in the assumption that redundancy is transparent to the application. The status is only required to report a redundant state or not, otherwise redundancy would be useless to the user. The state is not required by the application to select another source/destination of data. Yes, that's all I was thinking of, to display some sort of warning to the user that their network might have issues. On a related note though, I've been testing basic redundancy (a single loop without internal subloops) recently and I've noticed some things that seem odd to me: 1. On a two slave network with the break between the two (so one slave on each master link), the log messages identify both slaves as 0-0, making it hard to see what's going on. I've already written a patch to improve this, which I'll include in the patch bundle that I've been threatening to send to the dev list for a few months now. ;) 2. There appear to be a few things that only seem to work on the main link, not the backup link (unless I'm missing something). Register requests (maybe only some types?) seem to be one of them, and I'm dubious about the DC sync behaviour as well -- I don't think the RMW broadcast sync to the refclock is really going to work on a link that doesn't contain the refclock. The transmission delay measurements seem incorrect too. 3. Whenever the etherlab master service is started (with the network initially in good state), the first time that the network breaks and redundancy is activated takes about 2 seconds to resolve (which seems to be a standard network link-up delay). If the break is then fixed, future breaks in the same spot resolve almost instantly. (I haven't yet tested with a large enough network to check breaks in different places.) The below is an example of the syslog output when the slave0 - slave1 link is broken and the slave1 - backup link needs to pick up the slack. [ 1368.157824] e1000e: ecb0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None [ 1368.157829] ec_e1000e :01:00.1: (unregistered net_device): 10/100 speed: disabling TSO [ 1368.157831] EtherCAT 0: Link state of ecb0 changed to UP. [ 1368.157960] EtherCAT WARNING 0: Domain 0: Redundant link in use! On both master and slave the LINK/ACT lights are lit on the redundant ports both before and after this event (it's a two-port adapter, in case that makes a difference), so I'm not sure why the driver is announcing a link-up at this time instead of earlier. In case it helps, this is the initial output when the master is loaded: [ 3620.561200] EtherCAT: 1 master waiting for devices. [ 3635.431476] ec_e1000e: EtherCAT-capable Intel(R) PRO/1000 Network Driver - 1.5.1-k-EtherCAT [ 3635.431479] ec_e1000e: Copyright(c) 1999 - 2011 Intel Corporation. [ 3635.431501] ec_e1000e :01:00.0: Disabling ASPM L1 [ 3635.431520] ec_e1000e :01:00.0: setting latency timer to 64 [ 3635.431606] ec_e1000e :01:00.0: irq 41 for MSI/MSI-X [ 3635.604415] EtherCAT: Accepting 68:05:CA:0A:99:18 as main device for master 0. [ 3635.748669] ec_e1000e :01:00.0: irq 41 for MSI/MSI-X [ 3635.804370] ec_e1000e :01:00.0: (unregistered net_device): MSI interrupt test failed, using legacy interrupt. [ 3635.804398] ec_e1000e :01:00.0: (unregistered net_device): (PCI Express:2.5GT/s:Width x4) 68:05:ca:0a:99:18 [ 3635.804401] ec_e1000e :01:00.0: (unregistered net_device): Intel(R) PRO/1000 Network Connection [ 3635.804476] ec_e1000e :01:00.0: (unregistered net_device): MAC: 0, PHY: 4, PBA No: D50868-008 [ 3635.804487] ec_e1000e :01:00.1: Disabling ASPM L1 [ 3635.804500] ec_e1000e :01:00.1: setting latency timer to 64 [ 3635.804581] ec_e1000e :01:00.1: irq 41 for MSI/MSI-X [ 3635.980331] EtherCAT: Accepting 68:05:CA:0A:99:19 as backup device for master 0. [ 3636.124622] ec_e1000e :01:00.1: irq 41 for MSI/MSI-X [ 3636.180287] ec_e1000e :01:00.1: (unregistered net_device): MSI interrupt test failed, using legacy interrupt. [ 3636.180315] EtherCAT DEBUG 0: ORPHANED - IDLE. [ 3636.180316] EtherCAT 0: Starting EtherCAT-IDLE thread. [ 3636.180363] ec_e1000e :01:00.1: (unregistered net_device): (PCI Express:2.5GT/s:Width x4) 68:05:ca:0a:99:19 [ 3636.180366] EtherCAT DEBUG 0: Idle thread running with send interval = 4000 us, max data size=45000 [
Re: [etherlab-users] Redundancy support
In principle it should work, although I have not tested it. The trick with redundancy is, that the number of visible slaves and the order of packet traversal must not change when a single link is destroyed. You are quite correct in the assumption that redundancy is transparent to the application. The status is only required to report a redundant state or not, otherwise redundancy would be useless to the user. The state is not required by the application to select another source/destination of data. - Richard PS. I just love your ascii art ;) Monospace font, yeah! On 27.02.2015 05:28, Gavin Lambert wrote: Hi, I have a question regarding support for cable redundancy in the stable-1.5 branch. I know that it has options for enabling a backup network port on the PC and connecting the end of a single chain to this port. Presumably this is mostly transparent to the application code (although it can query for status)? Does it also support redundant tree links similarly? Eg. PC SL1 --- SL2 --- SW --- SL3 --- SL4 -+ | | | | | +- SL5 --+ +-- SL8 -+| | | || | +- SL6 --- SL7 -+| +--+ I know I've seen a diagram like this with an internal ring on a tree branch somewhere, although I'm having trouble locating the reference now. Maybe this is a feature of the switch slave rather than of the master? Regards, Gavin Lambert ___ etherlab-users mailing list etherlab-users@etherlab.org http://lists.etherlab.org/mailman/listinfo/etherlab-users Mit freundlichem Gruß Richard Hacker -- Richard Hacker M.Sc. richard.hac...@igh-essen.com Tel.: +49 201 / 36014-16 Ingenieurgemeinschaft IgH Gesellschaft für Ingenieurleistungen mbH Heinz-Bäcker-Str. 34 D-45356 Essen Amtsgericht Essen HRB 11500 USt-Id.-Nr.: DE 174 626 722 Geschäftsführung: - Dr.-Ing. T. Finke, - Dr.-Ing. W. Hagemeister Tel.: +49 201 / 360-14-0 http://www.igh-essen.com ___ etherlab-users mailing list etherlab-users@etherlab.org http://lists.etherlab.org/mailman/listinfo/etherlab-users