Re: [etherlab-users] Redundancy support

2015-03-05 Thread Gavin Lambert
Yesterday, I quoth:
 Possibly some of this is related to the weird linked-but-not-linked
behaviour
 when the backup link is connected but unneeded (described in my prior
email).

Well, I've fixed that now (it was an incorrect assumption in the EtherCAT
driver mods), but it hasn't helped the delay measurements.


I guess the main problem is that the measurement code seems to assume that
the receive timestamps are all updated from the same packet.  But in a
redundant topology, especially with no network breaks, some of the ports
don't actually receive packets, or the timestamps within a single slave are
updated from different packets traversing.  So a port might be open but
the timestamp may not have updated or may be from a packet sent at a
different time from what you're expecting.

And the topology calculations assume that the packet is always entering from
port 0, which isn't always true with a redundant setup.  (I can understand
why -- I've had a think about it and it's quite a tricky problem, especially
given some of the limitations on data available from the slaves.)


___
etherlab-users mailing list
etherlab-users@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-users


Re: [etherlab-users] Redundancy support

2015-03-04 Thread Gavin Lambert
Yesterday, I quoth:
 2. There appear to be a few things that only seem to work on the main
link,
 not the backup link (unless I'm missing something).  Register requests
(maybe
 only some types?) seem to be one of them, and I'm dubious about the DC
sync
 behaviour as well -- I don't think the RMW broadcast sync to the refclock
is
 really going to work on a link that doesn't contain the refclock.  The
 transmission delay measurements seem incorrect too.

As an example of this, here are some excerpts of ethercat slaves -v in
various states with two slaves.

This is a normal chain network -- backup link not connected:

=== Master 0, Slave 0 ===
Device: Main
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 0 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   upopenyes -114873362   0
0
   1  MII   upopenyes 11148746421280
640
=== Master 0, Slave 1 ===
Device: Main
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 640 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   upopenyes 0644914350   0
640
   1  MII   down  closed  no  --   -
-

This is exactly the same network with the backup link connected but not
needed (no break in the main network):

=== Master 0, Slave 0 ===
Device: Main
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 0 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   upopenyes -   3165080514   0
0
   1  MII   upopenyes 1114874642  1244761424
622380712
=== Master 0, Slave 1 ===
Device: Main
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 622380712 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   upopenyes 0   3692590109   0
622380712
   1  MII   upopenyes -   1829843904  2432221091
0

And here it is with a break between the two slaves:

=== Master 0, Slave 0 ===
Device: Main
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 0 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   upopenyes -   4023995730   0
0
   1  MII   down  closed  no  --   -
-
=== Master 0, Slave 1 ===
Device: Backup
  Distributed clocks: yes, 64 bit
  DC system time transmission delay: 0 ns
Port  Type  Link  LoopSignal  NextSlave  RxTime [ns]  Diff [ns]   NextDc
[ns]
   0  MII   down  closed  no  --   -
-
   1  MII   upopenyes -519954610  1122331797
0

Possibly some of this is related to the weird linked-but-not-linked
behaviour when the backup link is connected but unneeded (described in my
prior email).


___
etherlab-users mailing list
etherlab-users@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-users


Re: [etherlab-users] Redundancy support

2015-03-03 Thread Gavin Lambert
On 27 February 2015 22:06, quoth Richard Hacker:
  I have a question regarding support for cable redundancy in the
  stable-1.5 branch.
 
  I know that it has options for enabling a backup network port on the
  PC and connecting the end of a single chain to this port.  Presumably
  this is mostly transparent to the application code (although it can
  query for status)?
 
  Does it also support redundant tree links similarly?
 
 In principle it should work, although I have not tested it. The trick with
 redundancy is, that the number of visible slaves and the order of packet
 traversal must not change when a single link is destroyed.
 
 You are quite correct in the assumption that redundancy is transparent to
the
 application. The status is only required to report a redundant state or
not,
 otherwise redundancy would be useless to the user. The state is not
required
 by the application to select another source/destination of data.

Yes, that's all I was thinking of, to display some sort of warning to the
user that their network might have issues.


On a related note though, I've been testing basic redundancy (a single loop
without internal subloops) recently and I've noticed some things that seem
odd to me:

1. On a two slave network with the break between the two (so one slave on
each master link), the log messages identify both slaves as 0-0, making it
hard to see what's going on.  I've already written a patch to improve this,
which I'll include in the patch bundle that I've been threatening to send to
the dev list for a few months now. ;)

2. There appear to be a few things that only seem to work on the main link,
not the backup link (unless I'm missing something).  Register requests
(maybe only some types?) seem to be one of them, and I'm dubious about the
DC sync behaviour as well -- I don't think the RMW broadcast sync to the
refclock is really going to work on a link that doesn't contain the
refclock.  The transmission delay measurements seem incorrect too.

3. Whenever the etherlab master service is started (with the network
initially in good state), the first time that the network breaks and
redundancy is activated takes about 2 seconds to resolve (which seems to be
a standard network link-up delay).  If the break is then fixed, future
breaks in the same spot resolve almost instantly.  (I haven't yet tested
with a large enough network to check breaks in different places.)

The below is an example of the syslog output when the slave0 - slave1 link
is broken and the slave1 - backup link needs to pick up the slack.

 [ 1368.157824] e1000e: ecb0 NIC Link is Up 100 Mbps Full Duplex, Flow
Control: None
 [ 1368.157829] ec_e1000e :01:00.1: (unregistered net_device): 10/100
speed: disabling TSO
 [ 1368.157831] EtherCAT 0: Link state of ecb0 changed to UP.
 [ 1368.157960] EtherCAT WARNING 0: Domain 0: Redundant link in use!

On both master and slave the LINK/ACT lights are lit on the redundant ports
both before and after this event (it's a two-port adapter, in case that
makes a difference), so I'm not sure why the driver is announcing a link-up
at this time instead of earlier.  In case it helps, this is the initial
output when the master is loaded:

 [ 3620.561200] EtherCAT: 1 master waiting for devices.
 [ 3635.431476] ec_e1000e: EtherCAT-capable Intel(R) PRO/1000 Network Driver
- 1.5.1-k-EtherCAT
 [ 3635.431479] ec_e1000e: Copyright(c) 1999 - 2011 Intel Corporation.
 [ 3635.431501] ec_e1000e :01:00.0: Disabling ASPM  L1
 [ 3635.431520] ec_e1000e :01:00.0: setting latency timer to 64
 [ 3635.431606] ec_e1000e :01:00.0: irq 41 for MSI/MSI-X
 [ 3635.604415] EtherCAT: Accepting 68:05:CA:0A:99:18 as main device for
master 0.
 [ 3635.748669] ec_e1000e :01:00.0: irq 41 for MSI/MSI-X
 [ 3635.804370] ec_e1000e :01:00.0: (unregistered net_device): MSI
interrupt test failed, using legacy interrupt.
 [ 3635.804398] ec_e1000e :01:00.0: (unregistered net_device): (PCI
Express:2.5GT/s:Width x4) 68:05:ca:0a:99:18
 [ 3635.804401] ec_e1000e :01:00.0: (unregistered net_device): Intel(R)
PRO/1000 Network Connection
 [ 3635.804476] ec_e1000e :01:00.0: (unregistered net_device): MAC: 0,
PHY: 4, PBA No: D50868-008
 [ 3635.804487] ec_e1000e :01:00.1: Disabling ASPM  L1
 [ 3635.804500] ec_e1000e :01:00.1: setting latency timer to 64
 [ 3635.804581] ec_e1000e :01:00.1: irq 41 for MSI/MSI-X
 [ 3635.980331] EtherCAT: Accepting 68:05:CA:0A:99:19 as backup device for
master 0.
 [ 3636.124622] ec_e1000e :01:00.1: irq 41 for MSI/MSI-X
 [ 3636.180287] ec_e1000e :01:00.1: (unregistered net_device): MSI
interrupt test failed, using legacy interrupt.
 [ 3636.180315] EtherCAT DEBUG 0: ORPHANED - IDLE.
 [ 3636.180316] EtherCAT 0: Starting EtherCAT-IDLE thread.
 [ 3636.180363] ec_e1000e :01:00.1: (unregistered net_device): (PCI
Express:2.5GT/s:Width x4) 68:05:ca:0a:99:19
 [ 3636.180366] EtherCAT DEBUG 0: Idle thread running with send interval =
4000 us, max data size=45000
 [ 

Re: [etherlab-users] Redundancy support

2015-02-27 Thread Richard Hacker
In principle it should work, although I have not tested it. The trick 
with redundancy is, that the number of visible slaves and the order of 
packet traversal must not change when a single link is destroyed.


You are quite correct in the assumption that redundancy is transparent 
to the application. The status is only required to report a redundant 
state or not, otherwise redundancy would be useless to the user. The 
state is not required by the application to select another 
source/destination of data.


- Richard

PS. I just love your ascii art ;) Monospace font, yeah!

On 27.02.2015 05:28, Gavin Lambert wrote:

Hi,

I have a question regarding support for cable redundancy in the stable-1.5
branch.

I know that it has options for enabling a backup network port on the PC
and connecting the end of a single chain to this port.  Presumably this is
mostly transparent to the application code (although it can query for
status)?

Does it also support redundant tree links similarly?

Eg. PC  SL1 --- SL2 --- SW --- SL3 --- SL4 -+
  |  | | |
  | +- SL5 --+ +-- SL8 -+|
  | |   ||
  | +- SL6 --- SL7 -+|
  +--+

I know I've seen a diagram like this with an internal ring on a tree branch
somewhere, although I'm having trouble locating the reference now.  Maybe
this is a feature of the switch slave rather than of the master?

Regards,
Gavin Lambert

___
etherlab-users mailing list
etherlab-users@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-users



Mit freundlichem Gruß

Richard Hacker

--


Richard Hacker M.Sc.
richard.hac...@igh-essen.com
Tel.: +49 201 / 36014-16

Ingenieurgemeinschaft IgH
Gesellschaft für Ingenieurleistungen mbH
Heinz-Bäcker-Str. 34
D-45356 Essen

Amtsgericht Essen HRB 11500
USt-Id.-Nr.: DE 174 626 722
Geschäftsführung:
- Dr.-Ing. T. Finke,
- Dr.-Ing. W. Hagemeister
Tel.: +49 201 / 360-14-0
http://www.igh-essen.com


___
etherlab-users mailing list
etherlab-users@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-users