Re: [etherlab-users] Error reassigning removed PDO

2014-06-04 Thread Gavin Lambert
Hi Jun,

 

While that patch looks like an improvement, it will still have the same trouble 
if the master service is restarted between runs, or if an application wants to 
include a PDO that is not assigned by default.

 

I think having ecrt_slave_config_pdos (or actually 
ec_slave_config_load_default_mapping) upload the mapping from the slave is 
actually the better solution and not ugly at all, in theory.  Bear in mind that 
this should only happen if the slave is online and if the mapping was not 
already found in the application-supplied mappings or the previously-read 
cache.  (Though note that the current code structure would do it regardless of 
whether the application supplied mappings or not, as an unfortunate consequence 
of the API structure.  But it will meet the other two conditions.)

 

And it would only be one SDO upload if the slave supports Complete Access, 
which the master should already know at that point.  (Although that’s an 
optimisation missing from the current PDO configuration code as well.)

 

Regards,

Gavin Lambert

 

From: Jun Yuan [mailto:j.y...@rtleaders.com] 
Sent: Saturday, 31 May 2014 04:24
To: Gavin Lambert
Cc: etherlab-users@etherlab.org
Subject: Re: [etherlab-users] Error reassigning removed PDO

 

Hi Gavin,

I have a gift for you. The attached patch should make your scenario with 
different PDOs of interest in different apps working. The problem was that the 
master always makes the last PDO assign in SyncManager as the default PDO 
assign, and it don't remember any older PDO assignment. I made a patch to get a 
memory for the PDO mappings, and it always merge the new PDO mapping list into 
the old list, instead of throw the old list away. It remembers things. 

It is still not so smart like you said to fetch the PDO assign using their 
index via CoE automatically. I don't know if it's a good idea for the master to 
do it blindly. The question is when should the master fetch it. If the master 
fetches all the PDO mappings during the bus scan, isn't that a waste of time, 
because most of the time we don't need all of them? If fetch it when needed, 
the master needs to call several ecrt_master_sdo_upload() in the function 
ecrt_slave_config_pdos() to fetch the mapping, which makes the code quite ugly. 
And actually the app can do it itself, and then provides the correct default 
PDO mapping to the master.

Hope you enjoy it!

Regards,

Jun

 

 

On 22 April 2014 09:33, Gavin Lambert gav...@compacsort.com wrote:

Hi all,

TLDR: when reassigning PDOs, why doesn't the master read mappings from the
slave via CoE?

I have a (custom) slave that provides a number of different PDOs.  I have a
couple of different master applications which are interested in different
subsets of these PDOs.  As an example, let's say that the slave has an RxPDO
at 0x1600 that points to 0x7000:0x00:0x20, and one app wants to use this
value and the other doesn't.

If the master apps just use ecrt_domain_reg_pdo_entry_list to register the
PDOs of interest, then they both work (assuming that the slave has all the
required PDOs assigned by default), but it wastes space in the packet as the
whole SM is transferred even if some of the data is not of interest to that
particular master app.  (And in the case of outputs, it forces the master
app to write something even when it doesn't want to, lest the slave get
uninitialized data and think it needs to do something with it.)

If the master apps use ecrt_slave_config_pdos to select the PDOs of
interest, then things get troublesome.  If the master apps specify the full
mappings explicitly, then again things work, but as the slave does not
support remapping (just reassignment) this generates warnings, and it just
seems ugly to me to have to specify all this data that the slave already
knows.  (And it makes things more brittle, as if the mapping is changed in a
future version of the slave it will generate an error instead of just
working, as it would if it had loaded the slave's current mappings.)

If the master apps don't specify the full mappings, however (just the sync
manager - PDO assignments, which seems like it's a supported scenario given
the docs and examples), then results are mixed.  If the slave is rebooted
prior to running either master app, it works.  If not, then the master app
that wants the extra PDO will fail to run.

The problem case seems to be:
  - slave boots, has all PDOs in SII and CoE PDO assign.
  - first app runs, specifies PDO Assign to not include 0x1600.
- runs successfully.
- PDO Assign is updated in the actual slave.
  - second app runs, specifies PDO Assign to include 0x1600.
- fails at ecrt_reg_pdo_entry_list as it cannot find a mapping for
0x7000:0x00.
- problem is Loading default mapping for PDO 0x1600. - No default
mapping found.
- PDO Assign of the actual slave is never actually updated in this case
as it fails before it activates the slave configs.
  - ethercat rescan / ethercat pdos

Re: [etherlab-users] Error reassigning removed PDO

2014-05-30 Thread Jun Yuan
Hi Gavin,

I have a gift for you. The attached patch should make your scenario with
different PDOs of interest in different apps working. The problem was that
the master always makes the last PDO assign in SyncManager as the default
PDO assign, and it don't remember any older PDO assignment. I made a patch
to get a memory for the PDO mappings, and it always merge the new PDO
mapping list into the old list, instead of throw the old list away. It
remembers things.

It is still not so smart like you said to fetch the PDO assign using their
index via CoE automatically. I don't know if it's a good idea for the
master to do it blindly. The question is when should the master fetch it.
If the master fetches all the PDO mappings during the bus scan, isn't that
a waste of time, because most of the time we don't need all of them? If
fetch it when needed, the master needs to call several
ecrt_master_sdo_upload() in the function ecrt_slave_config_pdos() to fetch
the mapping, which makes the code quite ugly. And actually the app can do
it itself, and then provides the correct default PDO mapping to the master.

Hope you enjoy it!

Regards,
Jun



On 22 April 2014 09:33, Gavin Lambert gav...@compacsort.com wrote:

 Hi all,

 TLDR: when reassigning PDOs, why doesn't the master read mappings from the
 slave via CoE?

 I have a (custom) slave that provides a number of different PDOs.  I have a
 couple of different master applications which are interested in different
 subsets of these PDOs.  As an example, let's say that the slave has an
 RxPDO
 at 0x1600 that points to 0x7000:0x00:0x20, and one app wants to use this
 value and the other doesn't.

 If the master apps just use ecrt_domain_reg_pdo_entry_list to register the
 PDOs of interest, then they both work (assuming that the slave has all the
 required PDOs assigned by default), but it wastes space in the packet as
 the
 whole SM is transferred even if some of the data is not of interest to that
 particular master app.  (And in the case of outputs, it forces the master
 app to write something even when it doesn't want to, lest the slave get
 uninitialized data and think it needs to do something with it.)

 If the master apps use ecrt_slave_config_pdos to select the PDOs of
 interest, then things get troublesome.  If the master apps specify the full
 mappings explicitly, then again things work, but as the slave does not
 support remapping (just reassignment) this generates warnings, and it just
 seems ugly to me to have to specify all this data that the slave already
 knows.  (And it makes things more brittle, as if the mapping is changed in
 a
 future version of the slave it will generate an error instead of just
 working, as it would if it had loaded the slave's current mappings.)

 If the master apps don't specify the full mappings, however (just the sync
 manager - PDO assignments, which seems like it's a supported scenario
 given
 the docs and examples), then results are mixed.  If the slave is rebooted
 prior to running either master app, it works.  If not, then the master app
 that wants the extra PDO will fail to run.

 The problem case seems to be:
   - slave boots, has all PDOs in SII and CoE PDO assign.
   - first app runs, specifies PDO Assign to not include 0x1600.
 - runs successfully.
 - PDO Assign is updated in the actual slave.
   - second app runs, specifies PDO Assign to include 0x1600.
 - fails at ecrt_reg_pdo_entry_list as it cannot find a mapping for
 0x7000:0x00.
 - problem is Loading default mapping for PDO 0x1600. - No default
 mapping found.
 - PDO Assign of the actual slave is never actually updated in this case
 as it fails before it activates the slave configs.
   - ethercat rescan / ethercat pdos at this point does not show 0x1600.
   - it requires rebooting the slave, or manually updating PDO Assign (and
 rescanning) before the master will admit that it exists again.

 Shouldn't this scenario work?  The PDO is always specified in the SII, even
 if not presently in PDO Assign, so the master ought to know that it exists.
 And failing that, it could just try to read the mappings directly from the
 slave (if CoE is available) when unable to load default mapping from its
 cache.  (I think part of the problem is that the CoE data appears to be
 replacing the SII data in the master's PDO cache.)

 I'm also a little puzzled as to why (if it wants to have a cache of PDO
 mappings) it seems to limit itself to reading only the currently assigned
 PDOs during the initial scan, instead of fetching all of them.  They
 shouldn't be hard to find -- they can be identified purely by their index.

 It shouldn't be all that uncommon to have a slave that provides PDOs that
 aren't in the default PDO Assign, or to provide more information than
 needed
 for particular master apps.  Is it just expected that master apps always
 hard-code the full mappings, instead of fetching the mappings from the
 slave?  Or is this something missing from the 

Re: [etherlab-users] Error reassigning removed PDO

2014-05-29 Thread Jun Yuan
Hello Gavin,

for that specific part of the CoE transfer problem you mentioned, I may
have observed the same problem, and I did some analysis on it. This is
actually a big problem, makes the master quite unreliable for me. I have a
temporary fix for it. But I don't know who should be responsible for this
CoE mailbox bug. Is it the master? Is it the slave? or is it a design error
in the EtherCAT standard for the mailbox? I'll write another email to
elaborate the problem with the flaky CoE mailbox.

Regards,
Jun


On 29 May 2014 09:37, Gavin Lambert gav...@compacsort.com wrote:

 Last month, I wrote:
  TLDR: when reassigning PDOs, why doesn't the master read mappings from
  the slave via CoE?
 [...]
  Shouldn't this scenario work?  The PDO is always specified in the SII,
  even if not presently in PDO Assign, so the master ought to know that it
  exists.
  And failing that, it could just try to read the mappings directly from
  the slave (if CoE is available) when unable to load default mapping from
  its cache.  (I think part of the problem is that the CoE data appears to
  be replacing the SII data in the master's PDO cache.)
 
  I'm also a little puzzled as to why (if it wants to have a cache of PDO
  mappings) it seems to limit itself to reading only the currently
  assigned PDOs during the initial scan, instead of fetching all of them.
  They shouldn't be hard to find -- they can be identified purely by their
  index.

 There's a further problem with this that I've since discovered: if, during
 the master's scan of the PDO assignment registers, something goes wrong
 with
 the CoE transfer of 0x1C1x:0, then the master will log an error but proceed
 anyway under the assumption that the slave has 0 PDOs assigned in that SM.
 If this is not contradicted by the application using ecrt_slave_config_pdos
 (including both assigns and mappings, because it read no default mappings),
 then the master will *write 0 back* to the PDO assignment register (if
 writable) on activate.

 This guarantees that the next scan will not find any PDOs, unless the slave
 reloads the default assignments during INIT (and with my slave author hat
 on, all advice I can find says that slaves should not do that, although I
 couldn't find official word).

 So basically it all seems to point to applications being unreliable (at
 least for flexible-assignment slaves) unless they use
 ecrt_slave_config_pdos
 to configure *everything* (including mappings, even for fixed-mapping
 slaves).  Which makes me wonder why it bothers scanning for PDO assignments
 at all.  Doesn't that just waste time if apps have to use
 ecrt_slave_config_pdos anyway?

 Given how flaky mailbox handling is in general (as previously mentioned),
 I'm surprised this hasn't come up more often.


 ___
 etherlab-users mailing list
 etherlab-users@etherlab.org
 http://lists.etherlab.org/mailman/listinfo/etherlab-users

___
etherlab-users mailing list
etherlab-users@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-users


Re: [etherlab-users] Error reassigning removed PDO

2014-05-29 Thread Gavin Lambert
It’s mostly a master problem I think, although some of the worst misbehaviour 
requires particular functionality in the slave (which may be rarer).

 

The main problem that I’ve personally run into recently (and coded my own 
workaround for, just a few minutes ago) was from this scenario:

1.   Master starts up, starts doing slave scanning.

2.   Application starts up, calls ecrt_request_master, which waits for 
slave scanning to complete before returning.

3.   Application sets up basic configuration and calls ecrt_master_activate.

4.   Slaves wind their way up to OP.

5.   Meanwhile in the background the master starts reading the CoE 
dictionary and getting entry descriptions to fill in the names.  (This takes 
quite a long time.)

6.   Application decides something is screwy while this is still happening 
and calls ecrt_master_release and unloads the master module.

7.   Since the master stops dead when this happens, occasionally it has 
just sent a CoE Info request to a slave but abandoned waiting for the response. 
 The response is still sitting there in the slave’s mailbox.  The slaves have 
dropped back to SAFEOP+ERROR because they’re no longer receiving data.

8.   The master service and application are reloaded.

9.   The initial scan sees the slaves at = PREOP so merely acknowledges 
the error and leaves them at SAFEOP, then starts to read SM+PDOs.

10.   When it gets to the slave that had a stale SDO Info response in its 
mailbox (which is still there, because the slave was never sent back to INIT), 
it gets confused because it wasn’t the SDO 0x1C12 data response it was 
expecting (because it had just sent the request); it aborts the request and 
assumes 0 PDOs in that SM.  Hilarity ensues, as I’ve already outlined below.

 

(This can also occur if the network is disconnected but not unpowered at any 
time during the CoE dictionary scan, then reconnected later.)

 

Note that it’s reasonable for the scan to not reset to INIT, because rescans 
can occur during operation (although having said that, I haven’t looked too 
closely at whether this disrupts anything).  But I think it’s definitely a 
master-side bug that it can’t cope with stale responses – that’s just something 
you always have to expect with mailboxes, especially when there are timeouts 
involved as well.

 

My workaround was to change the CoE FSM to check for and discard any stale data 
in the mailbox prior to beginning any CoE operation.  It seemed to resolve the 
above issue in a very basic test, but I’ll hopefully know more after a more 
thorough one tomorrow.

 

It’s not an ideal solution, of course; the underlying problem (which I hinted 
at below, and posted in more detail about several months ago) is that the 
Etherlab code assumes that only one thing is going on in the mailboxes at a 
time, and so only checks them when it’s expecting a response and throws its 
virtual hands up when it finds something other than what it wanted.  This is 
particularly noticeable if a slave sends asynchronous notifications, or can 
process multiple mailbox protocols in parallel (both of which are allowed in 
the standards).  The most common types of these are CoE emergencies and EoE.  
And woe betide you if the master happens to be handling a FoE request when an 
emergency arrives, or a CoE request when an EoE packet arrives, etc.

 

Ideally the master should have some sort of central dispatcher which is 
constantly watching mailboxes and handing off incoming data to the protocol 
state machines as they arrive.  Often this can even be done for “free” – many 
slaves provide a dedicated “MBoxState” FMMU that can be used to watch for new 
mailbox messages as part of the regular process datagram, avoiding the need to 
individually poll the slaves.

 

From: Jun Yuan [mailto:j.y...@rtleaders.com] 
Sent: Thursday, 29 May 2014 20:40
To: Gavin Lambert
Cc: etherlab-users@etherlab.org
Subject: Re: [etherlab-users] Error reassigning removed PDO

 

Hello Gavin,

for that specific part of the CoE transfer problem you mentioned, I may have 
observed the same problem, and I did some analysis on it. This is actually a 
big problem, makes the master quite unreliable for me. I have a temporary fix 
for it. But I don't know who should be responsible for this CoE mailbox bug. Is 
it the master? Is it the slave? or is it a design error in the EtherCAT 
standard for the mailbox? I'll write another email to elaborate the problem 
with the flaky CoE mailbox.

Regards,
Jun

 

On 29 May 2014 09:37, Gavin Lambert gav...@compacsort.com wrote:

Last month, I wrote:
 TLDR: when reassigning PDOs, why doesn't the master read mappings from
 the slave via CoE?
[...]
 Shouldn't this scenario work?  The PDO is always specified in the SII,
 even if not presently in PDO Assign, so the master ought to know that it
 exists.
 And failing that, it could just try to read the mappings directly from
 the slave (if CoE is available) when unable

Re: [etherlab-users] Error reassigning removed PDO

2014-05-29 Thread Jun Yuan
Thank you so much, after reading your mail, I finally understand why some
slave goto SAFEOP+ERROR state under the circumstances. Yes I had exactly
the same problem.


On 29 May 2014 11:24, Gavin Lambert gav...@compacsort.com wrote:

 It’s mostly a master problem I think, although some of the worst
 misbehaviour requires particular functionality in the slave (which may be
 rarer).



 The main problem that I’ve personally run into recently (and coded my own
 workaround for, just a few minutes ago) was from this scenario:

 1.   Master starts up, starts doing slave scanning.

 2.   Application starts up, calls ecrt_request_master, which waits
 for slave scanning to complete before returning.

 3.   Application sets up basic configuration and calls
 ecrt_master_activate.

 4.   Slaves wind their way up to OP.

 5.   Meanwhile in the background the master starts reading the CoE
 dictionary and getting entry descriptions to fill in the names.  (This
 takes quite a long time.)

 6.   Application decides something is screwy while this is still
 happening and calls ecrt_master_release and unloads the master module.

 7.   Since the master stops dead when this happens, occasionally it
 has just sent a CoE Info request to a slave but abandoned waiting for the
 response.  The response is still sitting there in the slave’s mailbox.  The
 slaves have dropped back to SAFEOP+ERROR because they’re no longer
 receiving data.

 8.   The master service and application are reloaded.

 9.   The initial scan sees the slaves at = PREOP so merely
 acknowledges the error and leaves them at SAFEOP, then starts to read
 SM+PDOs.

 10.   When it gets to the slave that had a stale SDO Info response in its
 mailbox (which is still there, because the slave was never sent back to
 INIT), it gets confused because it wasn’t the SDO 0x1C12 data response it
 was expecting (because it had just sent the request); it aborts the request
 and assumes 0 PDOs in that SM.  Hilarity ensues, as I’ve already outlined
 below.



 (This can also occur if the network is disconnected but not unpowered at
 any time during the CoE dictionary scan, then reconnected later.)



 Note that it’s reasonable for the scan to not reset to INIT, because
 rescans can occur during operation (although having said that, I haven’t
 looked too closely at whether this disrupts anything).  But I think it’s
 definitely a master-side bug that it can’t cope with stale responses –
 that’s just something you always have to expect with mailboxes, especially
 when there are timeouts involved as well.



 My workaround was to change the CoE FSM to check for and discard any stale
 data in the mailbox prior to beginning any CoE operation.  It seemed to
 resolve the above issue in a very basic test, but I’ll hopefully know more
 after a more thorough one tomorrow.



 It’s not an ideal solution, of course; the underlying problem (which I
 hinted at below, and posted in more detail about several months ago) is
 that the Etherlab code assumes that only one thing is going on in the
 mailboxes at a time, and so only checks them when it’s expecting a response
 and throws its virtual hands up when it finds something other than what it
 wanted.  This is particularly noticeable if a slave sends asynchronous
 notifications, or can process multiple mailbox protocols in parallel (both
 of which are allowed in the standards).  The most common types of these are
 CoE emergencies and EoE.  And woe betide you if the master happens to be
 handling a FoE request when an emergency arrives, or a CoE request when an
 EoE packet arrives, etc.



 Ideally the master should have some sort of central dispatcher which is
 constantly watching mailboxes and handing off incoming data to the protocol
 state machines as they arrive.  Often this can even be done for “free” –
 many slaves provide a dedicated “MBoxState” FMMU that can be used to watch
 for new mailbox messages as part of the regular process datagram, avoiding
 the need to individually poll the slaves.



 *From:* Jun Yuan [mailto:j.y...@rtleaders.com]
 *Sent:* Thursday, 29 May 2014 20:40
 *To:* Gavin Lambert
 *Cc:* etherlab-users@etherlab.org
 *Subject:* Re: [etherlab-users] Error reassigning removed PDO



 Hello Gavin,

 for that specific part of the CoE transfer problem you mentioned, I may
 have observed the same problem, and I did some analysis on it. This is
 actually a big problem, makes the master quite unreliable for me. I have a
 temporary fix for it. But I don't know who should be responsible for this
 CoE mailbox bug. Is it the master? Is it the slave? or is it a design error
 in the EtherCAT standard for the mailbox? I'll write another email to
 elaborate the problem with the flaky CoE mailbox.

 Regards,
 Jun



 On 29 May 2014 09:37, Gavin Lambert gav...@compacsort.com wrote:

 Last month, I wrote:
  TLDR: when reassigning PDOs, why doesn't the master read mappings from
  the slave via CoE

Re: [etherlab-users] Error reassigning removed PDO

2014-04-22 Thread Richard Hacker


Am 04/22/2014 09:33 AM, schrieb Gavin Lambert:

Hi all,

TLDR: when reassigning PDOs, why doesn't the master read mappings from the
slave via CoE?
For the very simple reason, that the application can start without any 
slaves being attached to the network!


In order to be able to do that, the master must be informed of the 
network topology _and_ the SyncManager configuration of every slave.


The slaves themselves have different levels of intellegence. The 
simplest of them all don't support any reconfiguration, some support 
reconfiguring SyncManagers with different predefined and fixed PDO's and 
yet others support even reconfiguring PDO's themselves.


Some slaves don't even know their configuration until they have booted 
and been configured by the master, as in the case of completely dynamic 
slaves like bus converters, e.g. EtherCAT - ProfiBus converters. These 
slaves are completely dependent on the master telling the slave what its 
configuration looks like, in terms of SyncManager, PDO's and even PDO 
Entries!


Whether SyncManagers and PDO's are fixed or even mandatory should be 
documented in the ESI xml file.


Just by the way, SII does not necessarily contain valid information, but 
it might. SII is a sort of online data storage where the slave 
manufacturor can store some information. Here is a typical example of a 
single point of truth flaw: the information in SII should reflect the 
slave, but if it doesn't, the slave still works but you have been led 
behind the bush! Even though there is a certification test to check that 
the SII information is correct, slaves exist where this is not the case.


TLDR: RTFM of the slave and tell the master how a slave is to be 
configured ;) It is not a waste of space.


- Richard



I have a (custom) slave that provides a number of different PDOs.  I have a
couple of different master applications which are interested in different
subsets of these PDOs.  As an example, let's say that the slave has an RxPDO
at 0x1600 that points to 0x7000:0x00:0x20, and one app wants to use this
value and the other doesn't.

If the master apps just use ecrt_domain_reg_pdo_entry_list to register the
PDOs of interest, then they both work (assuming that the slave has all the
required PDOs assigned by default), but it wastes space in the packet as the
whole SM is transferred even if some of the data is not of interest to that
particular master app.  (And in the case of outputs, it forces the master
app to write something even when it doesn't want to, lest the slave get
uninitialized data and think it needs to do something with it.)

If the master apps use ecrt_slave_config_pdos to select the PDOs of
interest, then things get troublesome.  If the master apps specify the full
mappings explicitly, then again things work, but as the slave does not
support remapping (just reassignment) this generates warnings, and it just
seems ugly to me to have to specify all this data that the slave already
knows.  (And it makes things more brittle, as if the mapping is changed in a
future version of the slave it will generate an error instead of just
working, as it would if it had loaded the slave's current mappings.)

If the master apps don't specify the full mappings, however (just the sync
manager - PDO assignments, which seems like it's a supported scenario given
the docs and examples), then results are mixed.  If the slave is rebooted
prior to running either master app, it works.  If not, then the master app
that wants the extra PDO will fail to run.

The problem case seems to be:
   - slave boots, has all PDOs in SII and CoE PDO assign.
   - first app runs, specifies PDO Assign to not include 0x1600.
 - runs successfully.
 - PDO Assign is updated in the actual slave.
   - second app runs, specifies PDO Assign to include 0x1600.
 - fails at ecrt_reg_pdo_entry_list as it cannot find a mapping for
0x7000:0x00.
 - problem is Loading default mapping for PDO 0x1600. - No default
mapping found.
 - PDO Assign of the actual slave is never actually updated in this case
as it fails before it activates the slave configs.
   - ethercat rescan / ethercat pdos at this point does not show 0x1600.
   - it requires rebooting the slave, or manually updating PDO Assign (and
rescanning) before the master will admit that it exists again.

Shouldn't this scenario work?  The PDO is always specified in the SII, even
if not presently in PDO Assign, so the master ought to know that it exists.
And failing that, it could just try to read the mappings directly from the
slave (if CoE is available) when unable to load default mapping from its
cache.  (I think part of the problem is that the CoE data appears to be
replacing the SII data in the master's PDO cache.)

I'm also a little puzzled as to why (if it wants to have a cache of PDO
mappings) it seems to limit itself to reading only the currently assigned
PDOs during the initial scan, instead of fetching all of them.  They
shouldn't be hard