Re: [etherlab-dev] Support for multiple mailbox protocols

Gavin Lambert Mon, 30 Jun 2014 18:18:07 -0700

On 30 June 2014, quoth Jun Yuan:
> 1) The master can send max. only one mailbox request in one cycle.


Per slave, yes.

> 2) Before a mailbox request is going to be sent, the master should 
> check the SM0 register 0x800 whether the send mailbox is full or 
> empty, and send the request only if it is empty.

This is not necessary.  If the mailbox is still full then an attempt to write 
to it will simply fail with WC=0.  As the check and send would have to be in 
different frames, it's more efficient just to try to send blindly (because the 
most common case will be that the mailbox is empty), and retry later in case of 
failure.

Although some care might be required with that; WC=0 does not always mean that 
the mailbox is full, it can also occur if the packet has somehow missed the 
slave, or the incoming packet could get lost and time out (in which case you 
won't know whether the slave received it or not).

While testing Frank's patches (which include an auto-retry-send mechanism, 
since as he noted the higher level FSMs were inconsistent whether they retried 
sends or not) on an unreliable network I found a case where a send timed out, 
so it sent the message again, and then the slave replied twice (which the 
master wasn't expecting).  This suggests that the slave did receive the first 
send but the incoming WC=1 response got lost.  (Actually in this specific case 
it was merely delayed rather than really lost, but it had the same effect.)  
Ultimately everything recovered from this properly due to higher-level retries, 
but it did result in some warnings and errors getting logged.  (See also 
discussion at the bottom.)

(Note that it's similarly possible to blindly issue fetch requests instead of 
doing the two-part check+fetch that Etherlab currently does -- but there the 
tradeoff is more dubious as it's expected to take some time to generate the 
response, so there will be several failures before it succeeds, so using the 
"cheaper" check datagram makes more sense.  It's also possible to explicitly 
map an FMMU into a domain to cyclically poll the mailbox state of all slaves, 
if a realtime loop is running, and if the slave provides an extra FMMU for this 
purpose [but most CoE slaves do].)

> 3) Mailbox service with different protocol can be executed in parallel. 
> That is, the master can send a new mailbox request, while the last 
> conversation in another protocol is not finished yet, given that the 
> send mailbox is already cleared.

I believe so, yes.  I can't rule out the possibility that some slaves might not 
be able to cope with this, but in general I would expect that any slave 
prepared to process multiple protocols should be able to keep track of those 
conversations in parallel, as they're always described as independent FSMs.

> 4) We're not sure if all the slave can support multiple CoE conversations 
> in parallel. So the master should start a new CoE conversation only when 
> the last one is finished. The same applies to SoE/FoE/VoE.

Yes.  I believe that most of the time it would actually work -- if a slave is 
processing requests entirely synchronously, then it will start processing the 
first received request and leave the second either in the mailbox or in its 
internal queue until it's done with the first, and everything should just work 
out.  But slaves are allowed to process requests asynchronously (and may choose 
to do so if something eg. requires configuring onboard hardware) and that's 
where trouble could start.  Additionally, the way that the Etherlab master code 
is written at the moment means (I think) the order in which the requests are 
actually sent is not known at the higher level, and there isn't a way to ensure 
the reply is matched up correctly when it arrives.

Another consideration is that it's possible for a single request or response to 
consist of multiple mailbox exchanges (eg. if a response is fragmented because 
it doesn't fit in the mailbox).  I'm not sure if this is a generic thing but I 
do know that SDO Information responses can get fragmented this way, and it's 
possible for other responses to get injected in the middle of a fragmented 
response, and that would presumably be a major pain to disambiguate if they 
didn't have different protocols.

And again, in the standards the various protocols are described as state 
machines that react to mailbox data in specific ways that imply you shouldn't 
be trying to concurrently access the same state machine with different requests.

So I don't feel like it's a good idea to attempt sending the same protocol in 
parallel.

(The one fly in the ointment is that CoE Emergencies technically belong to the 
CoE protocol, but they're described as a separate state machine, implying that 
emergencies can arrive at any time, including mid-fragmented-CoE-message.  But 
I think the Etherlab CoE state machine is already prepared for that.)

> While reading the documentation and another open source ethercat project 
> SOEM, I found there is a mailbox service "counter" besides the service 
> type in the mailbox header. It says, "Counter of the mailbox services 
> (0 ist start value, next value after 7 is 1)". I wonder if what this 
> counter is used for, how is it implemented in your slave example code, 
> whether it could be useful for us in the multiple conversation situation.

As I understand it, mostly it's intended as a way to avoid the situation above 
with repeated requests causing duplicated responses.  The idea is that when the 
master is sending a request it picks a value from 1-7 to go in there (this 
should increment with each unique request according to the spec, but slaves 
shouldn't be overly picky about it).  If the send gets WC=0, then it can repeat 
the request *with the same counter*.  If the slave receives two *consecutive* 
requests with the same counter value, the second is ignored, which would have 
meant the scenario above would have resulted in only one reply, and everyone 
would have been happy.  Note that the counter is global and not per-protocol.  
Also note that this does mean that even sends for different protocols need to 
be aware of each other at the lower level in order to set the correct counter 
and retry if necessary before sending the subsequent request.

When the master uses a counter value of 0 (which Etherlab currently always 
does) then this is bypassed and all requests are processed.  Similarly when the 
slave generates responses into the receive mailbox it may either always use 0 
for the counter or it may increment 1-7, but this is independent from the send 
mailbox counter.

So it's not really intended to deal with multiple conversation threads.  There 
are some other fields in the mailbox header that do look like they're intended 
for that sort of thing (channel and priority) but currently they're reserved in 
the spec and not actually implemented AFAIK.

There's also a mechanism for getting a slave to repeat a response without 
re-sending the request (which might have side effects), which could be useful 
if a check indicated something in the read mailbox but then the subsequent 
fetch timed out (after actually succeeding at clearing the read mailbox).  This 
involves a register write to 0x080E, but I haven't looked too closely at the 
specifics, or how likely slaves are to implement it.

Regards,
Gavin Lambert


_______________________________________________
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev

Re: [etherlab-dev] Support for multiple mailbox protocols

Reply via email to