> On Friday, 6 June 2014 02:23, quoth Frank Heckenbach: > > I attach my complete set of patches, including the patches I've sent in > previous > > mails (02-* to 08-*, slightly adjusted; 01-ethercat-1.5-header.patch was > applied > > in your code already). > > I've been experimenting with merging most of these into my local code > (ignoring some of the EoE-specific patches as I don't have any way to test > that at the moment), with the intent to re-release as 1.5.2-compatible > patches once it seems to be behaving itself.
Perhaps you can at least make sure they compile with the new version. Ultimately, Florian will have to integrate the patches into the development sources, or not. > While merging I came across a > few oddities that I was hoping you might be able to clarify: > > > (09-ethercat-1.5-mailbox-tag.patch and > > 10-ethercat-1.5-mailbox-allocate-buffer.patch contain the boring > > parts (preparations, new data structures) that shouldn't change > > the behaviour. The main change is in > > 11-ethercat-1.5-mailbox-buffer.patch.) > > In patch 11, you have a change to master.c that checks for jiffies_poll > > jiffies_sent in addition to the original (jiffies_poll - jiffies_sent) > > timeout_jiffies. > > What was the motivation for this? It doesn't seem like it will be > wraparound-safe, and the safe version of it would be redundant with the > original code. Yes, I think I was overly cautious here (same with cycles, just before that). If we can assume that cycles_t is always an unsigned type, I think we can rely on C wraparound rules and don't need those additional checks. > > After I backported code from your repository to add locking in > > ec_master_clear_slaves() > > (18-ethercat-1.5-locking-fix-backport.patch) > > I'm curious where this was backported from. The current 1.5.2 code doesn't > have this. Indeed, it doesn't, but the development code as obtained by the "hg clone" command as listed on http://etherlab.org/de/ethercat/index.php does. Though it uses master->io_mutex instead of master->io_sem now. For what I can tell (seeing that 1.5.0 doesn't have io_mutex at all), an intermediate version introduced the code with io_sem (which I then took) and was later changed to io_mutex along with other changes. (This all might not be very relevant except to explain where I got the code from; otherwise you can treat it like it was one of my patches, I guess.) However, looking at the ChangeLog entries from 2012-12-06 and 2013-01-10 there have been changes regarding io_sem in 1.5.2. This might be a conflict area with my patches, so you might have to check these whole patches (#18, #19, #20) carefully when applying. My goals, as I tried to make clear, were: 1. To have a single semaphore responsible for locking datagram_queue in any situation. (Whether this queue is called io_sem or io_mutex is irrelevant, of course.) 2. To have a reliable way of locking between RTAI and normal code; as I wrote, semaphores won't do, that's why io_sem (or now perhaps io_mutex) shouldn't be used anywhere at all except in the default callback which my RTAI-using application would override in favour of a locking mechanism that cooperates with RTAI (see #21, #22 -- these patches apply to the example programs, but of course, my actual application does basically the same). I don't know if 1. is already the case in 1.5.2, but 2. certainly isn't (send_callback still uses an RTAI semaphore which isn't viable as I described, because a single responsible semaphore as required by 1. can also be used via cdev from any process which is not RT hardened). > > I see that in newer versions (e.g. commit 53b5128e1313), you > > apparently reverted the callback mechanism from send/receive > > callbacks back to lock/unlock callbacks as it was in 1.4. I also > > prefer the latter since they can be used more generally. > > The specified commit does not seem to be related to callbacks, was made in > 2011, and the latest 1.5.2 code still has send/receive callbacks. So this > confuses me. I admit I'm not very proficient with hg, so I probably mixed up the commits. I'd have to read up on it and dig deeper, maybe you're faster at it. In any case, the cloned code (see above) does contain lock_cb and unlock_cb in place of send_cb and receive_cb as in 1.5.0 and 1.5.2. So I figure, Florian made this change, but hasn't pushed it into a release yet. > > To avoid this, I fetch the mailbox once before using it for the > > first time, ignoring any result, whether empty or not. > > (26-ethercat-1.5-clear-mailbox.patch) > > In this patch, you log a message saying that data was cleared if the fetch > datagram working counter != 1. Shouldn't that be != 0 or == 1 instead? > AFAIK the working counter will be 0 if the mailbox is already empty and 1 if > it fetched and discarded the mailbox contents. Yes, it should be != 0. (I didn't notice it during testing since I could never know the actual state of the mailbox, and it was only a log message anyway.) > Also, this blindly clears the mailbox whenever the slave is rescanned. It's > possible for the slave to be rescanned during operation (eg. if the number > of responding slaves on the network changes). I'm not sure if this will > have any negative consequences for pending CoE/FoE/EoE requests (and > presumably unsolicited EoE received packets) or if these are just > abandoned/reset on rescan anyway (which might also be a problem, but at > least not a new one). I haven't looked closely enough at the code in > question to be sure. AFAICS the scanning ultimately originates in ec_fsm_master_state_broadcast, starting with ec_fsm_master_enter_clear_addresses. Several lines before that, there's a call to ec_master_clear_slaves which does what it says. Otherwise, I'd also have to clear my buffers (#10) on a rescan; this way, I only need to free them in ec_slave_clear as is done in the ec_master_clear case anyway. So I don't think continuing communication over a rescan is possible anyway. (Consider that slaves may even change their position if other slaves were inserted before them.) I don't know about your application, but in my case, I only want to run with the correct (configured) number of slaves anyway. Therefore, in my cyclic code I always call ecrt_master_state and check that in the result link_up is set and slaves_responding is equal to the number I expect; otherwise I abort immediately because something's fishy. > > The next problem then is that some code (e.g. > > ec_fsm_master_exec()) just assumes that the FSM has a datagram to > > send out in every state, so it always returns 1 unless it's > > waiting for a reply. With my previous change, this isn't the case > > anymore, and it cannot be -- unless I'd block the FSM completely > [...] > > (27-ethercat-1.5-coe-lock.patch) > > This one was also a bit tricky. Since the patch was made, it looks like > ec_fsm_coe_exec had already been changed to include the concept of not > sending a datagram and returning 0 -- except that most of the places that it > gets called just ignore the return value ec_fsm_coe_exec could return 0 in 1.5.0 already. The return value is only used by those callers that go on to do other things after calling it (to return early if 0); those that call it at their end don't (need to) check the result, and I think that's ok. One of my changes was that ec_fsm_master_exec doesn't always return 1 after executing a state (it still does in 1.5.2). It's a consequence of my previous changes: If CoE is reserved by the slave FSM, the master FSM must wait. In order to do that I set datagram->state to EC_DATAGRAM_INVALID, and prevent it from being queued. Returning 0 from ec_fsm_master_exec is an easy way to achieve this (for both callers) since the 0 return (the first branch in ec_fsm_master_exec) was already there in case the FSM is still waiting for a reply and doesn't execute a state at all (and therefore obviously doesn't send a new datagram either). Does this clarify things for you? If not, please ask again. Regards, Frank -- Dipl.-Math. Frank Heckenbach <f.heckenb...@fh-soft.de> Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359 Systems Programming, Software Development, IT Consulting _______________________________________________ etherlab-dev mailing list etherlab-dev@etherlab.org http://lists.etherlab.org/mailman/listinfo/etherlab-dev