Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-03-02 Thread Gavin Lambert
On 3 March 2015 01:49, quoth Knud Baastrup:
> Thanks, attached updated patches. See inline comments.

Looks good, as far as I can see. :)

Although speaking of syslog spam, I'm getting quite a lot of "Busy -
processing internal SDO request!" now.

>> Although speaking of the EC_REGALIAS code, if that's enabled and if the
>> register 0x0012 alias is different from the SII alias, then this patch
might
>> malfunction (it should probably skip reading the SII alias and go
straight
>> for the register).  Having said that, normally the two should be the
same,
>> unless someone is in the process of changing the alias (in which case
>> rebooting the slave afterwards should "fix" everything).  There might be
some
>> odd slaves out there though, which could be why EC_REGALIAS was added in
the
>> first place..?
>
> The patch should not be malfunctioning, but yes if alias (or a serial
> number) is updated after a re-scan the stored sii_image cannot be matched
in
> the coming re-scan and a new sii_image will be created for that particular
> module.

The issue would be if some slave always had some wrong value in its SII but
loaded some other value to register 0x0012 on startup (eg. from onboard
dipswitches).  This is not as unlikely as it sounds as it can be quite
awkward for the slave to access its own SII, especially with the default
Etherlab configuration (EC_SII_ASSIGN is not defined by default, and I'm
fairly sure it's not implemented correctly anyway).  This could either work
by coincidence (if the "wrong" value was still unique), or cause either a
cache miss or in the worst case a hit on the wrong data (if that alias value
is shared with another slave).

Fortunately the standards require that in this case the "wrong SII value"
must be zero, which would just make your patch ignore the alias instead of
getting an invalid cache hit, but it's always possible there's some slave
that violates this.  (Also the standard says that in case they're both
non-zero it doesn't need to signal an error until the INIT->PREOP
transition, and the scan may occur before this.)

So I was thinking that in the EC_REGALIAS case your patch should just read
register 0x0012 sooner instead of reading the SII alias at first and then
reading 0x0012 later (but not using the latter for the SII lookup).  It'd
save several network cycles too.

Having said that, I don't know how common use of EC_REGALIAS is (I don't use
it myself).  Maybe it doesn't really matter.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-03-01 Thread Gavin Lambert
On 27 February 2015 21:44, quoth Knud Baastrup:
> 
> Added one additional patch
(0015-Internal-SDO-requests-now-synchronized-with-
> external.patch) that solves an issue with input/output errors when
executing
> the ethercat sdos command (that now fetch directory) while configured SDO
> requests are executed from user application. Can also be observed with
> ethercat upload/download from EtherCAT Tool together with the execution of
> configured SDO requests. See the documentation in the patch it selves for
> more information.

I just noticed that patch "17_remove_prints_to_avoid_syslog_spam.patch" from
the 02022015 patch series appears to have vanished from later series.  I
assume this was intentional, as I don't recall seeing the spam it referred
to, but I thought I'd mention it just in case.

Also, some compiler warnings are still present from patch 0013:

master/fsm_slave_scan.c: In function 'ec_fsm_slave_scan_enter_attach_sii':
master/fsm_slave_scan.c:494:17: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'int' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Slave can re-use SII image data
stored."
 ^
master/fsm_slave_scan.c:502:17: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'uint32_t' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Slave can re-use SII image data
stored."
 ^
master/fsm_slave_scan.c:502:17: warning: format '%zu' expects argument of
type 'size_t', but argument 6 has type 'uint32_t' [-Wformat=]
master/fsm_slave_scan.c:502:17: warning: format '%zu' expects argument of
type 'size_t', but argument 7 has type 'uint32_t' [-Wformat=]
master/fsm_slave_scan.c: In function 'ec_fsm_slave_scan_state_sii_alias':
master/fsm_slave_scan.c:721:5: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'int' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Alias: %zu\n", slave->effective_alias);
 ^
master/fsm_slave_scan.c: In function 'ec_fsm_slave_scan_state_sii_serial':
master/fsm_slave_scan.c:759:5: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'uint32_t' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Serial Number: %zu\n",
slave->effective_serial_number);
 ^
master/fsm_slave_scan.c: In function 'ec_fsm_slave_scan_state_sii_vendor':
master/fsm_slave_scan.c:792:5: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'uint32_t' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Vendor ID: %zu\n", slave->effective_vendor_id);
 ^
master/fsm_slave_scan.c: In function 'ec_fsm_slave_scan_state_sii_product':
master/fsm_slave_scan.c:825:5: warning: format '%zu' expects argument of
type 'size_t', but argument 5 has type 'uint32_t' [-Wformat=]
 EC_SLAVE_DBG(slave, 1, "Product code: %zu\n",
slave->effective_product_code);
 ^

The ones complaining about "int" probably need casts to "unsigned" (or
uint32_t if you prefer) due to default parameter extension, and the %zu
should just be %u for all of them.

Also vendor ids and product codes are usually printed in hex.  Not sure
about serial numbers, but "ethercat slaves -v" displays those in hex too, so
that seems reasonable.

On a somewhat related note, I'm not sure "effective_serial" etc are good
variable names.  "Effective alias" is phrased that way because there are
several different kinds of alias, but this contains the one that's currently
in use (eg. see the EC_REGALIAS code, which allows the effective alias to
come from a register rather than the SII); but that isn't really true for
the vendor/product/serial values.  This is just a minor quibble of course.
:)

Although speaking of the EC_REGALIAS code, if that's enabled and if the
register 0x0012 alias is different from the SII alias, then this patch might
malfunction (it should probably skip reading the SII alias and go straight
for the register).  Having said that, normally the two should be the same,
unless someone is in the process of changing the alias (in which case
rebooting the slave afterwards should "fix" everything).  There might be
some odd slaves out there though, which could be why EC_REGALIAS was added
in the first place..?

Finally, this is one of those "probably not strictly necessary but it makes
things tidier just in case" changes, but I recommend adding the following
hunk to patch 0005:

--- a/master/datagram.c
+++ b/master/datagram.c
@@ -586,6 +586,9 @@
 case EC_DATAGRAM_ERROR:
 printk("error");
 break;
+case EC_DATAGRAM_INVALID:
+printk("invalid");
+break;
 default:
 printk("???");
 }


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-15 Thread Knud Baastrup
Thanks, it is just fine you mention these kind of issues as well as I any way 
would not know as you mention your selves. I can see the lines having only one 
indentation after a line break and not two and it is actual an indentation 
error that is also present in the stable-1.5 branch and it has not been 
corrected with the patch. 

Thanks,

Knud


-Original Message-
From: Gavin Lambert [mailto:gav...@compacsort.com] 
Sent: 15. februar 2015 23:36
To: Knud Baastrup
Cc: etherlab-dev@etherlab.org
Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues

On 13 February 2015 21:39, quoth Knud Baastrup:
>> Nice!  Although there still seem to be some funny things going on 
>> with
the
>> whitespace, eg. see patch 0013's master/fsm_slave_config.c's second 
>> hunk (ec_fsm_slave_config_enter_mbox_sync).
>
> I guess I need more help to figure this out. I cannot (with my current 
> knowledge of patch management) see anything wrong in this specific 
> hunk
(line
> 374 to 476). Do you get some kind of warning when applying the patch 
> or
how
> do you observe the issue?

The second hunk covers lines 467 to 524 in the patched file.

There's no patching errors or anything like that, it's just that the inserted 
lines have only four spaces instead of eight, so the indentation appears wrong 
when compared to the surrounding code.

I didn't examine the patches with a fine-toothed comb (though I did spend a bit 
of time looking through them, of course), so I don't know if there are other 
instances of this or if this was the only one, but I happened to notice this 
case so I thought I'd mention it.  Obviously it doesn't affect the actual 
operation of the patch, it's just a code style issue.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-15 Thread Gavin Lambert
On 13 February 2015 21:39, quoth Knud Baastrup:
>> Nice!  Although there still seem to be some funny things going on with
the
>> whitespace, eg. see patch 0013's master/fsm_slave_config.c's second hunk
>> (ec_fsm_slave_config_enter_mbox_sync).
>
> I guess I need more help to figure this out. I cannot (with my current
> knowledge of patch management) see anything wrong in this specific hunk
(line
> 374 to 476). Do you get some kind of warning when applying the patch or
how
> do you observe the issue?

The second hunk covers lines 467 to 524 in the patched file.

There's no patching errors or anything like that, it's just that the
inserted lines have only four spaces instead of eight, so the indentation
appears wrong when compared to the surrounding code.

I didn't examine the patches with a fine-toothed comb (though I did spend a
bit of time looking through them, of course), so I don't know if there are
other instances of this or if this was the only one, but I happened to
notice this case so I thought I'd mention it.  Obviously it doesn't affect
the actual operation of the patch, it's just a code style issue.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-13 Thread Knud Baastrup
See my in-lined comments!

BR, Knud

-Original Message-
From: Gavin Lambert [mailto:gav...@compacsort.com] 
Sent: 12. februar 2015 23:50
To: Knud Baastrup
Cc: etherlab-dev@etherlab.org
Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues

On 13 February 2015 03:30, quoth Knud Baastrup:
> I have attached a new set of patches (now by using git format-patch, 
> which also imply that the patch names are given by the commit text).
> 
> I believe that I have addressed the issues you highlighted in prior 
> mails including the alias support that might be relevant for us as 
> well sometime
in
> the future.

Nice!  Although there still seem to be some funny things going on with the 
whitespace, eg. see patch 0013's master/fsm_slave_config.c's second hunk 
(ec_fsm_slave_config_enter_mbox_sync).
[Knud] I guess I need more help to figure this out. I cannot (with my current 
knowledge of patch management) see anything wrong in this specific hunk (line 
374 to 476). Do you get some kind of warning when applying the patch or how do 
you observe the issue?
 
> The locks that conflicts with RTAI could be removed with a define 
> guard,
e.g.
> re-use the EC_RTDM define already available?

They're not conflicts, and I wasn't suggesting any specific changes (as I said, 
I don't use RTDM myself so I don't really know specifics).  It was just a 
comment to indicate why the master originally didn't do any locking there, and 
that your original problem *could* theoretically have been solved by doing the 
locking in the application code instead, as it's only an issue with concurrent 
realtime tasks, which are likely to need some application-level locking anyway. 
 It's a possible reason that Florian might not want to accept the patch, but 
that doesn't mean that you should modify or withdraw it -- that's something he 
can decide.
[Knud] Agree, we will keep the locks and let Florian decide to what extend he 
will include the patch.

I'll integrate your new patchset into my build and do a bit more testing; I'm 
hoping to post my full patch queue in a few days.  This will include your 
patches as well -- I hope you don't mind?  (They'll be clearly attributed, of 
course.)
[Knud] That is fully ok. I believe it will just further improve the chances for 
getting the patches merged into trunk by Florian. Feel free to do any 
improvements on the patches as well.

Just some further thoughts on patch 0010 (deferring the sdo dictionary
fetch): one of the interesting things about fsm_slave over fsm_master is that 
the former can run in parallel while the latter only in series.  In principle, 
this means that if someone issues "ethercat sdos" with no filters on a large 
network, the fetch time could be reduced considerably.  (It won't reduce the 
time to that of a single slave, as it has caps on the number of slave FSMs it 
can run in parallel to prevent blowing out the number of frames and causing 
latency.)  Currently your patch forces this to still run sequentially anyway, 
because the ioctl is blocking and it only does one slave at a time.

I was thinking about having a go at trying to make that change myself, but 
having said that, given that running "ethercat sdos" on multiple slaves is not 
particularly useful (since networks usually contain many duplicates) and that 
this is generally only used during development or commissioning, I'm not sure 
whether it's really worth it.
[Knud] I agree, it is tempting now to take next step and do this in parallel. 
It currently requires a patient user to wait several minutes for an ethercat 
sdos command (with no progress information) to complete. Today we as well only 
fetch the dictionary for debug and test purposes so we do not plan to further 
improve this for the moment.

I was also wondering if it should do a more limited dictionary fetch by default 
(just of the PDOs), which could improve some of the logging, but even then 
those messages only appear when the debug level is increased, so most of the 
time it wouldn't be all that useful.  And someone can just look at the slave 
docs (or a local dictionary scan) if they want to interpret logs to find out 
what a particular SDO entry is called from its index:subindex.

___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-12 Thread Gavin Lambert
On 13 February 2015 03:30, quoth Knud Baastrup:
> I have attached a new set of patches (now by using git format-patch, which
> also imply that the patch names are given by the commit text).
> 
> I believe that I have addressed the issues you highlighted in prior mails
> including the alias support that might be relevant for us as well sometime
in
> the future.

Nice!  Although there still seem to be some funny things going on with the
whitespace, eg. see patch 0013's master/fsm_slave_config.c's second hunk
(ec_fsm_slave_config_enter_mbox_sync).

> The locks that conflicts with RTAI could be removed with a define guard,
e.g.
> re-use the EC_RTDM define already available?

They're not conflicts, and I wasn't suggesting any specific changes (as I
said, I don't use RTDM myself so I don't really know specifics).  It was
just a comment to indicate why the master originally didn't do any locking
there, and that your original problem *could* theoretically have been solved
by doing the locking in the application code instead, as it's only an issue
with concurrent realtime tasks, which are likely to need some
application-level locking anyway.  It's a possible reason that Florian might
not want to accept the patch, but that doesn't mean that you should modify
or withdraw it -- that's something he can decide.


I'll integrate your new patchset into my build and do a bit more testing;
I'm hoping to post my full patch queue in a few days.  This will include
your patches as well -- I hope you don't mind?  (They'll be clearly
attributed, of course.)


Just some further thoughts on patch 0010 (deferring the sdo dictionary
fetch): one of the interesting things about fsm_slave over fsm_master is
that the former can run in parallel while the latter only in series.  In
principle, this means that if someone issues "ethercat sdos" with no filters
on a large network, the fetch time could be reduced considerably.  (It won't
reduce the time to that of a single slave, as it has caps on the number of
slave FSMs it can run in parallel to prevent blowing out the number of
frames and causing latency.)  Currently your patch forces this to still run
sequentially anyway, because the ioctl is blocking and it only does one
slave at a time.

I was thinking about having a go at trying to make that change myself, but
having said that, given that running "ethercat sdos" on multiple slaves is
not particularly useful (since networks usually contain many duplicates) and
that this is generally only used during development or commissioning, I'm
not sure whether it's really worth it.

I was also wondering if it should do a more limited dictionary fetch by
default (just of the PDOs), which could improve some of the logging, but
even then those messages only appear when the debug level is increased, so
most of the time it wouldn't be all that useful.  And someone can just look
at the slave docs (or a local dictionary scan) if they want to interpret
logs to find out what a particular SDO entry is called from its
index:subindex.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-10 Thread Gavin Lambert
On 10 February 2015 20:47:
>> 4. Regarding 13_domain_lock.patch, I believe the original rationale of
the
>> master is that locking between concurrent application tasks is the
>> responsibility of the application, not the master -- that's why in
>> kernelspace it has send/receive callbacks (formerly lock/unlock
callbacks) so
>> that eg. RTAI locks can be substituted, or locking can be avoided if the
>> application has some other way to schedule things to avoid actual
concurrency
>> (or if it's only running a single task).  See the "Concurrent Master
Access"
>> section in the docs.  I don't have any personal objections to this patch
>> though.
>
> Yes, we just faced some cases where the application developers did not
> include the necessary locking, which have quite severe impact for the
> complete system. We have not used RTAI, but I am not sure I understand why
> the extra locks become a problem for RTAI?

Not a problem as such, but not necessarily sufficient to protect anything.
RTAI/Xenomai is a separate kernel, so concurrent tasks would be using RTAI
locks instead of regular kernel locks, so they would make the kernel locking
redundant.  It does trivially hurt performance to take a lock that is never
contended, but it's usually not worth worrying about that unless it's in a
tight loop.

Having said that, I don't use RTAI *or* concurrent tasks, so it doesn't
really affect me either way. :)

>> 6. Regarding 16_improved_ethercat_rescan_performance.patch, it looks like
a
>> stray temporary file was included in the patch.  Also, I'm not sure it's
safe
>> to retrieve the data only by serial number.  Serial numbers are not
>> guaranteed unique between vendors, or even between product lines -- I
think
>> at minimum you should include the vendor id and product code in the
index.
>> Also, possibly this should have a #define config guard to disable this
>> functionality in case the master will be used at a site with pathological
>> slaves (eg. multiple slaves with identical non-zero vendor/product/serial
>> triplets, since *technically* they're not guaranteed unique at all --
>> although any slave vendor who does this deserves a kick).
>
> Yes sorry, my mistake with the temporary file. I can only agree with
> you that vendor and productcode should be included in the index in order
for
> this patch to be used in large scale, I will add this. I can also agree
with
> the #define guard.

To reduce network cycles a bit, I suggest trying the alias (if nonzero)
first, as this is required to be network-unique if defined (meaning that you
wouldn't need to check vendor/product/serial); falling back to reading
serial, check if nonzero, and only then read the vendor and product codes.
(I'm not sure if the alias is already known at that point or if that
requires a network cycle to read as well, but even if the latter it means
one read instead of at least three.)

Of course, I might be a little biased since as I mentioned before I usually
configure aliases on all slaves. :)

>> I haven't had a chance to test things locally yet, but at least
everything is
>> compiling ok with these patches. :)

I've given it some minimal testing now, and I'm happy to report that in a
network with about 10 slaves (all with serial numbers) this reduces the
total rescan time from about 45 seconds to about 2, at least for subsequent
scans.  (Numbers are vague because the test conditions weren't quite
identical in each case.)

Although the SDO dictionary patch means that I'm not really testing your
mailbox patches anymore, because dictionary vs. other SDO requests were the
main cause of mailbox conflicts that I see with an unpatched master.  (I'm
not using EoE, which is the other main source of conflicts.)  That's also a
good patch, as the dictionary scan of those 10 slaves takes about two
minutes, and normally the information isn't required for standard running,
only when commissioning.  (It's a little disconcerting to see "ethercat
sdos" just sit apparently dead for a few minutes though.  Maybe it needs
some kind of progress reporting.  Although it's not as bad when limited to a
single slave, which is probably the more common use case.)


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-09 Thread Knud Baastrup
Thanks Gavin, see my inline comments below!

BR, Knud

-Original Message-
From: Gavin Lambert [mailto:gav...@compacsort.com] 
Sent: 9. februar 2015 08:44
To: Knud Baastrup
Cc: etherlab-dev@etherlab.org
Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues

On 2 February 2015 20:18, you quoth:
> 
> I will just update you on some additional pathes I have prepared for
EtherCAT
> Master. I have attached the complete set of patches that we currently 
> use, but only the below patches have been updated added.

I'm just having a more detailed look through these patches now, and there's a 
few niggles. ;)

1. There are several files that appear to have tabs in them; it's usually a 
good idea when sharing patches with others to use spaces only, as different 
people/editors use different tab sizes.
[Knud] I fully agree, I will update and try to avoid in future patches!

2. There's several diffs in various files (eg. 12_sdo_directory.patch's
master/ioctl.h) that contain only whitespace changes on various lines for no 
readily apparent reason.  These sorts of things can cause unnecessary conflicts 
and hide the true intent of the patch.  This may have been the result of a 
space -> tab -> space conversion gone wrong.
[Knud] I fully agree, I will update and try to avoid in future patches!

3. This is optional, but I think it's good style to include a short text 
description at the top of each patch file, which can act as a commit message, 
and helps people reading the patch later without having to hunt down the 
original email.  (If you're using HG MQ it does this automatically; I think git 
format-patch will also do this for you if you have a branch structured with one 
commit per patch, though it also adds quite a bit of extra email-header junk.)
[Knud] I have the patches structured with one commit per patch so using 
git-format-patch will work fine. I will apply.

4. Regarding 13_domain_lock.patch, I believe the original rationale of the 
master is that locking between concurrent application tasks is the 
responsibility of the application, not the master -- that's why in kernelspace 
it has send/receive callbacks (formerly lock/unlock callbacks) so that eg. RTAI 
locks can be substituted, or locking can be avoided if the application has some 
other way to schedule things to avoid actual concurrency (or if it's only 
running a single task).  See the "Concurrent Master Access" section in the 
docs.  I don't have any personal objections to this patch though.
[Knud] Yes, we just faced some cases where the application developers did not 
include the necessary locking, which have quite severe impact for the complete 
system. We have not used RTAI, but I am not sure I understand why the extra 
locks become a problem for RTAI?

5. Regarding 14_fix_string_handling.patch, I don't think this is the "right"
fix.  I've attached Frank's 04 patch which fixes this a different way.
[Knud] I will talk with my colleague Per and check if we can revert this patch 
and instead use the one provided by Frank.

6. Regarding 16_improved_ethercat_rescan_performance.patch, it looks like a 
stray temporary file was included in the patch.  Also, I'm not sure it's safe 
to retrieve the data only by serial number.  Serial numbers are not guaranteed 
unique between vendors, or even between product lines -- I think at minimum you 
should include the vendor id and product code in the index.
Also, possibly this should have a #define config guard to disable this 
functionality in case the master will be used at a site with pathological 
slaves (eg. multiple slaves with identical non-zero vendor/product/serial 
triplets, since *technically* they're not guaranteed unique at all -- although 
any slave vendor who does this deserves a kick).
[Knud] Yes sorry, my mistake with the temporary file. I can only agree with you 
that vendor and productcode should be included in the index in order for this 
patch to be used in large scale, I will add this. I can also agree with the 
#define guard.

I haven't had a chance to test things locally yet, but at least everything is 
compiling ok with these patches. :)

___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-08 Thread Gavin Lambert
On 2 February 2015 20:18, you quoth:
> 
> I will just update you on some additional pathes I have prepared for
EtherCAT
> Master. I have attached the complete set of patches that we currently use,
> but only the below patches have been updated added.

I'm just having a more detailed look through these patches now, and there's
a few niggles. ;)

1. There are several files that appear to have tabs in them; it's usually a
good idea when sharing patches with others to use spaces only, as different
people/editors use different tab sizes.

2. There's several diffs in various files (eg. 12_sdo_directory.patch's
master/ioctl.h) that contain only whitespace changes on various lines for no
readily apparent reason.  These sorts of things can cause unnecessary
conflicts and hide the true intent of the patch.  This may have been the
result of a space -> tab -> space conversion gone wrong.

3. This is optional, but I think it's good style to include a short text
description at the top of each patch file, which can act as a commit
message, and helps people reading the patch later without having to hunt
down the original email.  (If you're using HG MQ it does this automatically;
I think git format-patch will also do this for you if you have a branch
structured with one commit per patch, though it also adds quite a bit of
extra email-header junk.)

4. Regarding 13_domain_lock.patch, I believe the original rationale of the
master is that locking between concurrent application tasks is the
responsibility of the application, not the master -- that's why in
kernelspace it has send/receive callbacks (formerly lock/unlock callbacks)
so that eg. RTAI locks can be substituted, or locking can be avoided if the
application has some other way to schedule things to avoid actual
concurrency (or if it's only running a single task).  See the "Concurrent
Master Access" section in the docs.  I don't have any personal objections to
this patch though.

5. Regarding 14_fix_string_handling.patch, I don't think this is the "right"
fix.  I've attached Frank's 04 patch which fixes this a different way.

6. Regarding 16_improved_ethercat_rescan_performance.patch, it looks like a
stray temporary file was included in the patch.  Also, I'm not sure it's
safe to retrieve the data only by serial number.  Serial numbers are not
guaranteed unique between vendors, or even between product lines -- I think
at minimum you should include the vendor id and product code in the index.
Also, possibly this should have a #define config guard to disable this
functionality in case the master will be used at a site with pathological
slaves (eg. multiple slaves with identical non-zero vendor/product/serial
triplets, since *technically* they're not guaranteed unique at all --
although any slave vendor who does this deserves a kick).

I haven't had a chance to test things locally yet, but at least everything
is compiling ok with these patches. :)



frank_04-string-download.patch
Description: Binary data
___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-03 Thread Knud Baastrup
Thanks Gavin, I was just curious to know if this "master configurator" were a 
tool that I did not know about.

/Knud


-Original Message-
From: Gavin Lambert [mailto:gav...@compacsort.com] 
Sent: 3. februar 2015 08:26
To: Knud Baastrup
Cc: etherlab-dev@etherlab.org
Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues

On 3 February 2015 20:08, quoth Knud Baastrup:
> Yes, I decided not to use the alias as matching criteria for the slave
data
> due to the "tree points" intended use of aliases. Currently we do not 
> use aliases on our modules, but we are planning to introduce this in 
> the near future to prevent that some modules get the wrong 
> configuration if a rack
of
> modules (with same vendor and product code as the following rack) are 
> disconnected due to wire/power break. What do you mean by the "master 
> configurator" ?

Depending on context, either the software that sets up the network layout 
(assigning aliases, saving persistent parameters, etc), or the person in charge 
of doing that work for a particular installation.  Depending on how your 
application and network operate, sometimes that's an explicit step, and 
sometimes it's part of the application.

Since I'm a lazy person, when our modules are assigned a serial number during 
production they get that assigned as their alias as well (although they can 
also be explicitly assigned a different alias if the network configurator 
wishes), which simplifies network configuration quite a bit.
But then, these are discrete units rather than "racks" so there's a higher 
chance they'll get wired in an unexpected order, so I think this provides the 
best compromise in network flexibility.  YMMV.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-02 Thread Gavin Lambert
On 3 February 2015 20:08, quoth Knud Baastrup:
> Yes, I decided not to use the alias as matching criteria for the slave
data
> due to the "tree points" intended use of aliases. Currently we do not use
> aliases on our modules, but we are planning to introduce this in the near
> future to prevent that some modules get the wrong configuration if a rack
of
> modules (with same vendor and product code as the following rack) are
> disconnected due to wire/power break. What do you mean by the "master
> configurator" ?

Depending on context, either the software that sets up the network layout
(assigning aliases, saving persistent parameters, etc), or the person in
charge of doing that work for a particular installation.  Depending on how
your application and network operate, sometimes that's an explicit step, and
sometimes it's part of the application.

Since I'm a lazy person, when our modules are assigned a serial number
during production they get that assigned as their alias as well (although
they can also be explicitly assigned a different alias if the network
configurator wishes), which simplifies network configuration quite a bit.
But then, these are discrete units rather than "racks" so there's a higher
chance they'll get wired in an unexpected order, so I think this provides
the best compromise in network flexibility.  YMMV.


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-02 Thread Knud Baastrup
Yes, I decided not to use the alias as matching criteria for the slave data due 
to the "tree points" intended use of aliases. Currently we do not use aliases 
on our modules, but we are planning to introduce this in the near future to 
prevent that some modules get the wrong configuration if a rack of modules 
(with same vendor and product code as the following rack) are disconnected due 
to wire/power break. What do you mean by the "master configurator" ?

BR, Knud

-Original Message-
From: etherlab-dev [mailto:etherlab-dev-boun...@etherlab.org] On Behalf Of 
Gavin Lambert
Sent: 3. februar 2015 02:00
To: 'Graeme Foot'
Cc: etherlab-dev@etherlab.org
Subject: Re: [etherlab-dev] Multiple mailbox protocols and other issues

On 3 February 2015 13:38, quoth Graeme Foot:
> Sent: Tuesday, 3 February 2015 13:38
> To: Gavin Lambert; 'Knud Baastrup'
> Cc: etherlab-dev@etherlab.org
> Subject: RE: [etherlab-dev] Multiple mailbox protocols and other 
> issues
> 
> From: etherlab-dev [mailto:etherlab-dev-boun...@etherlab.org] On 
> Behalf Of Gavin Lambert
> > On 2 February 2015 20:18, quoth Knud Baastrup:
> > > 16_improved_ethercat_rescan_performance.patch:
> > > The SII data and PDOs will now be stored when the EtherCAT master 
> > > is in
> > its
> > > operation phase. The stored SII data and PDOs will be detached 
> > > from the slaves prior to a scanning and re-attached during the 
> > > scanning without the need to fetch the SII data and PDOs once 
> > > again. The SII data and PDOs will however only be stored if the 
> > > slave have a serial number defined as this serial number will be 
> > > used when re-attaching
the
> SII data and PDOs.
> >
> > Ooh, thanks for that one.  That's one of the performance holes that 
> > I
was
> planning on investigating myself soonish.
> >
> 
> None of my modules seem to have serial numbers (Beckhoff IO and 
> yaskawa amps), or is that a bug that's got a patch?  I'm running the 
> original
1.5.2
> (+ misc patches).

It's not a bug, or really related to the master software at all.

It's up to the device manufacturer whether a given device has a serial number 
or not.  In my case they do, but I'm mostly using in-house hardware.
Beckhoff modules as a general rule don't seem to have serial numbers.

This is separate from the "alias" which is used in network addressing and is 
typically (but optionally) set by the network designer via the master 
configurator.  The EtherLab master lacks specific commands to set aliases but 
it can recognise aliases set by other masters, and some slaves support a CoE 
download to reconfigure their alias (which can be accessed via the command 
line), and some others have dipswitches.  Generally aliases are only needed at 
"tree points" in the network graph, so if you're only using simple chains 
they're less useful.

Technically the serial number can be altered by the master / network designer 
as well via an EEPROM download, but this is less "encouraged".

___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev
___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-02 Thread Gavin Lambert
On 3 February 2015 13:38, quoth Graeme Foot:
> Sent: Tuesday, 3 February 2015 13:38
> To: Gavin Lambert; 'Knud Baastrup'
> Cc: etherlab-dev@etherlab.org
> Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues
> 
> From: etherlab-dev [mailto:etherlab-dev-boun...@etherlab.org] On Behalf Of
> Gavin Lambert
> > On 2 February 2015 20:18, quoth Knud Baastrup:
> > > 16_improved_ethercat_rescan_performance.patch:
> > > The SII data and PDOs will now be stored when the EtherCAT master is
> > > in
> > its
> > > operation phase. The stored SII data and PDOs will be detached from
> > > the slaves prior to a scanning and re-attached during the scanning
> > > without the need to fetch the SII data and PDOs once again. The SII
> > > data and PDOs will however only be stored if the slave have a serial
> > > number defined as this serial number will be used when re-attaching
the
> SII data and PDOs.
> >
> > Ooh, thanks for that one.  That's one of the performance holes that I
was
> planning on investigating myself soonish.
> >
> 
> None of my modules seem to have serial numbers (Beckhoff IO and yaskawa
> amps), or is that a bug that's got a patch?  I'm running the original
1.5.2
> (+ misc patches).

It's not a bug, or really related to the master software at all.

It's up to the device manufacturer whether a given device has a serial
number or not.  In my case they do, but I'm mostly using in-house hardware.
Beckhoff modules as a general rule don't seem to have serial numbers.

This is separate from the "alias" which is used in network addressing and is
typically (but optionally) set by the network designer via the master
configurator.  The EtherLab master lacks specific commands to set aliases
but it can recognise aliases set by other masters, and some slaves support a
CoE download to reconfigure their alias (which can be accessed via the
command line), and some others have dipswitches.  Generally aliases are only
needed at "tree points" in the network graph, so if you're only using simple
chains they're less useful.

Technically the serial number can be altered by the master / network
designer as well via an EEPROM download, but this is less "encouraged".


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-02 Thread Graeme Foot
From: etherlab-dev [mailto:etherlab-dev-boun...@etherlab.org] On Behalf Of 
Gavin Lambert
> On 2 February 2015 20:18, quoth Knud Baastrup:
> > 16_improved_ethercat_rescan_performance.patch:
> > The SII data and PDOs will now be stored when the EtherCAT master is 
> > in
> its
> > operation phase. The stored SII data and PDOs will be detached from 
> > the slaves prior to a scanning and re-attached during the scanning 
> > without the need to fetch the SII data and PDOs once again. The SII 
> > data and PDOs will however only be stored if the slave have a serial 
> > number defined as this serial number will be used when re-attaching the SII 
> > data and PDOs.
>
> Ooh, thanks for that one.  That's one of the performance holes that I was 
> planning on investigating myself soonish.
>


Hi,

None of my modules seem to have serial numbers (Beckhoff IO and yaskawa amps), 
or is that a bug that's got a patch?  I'm running the original 1.5.2 (+ misc 
patches).

G.
___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2015-02-02 Thread Gavin Lambert
On 2 February 2015 20:18, quoth Knud Baastrup:
> 16_improved_ethercat_rescan_performance.patch:
> The SII data and PDOs will now be stored when the EtherCAT master is in
its
> operation phase. The stored SII data and PDOs will be detached from the
> slaves prior to a scanning and re-attached during the scanning without the
> need to fetch the SII data and PDOs once again. The SII data and PDOs will
> however only be stored if the slave have a serial number defined as this
> serial number will be used when re-attaching the SII data and PDOs.

Ooh, thanks for that one.  That's one of the performance holes that I was
planning on investigating myself soonish.

(A somewhat related one is that when a slave drops to SAFEOP+ERROR as a
result of a comms watchdog error, it *should* be safe for the master to
bring it straight up to OP, especially when the slave has a serial and can
be unambiguously identified, but currently it's doing a full back-to-PREOP
reconfigure.)

I've been meaning to post the full patchset that I'm using at the moment (in
the hopes that a few pieces at least can get integrated), but I'm still
investigating a few issues (and working on unrelated things), so it's not
quite ready yet.  Although (at least partly due to inertia) I'm currently
using Frank's mailbox patches rather than yours. :)


___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2014-08-08 Thread Knud Baastrup
Thanks for your feed-back Gavin!

I have in-lined some comments!

Patch 1 to 5 were prepared by a colleague some time back but were at that time 
not forwarded to the etherlab-dev.

Mvh. Knud

-Original Message-
From: Gavin Lambert [mailto:gav...@compacsort.com] 
Sent: 6. august 2014 09:41
To: Knud Baastrup
Cc: 'Florian Pose'; etherlab-dev@etherlab.org
Subject: RE: [etherlab-dev] Multiple mailbox protocols and other issues

On 11 July 2014, quoth Knud Baastrup:
> Patch 2 (ethercat_152_patch_2_foe.patch):
> Wrong datagram used for timeout calculation.

You'll be happy to learn that this patch is not required if you use the latest 
code; it was fixed in 8bb574da5da2.
[Knud] Great, we use the ec403cf308eb version released 2013-02-12 as baseline 
and yes I can see that the fix were provided 2 days later in 8bb574da5da2 by 
2013-02-14.

> Patch 4 (ethercat_152_patch_4_eoe_mac.patch)
> Change the MAC address for eoe0sY devices to real local administrated 
> MAC addresses based on the NIC part of eth0 and the EoE slaves ring 
> position.

In the line "if (ETH_ALEN >= 5)", shouldn't that be > 5 or >= 6?  Also, if this 
test fails (which seems unlikely, but then why test if it can't happen?), this 
will leave the address uninitialized, which seems undesirable.  Maybe it should 
fall back to the prior code in that case?
[Knud] I agree that it should be either > 5 or >= 6. I will correct and test.

> Patch 5 (ethercat_152_patch_5_priority_inheritance.patch)
> Replaced semaphores with mutexes to utilize priority inheritance and 
> limit impact from lower priority tasks (EtherCAT-EOE) running as 
> sched_other task.

While I like the idea of using RT-mutexes, they do have a minimum kernel 
version (I'm not sure exactly what that is except that it's somewhere in the 
2.6.x series) and currently Etherlab provides drivers for some kernel versions 
that are before that cutoff, I suspect.  (And do rt_mutexes and RTAI play 
nicely together or not?)  So this might break compatibility, and so possibly 
should be a configure option.  Or maybe nobody cares about those older kernel 
versions any more?  (I don't personally, I'm just wondering if it might be a 
concern.)
[Knud] I agree that it might need to be a configurable option. I do not know 
how to prepare that in a proper way and will leave it Florian or others to 
decide.

Also I found a few cases where "down" and "up" hadn't been changed to 
"rt_mutex_lock" etc.  Not sure if this was the result of a patch application 
failure or if this was code added since the patches' base version.
[Knud] Probably because I have used ec403cf308eb as base version for the 
patches.

> Patch 6 (ethercat_152_patch_6_mailbox.patch)
> Alternative solution to Patch 9-10-11 provided by Frank Heckenbach for
> 1.5.0 (that I did not succeed to get up running on 1.5.2).

Did you look at the updated-to-1.5.2 patches that I posted?
[Knud] The ones named ethercat-1.5.0-patches-v2.tar.bz2 ? No I was at that 
point quite happy with my current alternative implementation.

> In this solution I accept that a mailbox read request (e.g. FP-RD) for 
> a given mailbox protocol can return data from any other mailbox 
> protocol running at the same time.  [...]

In master/master.c near line 1240, on receipt of a datagram you're searching 
through the slave list to check its mailbox settings.  This code appears unsafe 
in the case when this search fails (which presumably could occur eg.
if a datagram with a corrupted address arrives or if slave scanning has just 
started and cleared master->slaves).
[Knud] I agree, I must ensure that the datagram is ignored if no matching 
station address. In the current patch such a datagram will be managed by the 
last slave and that were not the intention. I will correct and test.

Also this seems to generate quite a lot of "Await configured mailbox address" 
spam at debug level 1 even when no mailbox activity is taking place..?
[Knud] I will remove the debug message, it is not that relevant as you will 
know from other messages if mailbox offset address configuration fails for a 
slave. I will correct and test.

When compiling with a recent kernel version (3.13) I needed to #include 
 in master/slave.h in order to compile several files (the 
first is master/fmmu_config.c).  This doesn't appear to be needed in 3.2 
however (or by some of the other .c files); I guess the indirect includes have 
changed?
[Knud] I have been testing with 3.4.37-rt51

> Patch 10 (ethercat_152_patch_10_scan_skip_stats.patch)
> No reason to write output statistics in syslog when issuing a slave
scanning 
> where UNMATCHED datagrams are expected behavior.

I'm not sure I follow this one.  Why are unmatched datagrams expected?
Also, this patch isn't going to stop them printing, just delay it un

Re: [etherlab-dev] Multiple mailbox protocols and other issues

2014-08-06 Thread Gavin Lambert
On 11 July 2014, quoth Knud Baastrup:
> Patch 2 (ethercat_152_patch_2_foe.patch):
> Wrong datagram used for timeout calculation.

You'll be happy to learn that this patch is not required if you use the
latest code; it was fixed in 8bb574da5da2.

> Patch 4 (ethercat_152_patch_4_eoe_mac.patch)
> Change the MAC address for eoe0sY devices to real local 
> administrated MAC addresses based on the NIC part of eth0 and 
> the EoE slaves ring position.

In the line "if (ETH_ALEN >= 5)", shouldn't that be > 5 or >= 6?  Also, if
this test fails (which seems unlikely, but then why test if it can't
happen?), this will leave the address uninitialized, which seems
undesirable.  Maybe it should fall back to the prior code in that case?

> Patch 5 (ethercat_152_patch_5_priority_inheritance.patch)
> Replaced semaphores with mutexes to utilize priority inheritance 
> and limit impact from lower priority tasks (EtherCAT-EOE) running 
> as sched_other task. 

While I like the idea of using RT-mutexes, they do have a minimum kernel
version (I'm not sure exactly what that is except that it's somewhere in the
2.6.x series) and currently Etherlab provides drivers for some kernel
versions that are before that cutoff, I suspect.  (And do rt_mutexes and
RTAI play nicely together or not?)  So this might break compatibility, and
so possibly should be a configure option.  Or maybe nobody cares about those
older kernel versions any more?  (I don't personally, I'm just wondering if
it might be a concern.)

Also I found a few cases where "down" and "up" hadn't been changed to
"rt_mutex_lock" etc.  Not sure if this was the result of a patch application
failure or if this was code added since the patches' base version.

> Patch 6 (ethercat_152_patch_6_mailbox.patch)
> Alternative solution to Patch 9-10-11 provided by Frank Heckenbach for 
> 1.5.0 (that I did not succeed to get up running on 1.5.2).

Did you look at the updated-to-1.5.2 patches that I posted?

> In this solution I accept that a mailbox read request (e.g. FP-RD) for 
> a given mailbox protocol can return data from any other mailbox protocol 
> running at the same time.  [...]

In master/master.c near line 1240, on receipt of a datagram you're searching
through the slave list to check its mailbox settings.  This code appears
unsafe in the case when this search fails (which presumably could occur eg.
if a datagram with a corrupted address arrives or if slave scanning has just
started and cleared master->slaves).

Also this seems to generate quite a lot of "Await configured mailbox
address" spam at debug level 1 even when no mailbox activity is taking
place..?

When compiling with a recent kernel version (3.13) I needed to #include
 in master/slave.h in order to compile several files (the
first is master/fmmu_config.c).  This doesn't appear to be needed in 3.2
however (or by some of the other .c files); I guess the indirect includes
have changed?

> Patch 10 (ethercat_152_patch_10_scan_skip_stats.patch)
> No reason to write output statistics in syslog when issuing a slave
scanning 
> where UNMATCHED datagrams are expected behavior.

I'm not sure I follow this one.  Why are unmatched datagrams expected?
Also, this patch isn't going to stop them printing, just delay it until the
scan completes.

Otherwise, this patchset seems to work ok.  I've only given it fairly basic
testing thus far though.

However there was a bit of fuzz and other conflicts when trying to apply
these to the latest HG source, so in the interests of making it easier for
Florian and anyone else using the latest source to examine and test these
patches, I've attached updated versions.  These have only had the following
changes:
 - de-fuzzed based on 8dd49f6f6d32 (close to stable-1.5 tip).
 - various tabs converted to whitespace and related minor whitespace
changes.
   - I noticed some inconsistent newline-brace styles but I left those
alone.
 - omitted patch 2 as specified above (it was empty after defuzzing).
 - converted a few extra down/ups in patch 5 as specified above.
 - added #include to master/slave.h in patch 6 as specified above.

In particular I've basically just included changes required to compile; I
haven't tried to fix any of the other possible issues mentioned above.

I've also included a series file, so in theory it should immediately be
MQ-compatible if extracted into the .hg dir.  (You might need to "hg qq -c
knud" first.)

Regards,
Gavin Lambert



patches-knud.tar.gz
Description: Binary data
___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev


Re: [etherlab-dev] Multiple mailbox protocols and other issues

2014-07-10 Thread Florian Pose
Hello Kund,

On Thu, Jul 10, 2014 at 02:04:20PM +, Knud Baastrup wrote:
> I have attached the complete list of patches that we use for our IgH EtherCAT
> master:

thanks you very much. I will inspect them and come back to you.

-- 
Best regards,
Florian Pose

http://etherlab.org
___
etherlab-dev mailing list
etherlab-dev@etherlab.org
http://lists.etherlab.org/mailman/listinfo/etherlab-dev