Re: RFC: cleaning up CAN_ERR_ACK issue

Wolfgang Grandegger Wed, 16 Feb 2011 12:02:23 -0800

On 02/16/2011 08:00 PM, Oliver Hartkopp wrote:
> On 16.02.2011 09:10, Wolfgang Grandegger wrote:
>> On 02/15/2011 09:56 PM, Oliver Hartkopp wrote:
> 
> 
>>> So when CAN_ERR_BUSERROR is set, i may check CAN_ERR_PROT and then into
>>> data [2..3], right?
>>>
>>> So CAN_ERR_ACK would be obsolete ...
>>
>> Yes.
>>
>>> I think, the reason why CAN_ERR_ACK has made it's way into the error class
>>> bits in the CAN-ID was because of missing ACKs usually happen, when there's
>>> only one CAN node on the bus. The idea was to detect missing CAN nodes or
>>> wrong termination directly through CAN_ERR_ACK.
>>
>> Yes, you seem to remember correctly. See:
>>
>> https://lists.berlios.de/pipermail/socketcan-core/2006-May/000089.html
>>
> 
> Oh, the good old days :-)
> 
>> But wrong termination may result in other errors as well and there I
>> think it could not be used reliably for the purpose you describe above.
> 
> Ok. But i think we could leave it as-is. I don't think it's worth to break the
> definition here.


I prefer the clean solution. I don't think it's worth to keep it in the
CAN id. It will just confuse users.

>>> Logically this is redundant to the bitvalue in CAN_ERR_PROT_ACK
> 
> 
>>> Yes. From the current point of view they are. I thought a CAN_ERR_BUSERROR
>>> could be triggered by anything else then protocol violations ...
>>> Then it might have become interesting when using a special filter mask for
>>> error frames.
>>
>> You mean a bit in the CAN id for the five bus error types?
> 
> That's probably too much. 8-)
> I think when CAN_ERR_PROT is set, looking into data[2] is pretty ok.

Yes.

> But in general the idea was to have error and status information in the CAN-ID
> for easy access & easy filtering - and give additional problem information in
> the data[] fields.

We have that. The user can look just for bus errors. In case of problems
you can get various error types and therefore I think it's not useful to
filter them individually.

> For that reason i would prefer to have the CAN_ERR_ACTIVE bit *only* in the
> CAN-ID status bits and not in the detailed data[1] (CAN_ERR_CTRL_ACTIVE) as
> you and Kurt suggested.

OK, but then I would drop CAN_ERR_CTRL_ACTIVE.

>> We then could or even should add a version information into the error
>> message. I think data[5] could be used for that purpose.
> 
> It's like having the CAN core version information where Dave was complaining
> when it gone Mainline. Who should check version numbers and maintain all this?

This would allow users already using the error flags to easily migrate
to the new and changed error types. My idea here is:

        cf->data[5] = CAN_ERR_VERSION;

Also candump could use it to avoid displaying rubbish with old kernels.

> Better we do a final cleanup and then it's done.
> 
> IMO this patch would be enough:
> 
> diff --git a/include/linux/can/error.h b/include/linux/can/error.h
> index d4127fd..6eddd01 100644
> --- a/include/linux/can/error.h
> +++ b/include/linux/can/error.h
> @@ -26,6 +26,7 @@
>  #define CAN_ERR_BUSOFF       0x00000040U /* bus off */
>  #define CAN_ERR_BUSERROR     0x00000080U /* bus error (may flood!) */
>  #define CAN_ERR_RESTARTED    0x00000100U /* controller restarted */
> +#define CAN_ERR_ACTIVE       0x00000200U /* controller (self) recovered */
> 
>  /* arbitration lost in bit ... / data[0] */
>  #define CAN_ERR_LOSTARB_UNSPEC   0x00 /* unspecified */
> @@ -50,7 +51,7 @@
>  #define CAN_ERR_PROT_BIT0        0x08 /* unable to send dominant bit */
>  #define CAN_ERR_PROT_BIT1        0x10 /* unable to send recessive bit */
>  #define CAN_ERR_PROT_OVERLOAD    0x20 /* bus overload */
> -#define CAN_ERR_PROT_ACTIVE      0x40 /* active error announcement */
> +#define CAN_ERR_PROT_ACK         0x40 /* received no ACK on transmission */
>  #define CAN_ERR_PROT_TX          0x80 /* error occured on transmission */
> 
>  /* error in CAN protocol (location) / data[3] */

Furthermore I would like to use:

  s/CAN_ERR_PROT_OVERLOAD/CAN_ERR_PROT_CRC/

for the CRC errors.

> Ok, the bits
> 
>     CAN_ERR_ACK(can_id) and CAN_ERR_PROT_ACK(data[2])
> 
> are set together. But who cares?

Me, see above.

> CAN_ERR_BUSERROR(can_id) could be defined as the or-ed value from
> 
>     (CAN_ERR_LOSTARB | CAN_ERR_PROT | CAN_ERR_CRTL)

What do you mean here. Currently CAN_ERR_BUSERROR behave like
CAN_ERR_PROT but bus errors are *not* state changes or lostarb errors.
In principle we could define:

  #define CAN_ERR_BUSERROR CAN_ERR_PROT

But maybe we leave it as is to allowing for other protocol violations
which are not classical bus errors.

> That looks consistent to me after all the discussions.

I started writing the patch and realized that we need some more
clarification and cleanup. Next point is how we increment the
stats->[r|t]x_error counters? It's not done consistently and it's not
even possible to assign the direction of bus errors properly. Have a
look to the drivers to realizes the mess. Should we do that at all? We
have our own stats of CAN errors:

  struct can_device_stats {
        __u32 bus_error;        /* Bus errors */
        __u32 error_warning;    /* Changes to error warning state */
        __u32 error_passive;    /* Changes to error passive state */
        __u32 bus_off;          /* Changes to bus off state */
        __u32 arbitration_lost; /* Arbitration lost errors */
        __u32 restarts;         /* CAN controller re-starts */
  };

I tend to drop touching the network stats completely. What do you and
others think.

Wolfgang.
_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core

Re: RFC: cleaning up CAN_ERR_ACK issue

Reply via email to