Wolfgang Grandegger wrote: > On 07/25/2010 04:48 PM, Marc Kleine-Budde wrote: >> Wolfgang Grandegger wrote: >>> On 07/25/2010 03:35 PM, Marc Kleine-Budde wrote: >>>> Wolfgang Grandegger wrote: >>>>> I found some time to work on this issue. >>>>> >>>>> So far we only report CAN error state changes to the worse: >>>>> "error-active -> error-warning -> error-passive -> bus-off", but not >>>>> "bus-off -> error-active" or "error-passive -> error-warning -> >>>>> error-active". There have been request from other users to correct >>>>> this behavior. >>>>> >>>>> This patch reports any state change reported by the SJA1000 and >>>>> MSCAN-MPC5200 controller. It introduces "CAN_ERR_CRTL_ACTIVE" to >>>>> signal state changes to error active. Furthermore, the SJA1000 >>>>> triggers now the bus-off recovery procedure when restarted. So far, >>>>> the controller was *stopped* and then restarted. I did two tests to >>>>> analyze the state changes behavior: >>>>> >>>>> 1. First I forced the controller into the error-passive by sending >>>>> a message (with cansend) without cable connected (*). Then I >>>>> reconnected the cable (**) and continued to send messages (with >>>>> cangen) (***) until the error active state is reached: >>>>> >>>>> On SJA1000: >>>>> >>>>> # candump any,0:0,#FFFFFFFF >>>>> (*) >>>>> can0 20000004 [8] 00 08 00 00 00 00 60 00 ERROR-WARNING >>>>> can0 20000004 [8] 00 20 00 00 00 00 80 00 ERROR-PASSIVE >>>>> (**) >>>>> can0 123 [4] DE AD BE EF >>>>> (***) >>>>> can0 42A [1] 00 >>>>> ... >>>>> can0 42A [1] 07 >>>>> can0 20000004 [8] 00 08 00 00 00 00 7F 00 ERROR-WARNING >>>> Is it possible to tell apart whether the driver goes from passive to >>>> warning or from active to warning? Without the need to have a state >>>> machine in userspace, too? >>> Yes, I realized that side effect as well. The user must monitor state >>> changes to understand if the state gets worse or better. We could use an >>> error bit to indicate state improvements. >> I like that idea! From naive point of view, in the userspace I want >> information that I can interpret stateless (i.e. without the need for a >> statemachine == history). > > I tend to introduce the error class: > > #define CAN_ERR_STATE_CHANGE 0x00000200U /* state changed / data[1] > > apart from > > #define CAN_ERR_CRTL 0x00000004U /* controller problems / data[1] > > For any state change the CAN_ERR_STATE_CHANGE bit is set. If the state > got worse, the CAN_ERR_CRTL bit will be set as well. This would also > ensure backward compatibility.
What about CAN_CTRLMODE_BERR_REPORTING? How strict should it be
interpreted? In the flexcan driver, if CAN_CTRLMODE_BERR_REPORTING is
not active I don't send any bus error related error frames. Even if the
napi function is entered and the bus error bits are set!
>>> No need for another state machine. But a common function is still useful
>>> to fill an error frame for the state change, of course. I will try to
>>> come up with a proposal soon.
>> I think there's need for (another) state machine, let me try to explain.
>>
>> In the interrupt or napi handler each driver determines the current can
>> state.
>> This can vary from a simple: read register and translate into
>> CAN_STATE_*, to a more complex function like you've written above. This
>> functions can even have fixups for strange hardware, like the MSCAN: The
>> mscan may think it's still in error warning, but we know tx/rx error
>> level below 96 means error active.
>>
>> If the can state has changed the driver calls the
>> "fill-out-error-frame-function". This function will take care of strange
>> hardware again. E.g. if the can state changes from error passive to
>> error active the function can generate the missing to-error-warning
>> meesage. (If this is wanted). The same goes for the
>> state-is-getting-worse messages.
>
> OK, the prototype of a common function could be (similar to your
> do_state function in the flexcan driver):
>
> int can_handle_state_change(struct net_device *dev,
> struct can_frame *cf,
> enum can_state new_state,
> int tx_err, int rx_err)
I'd drop the "struct can_frame *cf", the function can take care of
allocating the frame and returning in case of an error. But this would
break you proposed detect "getting worse".
Apart from this, {tx,rx}_err can be derived via the
priv->can.do_get_berr_counter function.
> If "tx_err" or "rx_err == -1", they are ignored. If "new_state ==
If priv->can.do_get_berr_counter if NULL, don't use it.
> CAN_STATE_UNKNOWN", the state could be derived from tx_err or rx_err.
> What do you think?
The CAN_STATE_UNKNOWN fallback is a good idea.
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Socketcan-core mailing list [email protected] https://lists.berlios.de/mailman/listinfo/socketcan-core
