Hello,
On 07/24/2010 11:11 AM, Alexander Holler wrote:
> Hello,
>
> I'm currently testing a MCP2515 connected to an AT91-board.
>
> Besides some small changes to use an gpio-irq, I've come across about a
> problem in the driver which leads to an endless IRQ. For the test I've
> used two nodes, a PEAK USB controller and said MCP2515, both configured
> to use a bitrate of 500k. When I'm now sending a simple message from the
> peak, the mcp251x-driver will end in an endless irq and the MCP2515 will
> endless send error-messages.
>
> Here is an debug output which should describe the problem:
>
> ---------------------------------------------
> peak:
> cansend can0 "5A1#11.2233.44556677.01 / 123#DEADBEEF / 5AA# /"
>
> at91:
> [ 93.730000] mcp251x_can_ist entered
> [ 93.770000] mcp251x_can_ist intf: 0xa0
> [ 93.890000] mcp251x_can_ist eflag: 0xb
> forever:
> [ 93.930000] mcp251x_can_ist intf: 0x80
> [ 94.060000] mcp251x_can_ist eflag: 0xb
> [ 94.100000] mcp251x_can_ist intf: 0x80
> [ 94.220000] mcp251x_can_ist eflag: 0xb
> ...
>
> peak:
> candump any,0:0,#ffffffff
> forever:
> can0 20000004 [8] 00 04 00 00 00 00 00 00 ERRORFRAME
> ---------------------------------------------
Looks like the infamous bus error flooding, which can hang low-end
systems. There seem to be electrical problem on the bus triggering bus
errors on the MCP251x. Just run "candump any,0:0,#ffffffff" on the at91
system as well.
> Using the at91_can-driver (and can-hw) instead of the mcp251x on the
> AT91 box I can't reproduce this, so it seems not to be a problem of the
> cabling or the sender (peak) side.
Also here running "candump any,0:0,#ffffffff" on the at91 will provide
more information.
> Anyway, currently I don't care for the reason of the fault on the bus,
> but I think at least that endless loop is something which should be
> adressed because this renders the AT91 system almost useless (and could
> be triggered from remote).
>
> My first simple solution for this is currently the following patch:
>
> --------------------------------
> diff --git a/drivers/net/can/mcp251x.c b/drivers/net/can/mcp251x.c
> index b11a0cb..e1a3745 100644
> --- a/drivers/net/can/mcp251x.c
> +++ b/drivers/net/can/mcp251x.c
> @@ -835,7 +835,7 @@ static irqreturn_t mcp251x_can_ist(int irq, void
> *dev_id)
> }
> }
>
> - if (intf == 0)
> + if ((intf & ~CANINTF_MERRF) == 0)
> break;
>
> if (intf & (CANINTF_TX2IF | CANINTF_TX1IF |
> CANINTF_TX0IF)) {
> --------------------------------
>
> With that patch the IRQ will be released even if the MERRF is set in the
> CANINTF-register. This will not fix the problem, but at least the system
> is still usable in such a case.
You seem not to use the MCP251x driver from the mainline kernel. There,
MERR is not enabled any more. Here is the commit message:
commit bf66f3736a945dd4e92d86427276c6eeab0a6c1d
Author: Christian Pellegrin <[email protected]>
Date: Wed Feb 3 07:39:54 2010 +0000
can: mcp251x: Move to threaded interrupts instead of workqueues.
This patch addresses concerns about efficiency of handling incoming
packets. Handling of interrupts is done in a threaded interrupt handler
which has a smaller latency than workqueues. This change needed a rework
of the locking scheme that was much simplified. Some other (more or less
longstanding) bugs are fixed: utilization of just half of the RX
buffers, useless wait for interrupt on open, more reliable reset
sequence. The MERR interrupt is not used anymore: it overloads the CPU
in error-passive state without any additional information. One shot mode
is disabled because it's not clear if it can be handled efficiently on
this CAN controller.
Signed-off-by: Christian Pellegrin <[email protected]>
Acked-by: Wolfgang Grandegger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
>
> I'm no CAN expert, but I think a correct solution would be to restart
> the MCP2515 after a MERRF was received and didn't go away (restart-ms).
>
> The datasheet for the MCP2515 doesn't talk much about MERRF while
> receiving (just a sentence in section 7.4 Message Error Interrupt), and
> the chart for message reception shows why the Error Frames are generated
> endless.
>
> Any pointers/suggestions?
See above. Just disable the MERR interrrupt.
Wolfgang.
_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core