Hi Chrisitian,

I am posting this on the socketcan-core list.

On Friday 24 September 2010 10:41:15 am christian pellegrin wrote:
> > 1.- The MCP2515 driver (from current mainline linux) makes quite a lot of
> > SPI transfers from inside the ISR. I counted 10 SPI-transfers for each
> > CAN message sent or received. This can be reduced a lot. For instance,
> > the INTF and EFLG registers can be read and cleard both in just two
> > transactions. Also, if the second receive buffer isn't filled, there is
> > no need to first clear the interrupt flag of buffer 0 and then clear them
> > all in two separate transactions.
> 
> nice idea, but have you checked that this works on mcp2510 too? (I
> think it has a reduced instruction set on SPI).

Yes, that part should be no problem for the MCP2510, since it also supports 
automatic address increase.

> > 2.- Only one of the three available TX-boxes is used. This makes the
> > driver simple, but performance is less than it could be for
> > transmit-bursts. Essentially a message is put into the TX-box, and the
> > netif-queue is stopped, then the message is sent, an interrupt is
> > generated, SPI transactions occur to service the interrupt, and only then
> > is the netif queue re-enabled. This introduces a delay between two
> > consecutive CAN messages. In my patch below, I implemented a (very dirty)
> > hack to enable the use of two TX-boxes, and now the driver is able to use
> > the full bandwidth of a 250kbaud link.
> 
> have you checked that there aren't out of order transmission problems?

No, and yes, there most probably are out of order transmission problems. I 
just want to make clear that patch 1. was meant to be implemented as is, since 
I am pretty confident that it works, but patch 2. is meant only as RFC. It is 
known to be broken in several ways.

> > 1.- When just started the interface, a few of the first CAN messages to
> > arrive are being discarded, because of some strange race condition (I
> > have NOHZ enabled). I get the following kenel messages (10 times):
> >
> >  NOHZ: local_softirq_pending 08
> 
> I see the same message on my laptop to which I connected an external
> (USB) ISDN adapter. A quick question to google showed me this:
> http://patchwork.ozlabs.org/patch/33630/ which can be related.

Interesting. I haven't checked, but either this patch isn't in mainline yet 
(one year later), or this isn't the same problem...

> > 2.- The second problem was introduced by my hacks, and I am currently
> > investigating why this occurs: If the system is heavily loaded, I
> > sometimes get the following error message (no CAN-messages are lost):
> >
> >  mcp251x spi0.2: hard_xmit called while tx busy
> >
> > I have carefully checked the conditions when the netif-queue is started
> > and stopped, and have yet to discover why this message gets printed.
> 
> I had just a quick glance at your patch but this seem possible for the
> sequence:
> 
> mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 0
> mcp251x_tx_work_handler -> tx_skb free, tx_busy_mask 1
> mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 1
> mcp251x_tx_work_handler -> tx_skb free, tx_busy_mask 3
> mcp251x_can_ist -> tx_skb free, tx_busy_mask 2
> mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 2
> mcp251x_can_ist -> tx_skb full, tx_busy_mask 0
> mcp251x_hard_start_xmit -> has tx_skb at the beginning full so the
> warning you reported

Hmmm. So I guess we do need an extra skb (I wanted to avoid that), or change 
the end of the ISR from:

                netif_wake_queue(net);

to:

                if (!priv->tx_skb)
                        netif_wake_queue(net);

But in that case, I need to check if there isn't a scenario where the netif-
queue stays stopped forever... I'll try this later.

> please check this sequence carefully since I could be mistaken.

I guess you are right.

Best regards,

-- 
David Jander
Protonic Holland.
_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core

Reply via email to