Hi Chrisitian, I am posting this on the socketcan-core list.
On Friday 24 September 2010 10:41:15 am christian pellegrin wrote: > > 1.- The MCP2515 driver (from current mainline linux) makes quite a lot of > > SPI transfers from inside the ISR. I counted 10 SPI-transfers for each > > CAN message sent or received. This can be reduced a lot. For instance, > > the INTF and EFLG registers can be read and cleard both in just two > > transactions. Also, if the second receive buffer isn't filled, there is > > no need to first clear the interrupt flag of buffer 0 and then clear them > > all in two separate transactions. > > nice idea, but have you checked that this works on mcp2510 too? (I > think it has a reduced instruction set on SPI). Yes, that part should be no problem for the MCP2510, since it also supports automatic address increase. > > 2.- Only one of the three available TX-boxes is used. This makes the > > driver simple, but performance is less than it could be for > > transmit-bursts. Essentially a message is put into the TX-box, and the > > netif-queue is stopped, then the message is sent, an interrupt is > > generated, SPI transactions occur to service the interrupt, and only then > > is the netif queue re-enabled. This introduces a delay between two > > consecutive CAN messages. In my patch below, I implemented a (very dirty) > > hack to enable the use of two TX-boxes, and now the driver is able to use > > the full bandwidth of a 250kbaud link. > > have you checked that there aren't out of order transmission problems? No, and yes, there most probably are out of order transmission problems. I just want to make clear that patch 1. was meant to be implemented as is, since I am pretty confident that it works, but patch 2. is meant only as RFC. It is known to be broken in several ways. > > 1.- When just started the interface, a few of the first CAN messages to > > arrive are being discarded, because of some strange race condition (I > > have NOHZ enabled). I get the following kenel messages (10 times): > > > > NOHZ: local_softirq_pending 08 > > I see the same message on my laptop to which I connected an external > (USB) ISDN adapter. A quick question to google showed me this: > http://patchwork.ozlabs.org/patch/33630/ which can be related. Interesting. I haven't checked, but either this patch isn't in mainline yet (one year later), or this isn't the same problem... > > 2.- The second problem was introduced by my hacks, and I am currently > > investigating why this occurs: If the system is heavily loaded, I > > sometimes get the following error message (no CAN-messages are lost): > > > > mcp251x spi0.2: hard_xmit called while tx busy > > > > I have carefully checked the conditions when the netif-queue is started > > and stopped, and have yet to discover why this message gets printed. > > I had just a quick glance at your patch but this seem possible for the > sequence: > > mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 0 > mcp251x_tx_work_handler -> tx_skb free, tx_busy_mask 1 > mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 1 > mcp251x_tx_work_handler -> tx_skb free, tx_busy_mask 3 > mcp251x_can_ist -> tx_skb free, tx_busy_mask 2 > mcp251x_hard_start_xmit -> tx_skb full, tx_busy_mask 2 > mcp251x_can_ist -> tx_skb full, tx_busy_mask 0 > mcp251x_hard_start_xmit -> has tx_skb at the beginning full so the > warning you reported Hmmm. So I guess we do need an extra skb (I wanted to avoid that), or change the end of the ISR from: netif_wake_queue(net); to: if (!priv->tx_skb) netif_wake_queue(net); But in that case, I need to check if there isn't a scenario where the netif- queue stays stopped forever... I'll try this later. > please check this sequence carefully since I could be mistaken. I guess you are right. Best regards, -- David Jander Protonic Holland. _______________________________________________ Socketcan-core mailing list [email protected] https://lists.berlios.de/mailman/listinfo/socketcan-core
