Hi we are having similar problem with our MCP2515 connected to a Gumstix Overo.
Below is a dump from candump of packages when sending a burst of data representing "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" with one character per packet. (1279702812.562408) can0 1A300280 [1] 61 (1279702812.568237) can0 4 [8] 00 01 00 00 00 00 00 00 (1279702812.569458) can0 1A300280 [1] 67 (1279702812.571502) can0 1A300280 [1] 68 (1279702812.571594) can0 1A300280 [1] 69 (1279702812.571899) can0 1A300280 [1] 6A (1279702812.577453) can0 4 [8] 00 01 00 00 00 00 00 00 (1279702812.577575) can0 1A300280 [1] 6F (1279702812.581634) can0 4 [8] 00 01 00 00 00 00 00 00 (1279702812.581756) can0 1A300280 [1] 73 (1279702812.582397) can0 1A300280 [1] 74 (1279702812.583129) can0 1A300280 [1] 75 (1279702812.584869) can0 1A300280 [1] 76 (1279702812.585449) can0 1A300280 [1] 77 (1279702812.586273) can0 1A300280 [1] 78 (1279702812.587310) can0 1A300280 [1] 79 (1279702812.588348) can0 1A300280 [1] 7A (1279702812.589599) can0 1A300280 [1] 41 (1279702812.590606) can0 1A300280 [1] 42 (1279702812.591613) can0 1A300280 [1] 43 (1279702812.592864) can0 1A300280 [1] 44 (1279702812.593780) can0 1A300280 [1] 45 (1279702812.594604) can0 1A300280 [1] 46 (1279702812.595825) can0 1A300280 [1] 47 (1279702812.596893) can0 1A300280 [1] 48 (1279702812.597930) can0 1A300280 [1] 49 (1279702812.599426) can0 1A300280 [1] 4A (1279702812.600646) can0 1A300280 [1] 4B (1279702812.600952) can0 1A300280 [1] 4C (1279702812.602111) can0 1A300280 [1] 4D (1279702812.603118) can0 1A300280 [1] 4E (1279702812.604156) can0 1A300280 [1] 4F (1279702812.605224) can0 1A300280 [1] 50 (1279702812.606658) can0 1A300280 [1] 51 (1279702812.607269) can0 1A300280 [1] 52 (1279702812.608459) can0 1A300280 [1] 53 (1279702812.609344) can0 1A300280 [1] 54 (1279702812.610992) can0 1A300280 [1] 55 (1279702812.611389) can0 1A300280 [1] 56 (1279702812.612487) can0 1A300280 [1] 57 (1279702812.613494) can0 1A300280 [1] 58 (1279702812.615020) can0 1A300280 [1] 59 (1279702812.616241) can0 1A300280 [1] 5A (1279702812.616638) can0 1A300280 [1] 30 (1279702812.624145) can0 4 [8] 00 01 00 00 00 00 00 00 (1279702812.624237) can0 1A300280 [1] 37 (1279702812.625061) can0 1A300280 [1] 38 (1279702812.625823) can0 1A300280 [1] 39 As you can see we are getting a number of dropped frames with id 4, unsure where these frames comes from. I have applied the patch from Benny mentioned earlier in this thread to the 2.6.35 kernel, and the result is better then before are now seeing these id 4 frames. Is there anything else that I should try? Kindest regards, //Erik On Thu, Sep 2, 2010 at 6:48 AM, Michael Stocks <[email protected]> wrote: > Dear All, > I'm experiencing lost frames and id 0, 8 byte all zero packets now that we > have equipment sending back to back packets on a 250Kb network. > I have plenty of hardware attached to Gumstix Overos so I'll test your patch > once the kernel is rebuilt. > > Cheers Mike. > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of christian > pellegrin > Sent: Thursday, 2 September 2010 02:36 > To: Benny B. Simonsen > Cc: [email protected] > Subject: Re: [Socketcan-users] MCP 2515 RX problems > > Hi, first of all sorry for the late response. Just now I managed to put > together the needed hardware (you know, in Italy it's near impossible to do > anything not related to holidays in august ;-) ) > > I had a quick look at your patch and it's similar to the one I prepared and > I'm attaching below. I think both of them solve the problem with out of order > frame. What is worrying me is that packet with all zeros. I never saw > something similar. I'll put my system on test during the next night to see if > I can reproduce it. I guess you put the code to reset RX flags in case of > error to try to solve this glitch, didn't you? > > Anyway I will try your patch tomorrow. > > diff --git a/drivers/net/can/mcp251x.c b/drivers/net/can/mcp251x.c index > 177f2e1..e8c5e37 100644 > --- a/drivers/net/can/mcp251x.c > +++ b/drivers/net/can/mcp251x.c > @@ -773,8 +773,9 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) > struct mcp251x_priv *priv = dev_id; > struct spi_device *spi = priv->spi; > struct net_device *net = priv->net; > + u8 intf; > #ifdef STATS > - static int irqs, runs, ov0, ov1, rx0, rx1; > + static int irqs, runs; > static int ee[8], ii[8]; > static unsigned long last; > unsigned long now; > @@ -788,7 +789,7 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) > now = jiffies; > if (now >= last + HZ) { > last = now; > - printk("%lu: i %d r %d r0 %d r1 %d o0 %d o1 %d err", > now, irqs, > runs, rx0, rx1, ov0, ov1); > + printk("%lu: i %d r %d err", now, irqs, runs); > for(i = 0; i < 8; i++) { > printk(",%d", ee[i]); > ee[i] = 0; > @@ -799,24 +800,20 @@ static irqreturn_t mcp251x_can_ist(int irq, void > *dev_id) > ii[i] = 0; > } > printk("\n"); > - irqs = runs = rx0 = rx1 = ov0 = ov1 = 0; > + irqs = runs = 0; > } > } > irqs += 1; > #endif > > mutex_lock(&priv->mcp_lock); > - while (!priv->force_quit) { > + intf = mcp251x_read_reg(spi, CANINTF); > + do { > enum can_state new_state; > - u8 intf = mcp251x_read_reg(spi, CANINTF); > - u8 eflag; > + u8 eflag = 0; > int can_id = 0, data1 = 0; > #ifdef STATS > runs += 1; > - if (intf & CANINTF_RX0IF) > - rx0 += 1; > - if (intf & CANINTF_RX1IF) > - rx1 += 1; > { > int i; > for(i = 0; i < 8; i++) { > @@ -828,25 +825,31 @@ static irqreturn_t mcp251x_can_ist(int irq, void > *dev_id) > > if (intf & CANINTF_RX0IF) { > mcp251x_hw_rx(spi, 0); > - /* Free one buffer ASAP */ > - mcp251x_write_bits(spi, CANINTF, intf & CANINTF_RX0IF, > - 0x00); > + > + /* check that a packet didn't come while emptying RX0 > */ > + if ((intf & CANINTF_RX1IF) == 0) > + intf |= mcp251x_read_reg(spi, CANINTF); > } > > if (intf & CANINTF_RX1IF) > mcp251x_hw_rx(spi, 1); > > + /* This is needed to close a race condition that causes > + * packet reordening: we must assure that we never empty > + * RX0 but not RX1. If this happens the next packet will > + * land in RX0. As a conseuqence on the next do { } while > + * spin we will have packets out of order. > + */ > + intf |= CANINTF_RX1IF; > + > mcp251x_write_bits(spi, CANINTF, intf, 0x00); > > - eflag = mcp251x_read_reg(spi, EFLG); > - mcp251x_write_reg(spi, EFLG, 0x00); > + if (intf && CANINTF_ERRIF) { > + eflag = mcp251x_read_reg(spi, EFLG); > + mcp251x_write_reg(spi, EFLG, 0x00); > + } > > #ifdef STATS > - if (eflag & EFLG_RX0OVR) > - ov0 += 1; > - > - if (eflag & EFLG_RX1OVR) > - ov1 += 1; > { > int i; > for(i = 0; i < 8; i++) { > @@ -924,9 +927,6 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) > } > } > > - if (intf == 0) > - break; > - > if (intf & (CANINTF_TX2IF | CANINTF_TX1IF | CANINTF_TX0IF)) { > net->stats.tx_packets++; > net->stats.tx_bytes += priv->tx_len - 1; @@ -936,7 > +936,8 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) > } > netif_wake_queue(net); > } > - } > + intf = mcp251x_read_reg(spi, CANINTF); > + } while (!priv->force_quit && intf); > mutex_unlock(&priv->mcp_lock); > return IRQ_HANDLED; > } > > > > On Sat, Aug 28, 2010 at 2:20 PM, Benny B. Simonsen <[email protected]> wrote: >> >> >> 2010/8/6 christian pellegrin <[email protected]> >>> >>> On Fri, Aug 6, 2010 at 3:56 PM, Wolfgang Grandegger >>> <[email protected]> >>> wrote: >>> > To be clear, out-of-order messages are also not OK. >>> >>> Yes I was looking at this with some interleave analysis. I think the >>> problem lies in the following race: >>> >>> // suppose we come here with RXB0 filled with packet X >>> u8 intf = mcp251x_read_reg(spi, CANINTF); >>> u8 eflag; >>> int can_id = 0, data1 = 0; >>> >>> // packet X+1 arrives and lands in RXB1 >>> >>> if (intf & CANINTF_RX0IF) { >>> mcp251x_hw_rx(spi, 0); >>> /* Free one buffer ASAP */ >>> mcp251x_write_bits(spi, CANINTF, intf & >>> CANINTF_RX0IF, >>> 0x00); >>> } >>> >>> // packet X+2 arrives and lands in RXB0 // unfortunately variable >>> intf is old and doesn't show that we have something in RXB1 // (*1) >>> if (intf & CANINTF_RX1IF) >>> mcp251x_hw_rx(spi, 1); >>> >>> now on the next turn we will have packet X+2 in RXB0 and X+1 in RXB1 >>> so we read them out-of-order. The easiest way to solve the problem is >>> to take out the line: >>> >>> /* Free one buffer ASAP */ >>> mcp251x_write_bits(spi, CANINTF, intf & >>> CANINTF_RX0IF, >>> 0x00); >>> >>> but we risk to lose more packets. Perhaps also rereading intf in (*1) >>> could work but we must be careful when we reset CANINTF register to >>> not lose an interrupt for RXB0. I will do some tests but after next >>> week because I don't have the access to the hardware until then. >>> >>> -- >>> Christian Pellegrin, see http://www.evolware.org/chri/ "Real >>> Programmers don't play tennis, or any other sport which requires you >>> to change clothes. Mountain climbing is OK, and Real Programmers wear >>> their climbing boots to work in case a mountain should suddenly >>> spring up in the middle of the computer room." >> >> >> Hi Christian, >> Did you have a chance to look on the problem and / or the patch I send >> to the list? >> BR >> Benny > > > > -- > Christian Pellegrin, see http://www.evolware.org/chri/ "Real Programmers > don't play tennis, or any other sport which requires you to change clothes. > Mountain climbing is OK, and Real Programmers wear their climbing boots to > work in case a mountain should suddenly spring up in the middle of the > computer room." > _______________________________________________ > Socketcan-users mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/socketcan-users > _______________________________________________ > Socketcan-users mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/socketcan-users > _______________________________________________ Socketcan-users mailing list [email protected] https://lists.berlios.de/mailman/listinfo/socketcan-users
