Thanks Benny, Did you manage to remove all the problem in your system? Did you do anything else than applying the patch you attached earlier?
Kindest regards, //Erik On Mon, Sep 13, 2010 at 8:52 AM, Benny B. Simonsen <[email protected]> wrote: > Den 12-09-2010 21:24, Erik Calissendorff skrev: >> >> Hi we are having similar problem with our MCP2515 connected to a Gumstix >> Overo. >> >> Below is a dump from candump of packages when sending a burst of data >> representing >> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" >> with one character per packet. >> >> (1279702812.562408) can0 1A300280 [1] 61 >> (1279702812.568237) can0 4 [8] 00 01 00 00 00 00 00 00 >> (1279702812.569458) can0 1A300280 [1] 67 >> (1279702812.571502) can0 1A300280 [1] 68 >> (1279702812.571594) can0 1A300280 [1] 69 >> (1279702812.571899) can0 1A300280 [1] 6A >> (1279702812.577453) can0 4 [8] 00 01 00 00 00 00 00 00 >> (1279702812.577575) can0 1A300280 [1] 6F >> (1279702812.581634) can0 4 [8] 00 01 00 00 00 00 00 00 >> (1279702812.581756) can0 1A300280 [1] 73 >> (1279702812.582397) can0 1A300280 [1] 74 >> (1279702812.583129) can0 1A300280 [1] 75 >> (1279702812.584869) can0 1A300280 [1] 76 >> (1279702812.585449) can0 1A300280 [1] 77 >> (1279702812.586273) can0 1A300280 [1] 78 >> (1279702812.587310) can0 1A300280 [1] 79 >> (1279702812.588348) can0 1A300280 [1] 7A >> (1279702812.589599) can0 1A300280 [1] 41 >> (1279702812.590606) can0 1A300280 [1] 42 >> (1279702812.591613) can0 1A300280 [1] 43 >> (1279702812.592864) can0 1A300280 [1] 44 >> (1279702812.593780) can0 1A300280 [1] 45 >> (1279702812.594604) can0 1A300280 [1] 46 >> (1279702812.595825) can0 1A300280 [1] 47 >> (1279702812.596893) can0 1A300280 [1] 48 >> (1279702812.597930) can0 1A300280 [1] 49 >> (1279702812.599426) can0 1A300280 [1] 4A >> (1279702812.600646) can0 1A300280 [1] 4B >> (1279702812.600952) can0 1A300280 [1] 4C >> (1279702812.602111) can0 1A300280 [1] 4D >> (1279702812.603118) can0 1A300280 [1] 4E >> (1279702812.604156) can0 1A300280 [1] 4F >> (1279702812.605224) can0 1A300280 [1] 50 >> (1279702812.606658) can0 1A300280 [1] 51 >> (1279702812.607269) can0 1A300280 [1] 52 >> (1279702812.608459) can0 1A300280 [1] 53 >> (1279702812.609344) can0 1A300280 [1] 54 >> (1279702812.610992) can0 1A300280 [1] 55 >> (1279702812.611389) can0 1A300280 [1] 56 >> (1279702812.612487) can0 1A300280 [1] 57 >> (1279702812.613494) can0 1A300280 [1] 58 >> (1279702812.615020) can0 1A300280 [1] 59 >> (1279702812.616241) can0 1A300280 [1] 5A >> (1279702812.616638) can0 1A300280 [1] 30 >> (1279702812.624145) can0 4 [8] 00 01 00 00 00 00 00 00 >> (1279702812.624237) can0 1A300280 [1] 37 >> (1279702812.625061) can0 1A300280 [1] 38 >> (1279702812.625823) can0 1A300280 [1] 39 >> >> As you can see we are getting a number of dropped frames with id 4, >> unsure where these frames comes from. >> >> I have applied the patch from Benny mentioned earlier in this thread >> to the 2.6.35 kernel, and the result is better then before are now >> seeing these id 4 frames. >> >> Is there anything else that I should try? >> >> >> >> Kindest regards, >> >> //Erik >> >> On Thu, Sep 2, 2010 at 6:48 AM, Michael Stocks >> <[email protected]> wrote: >>> >>> Dear All, >>> I'm experiencing lost frames and id 0, 8 byte all zero packets now that >>> we have equipment sending back to back packets on a 250Kb network. >>> I have plenty of hardware attached to Gumstix Overos so I'll test your >>> patch once the kernel is rebuilt. >>> >>> Cheers Mike. >>> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of christian >>> pellegrin >>> Sent: Thursday, 2 September 2010 02:36 >>> To: Benny B. Simonsen >>> Cc: [email protected] >>> Subject: Re: [Socketcan-users] MCP 2515 RX problems >>> >>> Hi, first of all sorry for the late response. Just now I managed to put >>> together the needed hardware (you know, in Italy it's near impossible to do >>> anything not related to holidays in august ;-) ) >>> >>> I had a quick look at your patch and it's similar to the one I prepared >>> and I'm attaching below. I think both of them solve the problem with out of >>> order frame. What is worrying me is that packet with all zeros. I never saw >>> something similar. I'll put my system on test during the next night to see >>> if I can reproduce it. I guess you put the code to reset RX flags in case of >>> error to try to solve this glitch, didn't you? >>> >>> Anyway I will try your patch tomorrow. >>> >>> diff --git a/drivers/net/can/mcp251x.c b/drivers/net/can/mcp251x.c index >>> 177f2e1..e8c5e37 100644 >>> --- a/drivers/net/can/mcp251x.c >>> +++ b/drivers/net/can/mcp251x.c >>> @@ -773,8 +773,9 @@ static irqreturn_t mcp251x_can_ist(int irq, void >>> *dev_id) >>> struct mcp251x_priv *priv = dev_id; >>> struct spi_device *spi = priv->spi; >>> struct net_device *net = priv->net; >>> + u8 intf; >>> #ifdef STATS >>> - static int irqs, runs, ov0, ov1, rx0, rx1; >>> + static int irqs, runs; >>> static int ee[8], ii[8]; >>> static unsigned long last; >>> unsigned long now; >>> @@ -788,7 +789,7 @@ static irqreturn_t mcp251x_can_ist(int irq, void >>> *dev_id) >>> now = jiffies; >>> if (now>= last + HZ) { >>> last = now; >>> - printk("%lu: i %d r %d r0 %d r1 %d o0 %d o1 %d >>> err", now, irqs, >>> runs, rx0, rx1, ov0, ov1); >>> + printk("%lu: i %d r %d err", now, irqs, runs); >>> for(i = 0; i< 8; i++) { >>> printk(",%d", ee[i]); >>> ee[i] = 0; >>> @@ -799,24 +800,20 @@ static irqreturn_t mcp251x_can_ist(int irq, void >>> *dev_id) >>> ii[i] = 0; >>> } >>> printk("\n"); >>> - irqs = runs = rx0 = rx1 = ov0 = ov1 = 0; >>> + irqs = runs = 0; >>> } >>> } >>> irqs += 1; >>> #endif >>> >>> mutex_lock(&priv->mcp_lock); >>> - while (!priv->force_quit) { >>> + intf = mcp251x_read_reg(spi, CANINTF); >>> + do { >>> enum can_state new_state; >>> - u8 intf = mcp251x_read_reg(spi, CANINTF); >>> - u8 eflag; >>> + u8 eflag = 0; >>> int can_id = 0, data1 = 0; >>> #ifdef STATS >>> runs += 1; >>> - if (intf& CANINTF_RX0IF) >>> - rx0 += 1; >>> - if (intf& CANINTF_RX1IF) >>> - rx1 += 1; >>> { >>> int i; >>> for(i = 0; i< 8; i++) { >>> @@ -828,25 +825,31 @@ static irqreturn_t mcp251x_can_ist(int irq, void >>> *dev_id) >>> >>> if (intf& CANINTF_RX0IF) { >>> mcp251x_hw_rx(spi, 0); >>> - /* Free one buffer ASAP */ >>> - mcp251x_write_bits(spi, CANINTF, intf& >>> CANINTF_RX0IF, >>> - 0x00); >>> + >>> + /* check that a packet didn't come while emptying >>> RX0 */ >>> + if ((intf& CANINTF_RX1IF) == 0) >>> + intf |= mcp251x_read_reg(spi, CANINTF); >>> } >>> >>> if (intf& CANINTF_RX1IF) >>> mcp251x_hw_rx(spi, 1); >>> >>> + /* This is needed to close a race condition that causes >>> + * packet reordening: we must assure that we never empty >>> + * RX0 but not RX1. If this happens the next packet will >>> + * land in RX0. As a conseuqence on the next do { } while >>> + * spin we will have packets out of order. >>> + */ >>> + intf |= CANINTF_RX1IF; >>> + >>> mcp251x_write_bits(spi, CANINTF, intf, 0x00); >>> >>> - eflag = mcp251x_read_reg(spi, EFLG); >>> - mcp251x_write_reg(spi, EFLG, 0x00); >>> + if (intf&& CANINTF_ERRIF) { >>> + eflag = mcp251x_read_reg(spi, EFLG); >>> + mcp251x_write_reg(spi, EFLG, 0x00); >>> + } >>> >>> #ifdef STATS >>> - if (eflag& EFLG_RX0OVR) >>> - ov0 += 1; >>> - >>> - if (eflag& EFLG_RX1OVR) >>> - ov1 += 1; >>> { >>> int i; >>> for(i = 0; i< 8; i++) { >>> @@ -924,9 +927,6 @@ static irqreturn_t mcp251x_can_ist(int irq, void >>> *dev_id) >>> } >>> } >>> >>> - if (intf == 0) >>> - break; >>> - >>> if (intf& (CANINTF_TX2IF | CANINTF_TX1IF | >>> CANINTF_TX0IF)) { >>> net->stats.tx_packets++; >>> net->stats.tx_bytes += priv->tx_len - 1; @@ -936,7 >>> +936,8 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) >>> } >>> netif_wake_queue(net); >>> } >>> - } >>> + intf = mcp251x_read_reg(spi, CANINTF); >>> + } while (!priv->force_quit&& intf); >>> mutex_unlock(&priv->mcp_lock); >>> return IRQ_HANDLED; >>> } >>> >>> >>> >>> On Sat, Aug 28, 2010 at 2:20 PM, Benny B. Simonsen<[email protected]> >>> wrote: >>>> >>>> 2010/8/6 christian pellegrin<[email protected]> >>>>> >>>>> On Fri, Aug 6, 2010 at 3:56 PM, Wolfgang Grandegger >>>>> <[email protected]> >>>>> wrote: >>>>>> >>>>>> To be clear, out-of-order messages are also not OK. >>>>> >>>>> Yes I was looking at this with some interleave analysis. I think the >>>>> problem lies in the following race: >>>>> >>>>> // suppose we come here with RXB0 filled with packet X >>>>> u8 intf = mcp251x_read_reg(spi, CANINTF); >>>>> u8 eflag; >>>>> int can_id = 0, data1 = 0; >>>>> >>>>> // packet X+1 arrives and lands in RXB1 >>>>> >>>>> if (intf& CANINTF_RX0IF) { >>>>> mcp251x_hw_rx(spi, 0); >>>>> /* Free one buffer ASAP */ >>>>> mcp251x_write_bits(spi, CANINTF, intf& >>>>> CANINTF_RX0IF, >>>>> 0x00); >>>>> } >>>>> >>>>> // packet X+2 arrives and lands in RXB0 // unfortunately variable >>>>> intf is old and doesn't show that we have something in RXB1 // (*1) >>>>> if (intf& CANINTF_RX1IF) >>>>> mcp251x_hw_rx(spi, 1); >>>>> >>>>> now on the next turn we will have packet X+2 in RXB0 and X+1 in RXB1 >>>>> so we read them out-of-order. The easiest way to solve the problem is >>>>> to take out the line: >>>>> >>>>> /* Free one buffer ASAP */ >>>>> mcp251x_write_bits(spi, CANINTF, intf& >>>>> CANINTF_RX0IF, >>>>> 0x00); >>>>> >>>>> but we risk to lose more packets. Perhaps also rereading intf in (*1) >>>>> could work but we must be careful when we reset CANINTF register to >>>>> not lose an interrupt for RXB0. I will do some tests but after next >>>>> week because I don't have the access to the hardware until then. >>>>> >>>>> -- >>>>> Christian Pellegrin, see http://www.evolware.org/chri/ "Real >>>>> Programmers don't play tennis, or any other sport which requires you >>>>> to change clothes. Mountain climbing is OK, and Real Programmers wear >>>>> their climbing boots to work in case a mountain should suddenly >>>>> spring up in the middle of the computer room." >>>> >>>> Hi Christian, >>>> Did you have a chance to look on the problem and / or the patch I send >>>> to the list? >>>> BR >>>> Benny >>> >>> >>> -- >>> Christian Pellegrin, see http://www.evolware.org/chri/ "Real Programmers >>> don't play tennis, or any other sport which requires you to change clothes. >>> Mountain climbing is OK, and Real Programmers wear their climbing boots to >>> work in case a mountain should suddenly spring up in the middle of the >>> computer room." >>> _______________________________________________ >>> Socketcan-users mailing list >>> [email protected] >>> https://lists.berlios.de/mailman/listinfo/socketcan-users >>> _______________________________________________ >>> Socketcan-users mailing list >>> [email protected] >>> https://lists.berlios.de/mailman/listinfo/socketcan-users >>> >> _______________________________________________ >> Socketcan-users mailing list >> [email protected] >> https://lists.berlios.de/mailman/listinfo/socketcan-users > > Hi Erik, > > The 0x04 ID message is the way SocketCAN indicates overrun. Don't know where > the different error codes is described (saw it in the driver) > > BR > Benny > _______________________________________________ Socketcan-users mailing list [email protected] https://lists.berlios.de/mailman/listinfo/socketcan-users
