Re: Improved opensource firmware
On Sunday 26 July 2009 17:37:10 Larry Finger wrote: Michael Buesch wrote: Just to explain my idea: I think there are two ways for this warning to trigger. The first being mac80211 being broken and not stopping the queue on request. That's probably not very likely. The second could possibly be the firmware reporting status for one frame multiple times. I did not check the whole code, but this could possibly lead to an integer under/overflow in the free_slots() calculation. A signed integer is used, so I think it can go negative, which would trigger the warning. I don't see another way to trigger the message. And as it only seems to happen with open firmware, it seems likely be caused by TX status reporting in the firmware. The message also triggers with proprietary firmware. My debugging hunk is @@ -1340,7 +1350,8 @@ int b43_dma_tx(struct b43_wldev *dev, st B43_WARN_ON(ring-stopped); if (unlikely(free_slots(ring) TX_SLOTS_PER_FRAME)) { - b43warn(dev-wl, DMA queue overflow\n); + b43warn(dev-wl, DMA queue overflow with free_slots = %d\n, + free_slots(ring)); err = -ENOSPC; goto out_unlock; } The revised printk shows b43-phy0 warning: DMA queue overflow with free_slots = 0 Ok, it's a mac80211 bug then. -- Greetings, Michael. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: Improved opensource firmware
Michael Buesch wrote: On Friday 24 July 2009 16:48:49 Larry Finger wrote: Francesco, Sorry, but I missed a warning about DMA queue overflow that was being logged as follows: Can you printk the value of free_slots(), when this happens? I have a printk in place, but that warning has not happened again. I made the following changes: (1) I put guard words around meta in struct b43_dma_ring, and around the skb in struct b43_dmadesc_meta. These were checked in b43_dma_handle_txstatus() and were OK. Whatever is setting meta-skb to NULL is specific in writing that location. (2) I replaced the BUG_ON when meta-skb is NULL with printk statements that dump elements of the ring and meta structs. At least I get a chance to interrogate the internal data. Furthermore, I can get the wireless back by unloading and reloading b43 without rebooting. The dumped values are as follows: b43: meta data: skb (null) dmaaddr 986d50bc is_last_fragment 1 b43: ring data: nr_slots 256 used_slots 45 current_slot 65 index 1 tx 1 max_used_slots 256 Unfortunately, all of these look reasonable - no smoking guns. Any suggestions on other values to dump? Larry ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: Improved opensource firmware
On Friday 24 July 2009 11:57:02 Francesco Gringoli wrote: Ok. I'd like to see some oopses with proprietary firmware. I will have to send you the laptop and the cards (happens with both cards I have of that kind) because there is nothing written on display, it simply gets freeze. I also tried to enable the debug features such as magic system request keys to print something but they do not work. However the board is different than the one Larry reported the problem and in my case it can be something due to some hardware incompatibility. Did you try things like the hangcheck timer? Just enable all kernel hacking options. -- Greetings, Michael. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: Improved opensource firmware
On Thursday 23 July 2009 17:18:24 Francesco Gringoli wrote: On Jul 23, 2009, at 11:18 AM, Michael Buesch wrote: On Thursday 23 July 2009 04:05:17 Larry Finger wrote: Francesco Gringoli wrote: Larry, I think this could be one of the causes of the malfunctioning you reported before. If you have some time (and indeed if you feel like doing it :-) ) please test this firmware, it will be great. Francesco, The system ran about 30 minutes, then crashed. I missed the first oops, but caught a kernel panic with formal traceback on my i386 system: b43_dma_handle_txstatus + 0x1ee/0x2fa b43_handle_txstatus + 0x45/0x52 The call in b43_dma_handle_status is at line 1405: unmap_descbuffer(ring, meta-dmaaddr, meta-skb-len, 1); The oops was in drivers/net/wireless/b43/xmit.h:171 in the call to b43_is_old_txhdr_format(). It appears that dev-fw.rev causes the oops. How is that possible? Is the firmware clobbering random memory? I don't think that the value was modified by the firmware. It cannot poke values into host memory ;-) Oh yes it can. It has a DMA engine. I suppose that the issue on pccard32 hardware is not yet solved. Which issue? This crash does _not_ happen with proprietary firmware. The only way dev-fw.rev could crash is by dev being NULL. I will get a closer look at how the overflow condition should be handled correctly when reporting txstatus to host. -- Greetings, Michael. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: Improved opensource firmware
Francesco Gringoli wrote: Larry, I think this could be one of the causes of the malfunctioning you reported before. If you have some time (and indeed if you feel like doing it :-) ) please test this firmware, it will be great. Francesco, The system ran about 30 minutes, then crashed. I missed the first oops, but caught a kernel panic with formal traceback on my i386 system: b43_dma_handle_txstatus + 0x1ee/0x2fa b43_handle_txstatus + 0x45/0x52 The call in b43_dma_handle_status is at line 1405: unmap_descbuffer(ring, meta-dmaaddr, meta-skb-len, 1); The oops was in drivers/net/wireless/b43/xmit.h:171 in the call to b43_is_old_txhdr_format(). It appears that dev-fw.rev causes the oops. As usual, I was running an infinite loop of tcpperf in one console and a flood ping in a second. Larry ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Improved opensource firmware
Hello there, after a long time we just made available a new version of the opensource firmware at http://www.ing.unibs.it/openfwwf (release 5.2). We discovered a bug that was causing the firmware to go into suspension after a phy error due to external conditions: syslog reported a phy error and the mac suspended, interface was no more usable. This bug appears usually when the signal received from the peer is very low, and was reported by several users. Now we can still have phy errors in the system.log but the interface remains fully functional (as it happens with the official firmware). There are also more comments in the source code with some registers explained. Larry, I think this could be one of the causes of the malfunctioning you reported before. If you have some time (and indeed if you feel like doing it :-) ) please test this firmware, it will be great. Cheers, -Francesco INFORMATIVA SUL TRATTAMENTO DEI DATI PERSONALI I dati utilizzati per l'invio del presente messaggio sono trattati dall' Universita' degli studi di Brescia esclusivamente per finalita' istituzionali. Informazioni piu' dettagliate anche in ordine ai diritti dell'interessato sono riposte nell'informativa generale e nelle notizie pubblicate sul sito web dell'Ateneo nella sezione privacy. Il contenuto di questo messaggio e' rivolto unicamente alle persone cui e' indirizzato e puo' contenere informazioni la cui riservatezza e' tutelata legalmente. Ne sono vietati la riproduzione, la diffusione e l'uso in mancanza di autorizzazione del destinatario. Qualora il messaggio fosse pervenuto per errore, preghiamo di eliminarlo. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev