Thanks Terry! > > Here is the sum total of my clue-fu on this problem; it is > mostly supposition, because of incomplete information. Bill > Pauls kung-fu in ethernet drivers is much greater than > anyone else's... it beats the heck out of my cowering piglet > style. ;-). The best advice *anyone* could give you is to > "ask Bill Paul". >
Good advice. Will do... > > In other words, your driver problem is in your driver. 8-). > Technically yes, although I was able to realize it without failures on Netbsd. The only difference being the attach code. But as you say later it could be a bug waiting to happen since FreeBSD seems to be much faster on the PCI accesses. > >> 2. When the hardware is installed with the minimal kernel driver >> the system locks. The minimal kernel driver only attaches some >> resources. > > This appears to be a network driver. There are several > possible complications. The first is the lack of an > interrupt handler that just discards the events; this > might be because you showed us an incomplete driver. The > second is that NetBSD could be doing something in its > bus management code that FreeBSD isn't, and without > that, there's a problem ("switch_intr" doesn't look > like the right thing to call, to me). > Yes and no. It *IS* an ethernet device--that's clear, but it has it's own stack (more or less) optimized for doing policy based routing, so it doesn't connect to the TCP/IP stack. The application is traffic management. We are expecting very high bandwidth traffic management. The NetBSD driver switched 20K new connections a second before running into our PCI bottleneck (Hence the need to move to FreeBSD). The forwarding is handled in hardware, so we only get hit during the establishment of new connections. > What happens if you just probe the thing, and don't try > to attach or detach at all (always fail the probe, but > printf when it would have been successful)? You implied > that you did this, but it wasn't entirely clear that you > had not attached it. > No problems. Works fine... > Really, merely attaching a device should do nothing. If > it's a PCCARD device, and it just looks like a PCI device > because you haven't included all the information, it > could easily be the pccard code. > > You also didn't say on which version of FreeBSD you are > doing this (posting to -hackers really doesn't identify > the version very well 8-)). > Sorry, I tried 4.3 4.4 and 4.5. I will keep this point in mind on future posts. I have largely been lurking. > >> 3. When doing the full initialization of the device (which works >> in NetBSD) there are also the SAME failures as doing no >> initialization at all of the hardware (as seen in the samples >> posted). > > I recommend looking at the if_tx.c driver, and using that > as a guide, since it does some of the strange stuff you > seem to need to do, and it (apparently) works. > I will do this. I had been using if_fxp.c as an example. > It might just be that you are setting PCI_COMMAND_MASTER_ENABLE, > rather than setting: > > command |= PCI_COMMAND_IO_ENABLE | PCI_COMMAND_MEM_ENABLE | > PCI_COMMAND_MASTER_ENABLE; I can't use the IO mode because the registers of the crossbar are not available there. The only register that is there is also in MEM access mode too. > >> 4. The device driver does not use MBUFS at all. > > Not relevent, then... though if it's a network driver, it's > obligated to use mbufs at some point. > See above... Naturally I would like to say more about the internal structure of the driver, but I am not able because of confidentiality or as they say here "Datenschutz". > > As a final "look for zebras", I'll note that perhaps the > problem is not in your driver at all, and that the problem > is actually in another driver. > > The way this could work is that if the device shared PCI > interrupts with a rather bogus driver, then you could be > locking in the bogus driver as a result of having given > it an interrupt from the galnet at a time that it was not > able to properly field the interrupt without failure (e.g. > perhaps the interrupt notification is non-atomic). To avoid > this during driver developement, you should always make sure > that your dmesg shows your experimental driver on its own > interrupt. If it doesn't, you should probably juggle your > cards until the interrupt is not shared, or even consider > simply disabling the driver that is sharing the interrupt, > if it is not an important device to the developement > process, and you don't want to go card-juggling. > I found this to be the case at least once and this might be a point to investigate. But having the Realtek driver installed causes the PCI memory to be overlapped between the devices. We handle shared interrupts OK (just check our cause registers and do nothing if it isn't for us). It has worked in the Netbsd world this way OK. But for efficiency reasons I always locate a slot where I can plug the device so that no sharing occurs. Otherwise I would get an interrupt for every network packet. > This implies to me that: > > o It's a network driver > o It works for a while and breaks > o The FreeBSD operation is unexpectedly fast > > Together, this indicates that if you have the driver running > and it locks up the system later, that you might be sharing > an interrupt with your card, and that it might be your own > interrupt routine which is treating someone else's interrupt > on a shared interrupt as if it's your own, and breaking on > that count. > > The "much faster" sort of implies that the other interrupt > is causing your driver to poll your device, so the "FreeBSD > is faster" effect you are seeing is just an illusion caused > by the bad code. I can say in response: o No sharing of interrupts (that I can see in 'dmesg'). o There is no blocking to read the cause register of the crossbar. If nothing is set, we just leave immediately. o As to the much faster. It seems a different case. During initialization we read and write memory on the switch devices. This requires a PCI data transfer to send the command to the switch (to read the memory) and a response. We read every 64 words (16MB Ram) and it takes 3-4 minutes on NetBSD. The same code in the initalization routine on FreeBSD is so quick it almost cannot be counted. We also see this as an impact in establishing new connections, since establishing a new connection requires 16 words to be transferred to the switch memory. At least on Netbsd, interrupts are not enabled during the initialization (during attach (nitro_init)). so would not be subjected to the sharing of interrupt situation. > > Alternately, you could ask Bill Paul, since he's a better > choice than me on this sort of thing, anyway. 8-) 8-). > > Hope this is useful, even if it doesn't come right out > and say "here's a patch". > Fine enough! I have been *VERY* pleased at the response I have been getting in the FreeBSD world. I had not had that much success in the NetBSD world at all (with the exception of a new notable people). I was even snubbed by Mr. Wasabi himself. So going forward I won't be doing any new work on NetBSD. Thanks again! Andy Sporner > -- Terry > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message