Re: CARP as a module; followup thoughts
Hi, Will Andrews wrote: Hello, I've written a patch (against 8.0-CURRENT as of r191369) which makes it possible to build, load, run, & unload CARP as a module, using the GENERIC kernel. It can be obtained from: http://firepipe.net/patches/carp-as-module-20090421.diff There's no need to implement the in*_proto_register() stuff in that patch, you should just be able to re-use the encap_attach_func() functions. Look at how PIM is implemented in ip_mroute.c for an example. Other than that it looks like a good start... but would hold off on committing as-is. the more general case of registering a MAC address on an interface should be considered. cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
The following reply was made to PR kern/132722; it has been noted by GNATS. From: Bruce M Simpson To: John Hay Cc: Matthias Apitz , freebsd-net@freebsd.org, Sam Leffler , "Sean C. Farley" , bug-follo...@freebsd.org Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work Date: Tue, 24 Mar 2009 01:08:33 + John Hay wrote: > I found doing a -bgscan before it happens, make it not happen. I now > have -bgscan in my rc.conf. > That's exactly the workaround I needed. Thanks John. As Sam points out, the root fix is probably already in HEAD; it would be nice to find time to backport, but this works for us for now as a workaround (we are just using ath0 as a STA for testing in the lab at the moment, it is likely we will use hostap later). cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
John Hay wrote: I found doing a -bgscan before it happens, make it not happen. I now have -bgscan in my rc.conf. That's exactly the workaround I needed. Thanks John. As Sam points out, the root fix is probably already in HEAD; it would be nice to find time to backport, but this works for us for now as a workaround (we are just using ath0 as a STA for testing in the lab at the moment, it is likely we will use hostap later). cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/124282: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value
bru...@freebsd.org wrote: Synopsis: [libc] socket(2): INP_PORTHIGH and INP_ONESBCAST share same value Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: brucec Responsible-Changed-When: Mon Mar 23 21:45:54 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). rwatson@ saw this crop up in -CURRENT and I believe he has a fix. Not sure about MFC but it clearly needs to get fixed... cheers, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
ath0 apparent silent disassociation
[Repost without attachment] OK. We've managed to reproduce this set of symptoms now in our work area. [If anyone needs to see a pcap, please Cc: me offlist.] Timebase: beginning of the pcap is in sync with a bringup from single-user mode; the tcpdump runs in the background from init whilst the system is brought up. OK, so I timed the apparent loss of connectivity as 6m 30s from that point I hit the stopwatch, to when I hit it again when the AP's Web GUI no longer shows the STA affected as being associated. Obviously such a timing is subject to human/visual jitter, and how often Netgear's firmware pulls the STA association list from the AP into the web GUI. What stands out in the pcap is that 302.291s in (almost 5m exactly), the STA (ath0) sends an IEEE 802.11 NULL frame to the AP with the PWR MGT bit set (I'm going to sleep!). This more or less coincides with a normal beacon from the Netgear AP. It does not advertise Auto Power Save Delivery (apsd), that bit is 0. This is puzzling as we don't enable power management by default. As I understand it, this may be an AP feature in some environments... I can try reproducing this with an explicit 'ifconfig ath0 -powersave' and see if it reoccurs. You'll see that after this NULL frame is sent, there is another Probe Request, and the Netgear AP does Probe Respond, but this makes no difference (I ended the capture around 150s after the NULL frame was sent). At this point we can't send traffic from the ath0, or rather, the AP is acting as though it never even heard the STA. The STA learns the AP's IP address/MAC mapping through passive ARP -- we still see broadcasts on the SSID -- but the AP has started to totally ignore the STA, and seemed to have ignored its ARP requests also. We are using MAC address ACL control with this AP, and the ath0 affected is definitely listed in its ACL table, configured up, rebooted etc. It is as though the STA is entering power saving mode when not explicitly told to, and the AP is not waking up the STA as it should. If any more information needed, or where to look, please let me know what's involved (I MFCed the change after all, so I'll help where I can until I'm on holiday this week...) My lab colleague is just working around this with 'ping ' for now, that keeps things up, as does OpenVPN... cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
The following reply was made to PR kern/132722; it has been noted by GNATS. From: Bruce M Simpson To: Matthias Apitz Cc: bug-follo...@freebsd.org, Sam Leffler , freebsd-net@freebsd.org, "Sean C. Farley" Subject: Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work Date: Mon, 23 Mar 2009 18:44:42 + Matthias Apitz wrote: > I went today evening with my EeePC and CURRENT on USB key > to that Greek restaurant; DHCP does not get IP in CURRENT either; > this is somehow good news, isn't it :-) > This may be orthogonal, but: A lab colleague and I have been seeing a sporadic problem where the ath0 exhibits the symptoms of being disassociated from its AP. We are running RELENG_7 on the EeePC 701 since the open source HAL merge. In the behaviour we're seeing, we don't see any problem with the initial dhclient run, the ath0 just seems to get disassociated within 5-10 minutes of associating. If we leave 'ping ' running in the background, we don't see this problem. We have yet to produce a tcpdump to catch it 'in the act' and observe the DLT_IEEE80211 traffic when it actually happens, I have only seen the symptoms. The AP does not show the EeePC units as being associated any more at this point, but ath0 still shows 'status: associated'. The AP involved is a Netgear WG602 V2, and is running the vendor's firmware. I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot (including dhcp and anything we bump into). cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/132722: [ath] Wifi ath0 associates fine with AP, but DHCP or IP does not work
Matthias Apitz wrote: I went today evening with my EeePC and CURRENT on USB key to that Greek restaurant; DHCP does not get IP in CURRENT either; this is somehow good news, isn't it :-) This may be orthogonal, but: A lab colleague and I have been seeing a sporadic problem where the ath0 exhibits the symptoms of being disassociated from its AP. We are running RELENG_7 on the EeePC 701 since the open source HAL merge. In the behaviour we're seeing, we don't see any problem with the initial dhclient run, the ath0 just seems to get disassociated within 5-10 minutes of associating. If we leave 'ping ' running in the background, we don't see this problem. We have yet to produce a tcpdump to catch it 'in the act' and observe the DLT_IEEE80211 traffic when it actually happens, I have only seen the symptoms. The AP does not show the EeePC units as being associated any more at this point, but ath0 still shows 'status: associated'. The AP involved is a Netgear WG602 V2, and is running the vendor's firmware. I'll try to get set up with 'tcpdump -y ieee802_11' from initial boot (including dhcp and anything we bump into). cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: IGMP+WiFi panic on recent kernel - in igmp_fasttimo()
Sam, Sam Leffler wrote: This patches avoids the crash. Not sure how ifma_protospec is supposed to be handled so I'm not committing it. Thanks for this. I have a test machine ready to be prepped but it's missing a CF card (I have none) so need to pick one up from a friend. I have a pci-cardbus adapter + a ral(4) CardBus card, but no CardBus ath(4) -- I imagine this ain't specific to ath(4) so that should be fine. I'll try to look at this Sun/Mon, I have a -CURRENT image built for the 1U box now that just needs bootstrapping (it has a CF slot). thanks, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Eygene Ryabinkin wrote: ... I wanted to stress only one point: simple 'kldunload ' and 'kldload ' makes devices to flip for Yony's case. This means that unless some PCI hotplug stuff is here (which I don't believe to be present, because no physical cards are touched and there is actually a small amount of PCI hotplug support in FreeBSD), no physical PCI devices get added or removed from the PCI child tree. It looks like that something goes wrong during the PCI tree reprobe on the driver module loading. BTW: Thanks for looking further at the software layer first. VIM is a wee bit easier to use than a bus analyzer. Most motherboards don't support PCI geographical addressing, so... I wager it's the network driver code which may be the source of the problem, based on your analysis! If this code just doing a blind bump of an instance count and using that as a "unit number"... well, that's OK and expected for software virtual devices, but is counter-intuitive for something like hardware. But I don't have any mtnic source, so this is pure speculation on my part. Correct me if I am wrong, but pci_driver_added from /sys/pci/pci.c will invoke device_get_children() to get the list of the attached devices, and for PCI case the list should be static. Yup, that's right. I guess that when Yony will enable verbose boot and will show us kernel messages from two successive kldunload/kldload sequences, we will get some additional information about what's going on. Hopefully he will chime in... [bms does some google searching *before* he thinks about throwing his toys out of the pram at the Orignal.Poster.] ding :-) [a light bulb above bms' head] So... Yony. you're writing a driver. Maybe there's a bug in it? That's cool, dude. Hope it's a nice card and you plan on sharing the sweets with the rest of the class. ;-) But seriously, please mention that you are writing a driver in general questions you might ask about the whole system, otherwise, FreeBSD volunteers will run around going "Is core code broken?" and that's not so good for community stress levels as a whole. with lemonade, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Yony, Bruce M. Simpson wrote: And how come the unit number is given an arbitrary value? Is there a good reason for that? ... In your case I'm not sure why your two cards would flip order. Could it be how your BIOS and hardware set up the PCI IDSEL lines at boot? If this is the case on your system, then you really need to provide more data about your hardware, i.e. motherboard, BIOS, vendor information etc. as others point out. Based on the data you've provided about the issue to date, my best guess is that something in the above is different on your system (which is why I mentioned IDSEL lines -- the mechanism PCI uses to actually assign bus numbers electrically). Normally the behaviour of FreeBSD's bus probes is well known -- nexus is walked for child buses, then these buses are plumbed into NEWBUS, e.g. cpu0...cpuN on nexus itself, PCI buses, and PCI subordinate buses in that order. * You mention you don't encounter the issue with Linux, but you may already be aware that udev can tie driver instance number(s) to specific MAC addresses, although this process isn't fully automatic and any given distro may or may not create the persistent udev rules on a first run -- so this is comparing apples with oranges. * [PCI-Express is a special case though, and I've had to sit down and do some work with commercial clients to make sure their appliance was able to detect devices being in particular slot numbers. Again, though, it's just as subject to the PCI enumeration order further up on the bus hierarchy as non-PCI-Express drivers.] So your issue may not be a simple matter of "this seems wrong, this doesn't work", though I am sorry to hear it isn't working for you right now. There are a lot of dynamic factors in the overall picture of the system, and what seems to work as expected for many users, may not be working for you, and we really need basic hardware information, when folk see things like this happening, for any volunteer(s) out there to come up with the right solution, let alone the true picture of what's actually going on in your specific case. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: howto determine network device unit number? device.hints?
Yony Yossef wrote: Thanks for the explanation. So there's no way to determine this in advance.. I must build a script that contains my own mapping between MAC addresses and the wanted interface names and run it after each driver load, rename the interfaces if necessary. It seems quite wrong, don't you agree? And how come the unit number is given an arbitrary value? Is there a good reason for that? Normally the PCI probe runs in the opposite direction from that of Linux. It's largely to do with how the NEWBUS code walks the PCI bus. From a systems management point of view, yeah, it's irritating, however it would probably take more effort (i.e. kernel code) to try to patch it to work differently, and not everyone has free time to sit down and patch the kernel. That and (unlike Solaris) there is no *direct* mapping between the card's driver number on the bus and its network driver number. In your case I'm not sure why your two cards would flip order. Could it be how your BIOS and hardware set up the PCI IDSEL lines at boot? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Bruce M. Simpson wrote: Peter Steele wrote: ... I personally like this idea, but I'm not sure I can sell it to the others. Are there any restrictions to these 169.254.x.y addresses? 169.254.0.0/16 must never appear outside a link -- it is strictly scoped to that link. P.S. I checked in a change to ip_forward() a while back which enforces this, as forwarding such traffic between interfaces without NATting it or otherwise proxying it is a really bad idea (and also breaks the IPv4 LL RFC). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: ... I personally like this idea, but I'm not sure I can sell it to the others. Are there any restrictions to these 169.254.x.y addresses? 169.254.0.0/16 must never appear outside a link -- it is strictly scoped to that link. Currently the IPv4 BSD stack has no concept of link-scoped addresses, but IPv6 does. Link is a realized concept there because of KAME's support for the % syntax. Internally, interface indexes get used. In practice this shouldn't be an issue as long as you can guarantee different addresses are used for the 169.254.0.0/16 block on each interface, however, it would mean any app using sockets would need to explicitly bind to the local address to ensure the correct interface is used. Furthermore, we effectively need to be able to support multiple next-hops for the 169.254.0.0/16 prefix, otherwise we can support only one such interface w/o significant kernel code rewrites. So, really, LL may not buy you anything at all, and it's likely you need to go straight to pcap for your app. These restrictions have existed for years, and the fact that they haven't been addressed has largely been because there has been no community strategy to deal with it. I speculate some BSD-using organisations might have already solved these problems, however, without evidence (and code sharing), that's pure speculation. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: The folk who point out that link-local addresses could be used, have an interesting suggestion which might work for you. It's definitely interesting, but it is very likely that some of our customers will want to be able to set their own IP ranges and not be limited to 169.254/16. So we need a more generic solution. Sounds like it's bpf/pcap city for you guys. A similar bump-in-the-stack to SO_BINDTODEVICE, e.g. let's call it IP_SENDIF has been on the drawing board, but it needs appropriate security screening -- the ability to bypass the forwarding tables, whilst specifying an interface e.g. by index or name, would be desirable only for certain privileged processes. BTW: If you guys are already looking at scapy, you may also wish to give pcs.sourceforge.net a look as an alternative. It is a Python project which I did some hacking on with George Neville-Neill who started it. It has BPF/PCAP support out of the box and has a number of powerful features, including a packet-level expect() facility, which works in a very similar manner to pexpect (Python expect for text streams). I added a scapy-like concatenation syntax ('/' operator) to it as that makes plugging packet chains together that much easier. I have the beginnings of an IGMPv3 test suite in my home repo written using PCS, it uses pcap capture. I imagine a DHCP like protocol could easily be implemented using PCS too. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: ... It's really a matter of time. We didn't anticipate limited broadcast being broken in FreeBSD and we're scrambling to come up with a solution. To be quite frank I haven't done anything with IPv6 before so it would be more research to get up to speed on this option. It seems our best option is scapy, which unfortunately I also haven't used before... It's not broken -- it has always been this way in all BSD derived networking stacks. Limited broadcast addresses just don't contain any information about where the datagram should go, and this is the case in all other implementations. They are similar to multicast addresses in that regard. Linux has a knob SO_BINDTODEVICE which is partly there to workaround this problem, however it isn't the ideal semantic fit. The folk who point out that link-local addresses could be used, have an interesting suggestion which might work for you. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Having problems with limited broadcast
Peter Steele wrote: .. Based on the discussion in the link above, it doesn't seem like the problem was entirely resolved by the patches mentioned in this thread. Has anything been done since this discussion took place. Surely there must be a way to get limited broadcast to work under FreeBSD. You will need to go to the pcap layer to send limited broadcasts w/o any IPv4 addresses configured in a BSD stack for now. If you have an IP on the interface, you can just use IP_ONESBCAST. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: last call for L2/L3 rewrite code review
Hi, Just skimming this I notice it uses the if_afdata[AF_INET] pointer purely for lltbl purposes; this clashes with the IGMPv3 code drop. Please look in the bms_netdev branch, where I introduce a 'struct ip_ifinfo' to make more general use of that slot. IGMPv3 needs to store per-interface state for AF_INET, so this slot really needs to be shared with other AF_INET stuff. Looks like it needs to be updated for VIMAGE also, hopefully others more familiar with this can help -- I am busy enough with non-programming activity as it is to get up to speed on this, although I have at least managed to print Julian's write-up... Other than that, it looks like a much needed improvement and we are all very grateful for our work on this. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Heads up --- Thinking about UDP and tunneling
Hi, I am missing context of what Max's suggestion was, do you have a reference to an old email thread? Style bugs: * needs style(9) and whitespace cleanup. * C typedefs should be suffixed with _t for consistency with other kernel typedefs. * Function typedefs usually named like foo_func_t (see other subsystems) Have you looked at m_apply() ? It already exists for stuff like this i.e. functions which act on an mbuf chain, although it doesn't necessarily expect chain heads. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: how to program a driver?
[Resend to list for everyone] Espartano wrote: Actually i know how to program with C language in a basic level but i don't know nothing about hardware or computer organization, what topics i should study for gain knowledges about net-drivers ? or if someone can recommend me books about this topic i will be very thankful. The seminal work is TCP/IP Illustrated Volume 2 (Gary Wright and W. Richard Stevens, Addison-Wesley). Whilst dated it will give you an overview of how all the parts in the BSD networking stack fit together. It really needs to be updated, however enough things are in flux right now that summarising all the changes would be difficult until say after FreeBSD 8.0 dust is settled. For computer architecture, probably best to learn PC architecture these days -- x86 is here to stay, kids, and Netbooks are something of a reactionary response triggered by the One-Laptop-Per-Child (OLPC) project. In my day, I learned 68000 assembly and C on the Amiga. Hans-Peter Messmer's "The Indispensable PC Hardware Book" is a huge book which cost me about 50 GBP new when I first bought it -- I was working in a reasonably well paid job at the time, but it can be found second hand no doubt around the world. Cover to cover it will tell you what you need to know about how the PC architecture fits together, but if you need more detail e.g. on stuff like FreeBSD network drivers, again, it's best to refer back to the source code itself. Hope this helps. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to program a driver?
Espartano wrote: Actually i know how to program with C language in a basic level but i don't know nothing about hardware or computer organization, what topics i should study for gain knowledges about net-drivers ? or if someone can recommend me books about this topic i will be very thankful. Try "The Indispensable PC Hardware Book" by Hans-Peter Messmer for a general overview of PC architecture. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Vimage howto
Julian, Thank you (and Marko) very much for preparing this document. The VIMAGE import has had me at something of an impasse re: the IGMPv3 branch and clearly written documentation is a big help indeed. Julian Elischer wrote: Well not completely, but I've had a number of questions over the last few months about what it is, so, as Marko and I have written the following "how to virtualize your module" document, I've been directing people to it. After another couple of questions I think this could do with wider distribition.. Thank you also for providing it here on the list, as opposed to relying on Perforce alone. Whilst I understand committers rate p4 for experimental work in the FreeBSD sphere, sadly it is simply not accessible to the not-so-silent majority in the FreeBSD sphere who are not committers, which makes its continued use questionable at best. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: How to support an Ethernet PHY without ID registers?
Sepherosa Ziehau wrote: Are you sure you could read from BMSR? Return invalid value from BMSR is the usual cause of miibus attaching/probing failure. For ID1/ID2 reading, you could just fake some values in npe(4)'s miibus_readreg implementation. Thanks for the tip (from you and Pyun). I had to spoof the BMSR read to get npe(4) to attach just to begin with. For whatever reason the chip doesn't seem to respond on any of the PHY IDs which the Linux folk are using (5 and 4 for npe0 (-B) and npe1 (-C) respectively). I noticed the ucLinux folk needed a similar patch to force driver attach under Linux w/the IXP: http://mailman.uclinux.org/pipermail/uclinux-dev/2005-March/031419.html The switch pretty much disappears after npe(4) attaches, I don't see any activity lights or link lights at that point. This seems to happen after any mii register access. If I frob things to allow rlswitch to attach, by using hints and hacking if_npe.c, I can get dumps of the PHY register space, but it's all ones, suggesting that it failed at xScale register level -- that would suggest the PHY IDs are *wrong*, or something else isn't right. Pyun also suggested trying to manually take the PHYs out of power-down mode. I tried that with a code snippet I sent him, but still no dice. I can't even be sure that the PHYs are being addressed right. At this point I kind of have to go, whoah, wish I had a logic analyzer and grabbers! I believe the firmware configures the switch chip in a certain VLAN configuration which isn't meant to be disrupted, although Freecom's own SnapGear-based distro apparently does the right thing. I've looked through all of their GPL materials and cannot find the driver for the switch. I suppose one thing I could try is re-flashing the box with the official Freecom firmware, and using mii-diag to dump out what Linux thinks the registers are. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
How to support an Ethernet PHY without ID registers?
Hi, I have been trying to get FreeBSD onto the Freecom FSG3 Storage Gateway. It is an xScale based ARM system. Whilst the npe(4) driver appears to attach, the PHY does not. It is a Realtel RTL8305SB switch chip in dual miibus mode. Unfortunately the RTL8305SB does not have ID registers. The RTL8305SC does, but it's a totally different chip. We do have a driver in the tree for the RTL8305SC, however these chips are different enough for this to cause problems. Is there any way I could for example force ukphy(4) to attach? Note: Because there are no ID registers, mii_phy_probe_gen() WILL NOT work. It looks like I'd have to override this by hacking if_npe.c itself. Can anyone clarify? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Freeing an mbuf cluster
Yony Yossef wrote: Hi All, I'm trying to manually build an mbuf chain with clusters in various sizes. I'm doing it using the MGETHDR and MEXTADD macros, it works fine. Now I'm looking for the simplest way to free an mbuf cluster, since I want to free the clusters seperately. This function will be given as a parameter to MEXTADD. Is there a simple command like 'free(buf)' to free an mbuf cluster? You don't specify if you are trying to add the external storage from a pool you manage, in which case, you're on your own. m_free() for a cluster or mbuf should just "do the right thing". Since the UMA cleanup there are destructor functions which should free the mbuf or cluster using the right pool. m_freem() works on chains, of course. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Initialisation of a networking protocol
Hi Ryan, Did you initialize the .pr_init member of struct protosw for MPLS? AFAIK, MPLS does not use an outer IP header, so adding a struct ipprotosw won't work; they are similar structs however. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: lost routes
Giulio Ferro wrote: There are no messages in the logs, and no interface has been touched. Anyway, since there are a lot of routes and only one gets deleted I don't think it depends on interface changing (it would delete them all, wouldn't it?) Normally static routes only get touched if the state of the underlying ifp/ifa changes. There are paths in netinet which will cause routes to be deleted in this situation. Occasionally the idea of a floating static re-surfaces... look in the PR database with this term for possibly related reports. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
[EMAIL PROTECTED] wrote: ... I found no occurrences of the above in our code base. I used cscope to search all of src/sys. Are you aware of any occurrences of this? I have been using IFQ_MAXLEN to size buffer queues internal to some IGMPv3 stuff. I don't feel comfortable with a change which sizes the queues for both IPv4 and IPv6 stacks, from a variable which is obscured by a macro. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
Hi, I agree with the intent of the change that IPv4 and IPv6 input queues should have a tunable queue length. However, the change provided is going to make the definition of IFQ_MAXLEN global and dependent upon a variable. [EMAIL PROTECTED] wrote: Hi, It turns out that the last time anyone looked at this constant was before 1994 and it's very likely time to turn it into a kernel tunable. On hosts that have a high rate of packet transmission packets can be dropped at the interface queue because this value is too small. Rather than make a sweeping code change I propose the following change to the macro and updating a couple of places in the IP and IPv6 stacks that were using this macro to set their own global variables. This isn't appropriate for many uses of ifq's which might be internal to a given driver or subsystem, and which may use IFQ_MAXLEN for convenience, as Ruslan has pointed out. I have code elsewhere which does this. Can you please do this on a per-protocol stack basis? i.e. give IPv4 and IPv6 their own TUNABLE queue length. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ACE on FreeBSD?
Hi, I looked at ACE years and years ago (~1997) when Doug Schmidt was first promoting the ideas behind it. The whole Reactor/Proactor split pretty much hangs on the event dispatch which your particular OS supports. The key observation is whether your target OS implements events in an edge-triggered or level-triggered way; I am borrowing definitions from electronic engineering here. You could do a straight port with Proactor, but performance will probably suck, because both FreeBSD (and Linux, I believe) need to emulate POSIX asynchronous I/O operations. Reactor will generally "fare better" on UNIX derived systems such as FreeBSD and Linux, because its event handling primitives are geared towards the level-triggered facilities provided by select(). In Windows, Winsock events use asynchronous notifications which may be tied to Win32 EVENT objects, and the usual Kernel32.DLL thread primitives are used around this. This makes Proactor more appropriate in that environment. XORP does some similar stuff to ACE under the hood to support the native socket facilities of both Windows and FreeBSD/Linux. It's hybridized but it behaves more like Reactor because we run in a single thread, and you have to force Winsock's helper thread to run, by preempting you, using some file handle and socket tricks. I don't currently know about stability of ACE on FreeBSD. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
Chris Buechler wrote: This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. ICMP echo and echo replies do have "sessions" of sorts, at least unique identifying fields - identifier and sequence number. These fields do exist in ICMP, and as you point out, they are sometimes used to implement session-like behaviour. Many NAT implementations use them in this way. However there is no way of specifying them in a bind() call -- ICMP can only be received on a raw socket, and raw sockets will not filter these things on behalf of a user process, nor have they ever done to the best of my knowledge. They are not part of the address structures for a raw socket (SOCK_RAW, PF_INET, * or IPPROTO_ICMP). This was opened by a pfSense maintainer because it's a change in behavior from 6.x releases where this was never an issue, and is something we feel is a regression. Robert has replied outlining a few situations where the behaviour might have changed. Raw sockets do support binding laddr/faddr, there is the possibility this could have changed, however there is no notion of processes "owning" streams of ICMP messages, this has never been part of the ICMP protocol and to think in these terms is misleading. It sounds to me as though the application is relying on a form of filtering which isn't happening, and the way to track this down is to carefully note what, if anything, changed in the expected behaviour between releases. For example, does the application bind() to any given host addresses? This is the only form of filtering, apart from multicast SSM, that raw sockets would support, and SSM ain't in the tree [yet]. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
The following reply was made to PR kern/127528; it has been noted by GNATS. From: "Bruce M. Simpson" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: freebsd-net@FreeBSD.org, [EMAIL PROTECTED] Subject: Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process. Date: Sun, 21 Sep 2008 23:12:30 +0100 [EMAIL PROTECTED] wrote: > Old Synopsis: icmp socket receives icmp replies not owned by the process. > New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the > process. > This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. The networking stack will only selectively dispatch ICMP traffic based on two conditions: 1. ip_proto number (raw sockets may selectively bind to a protocol) and 2. multicast group membership (not applicable in this instance). > It also shows that both echo requests have different identifiers in the id field which should keep the icmp streams seperated. There is absolutely no requirement for the kernel code to look at the ID field, beyond reporting it to consumers of the SOCK_RAW interface. This PR can be closed, the submitter should consult the pfSense maintainers. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/127528: [icmp]: icmp socket receives icmp replies not owned by the process.
[EMAIL PROTECTED] wrote: Old Synopsis: icmp socket receives icmp replies not owned by the process. New Synopsis: [icmp]: icmp socket receives icmp replies not owned by the process. This PR is bogus because: ICMP has no concept of datagrams being "owned" by a process. There is no field in the ICMP protocol which differentiates ICMP "sessions" on a per-process basis, and this is because ICMP has no concept of "sessions" -- ICMP messages are directed at IP endpoints. The networking stack will only selectively dispatch ICMP traffic based on two conditions: 1. ip_proto number (raw sockets may selectively bind to a protocol) and 2. multicast group membership (not applicable in this instance). > It also shows that both echo requests have different identifiers in the id field which should keep the icmp streams seperated. There is absolutely no requirement for the kernel code to look at the ID field, beyond reporting it to consumers of the SOCK_RAW interface. This PR can be closed, the submitter should consult the pfSense maintainers. thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: ... By the way, would you want someone to implement 'show' support for FreeBSD's route implementation? I can give it a go now. :-) For sure, we'd be very happy to see a patch like that. Many thanks BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Problem with IFDATA_DRIVERNAME sysctl
Bruce M Simpson wrote: It looks like the switch..case in that path could be fubar'd by the compiler as there are not break statements for each distinct case label, could this be due to gcc friendly fire? Possibly false alarm or PEBKAC, I wasn't checking return values right in some of my code, although we should probably have "break" there anyway. Patch against RELENG_7_0. --- if_mib.c.orig 2008-09-10 00:31:25.0 +0100 +++ if_mib.c2008-09-10 00:32:15.0 +0100 @@ -90,6 +90,7 @@ switch(name[1]) { default: return ENOENT; + break; case IFDATA_GENERAL: bzero(&ifmd, sizeof(ifmd)); @@ -136,6 +137,7 @@ error = SYSCTL_IN(req, ifp->if_linkmib, ifp->if_linkmiblen); if (error) return error; + break; case IFDATA_DRIVERNAME: /* 20 is enough for 64bit ints */ @@ -152,6 +154,7 @@ error = EPERM; free(dbuf, M_TEMP); return (error); + break; } return 0; } ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Problem with IFDATA_DRIVERNAME sysctl
Whenever I call this sysctl, I get an errno of EPROGNOTAVAIL from sysctl(): »···name[0] = CTL_NET; »···name[1] = PF_LINK; »···name[2] = NETLINK_GENERIC; »···name[3] = IFMIB_IFDATA; »···name[4] = ifindex; »···name[5] = IFDATA_DRIVERNAME; »···len = IFNAMSIZ; »···if (sysctl(name, 6, dname, &len, NULL, 0) == -1) { »···»···warnc(EX_OSERR, "cannot obtain driver name for ifname %s", »···»···ifname); »···»···return (-1); »···} The ifindex is valid. "dname" is a pointer to an IFNAMSIZ sized buffer. This problem is happening on a 7.0-RELEASE system. It looks like the switch..case in that path could be fubar'd by the compiler as there are not break statements for each distinct case label, could this be due to gcc friendly fire? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to read dynamic data structures from the kernel (was Re: reading routing table)
Luigi Rizzo wrote: do you know if any of the *BSD kernels implements some good mechanism to access a dynamic kernel data structure (e.g. the routing tree/trie, or even a list or hash table) without the flaws of the two approaches i indicate above ? Hahaha. I ran into an isomorphic problem with Net-SNMP at work last week. There's a need to export the BGP routing table via SNMP. Of course doing this in our framework at work requires some IPC calls which always require a select() (or WaitForMultipleObjects()) based continuation. Net-SNMP doesn't support continuations at the table iterator level, so somehow, we need to implement an iterator which can accomodate our blocking IPC mechanism. [No, we don't use threads, and that would actually create more problems than it solves -- running single-threaded with continuations lets us run lock free, and we rely on the OS's IPC primitives to serialize our code. works just fine for us so far...] So we would end up caching the whole primary key range in the SNMP sub-agent on a table OID access, a technique which would allow us to defer the IPC calls providing we walk the entire range of the iterator and cache the keys -- but even THAT is far too much data for the BGP table, which is a trie with ~250,000 entries. I hate SNMP GETNEXT. Back to the FreeBSD kernel, though. If you look at in_mcast.c, particularly in p4 bms_netdev, this is what happens for the per-socket multicast source filters -- there is the linearization of an RB-tree for setsourcefilter(). This is fine for something with a limit of ~256 entries per socket (why RB for something so small? this is for space vs time -- and also it has to merge into a larger filter list in the IGMPv3 paths.) And the lock granularity is per-socket. However it doesn't do for something as big as a BGP routing table. C++ lends itself well to expressing these kinds of smart-pointer idioms, though. I'm thinking perhaps we need the notion of a sysctl iterator, which allocates a token for walking a shared data structure, and is able to guarantee that the token maps to a valid pointer for the same entry, until its 'advance pointer' operation is called. Question is, who's going to pull the trigger? cheers BMS P.S. I'm REALLY getting fed up with the lack of openness and transparency largely incumbent in doing work in p4. Come one come all -- we shouldn't need accounts for folk to see and contribute what's going on, and the stagnation is getting silly. FreeBSD development should not be a committer or chum-of-committer in-crowd. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: Why don't you just use XORP's FEA code? It already does all this under a BSD-type license. I was not aware of it. What does it do? Is it portable across other OSes or is it *BSD specific? XORP's FEA process is responsible for talking to the underlying forwarding plane. It supports *BSD, Linux, MacOS X, and Microsoft Windows. Over the last year there was a refactoring where the forwarding table management got split into plugin-like modules. It is written in C++ although it's likely this split might make integration into other projects easier. Normally that support all goes into a single process, rather than being linked into many. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: ... I was going through the FreeBSD and NetBSD documentation and the FreeBSD sources of netstat and route. I was suprised to see that while NetBSD's route implementation has a 'show' command, FreeBSD does not offer any such thing. Moreover it seems that one can not read the entire routing table using the PF_ROUTE sockets and RTM_GET returns information pertaining to only one destination. This suprised me because one can do such a thing with the Linux kernel's RTNETLINK. Is there a reason why this is so? Or is reading from /dev/kmem the only way to get a dump of the routing tables? You want 'netstat -rn' to dump them, this is a very common command which should be present in a number of online resources on using and administering FreeBSD so I am somewhat surprised that you didn't find it. P.S. Look in the sysctl tree if you need to snapshot the kernel IP forwarding tables. You can use kmem, but it is generally frowned upon unless you're working from core dumps -- kernels can be built without kmem support, or kmem locked down, etc. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: reading routing table
Debarshi Ray wrote: I am implementing a library/utility which basically encompasses the features of the traditional route utilities and those of newer tools (like ip from iproute2), which are mostly specific to a particular kernel. The overpowering objective is to make the library/utility work uniformly across all different kernels, so that programs like NetworkManager have a portable library/utility to use instead of the Linux-kernel specific ip which is now being used. Why don't you just use XORP's FEA code? It already does all this under a BSD-type license. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [CFT/R] IPv4 source address selection
Bjoern A. Zeeb wrote: Hi, I have a patch, that was inspired by work from Y!, to do porper IPv4 source address selection for unbound sockets (with multi-IP jails). Hi, This kinda overlaps with some other ideas I'd like to see go in. It looks good and if it's already been tested, it should probably go in anyway as it disentangles the logic and puts it in a separate function. I'm thinking we may wish to use criteria other than interface or jailed socket to select source address. I should point out though that we picked some stuff up from KAME to do source address selection but it's not in the IPv4 stack. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Code review request
M. Warner Losh wrote: I've been shepherding this patch in my p4 tree for a long time. It removes the obsolete support for other systems in if_spppsubr.c. Is there a reason I shouldn't commit this? Looks fine to me. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: Somehow the data that the device needs to do the proper checksum offload is getting trashed here. Now, since it's clear we need a writable packet structure so that we don't trash the original, I'm wondering if the m_pullup() will be sufficient. If it's serious enough to break UDP checksumming on the wire, perhaps we should just swallow the mbuf allocator heap churn and do the m_dup() for now, but slap in a big comment about why it's there. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: I gather you mean that a fast link on which also we're looping back the packet will be an issue? Since this packet is only going into the simloop() routine. We end up calling if_simloop() from a few "interesting" places, in particular the kernel PIM packet handler. In this particular case we're going to take a full mbuf chain copy every time we send a packet which needs to be looped back to userland. I was actually hoping, as the person who last hacked this code, that you might have a suggestion as to a "right" fix. It's been a while since I've done any in-depth FreeBSD work other than hacking on the IGMPv3 snap, and my time is largely tied up with other work these days, sadly. It doesn't seem right to my mind that we need to make a full copy of an mbuf chain with m_dup() to workaround this kind of problem. Whilst it may suffice for a band-aid workaround, we may see mbuf pool fragmentation as packet rates go up. However we are now in a "new world order" where mbuf chains may be very tied to the device where they've originated or to where they're going. It isn't clear to me where this kind of intrusion is happening. In the case of ip_mloopback(), somehow we are stomping on a read-only copy of an mbuf chain. The use of m_copy() with m_pullup() there is fine according to the documented uses of mbuf(9), although as Luigi pointed out, most likely we need to look at the upper-layer protocol too, e.g. where UDP checksums are also being offloaded. Some of the code in the IGMPv3 branch actually reworks how loopback happens i.e. the preference is not to loop back wherever possible because of the locking implications. Check the bms_netdev branch history for more info. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
[EMAIL PROTECTED] wrote: The only thing i can think of is that it's the UDP checksum, residing beyond hlen, which is overwritten somewhere in the call to if_simloop -- in which case perhaps a better fix is to m_pullup() the udp header as well ? It is the checksum that gets trashed, yes. ... The m_*() routines actually have reasonable comments, it just seems the wrong one was used here. Actually, m_copy() has been legacy for some time now -- see comments. I'd be concerned that the change to m_dup() (which makes a full mbuf chain copy) rather than m_copym() (which bumps refcounts) is going to eat into the mbuf clusters on fast links, though it's an easy band-aid for the problem. I agree with Luigi that some of the API contract for mbuf(9) doesn't hold any more now that we have TSO and other offload. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: BPF problems on FreeBSD 7.0
Robin Sommer wrote: Hi all, we're seeing some strange effects with our libpcap-based application (the Bro network intrusion detection system) on a FreeBSD 7-RELEASE system. As the application has always been running fine on 6.x, we're wondering whether this might be triggered by any of the changes that went into 7. ... I'm wondering whether anybody here has seen something similar or might have an idea where to start looking for the cause. Any ideas? One place to start might be: netstat -B output in 7.x (I *think* this got MFCed), this will let us see what the drop count is for the Bro process, and what the flags are for the open BPF descriptors in the system. I'm not hot on current BPF internals, but I hazard a guess this is related to BPF descriptor buffering -- an area where there have been changes, some of which I've eyeballed. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HEAD UP: non-MPSAFE network drivers to be disabled (was: 8.0 network stack MPsafety goals (fwd))
Robert Watson wrote: An FYI on the state of things here: in the last month, John has updated a number of device drivers to be MPSAFE, and the USB work remains in-flight. I'm holding fire a bit on disabling IFF_NEEDSGIANT while things settle and I catch up on driver state, and will likely send out an update next week regarding which device drivers remain on the kill list, and generally what the status of this project is. Goliath needs to get stoned, it's been a major hurdle in doing IGMPv3/SSM because of the locking fandango. I look forward to it. [For those who ask, what the hell? IGMPv3 potentially makes your wireless multicast better with or without little things like SSM, because of protocol robustness, compact state-changes, and the use of a single link-local IPv4 group for state-change reports, making it easier for your switches to actually do their job.] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Route messages
Paul wrote: Get these with GRE tunnel on FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT 2008 :/usr/obj/usr/src/sys/ROUTER amd64 But do not get them with 7.0-RELEASE Any ideas what changed? :) Wish there was some sort of changelog.. # of messages per second seems consistent with packets per second on GRE interface.. No impact in routing, but definitely impact in cpu usage for all processes monitoring the route messages. RTM_MISS is actually fairly common when you don't have a default route. Messages which get enqueued don't necessarily get delivered -- and very few processes actually listen to the routing socket actively like this, so I wouldn't worry about it. If it's a real concern for you then you could try hacking in a sysctl to tell the radix trie code not to issue RTM_MISS messages on the routing socket. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [Removal of mrouted in FreeBSD-7.0]
Archimedes S. Gaviola wrote: ...if ever there's a way to implement IP multicasting with PIM-SM and or PIM-DM in the FreeBSD base system, how big is the work would be? What are the things that needs to be considered if we are going to implement PIM-SM and or PIM-DM to the current FreeBSD network subsystem? The goal is to be able FreeBSD to provide native IP multicast using PIM just like the way DVMRP protocol is implemented before as part of the base system. I really think the remit of multicast routing is too wide to be addressed in the base system, which is why projects like XORP and pimdd exist -- it doesn't strike me as a good fit for the FreeBSD base system. Separate projects already exist for this. If someone is willing to commit to all the man-hours involved in the reimplementation and ongoing support of such a thing, blimey... they must have a lot of free time! cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Probable Bug in tcp.h
Marc Lörner wrote: off0 is 0x14 => no problem with that but address of ip is 0xe00021c8706e => not correct aligned to 32-bits Can anyone tell me, where ip is allocated, so I can do a little bit more research? It really depends on the context! That's a very wide ranging question. It depends upon whether mbuf chains are flowing up or down the stack, whether or not the network driver supports checksum or header/segment offload, and whether or not it is using zero-copy. Zero copy transmit normally only has mmu cost if the mbuf (from userland) can be mapped to a location where headers are easily prepended. Zero copy receive is more expensive and complex as it requires that the DMA engine on the network interface card supports header splitting. The FreeBSD stack is known to have some issues with mbuf alignment and architectures other than those in its Tier 1. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [Removal of mrouted in FreeBSD-7.0]
Archimedes S. Gaviola wrote: Hi! I have just read from the FreeBSD-7.0 release notes http://www.freebsd.org/releases/7.0R/relnotes.html that the mrouted multicast routing protocol (DVMRP implementation) has been removed from the base system. I want to know what multicast routing protocol will served as replacement to this? The KAME snap kit have PIM-SM and PIM-DM implementations but are specific only to IPv6. DVMRP is something of a legacy protocol now, most deployments use PIM-SM. mrouted is still available in ports as other folk have pointed out If you want a freely available router with full multicast capability, please give XORP a try. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Probable Bug in tcp.h
Marc Lörner wrote: th_x2 and th_off are created as a bitfield. But C-Standard says that bitfields are accessed as integers => 4-bytes On itanium integers are read with ld4-command but the address of th_x2/th_off may not be aligned to 4-bytes => we get an unaligned reference fault. If we'd change to 1 byte-accesses => I won't get any misaligned faults anymore. It's worth noting that Linux implements its version of tcphdr using a 32-bit-wide bitfield and the TCP header flags live there as bits instead of as integer quantities. I think it should be OK to change the u_int to a uint8_t as NetBSD has. The problem with bitfields in "signed char" is that they can become unintentionally sign extended on a read, and for many years compilers only supported "char", not "unsigned char". Does anyone see a reason why we should not make this change? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Probable Bug in tcp.h
Marc Lörner wrote: .. First of all I have the problam of misalignment of th_off. Because in this way always 4 bytes are read and the the bits of th_off are replaced. Then the 4 bytes are written back. But should (th_x and th_off) not only be 1 byte in whole -> only read and write 1 byte? Which machine architecture are you attempting to compile this code on? On FreeBSD Tier 1 platforms, the access is probably going to come out of L2 cache anyway, so the fields in question will be read by a burst cycle. It is worth noting that NetBSD changed the base type of tcphdr's bitfields to uint8_t, however this shuffles the compiler dependency into the treatment of the "char" type. Most modern C compilers support "unsigned char". ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Understanding the interplay of ipfw, vlan, and carp
Peter Jeremy wrote: Note that one downside of your carpdev patches is that (AFAIK) it is no longer possible to identify which host sent the packet: The source and destination MAC addresses, as well as the destination IP address are all defined by CARP. Once you change the source IP address to be the shared address there's nothing to identify which host sent it. If you really, really wanted to, you could write code to prepend the original IP or MAC as an experimental IP option. Options less than <0x80 are not forwarded in IP fragments. I can understand why you'd want to do this (debugging springs to mind), though it does go against the gist of what carp is and does. Also, there is compatibility to keep in mind, and it's entirely possible that the presence of a new and unknown IP option is going to break implementations which don't parse IP option headers correctly, or trigger other unwanted behaviour ("I don't know what this IP option is therefore I will drop it"). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Anyone interested in HDLC support for pppd ?
[EMAIL PROTECTED] wrote: Hello; I started playing a bit with net/pppd23 and I noticed there are some patches for FreeBSD-3.0 that were never committed (NetBSD certainly has them). Our pppd(8) is derived from the "samba" pppd port and should have them if we want to continue updating it. Ed Schouten is currently rewriting the tty code. It sounds like line disciplines are about to go away, so pppd23 will most likely stop working at that point. There's a Netgraph node ng_cisco which claims to support HDLC. Perhaps tweaking MPD to work with it is a better use of effort. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [Regarding FreeBSD and RFC Compliance]
Dalibor Gudzic wrote: Any pointers for someone that wishes to do it? http://wiki.freebsd.org/NetworkRFCCompliance ...is one place to start... ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GSoC - tcptest] - Regression Tests, Conformance Tests...
Victor Hugo Bilouro wrote: I've made a lot of changes to it; diffs are with him but I can send folk a copy of my Mercurial repo. I would appreciate that. Sent (off-list). As an example of the new PCS syntax and expect() stuff, I'll forward you the IGMPv2 test off-list. (Also sent.) humm, track state is needed to make TCP tests. It is something you'll have to build yourself around the expect() functionality. The experimental IP reassembly code (in pcs/packets/ipv4sar.py) might be a good place to start. It isn't finished, but it should demonstrate the general principles -- i.e. you read packets in a loop and you pass them to an object which knows what to do, in this case, ipv4sar. One big problem I had was that the concept of fragmentation requires deep copies of PCS objects. I imagine that's less of an issue for TCP segmentation, as the situation is made somewhat easier by the fact you're dealing with streams. BTW: My snapshot of PCS fixes the IP and TCP option parsers. If you look at the IGMP and DHCP decoders, there is an example of a dictionary driven option parser. This could also be applied to TCP where it's likely to be useful. I believe most of the bugs have been shaken out of expect(). The main problem is buffering and the fact that expect() depends on non-blocking I/O. pcap can return more than one packet from the kernel every time you call into the non-blocking dispatch function, so I did some internal refactoring to allow expect() to deal with that. So your code has to be able to deal with multiple matches from the Connector, even if you only asked to match at *least* one packet. "Count" is mostly about stopping expect() from hanging the flow of control anyway. The syntax and semantics are intentionally similar to PExpect for Python. In fact the IGMPv2 test uses PExpect to drive a QEMU virtual machine encapsulated as a Python object, for regression testing the IGMP code. So my suggestion is check out PExpect too. I didn't find his site, can you send me? http://www.fsmware.com/freebsd/syntest2.py I've added some Scapy-like syntax to PCS which can make the code look a bit smaller. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [GSoC - tcptest] - Regression Tests, Conformance Tests...
Victor Hugo Bilouro wrote: Hi, I'm in architectural phase of tcptest* development, so, I need understand every possible test it will need cover, because it would change tcptest architecture. Hey, have you seen gnn's PCS toolkit? http://pcs.sourceforge.net/ I've made a lot of changes to it; diffs are with him but I can send folk a copy of my Mercurial repo. I wrote a set of IGMPv2 and IGMPv3 baseline regression tests using it, now that I've added things like expect(), etc. It might save you a lot of work, although the TCP stuff needs attention. With expect() you can track state between segments. I started on IP reassembly, but ain't finished. I think Kip Macy's been using it for testing too, I saw a chunk of PCS-using TCP code on his site the other day. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [Regarding FreeBSD and RFC Compliance]
Archimedes S. Gaviola wrote: To Whom It May Concerned: Good day! Is there any document or web site that lists all the standard Request for Comments (RFCs) for all the networking protocols currently implemented on FreeBSD? This will help users identify what specific sections of a standard a certain network protocol is being implemented especially interoperability with other platforms. No, want to compile one and contribute it to the project? We'd be very grateful for the help. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: if_var.h micro-optimization
rihad wrote: Bruce M. Simpson wrote: It could save dirtying an L2 data cache line at the expense of taking a conditional branch, Whoa, why don't you take it easy on me :) I'm not that much into kernel (or hardware) programming. It's just that reading Ch. 3 of TCP/IP Illustrated Vol.2 by Rich Stevens got me digging around FreeBSD source code dealing with struct ifnet, where this piece of code caught my attention. It could be red, it could be yellow. It could be 620nm. Who am I to say what is and what isn't? ;-) There are bound to be situations where the change is a win, and even some where there isn't. Context is everything... but to evaluate your suggested change requires a lot more data. Do you plan to do this? Perhaps there is already a framework for trying out changes in -CURRENT and seeing their relative impact, so perhaps someone more experienced than I am can see to this? All educators are busy right now, please hold and the next available dogma merchant will be with you as soon as possible. ;-) (Hint: No, there isn't a framework I know of, unless you wanna make one? Scientific process applies, reproducible results, etc. You could script stuff, figure out a way to run the kernel or parts of the network stack under Valgrind so it can be L2 profiled w/o running it on a real machine... or hack hwpmc so it can be done live.. anything is possible.) Given how _IF_DEQUEUE() is normally used the impact is likely negligible. Oh, I see. A nice first attempt of mine anyway ;) Thanks. Don't take my word for it, down that road lies darkness. Seriously though -- it's easy to introduce bugs doing things like this, if anything else it's an exercise in really thinking things through. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: if_var.h micro-optimization
rihad wrote: Not sure if this is a worthwhile optimization? FreeBSD 7.0 --- /usr/src/sys/net/if_var.h 2007-12-07 09:46:08.0 +0400 +++ if_var.h2008-05-30 18:10:25.0 +0500 @@ -282,7 +282,8 @@ if (m) {\ if (((ifq)->ifq_head = (m)->m_nextpkt) == NULL) \ (ifq)->ifq_tail = NULL; \ - (m)->m_nextpkt = NULL; \ + else\ + (m)->m_nextpkt = NULL; \ (ifq)->ifq_len--; \ } \ } while (0) It could save dirtying an L2 data cache line at the expense of taking a conditional branch, but to evaluate your suggested change requires a lot more data. Do you plan to do this? Given how _IF_DEQUEUE() is normally used the impact is likely negligible. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HEAD UP: non-MPSAFE network drivers to be disabled
Julian Elischer wrote: While this is a good idea on it's own, the difference between what that achieves and what a line discipline achieves is that a line disciplin is hardware independent and can even be used on a virtual device. I was under the impression that the back-end for UART was light weight enough that it could be used as a virtual device. For example: Many years ago I tried to get the WinModem working in my IBM ThinkPad T23. UART lends itself well to being a wrapper for the DSP microcode without having any of the historical tty baggage. In the case of UART the "translation shim" moves from on top of the device node to underneath, in much the same way as has happened for GEOM. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: lagg0.2 style vlans on lagg(4) interface
Hi, It looks like this patch will cause gratuitous ARP to be queued even when the interface is not IFF_UP, is this intentional? Niki Denev wrote: I think arp_gratuit() needs a better name. arp_announce() ? Is if_ethersubr.c:ether_ifattach() good place to register the EVENT hook? ARP is also used by FDDI and IEEE 802.5, as well as anything which emulates this. Taking the call to arp_ifinit() out of if_setlladdr() is likely to break this code. And if yes, what would be the best way to handle failure to register the hook, as the function is void? Should I worry about that, or just print a warning message and continue? I see the C++-style comments - perhaps someone who knows event handlers better than I can comment, I believe it's using one of the shared kernel malloc pools with M_WAIT. It looks like this won't run afoul of locking, but it is a change to a fairly central path which needs to be considered carefully as it affects consumers other than Ethernet drivers. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: carp oddness... BACKUP is ARPing!
Rudy wrote: The CARP in BACKUP is arping... why? Without looking at the carp code, I can tell you that its addressing hook is implemented as a pass-through in ether_input(). carps are not IFT_ETHER, therefore they shouldn't emit gratuitous ARP or otherwise when an address is configured on one. So I'll leave this up to someone who knows the carp code, as this is most likely where the ARP originated from. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch to the kernel and to netstat...
[EMAIL PROTECTED] wrote: ... Please email me comments. I'd like to commit this to HEAD soon. It can't be put into 7 without removing the cluster and mbuf counting, but I might do that as well if there is interest. People writing servers are going to find the watermark stuff useful. I'm thinking being able to watch the the buffer stats (possibly also in a way which we can graph) for a single socket, given its inpcb or so address, would also be a neat trick... cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to identify a PHY?
Volker wrote: ... In short my original question better reads as "how do I know the kind of phy if no driver has been attached". Can one retrieve that information out of a verbose boot dmesg (from probing messages)? You can't determine which PHY is in use unless a driver is attached, because it's necessary to attach a driver in order to access the card's MII registers. Same with any other OS. If no PHY driver attached, but a NIC driver attached, you should see this message: device_printf(dev, "MII without any PHY!\n"); It sounds like someone needs to instrument the code path mii_phy_probe() to print useful information in the situation you describe. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: how to identify a PHY?
Marius Strobl wrote: If the system is running the simplest thing in order to identifiy the PHYs is to check the oui= and model= output of `devinfo -v`. Otherwise boot verbose and check the OUI and model output of ukphy(4). There's a project for someone in there I'm sure. Linux has mii-tool and mii-diag. Whilst we generally don't need all of the knobs, sometimes it can be useful to dump and poke PHY registers on the MII. src/sys/dev/mii/miibus_if.m contains the newbus interface definition for miibus which would be a place to start. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Problems with netgraph
Oleksandr Samoylyk wrote: looks like UDP in PPP in GRE I think so. Should we hope for some progress in this direction in future? Probably not, unless someone is willing to come up to the table and commit to writing and maintaining a Netgraph node to demux GRE, although this is only shuffling the fanout elsewhere. If MPD is relying on raw sockets to demultiplex GRE, then this is what it's up against in terms of performance -- repeated acquisitions of the INP sleep lock, and context switches when the socket buffer low water mark is passed. It might have improved slightly in HEAD since the move to rwlocks. Like udp_input(), rip_input() suffers from the fact that the stack has to deal with delivering datagrams to potentially more than one socket, and there is no intermediate data structure to handle the fan-out -- it walks the entire inp list every time. If you look at the comments in udp_input() it's pretty clear this is a historical weakness in the BSD implementation. Windows, by the way, forces socket clients to explicitly request reception of broadcast datagrams as of Windows Server 2003, and multicasts are strictly delivered to group members only, which eliminates that problematic loop -- you can always maintain a tree of receivers that way. I'm happy to review patches if someone else commits to fixing it. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPPROTO_DIVERT and PF_INET6
Julian Elischer wrote: actually the divert sockets should really not be in PF_INET they could deliver both inet and inet6 packets. the sockaddr that they return (and which needs to be read for divert to make sense) could be used to distinguish between them. Good point. I'd forgotten that they were abusing the fields in sin_zero. This is not OK for IPv6, although the kludge can still be perpetuated by looking at sa_len and stashing what divert wants at the end of sockaddr_in6. So there IS a case for making them a separate protocol family if someone's going to do a clean implementation of divert sockets for IPv6. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPPROTO_DIVERT and PF_INET6
Julian Elischer wrote: you could implement a whole new protocol family of which there was a single protocol.. divert. That's sheer overkill for what Edwin needs to be able to do. We already have a bunch of apps which use divert sockets in the IPv4 space, why should the existing semantics change? Divert sockets are still tied to the transport you instantiate them with, and they have always been a special case anyway depending on where one wishes to draw the lines. There is no reason per se, that I can see, why the IPPROTO_DIVERT identifier can't just be re-used along with pf_proto_register() for PF_INET6, and I've said this to Edwin off-list. A PROTO_SPACER entry just needs to be added to in6protosw. I was surprised to learn no-one had gone ahead and actually implemented it already as there are a few cases in IPv6 which might warrant it (6to4, Teredo etc.) If I'm missing anything obvious please let me know. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Network Patches from -RELEASE to -STABLE 7.0
Paul wrote: Is there a list of patches that have been applied to -STABLE since the -RELEASE ? I can't seem to find a simple organized list of applied patches (something similar to linux kernel changelog). I want to know if anything has been fixed or udpated in the network area to see if it warrants changing the kernel to -STABLE on a production machine. This information is typically present in commit messages, or in FreeBSD's release notes. It's not something which is compiled on an ad-hoc basis, it is specifically compiled on a per release basis, although you may occasionally see the release engineers updating the release notes for -CURRENT. Cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
John Hay wrote: You don't need to go to the kernel for this sort of thing unless you specifically need to implement route policy based on which interface(s) a packet came in on. Yes I know that. But in the world of adhoc wireless mesh networking there are very few non-linux people, so they basically call the shots and use the linux kernel features to the full. Not true. There's an awful lot going on behind closed doors in the MANET world, and from the sounds of the emanations, they might not be using Linux at all. In a sense I can understand them because their stuff also run on the small embedded stuff like the linksys wireless boxes and it needs to scale. The biggest adhoc olsr network is probably the Freifunk one that have more than 600 wireless nodes, mostly consisting of linksys boxes. The complexity of any system like that is still there, regardless of whether or not people choose to make it harder to debug code by prematurely pushing it into the kernel. On some boxes that are also connected to different kinds of networks, they run a different routing daemon into another fib and by setting the priorities on the fibs, they can decide which daemon's routes have the highest priority. And both routing daemons are happy because the other is not stomping on its feet. Yes, but this is largely to do with the fact that the Linux netlink socket allows daemons to coexist due to its use of a tag-length-value which captures that information, a different kettle of fish. The feature you describe is totally possible without adding complexity to Julian's current effort. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
Julian Elischer wrote: OLSR is an overlay network Nope -- the express intention was that it could be used for basic IP connectivity, for mobile devices. In OLSR, every node is a potential IP forwarder unless it explicitly advertises itself as being unwilling to forward. and any machine that participated must have a split personality. First it must be able to think in terms of the basic local network, and it must be able to think in terms of the world from the perspective of the overlay. Applying routing policy gets more important at the border. The OLSR implementation in XORP is intended to give people a means of connectivity between MANET and non-MANET routing domains, by redistributing routes into the OLSR cloud. I daresay these capabilities will get more important, and relevant, to the MANET picture as time goes on, but it's best to leave them out of the operational picture for now, in my opinion. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
John Hay wrote: The linux guys seems to have multiple fibs (or whatever they call them) which they can chain together by giving them different priorities. The effect seems to be that a packet will be matched through the highest priority fib to the lowest until a route match is found en then is used. Will something like that be possible? I came across that kind of use with the olsr guys. They let olsrd twiddle one of the higher priority fibs and then put fallback routes in a lower priority fib. That way olsrd can override a route (even the default route) and when olsrd exists and deltes all its routes, the original ones are still in the lower priority fib and will be used. XORP already does this without relying on any kernel support. Each routing protocol supplies an origin table of its own. The RIB makes the decision on which route to plumb to the kernel based on administrative distance. When xorp_olsr exits, its origin table is removed, and the winning routes are recalculated. You don't need to go to the kernel for this sort of thing unless you specifically need to implement route policy based on which interface(s) a packet came in on. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
Julian Elischer wrote: what's SSM? Source-specific multicast, where multicast flows (channels) are identified by both their original source address, and group address. Multicast addresses have no meaning on their own beyond the scope of a single link. I haven't changed any of that.. Basically I've kept clear of M/Cast. The way I see it, if you don't define ROUTETABLES=2 (or more) or don;t define it at all in your config then you get what you had before and I shouldn't have broken anything. Cool! Doing multicast "right" is Hard. Doing it "right" in ad-hoc topologies is Harder. It makes sense to steer clear of it for now. It can no doubt benefit from the hierarchy offered by multiple FIBs, but again, the policy routing mechanisms don't really exist just now, and things like PIM need changes to encompass it. They will need to come into existence for the model to work on a macro scale, for the same reason SSM was put on the table. I take it from this that you don't have any major complaints as far as what I've done. No problems here... I haven't tried testing. I would say though if we are going to be renaming rtalloc() and friends, that names should really change to be descriptive of what it does. It doesn't "allocate a route", it tries to look up a forwarding table entry, and returns a reference to it. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
Bruce M. Simpson wrote: Wouldn't it make sense to treat each alias as on a separate logical interface? Then each logical interface belongs to exactly one FIB. On input you decide which logical inteface a packet arrived on by looking at its destination MAC address. That reduces confusion quite a bit, at least in my mind! What does doing more than this buy you? It doesn't buy anything because there is still no 1:1 mapping -- the link-layer destination address maps to an ifp, and multiple aliases exist on the ifp. Let me qualify that further: You are talking about splitting network layer addresses onto their own logical interfaces, with the goal of having a 1:1 mapping for FIB resolution. This doesn't buy anything, because in IP, the previous hop never encodes the next-hop address it sends to -- it merely performs a lookup and forwards to you; your MAC address is the same for every IP address you have on the link, therefore it is not a unique identifier. UNLESS you use a separate MAC address for every IP alias which you add, in which case, you are merely pushing the mapping elsewhere in the stack; it actually adds more complexity in this case. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
Bakul Shah wrote: 1) A packet arrives on an interface. If this interface is associated with more than one FIB, which FIB does it get given to? If you only have a single FIB, there is no issue here. If you have multiple FIBs, the decision gets made by the classifier. 2) If that decision is taken by a a packet 'classifier', isn't it in effect doing the job of a FIB (deciding the next hop, which happens to be a local FIB)? Recall that basically a packet passes from a FIB to another FIB until it gets to its eventual destination. Up until now, the BSD forwarding code always forwarded packets on the basis of the destination address. In an IP environment this is totally reasonable. Most implementations work on this basis -- ultimately, there is a fan-out to a collection of tries which hold the prefix information, and there has to be a decision about which trie(s) to use for resolving the next-hop. Linux iproute2 works on this basis more or less. So the classifier is NOT doing the job of the FIB. 3) When a local packets needs to be sent, which FIB gets it? Does setfib decides that? If there a default FIB? If you look at Julian's patch, he's added an option to the socket layer to control this. There is a default FIB which is used when no FIB tag exists. I believe having to use pf/ipfw will slow things down a bit so the question is what does associating an interface with multiple FIBs buy you? You only need to pass through pf/ipfw if you wish to source-route packets, or need to apply a forwarding policy decision more complex than the destination field, which is all rtalloc() has historically supported. If there is any additional latency or slowdown, it's down to how good your matching algorithms are as you enter the classifier. Wouldn't it make sense to treat each alias as on a separate logical interface? Then each logical interface belongs to exactly one FIB. On input you decide which logical inteface a packet arrived on by looking at its destination MAC address. That reduces confusion quite a bit, at least in my mind! What does doing more than this buy you? It doesn't buy anything because there is still no 1:1 mapping -- the link-layer destination address maps to an ifp, and multiple aliases exist on the ifp. You still need a classifier to look at other fields in the message and decide, based on policy, which FIB is used for next-hop resolution. Tag switching systems avoid the issue by prepending a tag, but of course, what does a packet go through upon entry to an MPLS domain? You guessed it: A classifier. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: multiple routing tables review patch ready for simple testing.
Julian Elischer wrote: An interface may however be present in entries from multiple FIBs in which case the INCOMING packets on that interface need to be disambiguated with respect to which FIB they belong to. Yes, there is no way the forwarding code alone can do this. It should not be expected to, and it's important to maintain a clean functional separation there, otherwise one ends up in the same quagmire which has been plaguing a lot of QoS research projects over the years (Where do I put this bit of the system?) This is a job for an outside entity (from the fibs). In this case a packet classifier such as pf or ipfw is ideal for the job. providing an outside mechanism for implementing whatever policy the admin wants to set up. Absolutely. This has been the intent from the beginning. There is no "one size fits all" approach here. We could put a packet classifier into the kernel which works just fine for DOCSIS consumer distribution networks, but has absolutely no relevance to an ATM backbone (these are the two main flavours of access for folk in the UK). I find it is convenient to envision each routing FIB as a routing plane, in a stack of such planes. Each plane may know about the same interfaces or different interfaces. When a packet enters a routing plane it is routed according to the internal rules of that plane. Irrespective of how other planes may act. Each plane can only route a packet to interfaces that are know about on that plane. Incoming packets on an interface don't know what plane to go to and must be told which to use by the external mechanism. It IS possible that an interface in the future might have a default plane, but I haven't implemented this. This limitation seems fine for now. Users can't be expected to configure the defaults "by default" if they aren't supported, so, if overall the VRF-like feature defaults to off, and there are big flashing bold letters saying "You must fully configure the forwarding plane mappings if you wish to use multiple FIBs", then that's fine by me. if you have several alias addresses on an interface it is possible that some FIBS know about some of them and others know about other addresses. New addresses when added are added to each FIB and whatever is adding them shoudl remove them from the ones that don't need it. This may change but it fits in with how the current code works and keeps the diff to a manageable size. In any event, for plain old IP forwarding, a node's endpoint addresses are used only as convenient ways of referring to physical links. To back up and give this some detailed background: For example, 192.0.2.1/24 might be configured on fxp0, and we receive a packet on another interface for 192.0.2.2. When resolving a route, the forwarding code needs to do a lookup to see from where 192.0.2.2 is reachable before the next-hop is resolved in the table. That happens on a per-FIB basis, when the patches are applied -- however the job of tagging input for which FIB is the job of the classifier. The problems with the above approach begin when an input interface resides in multiple virtual FIBs (no 1:1 mapping), or when you can't refer to it by an address (it has no address -- unnumbered point-to-point link, or addresses do not apply), or when you attempt to implement encapsulation (e.g. GRE, IPIP) in the forwarding layer. Then, you're reliant on each individual FIB having resolved next-hops correctly. The existing forwarding code already does some of this by forcing the ifp to be set for any route added to the table. This is done implicitly for routes which transit point-to-point interfaces. BSD has had some weaknesses in this area. It makes implementing things like VRRP particularly difficult, which is why the ifnet approach to CARP was used (the forwarding table gets to see a single ifp); it eliminates a level of possible recursion from that layer of the routing stack. With multicast, for example, next-hops can't be identified by IPv4 addresses alone. Every forwarding decision has potentially more than one result, and links are referred to by physical link (this could be an ifp, an interface index, a name, whatever), and where messages are forwarded is determined using a link-scope protocol such as IGMP. There, it's reasonable to expect that the user partitioned off the multicast forwarding planes into separate virtual FIBs, and that the appropriate rules in the classifier are configured. For SSM, the key (S,G) match has to happen in the input classifier, if one is going to route flows OK using the multiple FIB feature -- the multicast routing daemons have to be aware of it, 'cuz you can't run a separate instance of PIM for every set of flows -- PIM is greedy per-link, a !1:1 mapping problem exists, PIM has no way of telling separate instances apart (no hierarchy in the form of e.g. OSPF areas, and even OSPF won't let you put a link in more than
Re: multiple routing tables review patch ready for simple testing.
Julian Elischer wrote: A general purpose OS is a different beast as it has no physical equivalent of the FIB. It may have multiple routing tables, though, to I think setrib would be a term less likely to cause confusion then setfib even though, in the case of your FreeBSD patches, it's really both. If we need to change the terminology now is the time.. I asked for comments on terminology before and this is what we came up with.. but once it gets committed it gets set in stone. The kernel forwarding table is not a RIB. In the past some apps have tried to use it as one. They really shouldn't do that. There are implementation constraints on the inter-process communication involved (PRC_ATOMIC, etc) which make it inherently unsuitable as a place for routing daemons to exchange routes, particularly when the system is under load, or running near load limits, as would be the case with a tightly engineered embedded system. I understand folk went down that road in the past, as a means to get something up and running quickly as a working demo, or as a hangover from the days when they were the only tools around, but it isn't the way to build a comms infrastructure. These days general purpose OSes are getting closer to specialised comms equipment in terms of what they can do, but more importantly, so are people's expectations of them -- and thus people's concern about whether or not it works tends to follow. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Multiple routing tables in action...
Julian Elischer wrote: The interaction with routing daemons is something I don't know enough about. I need someone who knows routing daemons to tell how to correctly tweek code that sends routing events. As long as it doesn't break anything... I think it is possible that events from a particular FIB should only be reported to routing sockets that are associated with that FIB. but I'm not sure about this. Please look at the Linux rtnetlink socket, they use a tag-length-value protocol for just this reason. It seems reasonable that PF_ROUTE messages have some kind of filter applied to them until a more complete story can be realised for this. Most PF_ROUTE clients are savvy enough to ignore message types on the socket that they don't understand. If there is a need to announce route adds and deletes on the socket on a per-fib basis, it seems reasonable to stash it in one of the unused fields (if we've got any of those..urp) and change the rtm_type field for now. However it does take us further down a route (no pun intended) of incremental growth which has real risk (lack of or insufficiently rich test cases, requirements drift etc) and seems to be incumbent with open source in general. This would mean running a separate instance of the routing daemon for each FIB (VRF?). Does this sound right to people? Sounds crap! You really, really don't want to be doing that if you can avoid it. Of course a lot of what's out there is not geared up to deal with it (and why would it be?) so it's fine for the time being, but it really, really can't be considered a complete, production-quality solution until the missing parts exist. cheers BMS P.S. I am impressed by the scope and ambition of your work even if I haven't had a chance to digest it fully yet, and I hope that my concern about production quality open source here is not misinterpreted as nay-saying or disapproval by anyone. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/122839: [multicast] FreeBSD 7 multicast routing problem
The following reply was made to PR kern/122839; it has been noted by GNATS. From: "Bruce M. Simpson" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: kern/122839: [multicast] FreeBSD 7 multicast routing problem Date: Tue, 22 Apr 2008 16:39:01 +0100 [EMAIL PROTECTED] wrote: > ... > So, as it seems to me, it is not a em driver problem. I fink, it is > imposiible, what such different drivers, as xl, em and msk were was broken > simultaneously and identically. > Without seeing reproducible results, I couldn't comment either way. > As my colleague says, when we both take a brief look at the source codes > of em driver, it seems some card have a hardware filter, and some do not > have it. So, if the card's filter programmed correctly (by the driver), > multicast working task working just fine, and if not, we have a problems. > Yes. It's regrettable that not all devices support the IFF_ALLMULTI feature, nor that it is supported correctly and consistently where it is supported. For example, wi(4) has never supported IFF_ALLMULTI correctly. The network stack has no notion that a card with IFF_MULTICAST capability can't support IFF_ALLMULTI. The way to fix it is to add support for emulating it using IFF_PROMISC. This was part of the motivation behind the M_PROMISC change to ether_input() last year, which allows the input path to tell if it received frames which it otherwise wouldn't. It was largely added to avoid introducing layer 2 loops with vlan(4) and if_bridge(4). This use of IFF_PROMISC has to be reference counted however. What would also help in tidying that piece of code up would be to get rid of the special case of carp(4)'s emulated addresses by tying this into a common API. Unfortunately I don't have free time to actually do this work at the moment, but I am happy to review patches. > On latest intel's driver 6.8.7 we have commented a few string on the code, > after what, multicast routing started to work correctly. But i fink, it's > a wrong way, so i asking for help again, if someone can help me to > investigate the source of the problem and fix it by the right way. > Tony Ackerman ([EMAIL PROTECTED]) is still listed as em(4) maintainer according to MAINTAINERS, last I heard Jack Vogel ([EMAIL PROTECTED]) was actually involved. The MAINTAINERS file should probably be updated. It is probably best if you contact Jack about em(4) directly. Thanks. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Looking for a bgp stressing tool
Ingo Flaschberger wrote: So we are looking for a tool that inject and verify packet with faked IPs. We want to generate fake traffic between A-B A-C B-C in both directions. The aim is to evaluate the routing capacity of openbgpd/freebsd. We currently didn't find any tool that fit our needs. Do you have any suggestion ? sbgp you can script this bgp listener/sender. is hard to find, as it was in the mrtd router package, which is "dead" now. http://www.filewatcher.com/m/mrtd-2.2.2a.tgz.871976.0.0.html The regression test framework in XORP is driven by a set of Python scripts, I believe it is fully scriptable. It might also be worthwhile adding BGP message support to PCS: http://pcs.sourceforge.net/ I have a lot of patches to go into PCS, gnn@ is pretty busy right now. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Question about ip accounting
Christopher Arnold wrote: Anyone looing at supporting the netfpga card on FreeBSD? I would love to do that project myself, my time is scarse right now. I believe there was some interaction between other XORP members and the NetFPGA people, although I don't know if this resulted in any outcome. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Strange forwarding issue with tap(4) and if_bridge(4)
Hi, I noticed a strange issue with tap(4) and if_bridge(4) where the bridge seems not to be forwarding frames. 6.3-RELEASE, btw. I have this setup where I use the two to bootstrap QEMU virtual machines. Up until now I've been using dhcpd for this. This has only ever worked right for me if I run dhcpd on the bridge interface. However I tried doing it on a second tap, and it worked OK for me. qemu /dev/tap0 tap0 -- bridge0 tap1 - [bpf] - dhcpcd > DHCP discovery broadcasts <- DHCP unicast replies OK If I run dhcpd on another tap interface, this works OK, but obviously only if I open the matching character device. dhcpd of course uses bpf for injection, not the character device. HOWEVER: If I try to run my own BOOTP server in userland, on the character device, what happens is this: If I tcpdump, I see the broadcast DHCP discover messages on the tap OK. bpf also sees the unicast replies my code generates. But if_bridge does not forward my traffic, even though the unicast addresses appear to be correct. qemu /dev/tap0 tap0 -- bridge0 tap1 - /dev/tap1 - my_bootpd > DHCP discovery broadcasts X <- BOOTP unicast repliesNOT OK The BOOTP replies (written to /dev/tap1) do not appear on bridge0 or tap0. They do however appear on tap1. In the first setup, the DHCP replies appear on all interfaces in the bridge, including the bridge. What if anything could I be doing wrong? tcpdump and wireshark report that the BOOTP replies I am generating are well formed. The write semantics I use are identical to those of the QEMU client at the other end. I've ruled out pfil/firewall filters. Now, as tap1 has been added to a bridge, it is in promiscuous mode -- and because bpf shows the userland-generated frames being sent, I believe the check I added for the destination address in if_tap.c can be ruled out. The problem occurs even if I add static entries to the bridge's address cache and disable all learning. Both RSTP and STP are disabled. Thanks for any help you can provide. cheers BMS [P.S. I have noticed that in order to get frames from /dev/tapX, non-blocking reads are necessary. My code is single threaded, I use select() to block it]. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: problem in if_tap.c
Maksim Yevmenkin wrote: please try the following patch. if there is no objections, i will commit it beetle# diff -u if_tap.c.orig if_tap.c --- if_tap.c.orig 2007-04-05 10:58:39.0 -0700 +++ if_tap.c2008-04-14 09:42:42.0 -0700 @@ -404,6 +404,7 @@ struct ifnet*ifp = NULL; struct tap_softc*tp = NULL; unsigned short macaddr_hi; + uint32_t macaddr_mid; int unit, s; char*name = NULL; u_char eaddr[6]; @@ -432,8 +433,9 @@ /* generate fake MAC address: 00 bd xx xx xx unit_no */ macaddr_hi = htons(0x00bd); + macaddr_mid = (uint32_t) ticks; bcopy(&macaddr_hi, eaddr, sizeof(short)); - bcopy(&ticks, &eaddr[2], sizeof(long)); + bcopy(&macaddr_mid, &eaddr[2], sizeof(uint32_t)); eaddr[5] = (u_char)unit; /* fill the rest and attach interface */ This patch looks good, please commit. [Unless of course we want the autogenerated MAC to be deterministic for some reason, but given that it comes from a timer, there's not much point in fixing the endianness...] cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IGMPv3 support
Martin Garon wrote: I am looking for a FreeBSD release with IGMPv3 and was surprised to find none. I know the KAME project added support for IGMPv3. Anyone knows why this was not imported back into the current sources? I was wondering if it had anything to do with reliability or rather with business mumbo-jumbo. I am actively working on this right now. Please see the bms_netdev branch in p4 for progress. The code there must be considered pre-alpha, it's a development branch. At the moment I am constructing baseline regression tests to make sure that everything works according to spec. It's harder than it looks as there are a few places where the delta-based vs the SSM API can lead to inconsistency and the specs are not completely unambiguous. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Howto send a limited broadcast?
tmm wrote: So, can anyone suggest how I can send a limited broadcast (on an interface that has been initalized with an IP and a subnet)? Use the IP_ONESBCAST option and send to the network broadcast address for that subnet. The stack will change it into 255.255.255.255 on output. See man page ip(4) for details. It's a hack, but it's largely due to how the stack has worked historically. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Initialising networking protocol
Julian Elischer wrote: Seen ayame? http://www.ayame.org/ looks like a stalled affort.. things stop in 2002 [greater-than] From what I've read of the code, it seems close to KAME and BSD style, and could actually get merged. With a little bit more work, the userland could slot into XORP's BGP implementation. Of course, all this takes time and effort, however I believe Ayame was a working example of MPLS in NetBSD, so it's as good a place to start as any. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Initialising networking protocol
[EMAIL PROTECTED] wrote: Hi All, I am working on implementing MPLS in FreeBSD at the moment. I was wondering if anyone had some links to any references I could use, or recommend any books I can use to help me in that. Failing that, I am struggling with trying to work out how to initialise my MPLS protocol in the netisr stack, so the mpls_input function I am writing is called when an MPLS packet is received. Seen ayame? http://www.ayame.org/ ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
getifaddrs() scalability
Just off the top of my head... ...has anyone run into problems with the scalability of this call? One of the XORP users needs to create »1000 interfaces in Linux, and I'm wondering if any FreeBSD users need to create that amount of network interfaces. As such the getifaddrs() call is likely to get slow in that scenario, as it uses a linked list. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
fxp(4) multicast transmission bug.
Hi, I am doing some protocol testing, and I just saw something very odd on 6.3-RELEASE. If I try to inject multicast traffic via bpf with fxp, bpf will report that it went out OK, however it never makes it out onto the wire. I have ruled out firewalls, switches and other layer 2 behaviour. sysctls look like this: dev.fxp.0.int_delay: 1000 dev.fxp.0.bundle_max: 6 dev.fxp.0.rnr: 0 dev.fxp.0.noflow: 1 driver flags look like this when injection is OK: fxp0: flags=8943 mtu 1500 driver flags look like this when injection is NOT OK: fxp0: flags=8843 mtu 1500 ... however, if for any reason the group I'm sending to has been joined by another process or kernel entity, sending is OK. My understanding of multicast hash filters was that they worked in only one direction -- receive, not send. However, I see from reading the driver that the fxp chip has certain restrictions on how the hash filter is programmed -- the command to do so must come before any other descriptors are queued. That's all well and good, but sending should "just work". Further reading of the driver suggests that it does nothing special about multicast transmission, so that would seem to point the finger at the driver firmware or the ASIC itself. If fxp is behaving differently to 99% of hardware out there, surely this needs to be marked in capabilities -- I shouldn't strictly need to program the hash filter to send the traffic, only receive. Whilst it's something an application is *likely* to do, it doesn't have to do so by spec, therefore this behaviour is a bug. Or is there something I'm missing completely here? Comments? feedback? suggestions? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic: tcp_addoptions: TCP options too long w/ with TCP_SIGNATURE support
Dontcha just hate broken vendor NAT? Yes, it seems reasonable that SACK is the sacrificial victim. Considering folk normally configure TCP-MD5 between routers which are usually directly connected on the same switch, doing away with SACK should be fine. Funny, I was staring at that define moments ago whilst debugging a totally unrelated piece of code in a different language. Good stuff. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Unbreaking igmp with pf.
Hi all, Just to follow up on my message last week. If I don't hear further feedback, I am likely to commit code which allows IP Router Alert options through the pf firewall by default. For further background read on. cheers BMS The lack of support for allowing the IP Router Alert option (henceforth: RA) by default in pf is problematic for the widespread deployment of IGMPv3. It's also bit some people who have been trying to set up multicast capable routers, even without IGMPv3, as FreeBSD sends RA by default in IGMP and has done since the 3.x era. Currently, PF has no capability to parse IP options, and defaults to dropping traffic which contains them. In day to day deployment, the most used option is in fact RA. The meaning of RA is quite simple: all routers on the path must examine the datagram. It is described in RFC 2113. Currently FreeBSD's forwarding plane performs no special processing of RA. Whilst RA came into existence well into after, RFC 3376 extends the notion of IGMP to make the use of RA mandatory. It's reasonable to do this, given that vendor kit is intended to do it. It also helps IGMP snooping switches spot the group joins. It is also used with MPLS and RSVP. "So what?", I hear you cry. Yes, but if outgoing IGMP is being squelched at the host, it breaks IP multicasting for everything but the most trivial cases (i.e. service discovery at 1 hop, pfsync, etc). Furthermore... if you don't send IGMP for link-scope groups (224.0.0.0/24), it will break them anyway if the switch is configured to prune link-layer multicast traffic. Options: 1. Change default in FreeBSD pf import to ip options enabled. 2. Add code to pf to simply allow the RA option by default. [I'm happiest with this one.] 3. Add code to the options path in pf to decode options, if and only if options are allowed, and add a mask specifying the allowed values. For reference, the IANA list of IP option numbers is here: http://www.iana.org/assignments/ip-parameters ...most of those are never used in practice. RA is. There are 30 possibilities specified for an 8-bit-wide space; the minimal mask fits in 32 bits; the maximal mask is therefore 256 bits. There is some overlap between 2 and 3; FreeBSD's kernel only tacks on 4 bytes to the IP header in outgoing router alert traffic, userland apps may do different things. So, if I don't hear more feedback from folk, I am likely to commit code which implements option 2. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 7.0 - ifconfig create is not working as expected?
Eugene Grosbein wrote: On Sat, Mar 29, 2008 at 03:43:44PM -0500, Brooks Davis wrote: I was using following command in FreeBSD 6.2: # ifconfig lo1 create inet 172.16.16.2 netmask 255.255.255.0 In FreeBSD 7.0 I got an error: # ifconfig lo1 create inet 172.16.16.2 netmask 255.255.255.0 ifconfig: inet: bad value But it is working splitted in to two commands: # ifconfig lo1 create # ifconfig lo1 inet 172.16.16.2 netmask 255.255.255.0 Is this expected behavior or should I file a PR? This expected. There's some argument it's wrong, but filing a PR is unlikely to cause it to change any time soon. Why? The same with creating gif-tunnel, now I need to invoke ifconfig twice, once for 'create' and once for other tunnel parameters, whereas for RELENG_6 this works: 'ifconfig gif0 create tunnel 1.1.1.1 2.2.2.2' This breaks existing setups/scripts. This is POLA issue. Why was it broken? I don't know why or how this has happened, however, given the complexity of the command line grammar which ifconfig is expected to parse, our choices are limited, unless someone(tm) is willing to come along and implement a full parser in ifconfig. I investigated this some years ago and frankly didn't get anywhere, one of the constraints was that Sam wanted to modularize the ifconfig code, with a view to future dynamic loading -- as such, this places restrictions on the kind of parser which can be used. There is valid argument that we should not do this, as ifconfig is a tool which sits in the base system, and should be kept simple and therefore small. On the other hand, there's also the argument that as ifconfig's syntax has grown considerably over the years, that we should go ahead and add a parser anyway. In the absence of a full-blown parser, I'm comfortable with "ifconfig create" being a separate operation, which preferably throws an error if other commands are included with it, and understand why these limitations apply. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
CALL FOR FEEDBACK: IGMP and PF interoperability
It has come to my attention that the default configuration of PF in FreeBSD will block legitimate outgoing IGMP messages. PF is currently not the default firewall in FreeBSD. Anyone using multicast in any way, even for link-scope multicasts (224.x.x.x/24), will be affected by this issue if they use PF as their firewall. This issue was described in this thread: http://lists.freebsd.org/pipermail/freebsd-pf/2006-June/002259.html The documentation does state that allow-opts needs to be specified explicitly -- there is no fine grained control for the IPv4 options actually filtered, however, and currently the IP Router Alert option is handled in the main path in all BSD derived systems. Please let me know if you have encountered this issue, so that we can get started on a workaround. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Frequent pauses with Linux-based router
Sean C. Farley wrote: I have noticed that with a Linux-based Netgear DG834G (DSL modem) frequent pauses (example[1]) between external systems and 7-STABLE (March 14th). At first, I thought it was ipfilter or ipnat, but I took those out of the picture by activating telnet on the router and connecting directly to it. Even running "ls /usr/sbin" on the router would pause occasionally. I did not (or did not recall) have these problems with 6-STABLE (post 6.2). I switched out the NIC (FA-311 (sis) to a FA-310 (dc)), cable and tried different ports on the modem by which to connect. I also tried disabling all RFC sysctl's and SACK. Nothing helped. Finally, I brought out an old DSL modem (SpeedStream 5660). This fixed the issue. I think this maybe a specific issue between Linux (2.4.17_mvl21-malta-mips_fp_le) and FreeBSD 7. Is there anything else I may test to see what is happening? OT: Hang on, are you saying you're running a MIPS MALTA targeted Linux kernel on a Netgear DG834G? That would be interesting as a test platform for FreeBSD/mips, considering the platform support for Malta is already there. I had a go at doing the Broadcom Sentry5 SoC last year but hadn't finished anything. Long shot, but are 802.3 pause frames appearing anywhere, ie can you test with a crossover cable? Have you done a BER test with UDP or something like that to try to rule out non-TCP protocols? cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: FYI: inpcb/pcbinfo mutex -> rwlock at some point in the mid-distant future
Robert Watson wrote: One of those issues is that we need to demonstrate to ourselves that exclusive access contention is managed as well with rwlocks as with sleep mutexes, as these locks would continue to be fairly highly contended in TCP. The other issue is that rwlocks don't support full priority propagation for reader access, although Jeff Roberson has recently improved fairness to writers with many readers. Don't forget that p4 bms_netdev contains a number of optimizations for the multicast paths -- there are lock acquisitions which are quite often unnecessary, or whose granularity is too high for the data structure(s) which need to be shared. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: FBSD 1GBit router?
Willem Jan Withagen wrote: £ukasz Bromirski wrote: Wouldn't it be a case for use of multicast vs unicast? Hardware is always better anyway, so why not invest in some switch that can do unicast/multicast in hardware? Usefull suggestion, only this is going to be in an overlay cloud where we do not have control over all the endpoint networks. let alone that we can get them to use multicast. And even those that use multicast in their last-mule equipment, don't always have correct setups. My experience is that Multicast in nice in theory and experiment, but when push comes to shove it does not completely deliver. I have to agree wholeheartedly, for more detail than you can shake a stick at, look here: http://www.cs.ucr.edu/~michalis/COURSES/204-02b/papers/ramalho.html If you're running over MPLS all bets are off. MPLS is like ATM in the sense that it ain't got no multicast grok, as far as I can fathom, anyway. Label switching is label switching. I never saw any support for the notion of 1:M in the LSPs. Multicast is more likely to succeed at the moment when you have complete knowledge of the network topology, and IP layer visibility. There are ongoing efforts to address these limitations. later BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Ephemeral port range (patch)
+1 on increasing the threshold, 1024 is way too low. Also consider the folk who depend on the existing behaviour: a predictable ephemeral port range is useful, if for some reason you need to apply a NAT policy to that traffic, with no other knowledge about how the applications you must NAT actually behave. later BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Looking for a guide to extend|adapting the socket framework for NFCIP-1
Hi, I had to use a search engine to figure out what the acronym NFC was, and I assume you mean this: http://en.wikipedia.org/wiki/Near_Field_Communication It helps if you give more background information when asking a more general audience for feedback. zDen wrote: 1) As the NFC device is attached to the USB or UART port, how and where in the source code can I change the output of the byte-stream packet to the proper physical port? i.e where is the part of the source code that is physical device dependent when doing the I/O calls? You really need to roll your own driver framework for this. Whilst the Bluetooth support sounds like it's the right place to start to look for ideas, you're going to have to write your own layering. I know off the top of my head that the Bluetooth support is able to add its own TTY disciplines to serial devices but I couldn't tell you specifics, as it's not something I meddle with unless I need to. 2) As the protocol family (PF_xx) and address family (AF_xx) of NFC is not define in the socket library, how can I define them and let the default socket() call return a socket with the customized structure? I can see that I may need to use SOCK_RAW as the basic socket framework or any others recommendation? To learn about adding a new socket family to the system, you really need to pick up a copy of TCP/IP Illustrated Volume 2 and read Chapter 15 onwards. It sounds like you have a fairly involved and challenging software project on your hands. I hope you're being funded by someone to do it, it doesn't sound like something a hobbyist would pick up just for the hell of it if it's going to be done properly, i.e. beyond a quick hack for demonstration purposes. cheers BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Routing confusion
Eric Anderson wrote: I guess my biggest question is, why do the IPs .128, .129, .130, .131 appear in the routing tables where they're NOT defined? I don't get it? You are not seeing forwarding table entries. You are seeing ARP entries - the LLINFO flag is set (L). This is a legacy behaviour we haven't done away with just yet. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"