Call for testers: olsrd and IP_ONESBCAST
Hi, For a while now I have had a patch available to teach olsrd to use IP_ONESBCAST instead of using libnet/bpf just to send broadcast datagrams in FreeBSD, which has had IP_ONESBCAST for a few years now. If anyone is using olsrd on FreeBSD I would greatly appreciate testing and feedback for this patch: http://people.freebsd.org/~bms/dump/olsrd-onesbcast.diff Thanks! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Interface index hack in IP_ADD_MEMBERSHIP
Yar Tikhiy wrote: Quagga still uses it, too, if its configure script detects FreeBSD or NetBSD. I'm afraid it was me who submitted the patch to the Quagga folks when I'd found that Quagga's ospfd couldn't handle unnumbered P2P interfaces in FreeBSD because their local IPs weren't unique. Unfortunately, Quagga doesn't seem to use the protocol independent part of the RFC 3678 API yet. A preliminary patch for the Rhyolite.com routed is available at: http://people.freebsd.org/~bms/dump/routed.rfc3678.diff The upcoming rewrite of IPv4 multicast host-mdoe logic (currently in bms_netdev) adds support for the Linux-derived 'struct ip_mreqn' for specifying interface indexes to IP_MULTICAST_IF. The RFC 3678 API is implemented; IGMPv3 and MLDv2 may be hooked in later on subject to available resources. The RFC 1724 hack has been completely removed from the kernel in this spin. The new code passes the existing regression tests for any-source multicast. I hope to have source-specific multicast regression tests in the main tree ASAP, I am very close to a code drop. Whilst the radical approach of rewriting this stuff may break legacy applications, they should probably be updated to support the new APIs anyway, given that Linux 2.6 and Microsoft Windows Longhorn both support RFC 3678. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Spillover routing?
Rajkumar S wrote: Hi, I have a low cost 128kbps and a high cost 512 kbps link to internet. Is it possible to do a spillover routing so that the high cost link is used only when the low cost link is, say, used more than 80%. This feature is almost certainly not going to be present in the base system. What you would need to do to implement this is to configure a part of the kernel to perform bandwidth measurements and make an upcall to bring up the other link in a dial-on-demand style configuration. Add NAT into the mix and it gets even more interesting. I believe pf+altq may have the potential to do this however I could not help you with where to begin re configuring it to do so, so I wish you best of luck in your research. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Source-specific multicast
I am very close to merging support for RFC 3678 to -CURRENT. I will make a patch available before I commit. The only userland consumer in the tree which is likely to be affected by the removal of ip_multicast_if() from the kernel is routed, which I will update to use the new setsourcefilter() API. The SSM code does change some of the coupling between sockets and IGMP, and changes some logic in udp_input; strict multicast membership becomes the default. For systems which deal with many multicast sockets and traffic, they may benefit from an additional hash table. I haven't finished touching the raw IP input path. Given current looming commitments I'm open to someone volunteering to finish the work of merging IGMPv3 and MLDv2, or possibly to fund the work. I wish to get at least the socket part of ASM/SSM merged before I come back to Yar's PR with vlan and pfsync, which I have not had reason to investigate thoroughly; I have had no further reports of problems with carp(4) in -CURRENT. regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: A radical restructuring of IPsec...
I'm all for this in principle. I believe that the case for FAST_IPSEC over KAME IPSEC is fairly clear for those of us who have read the USENIX paper. Qualitatively speaking I can say FAST_IPSEC has been more pleasant to work with when introducing the TCP-MD5 support. I will try to look at the patch in more detail as time permits. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: IPv6 Router Alert breaks forwarding
I can only speak about IPv4 router alert in detail; we do nothing with IPv4 RA nor would it appear that it would make any real difference in performance given how the code is laid out. RSVP packets should be passed verbatim to userland from ip_input() via rip_input() there. I think your IPv6 fix is good for now but will wait to hear further from [EMAIL PROTECTED] I am heading out the door so if someone could add an item for this to http://wiki.FreeBSD.org/Networking I should be most grateful. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: intel 802.11 2200BG routing
Da Rock wrote: So I could use some guidance as to what I can do to rectifiy this problem. I have 2 goals: 1. setup iwi to start on boot, and attach to my ap whenever its in range. 2. make sure iwi stays connected without manually monitoring it. 3. prioritise my routes via the rl0 and iwi if's so that cable is used over wifi, but both can be used to access the network. Umm, that's 3 goals. :^) The short answer is, you can't do what you're trying to do, yet. You can cut over without rebooting, you just need to remember to kill off all dhclient processes and manually remove the default route, as in FreeBSD all forwarding entries ('routes') reference an interface pointer, and the PRC_IFDOWN handler will not touch routes marked RTF_STATIC. No one as far as I know has rolled a 'cutover' script. What would be really useful is a port which can do this cutover in a more general way until the stack is changed. This isn't that different from say Microsoft Windows where a manual cutover is needed, although the OS having a multipath FIB ('routing table') helps. The long answer is, it's possible, and it requires some things in the network stack to be carefully reworked. I have looked at these issues in some depth; there are at least 3 items on the Network Stack Wiki which are directly relevant to making the kind of clean cut-over between wireless/wired interfaces possible. Notably looking at the PRC_IFDOWN handler in netinet, making forwarding entry lookup skip interfaces marked down, and introducing route preference into the routing trie. There are historical reasons why the code is the way it is. It will take a while to get these issues addressed going forward. Regards, BMS P.S. routed isn't going to help you at all in this situation, it's just an implementation of the RIPv2 routing protocol; it may have helped as the routes it introduces to the kernel are !RTF_STATIC. One thing I haven't tried is IPv4 Router Discovery (rdisc), that may help update the default route quickly. The problem with this of course is the additional network configuration in the infrastructure itself. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: IPFW update frequency
For what it's worth, the code I wrote for XORP is only for IPFW2, and uses its tables feature to atomically transcribe XORP rulesets to IPFW ones before swapping them in. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Merging rc.d/network_ipv6 into rc.d/netif
Mike Makonnen wrote: I would especially like feedback from folks more familiar with IPv6. One gotcha I've noticed is that if you boot with ipv6_enable turned off, then try to start IPv6 on an interface later on, it doesn't work because none of the interfaces (except lo0) has a link-local address (see rc.d/auto_linklocal). How can we fix this? Also, I would appreciate feedback on how stopping IPv6 on an interface should be handled. In rc.d/network_ipv6 it was handled at all. Currently, it goes through and deletes all IPv6 addresses on the interface. I agree. We should be able to add/remove IPv6 link-local addresses somehow at runtime, after boot, without necessarily bringing up IPv6 on an interface during boot. I am thinking at some point it may be for the best if some of the code to do with address families is restructured so that the administrator is able to explicitly attach or detach protocol domains e.g. AF_INET, AF_INET6 to network interfaces on the command line, based on my experience of making the changes necessary for refcounting of various network stack structures. I'd like to get this fixed going forward, though, as ever, other work takes priority... Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: The broadcast of python in FreeBSD
Zhu Yan wrote: When I send the broadcast in FreeBSD with address 255.255.255.255, the packet can not be received by other OS. FreeBSD applications need to use the IP_ONESBCAST option to send all-ones broadcasts. See the ip(4) man page. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Vrrp/CARP/ucarp Problems
Andrea Venturoli wrote: Jordan Gordeev wrote: The only load balancing that CARP supports, to my knowledge, is ARP level load balancing. From carp(4): The ARP load balancing has some limitations. First, ARP balancing only works on the local network segment. It cannot balance traffic that crosses a router, because the router itself will always be balanced to the same virtual host. Forgive me for stepping in, but I had read the above statement over and over trying to figure what it meant; perhaps it's not so clear... If I understood it correctly it's not saying you should not use CARP on routers. Instead it's meaning that load-balancing won't cross a third router which is on cascade of the two CARP routers. ... Andrea, you are correct. Jordan is pointing out the main limitation of CARP, which is that it operates only within a broadcast domain. I should point out such a feature is out of scope for VRRP, CARP, IPMP or other Layer 2 IP sharing protocol. However this behaviour is just fine for load balancing a router, in which case one relies on next-hop reachability anyway. The thing to remember with CARP is that it relies on the ability of the interface to go into promiscuous mode to pick up traffic for its virtual MAC addresses. More modern cards may support more than one station address in hardware, which avoids the need for promiscuous mode processing, however we don't currently support this hardware feature. If one wishes to load balance across Layer 3 hops (rather than within the same broadcast domain), what one is asking for is a feature like BGP4 Anycast, IPv6 Anycast, or OSPF-based Anycast which relies on cooperating routers to inject a route into the Layer 3 routing domain for a given 'virtual' IP address. There is a daemon out there which uses the OSPF API in Quagga to flood OSPF domains with virtual host routes for anycasting services using Opaque LSAs but I forget its name. XORP has the potential to do the same but requires some development effort to do so. If one wishes to load balance specific requests for an application layer service, one enters the wonderful world of 'middleware' and competing commercial solutions to the problem. And this is where money comes into play... Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/19875: A new protocol family, PF_IPOPTION, to handle IP options at socket interface
Synopsis: A new protocol family, PF_IPOPTION, to handle IP options at socket interface State-Changed-From-To: suspended-closed State-Changed-By: bms State-Changed-When: Mon Mar 26 14:36:38 UTC 2007 State-Changed-Why: It is unlikely this code will ever be committed. Reasons: 1) This information can be obtained via cmsg so as to lie out-of-band of protocol data 2) This code is IPv4 specific 3) Most consumers of IP options and router alerts either live in the kernel, or have this information delivered via raw sockets. http://www.freebsd.org/cgi/query-pr.cgi?pr=19875 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GRE with key
Cristian KLEIN wrote: Hello everybody, I am new to FreeBSD kernel hacking, so please excuse my perhaps stupid questions. I would like to add key support to gre(4). I have already been able to use gre(4) with a hardcoded key. The single thing remaining to do is to transfer the key from ifconfig(8). The key is an uint32_t and I haven't found a way to transfer it without modifying ifconfig(8). Excellent. Thanks for volunteering to do this! My question is, which is the BSD-style to achieve the above? Solutions I came up with are as follows: 1) Use SIOCSDRVSPEC / SIOCGDRVSPEC 2) Add SIOCSGREKEY / SIOCGGREKEY 3) [Probably to ugly to be mentioned, but requires fairy few modifications.] Add a sysctl MIB which is read when calling ifconfig ... create. If I were doing this, I would add the code to ifconfig.c where the other tunnel stuff lives, and go for option number 2. Feel free to modify ifconfig to accomodate the the new options. Another thing I wanted to ask is, which function of ifconfig(8) should I modify to display the GRE key? Look at how af_status_tunnel() works and consider adding it there. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: MPLS implementation
Sam Wun wrote: Hi, Is there any MPLS implementation for FreeBSD? I found a port ayame mpls for netbsd, but the last implementation was dated back to 2003, seems very old. There is NISTswitch, but it is most likely very bit-rotted by now. I would suggest helping Anihudda Bodhra out on the Click port as it would be a great starting point for prototyping MPLS due to how Click will most likely attach to the kernel forwarding paths. The key to success with MPLS is to learn from the layer 2 forwarding stuff in if_bridge; to integrate cleanly with the Ethernet code; to use ALTQ for the token bucket filter and traffic classification policies; and to not break the regular forwarding path. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ICMP-floods
I have a patch attached to http://wiki.freebsd.org/Networking to rate-limit ICMP which is generated by the forwarding path. It would be useful to find out if this offers symptomatic relief in this situation, although as Chuck points out, it is most likely being caused by a routing loop. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Proposal: Merge RFC3678 multicast APIs
Hi, I propose that we merge the RFC3678 advanced multicast APIs. Doing so gets us closer to IGMPv3 and SSM. I would greatly appreciate suggestions about how to deal with the include header issue below. I have already started merging the basic definitions into p4 branch bms_netdev. Background: * RFC3678 specifies user and kernel APIs for any-source and specific-source multicast for IPv4, IPv6, and protocol-independent use. * this includes struct ip_mreq_source and friends * SIOCSIPMSFILTER and SIOCGMSFILTER are historical and may be ignored. Impact: * It requires that struct sockaddr_storage is visible to netinet/in.h. * This change breaks the following files in the kernel: in4_cksum.c inet_ntoa.c ip_ecn.c in6_cksum.c in_cksum.c slcompress.c ...which do not include sys/socket.h where this structure is defined. Benefit: * We get the SSM API. We don't support IGMPv3 or SSM yet, but this is part of the work. * Better to do this now and incrementally; the IGMPv3 implementations out there for FreeBSD have been published as patch sets which are now bitrotting. * This lets us eliminate the ugly RFC1724 hack from the IPv4 stack, which is used to specify an outgoing IPv4 multicast interface by passing a 24-bit interface index in the host portion of a 0.0.0.0/8 address. * This behaviour is not portable; Microsoft Windows Vista uses the full 32-bit wide interface index space in both its IPv4 and IPv6 stack. No snickering from the gallery please -- Dave Thaler has done excellent work bringing the MS stack closer to IETF standards. * routed uses this; it can be patched to not do so; the RFC3678 API for this is to use the generic MCAST_JOIN_GROUP socket option which accepts an interface index as an argument in struct group_req. * Linux defines a struct ip_mreqn as a workaround for applications using the pre RFC3678 API. Inside the kernel it maps IFA to IFP when handling IP_ADD_MEMBERSHIP, thus avoiding the 0.0.0.0/8 hack. See ip(4) in HEAD for the polite rendering of my rant about doing IGMP correctly and its implications for addressing in the IPv4 stack (short: you need an IP address for it to work properly, and source address selection, or IPv6, is looking like a really good idea in a wireless/manet/mobile/ad-hoc world). Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Interface index hack in IP_ADD_MEMBERSHIP
Hi, I plan to get rid of the ugly little ip_multicast_if() hack in the IP stack.= Before I do, is anyone actually using this? RFC 3678 specifies a protocol independent API for socket group memberships which allow joins on interfaces referenced by index. This is intended to support IGMPv3 and MLDv2. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Interface index hack in IP_ADD_MEMBERSHIP
Eugene Grosbein wrote: I recall that routed and ripd used to utilize something similar long time ago. I'm not sure if they have switched to another API. You're right -- this would break routed on point-to-point interfaces. They didn't, unless it was updated at the upstream, i.e. rhyolite.com. This means that the RFC1724 hack can't be safely deprecated without breaking this use case, until routed is updated to use the RFC 3678 protocol-independent ASM API. Linux uses a slightly different technique to work-around this; ip_mreq is expanded to ip_mreqn internally, and the interface index is explicitly passed around in the kernel. The blocker in the FreeBSD case which prevents us simply adopting this is the source interface selection logic in ip_output(). Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Multicast refcounting in network stack
Hi, A patch against -CURRENT is now available: http://people.freebsd.org/~bms/dump/multi_refcounting.diff This is a fairly sweeping architectural change which should resolve memory leaks and potential panics with the network stack as a whole, to better support interface detach at runtime. I'd like to check it in as soon as possible as it fixes the root cause of the problems we have had with carp and pfsync in our stack. NetBSD has implemented refcounting like this for some time now, so it does not suffer from the same problems. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: networking code and splx()
Ignacio Rey wrote: ... The question is: Have calls to these functions been wrapped? or are they simply not used in this context? splx() and friends have been no-ops since FreeBSD 5.x was branched. Synchronization is now done using other mechanisms such as mutexes and spin locks. See the new man page locking(9) in -CURRENT. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: PMTU Discovery support
Kevin Lahey wrote: The boxes were running FreeBSD-6.1, but I can't really vouch for the particular kernel configuration. It could well be that the problem is with the loose nut behind the wheel, rather than with FreeBSD. :-) I believe PMTU measurements may only be relied upon for active TCP connections, but it's been a while since I read this code. It would be useful if non-TCP drivers such as gre(4) could be extended to perform PMTU discovery and auto-tune their MTU based on this, as manually setting the MTU is a bit random and can result in horrible fragmentation when going across the big-I Internet. I imagine doing this would require changes to the icmp input path and a bit of abstraction. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Multicast refcounting in network stack
Andre Oppermann wrote: http://people.freebsd.org/~bms/dump/multi_refcounting.diff Patch looks good. :-) Committed, with some changes. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
IPv4, IPv6 and link-layer multicast refcounting in bms_netdev
I have just committed reference counting for multicast structures in p4. Change list number is 116036. This should fix the problems with pfsync and carp since the scalability fixes for IPv4 multicast last September. A further cumulative fix for pfsync is present in this branch. Basic testing with the stock IPv4 and Ethernet code have been performed. Further testing would be much appreciated before the code is merged to HEAD. The refcounting has been implemented in a way so as not to break the 6.x ABI so that it may be merged to STABLE. It would be great to have feedback on how these patches may affect vlan(4) which is the only other consumer of the in_delmulti() KPI. My experience working on this suggests IFF_NEEDSGIANT is a real headache for dealing with ifnets which may potentially go away during the lifetime of the system. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
MFCing rev 1.96 of netinet/in.c for Zeroconf
The change itself is very simple; http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/in.c.diff?r1=1.95r2=1.96 This change is necessary before IPv4 address scope and source selection policy may be implemented. Does anyone see any potential problems with this? It is possible that there are people out there forwarding between LANs with 169.254.0.0/16 subnetted on different interfaces, though this is not RFC compliant behaviour, so I'd like to hear about that before I merge it. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network
Anton Yuzhaninov wrote: Thursday, March 15, 2007, 7:30:54 PM, Andre Oppermann wrote: AO IMO when configuring a interface with an IP address and network it should AO kick out previous host and/or network routes matching it. Unless those AO are from locally configured interfaces, then it should reject the new AO attempt. New route should replace existing one only if it have administrative distance (in cisco terms) smaller than AD for existing route. Preference of network from locally configured interface is only particular case of this general principle. We are obstructed by the current radix trie code only matching on destination and prefix. Adding 'administrative distance' to the FTE match is something which should seriously be considered. It is a stepping stone to equal cost multipath and would help in this situation. It does however considerably change the semantics of the existing routing socket and its consumers would need to be updated to reflect that fact. As I hinted at in my original response: it seems acceptable that ifconfig'ing an interface into the system should be able to clobber the overlapping routes in the meantime, but only until the architecture is fixed. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Generic ioctl and ether_ioctl don't agree
Yar Tikhiy wrote: Hi folks, Quite a while ago I noticed that our ioctl handlers get the ioctl command via u_long, but ether_ioctl()'s command argument is int. This disarray dates back to 1998, when ioctl functions started to take u_long as the command, but ether_ioctl() was never fixed. Fortunately, our ioctl command coding still fits in 32 bits, or else we would've got problems on 64-bit arch'es already. I'd like to fix this long-standing bug some day after RELENG_7 is branched. Of course, this will break ABI to network modules on all 64-bit arch'es. BTW, the same applies to other L2 layers, such as firewire, which seems to have been cloned from if_ethersubr.c. This is one of those annoying things which breaks compatibility with external modules. I'm not sure about this, though. I was getting sign extension warnings on amd64 last week when I was testing the IGMPv3 aware mtest(8). Perhaps if we're fixing these ABIs, we should commit to an explicit C99 type with known bit width, i.e. uint32_t. I would be much happier if we began using C99 types in the code. Just my 2c. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: tap(4) should go UP if opened
Hi, Frank Behrens wrote: If we have no possibility to mark the interface as UP for the non-root process the net.link.tap.user_open=1 is useless, because we can not transmit any packets. With the patch the interface goes UP only, when the administrator allowed non-root user access. The conditional in the second patch is a no-op as the open will be forbidden if the user did not have privilege to open the tap. Bringing the interface up by default potentially violates POLA, so this should not happen by default. Please try the attached patch, which puts this behaviour under a sysctl. Thanks, BMS //depot/user/bms/netdev/sys/net/if_tap.c#1 - /home/bms/p4/netdev/sys/net/if_tap.c --- /tmp/tmp.58336.0 Wed Mar 14 13:06:09 2007 +++ /home/bms/p4/netdev/sys/net/if_tap.c Wed Mar 14 13:05:54 2007 @@ -150,7 +150,8 @@ */ static struct mtx tapmtx; static int tapdebug = 0;/* debug flag */ -static int tapuopen = 0;/* allow user open() */ +static int tapuopen = 0;/* allow user open() */ +static int tapuponopen = 0;/* IFF_UP on open() */ static int tapdclone = 1; /* enable devfs cloning */ static SLIST_HEAD(, tap_softc) taphead; /* first device */ static struct clonedevs *tapclones; @@ -164,6 +165,8 @@ Ethernet tunnel software network interface); SYSCTL_INT(_net_link_tap, OID_AUTO, user_open, CTLFLAG_RW, tapuopen, 0, Allow user to open /dev/tap (based on node permissions)); +SYSCTL_INT(_net_link_tap, OID_AUTO, up_on_open, CTLFLAG_RW, tapuponopen, 0, + Bring interface up when /dev/tap is opened); SYSCTL_INT(_net_link_tap, OID_AUTO, devfs_cloning, CTLFLAG_RW, tapdclone, 0, Enably legacy devfs interface creation); SYSCTL_INT(_net_link_tap, OID_AUTO, debug, CTLFLAG_RW, tapdebug, 0, ); @@ -502,6 +505,8 @@ s = splimp(); ifp-if_drv_flags |= IFF_DRV_RUNNING; ifp-if_drv_flags = ~IFF_DRV_OACTIVE; + if (tapuponopen) + ifp-if_flags |= IFF_UP; splx(s); TAPDEBUG(%s is open. minor = %#x\n, ifp-if_xname, minor(dev)); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Removal of redundant entries from ifnet manpage
Aniruddha Bohra wrote: Hi, The ifnet manpage contains entries for the following routines which do not exist in the ifnet struct. committed, thanks! ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network
Gleb Smirnoff wrote: AFAIK, the problem needs a more generic approach. I see two approaches. 1) Introduce RTM_CHANGEADD, a command that will forcibly add route, deleting all conflicting ones. Use this command in in_addprefix(). 2) In rt_flags field we still have several extra bits. We can use them to specify route source - RTS_CONNECTED, RTS_STATIC, RTS_XXX, where XXX is a routing protocol. When issuing RTM_ADD a route with a preferred source (e.g. CONNECTED vs STATIC) will override the old one. The proposed changes also constitute a hack. I understand that they are being proposed to address problems we currently have in the stack, i.e. that we do not support multipathing, though it is more than likely they will be blown away in future when the architecture changes (and it has to change). Approach 1 is largely irrelevant if multiple paths are introduced to the network stack; there is then no concept of a conflicting forwarding entry, only preference derived from the interface, entry flags, or the entry ('route') itself. Approach 2 has some merit to it, although the forwarding plane should not care where the forwarding entry came from unless it needs to (e.g. next-hop resolution). It seems reasonable that the forwarding plane should tag entries as being 'CONNECTED' i.e. derived from the address configuration of an interface. I believe many implementations out there do this, and multi-path does not change this. We already have the RTF_PROTO1 flag to determine if the forwarding entry ('route') came from a routing protocol in userland, so there should be no need to change the existing flags. The RTF_STATIC flag only has special meaning in that it means 'the user added this forwarding entry manually via the route(8) command'. We should preserve these semantics, though I believe we should start implementing forwarding preference in the radix trie. I think it seems acceptable and reasonable that we use a limited form of Approach 2 to clobber 'routes' being aded in the case described in the PR, until such time as the network stack is re-engineered to support multiple paths and forwarding preference. I also believe it is useful if we start to use more modern technical jargon to discuss 'routes' in the network stack, because we are actually discussing the behaviour of entries in a forwarding table. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network
Gleb Smirnoff wrote: I was afraid that this would raise an argument on multipath routing. Let's temporary do not speak about multipath but just decide what is the correct way to remove conflicting routes when we are assigning an IP prefix to a local interface? My suggestion is to take the second approach you outlined but modify it slightly. That way, the conflict between the 'connected' FTE introduced by ifconfig'ing the interface and the pre-existing FTE for that network prefix, may be resolved in a manner which doesn't break current consumers of the routing code, and leaves the way open to do multipath later w/o problems. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Eygene Ryabinkin wrote: I tried to understand this, because Bruce already gave me a patch, but I am a bit stupid: I do not see how M_PROMISC that is cleared unconditionally before BRIDGE_INPUT will help us to identify the right interface. As I see now, the BRIDGE_INPUT is called once from if_ethersubr.c, once from if_gif.c and once from ng_ether.c: http://fxr.watson.org/fxr/ident?i=BRIDGE_INPUT So there is no distinct code paths that can allow BRIDGE_INPUT to modify its behaviour based on the M_PROMISC flag. But I feel that I am wrong in some place and missing some discuission on the M_PROMISC. Can anyone point me to the right place? In short: M_PROMISC exists to easily identify frames which were received promiscuously, to prevent infinite recursion, and to simplify code which needs to re-enter ether_input(). M_PROMISC is a flag introduced by NetBSD into their ethernet input path to deal with the case where an entity in the network stack needs to receive frames promiscuously, without necessarily passing those frames to the upper layers e.g. IPv4. It is not documented; the code is the documentation in this instance. It is cleared when an mbuf chain is passed to another entity which may consume the frame in that mbuf chain, in case the entity re-enters ether_input() with the same mbuf chain for local delivery (e.g. bridge, netgraph, vlan). I do not think M_PROMISC alone is sufficient to solve our architectural problems at Layer 2. So all the tangled if()s inside LIST_FOREACH() will be gone completely from bridge_input(). But we still need to see if we want to consume the packet by the bridge or it members or to do forwarding. Am I missing something? Correct. Just because a frame was received promiscuously, does not imply that the bridge will be the only consumer of that frame. I'm afraid there is a serious flaw in the very notion of such a logical interface. If it's true, we should start by admitting that the support for logical interfaces should be a side hack for compatibility, and not something that can live forever on the main code path. I agree with you. That is why I patched if_bridge once again to enable the pfil hooks for the physical incoming interface. And there are two ways to solve the problem: - to give each VLAN interface the distinct MAC, as Bruce suggested, I didn't suggest this. :-) I pointed out that the code matches on destination MAC only at the moment. vlan(4) is an abstraction of something which exists as part of the Ethernet framing, and is not a physical interface in its own right, as was correctly identified above. - to refuse the logical interfaces completely and to support only physical ones. It is what my very first (and very short) patch did. But this can break some existing firewall rulesets. And that should be discuissed -- we do not need the total breakage due to out changes. And you're right: the best way for this alternative is to leave the current behaviour as the compatibility sysctl that is turned off by default and move to the filtering on the physical interfaces by default. No problem, but skilled network people that are using FreeBSD as the bridge for VLANs should say if they are happy with it. I think it is acceptable for if_bridge(4) to know about the existence of VLAN interfaces and to deal with them accordingly as a special case, because Spanning Tree is specified differently in the case where VLANs are present. Therefore it is not unreasonable for if_bridge(4) to be looking at VLAN headers in the mbuf chain. As such I think the behaviour Andrew Thompson and I were discussing off list should be made the default: that is, the first 802.1q VLAN header is stripped off and turned into an M_VLANTAG before being passed to other consumers in the stack. The presence of M_VLANTAG makes it very easy to see that a frame was received with a VLAN header without involving vlan(4) and reduces the amount of 802.1q specific code across Layer 2 subsystems. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UltraVNC on freebsd
Rashid N. Achilov wrote: TightVNC or TridiaVNC. But encryption and file transmission will not available with these VNC's and UltraVNC at another end JFYI: I have heard corporate IT people who mostly work with Windows discuss UltraVNC. I don't see a port for it. It is on SourceForge so perhaps someone will step up to contribute a port. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Hi, Eygene Ryabinkin wrote: Speaking about vlan problems: the original problem is to do something with VLAN interfaces only because they are sharing the MAC of their physical parent. The problem itself is not VLAN-specific -- if there will be two physical interfaces with the same MACs and they will be bridged, the problem will still be here. I see this also. What would be good is if there was a way to record additional MAC addresses for each ifnet, in addition to the if_lladdr member. This would cut down the cruft in ether_input(), if_bridge(4) and possibly also carp(4). For network cards with more than one perfect hash filter entry in the hardware, programming these into the card would *perhaps* be more efficient when trying to achieve line rate with gigabit and beyond. This would most likely require an ABI change. The VLAN handling problem doesn't go away; we will still need to check if a bridge member is a VLAN interface because we can't uniquely key off the MAC as you point out. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Eygene Ryabinkin wrote: This is a different point. The bridge wants to know about bridge members MACs just because it should catch the packets that are destined to the bridge members. It is the only way for an L2 thing that is operating in the promiscious mode. Correct. For our case (when MACs are the same): I think that rik@ has explained it rather good, so you should read his message once again. Perhaps, we can talk about this off-list and in Russian, if you prefer. The problem isn't going to go away. It will get bigger when 802.3ad trunking is introduced. Andrew Thompson is currently working on this code. It may also affect the 802.11 code in future, which as you know is layered around Ethernet. It would be good to have a well thought out architectural solution for this problem. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Yar Tikhiy wrote: Guys, excuse me, but I still fail to see how the case of VLANs' sharing a single MAC differs from the case of several physical interfaces with the same MAC from the POV of a bridge. A bridge can have no own MAC addresses at all, it plays with foreign MAC addresses only. Therefore I can't see why our bridge code needs to know local MAC addresses, let alone why it fails when they're the same. Could you give me a hint? Thanks! A few points: 1. A bridge *does* have a MAC address; it is automatically assigned one to participate in IEEE 802.1d Spanning Tree. 2. In the case where 802.3ad trunking is implemented, the same Ethernet address may be used by multiple physical interfaces. 3. As Eygene explained well: there are a number of consumers of Ethernet frames in the stack. As if_bridge may potentially be passed mbuf chains containing packets for these consumers first, it must examine the destination address to determine if it should claim the packet or not. Finally, because of the above points, the Ethernet destination address cannot be regarded as a unique key in the bridge code, or indeed the general Ethernet path, for where packets should be relayed in the stack as a whole. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: tap(4) should go UP if opened
Frank Behrens wrote: How does tun(4) handle this? tun(4) is also set to down, when closed. It is not set to up, when ist is opened, but when an address is assigned by the user process. This is fine, because it needs always an ip address. tap(4) as layer 2 tunnel device does not need an ip address, so setting it up on open is IMHO the best solution. This isn't consistent with the other software cloneable interfaces which emulate certain layer 2 semantics, e.g. bridge, trunk, vlan; see below. Sound this reasonable or how should I handle the tap(4) open by an user process, when this process does not run as root? I recently committed Landon Fuller's code which makes tap and tun cloneable interfaces which may then be created via 'ifconfig tap0 create'. Automatically setting the interface to IFF_UP is not consistent with the semantics for other network interfaces; it requires specific privileges (usually super-user or PRIV_NET_SETIFFLAGS in -CURRENT) to do. However, we also support the creation of tap/tun instances by non-super-users, so there is motivation for the change. Configuring a tap interface to up by a non-superuser should only be permitted if the interface itself was created by a non-superuser, and if net.link.tap.user_open is set to 1. A more involved patch is needed to do this right for all cases -- we should not do this by default. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SO_ACCEPTCONN equivalent
Alexandru Arion wrote: Thanks for both suggestions. Since I'll support version 5.4 and up, this leaves me to using the workaround implied by calling accept and checking the returned value, for now. Erm. It looks like it's implemented in 5.4 as well, although you might have mentioned in your original mail you were working with a legacy version of FreeBSD. :^) http://fxr.watson.org/fxr/ident?v=RELENG54i=SO_ACCEPTCONN BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SO_ACCEPTCONN equivalent
Vlad GALU wrote: Erm. It looks like it's implemented in 5.4 as well, although you might have mentioned in your original mail you were working with a legacy version of FreeBSD. :^) http://fxr.watson.org/fxr/ident?v=RELENG54i=SO_ACCEPTCONN Manpage diff attached. Mailman ate your homework. :/ BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SO_ACCEPTCONN equivalent
Bruce M. Simpson wrote: Manpage diff attached. Mailman ate your homework. :/ My bad. Committed. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SO_ACCEPTCONN equivalent
Alexandru Arion wrote: Tried it on fresh install of 5.4: compiled the source locally, run, got error Protocol not available. Same code works on Linux. By replacing SO_ACCEPTCONN with SO_REUSEADDR, or any other option that appears in the manual page for 5.4, the program works correctly. Bruce, is there something I'm missing? There was a thread about this on a mailing list in the past from Robert Watson who was concerned introducing the option might introduce race conditions; please see the archives for this. If SO_ACCEPTCONN does not work for you, please consider submitting a regression test for it e.g. src/tools/regression/sockets/acceptconn so that someone can pick up on this. Thanks! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR
Bruce M. Simpson wrote: I have just committed a change in bms_netdev which enforces strict and better defined semantics for the IP_SENDSRCADDR option in udp_output(). I have just committed this change in -CURRENT. After testing it with 'ipbroadcast', it looks good apart from sockets which are already laddr bound. This is forbidden by in_pcbbind_setup(). The same caveats apply -- it might collide with an already bound inpcb. It is OK for code to choose any source address configured on the box as this will be needed to override source selection come ECMP. If someone else steps up to make it work when socket is laddr bound, well and cool. I now consider it 'fit for purpose'. I'm satisfied with this for now. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR
Bruce M. Simpson wrote: Dealing with dhclient is a separate issue -- here, something like IP_SENDIF needs to be introduced, as we are truly in an 'ip unnumbered' situation -- ie the ifnet MAY not yet have been assigned an IPv4 address at all, and IP_SENDSRCADDR implies that you are source routing in the local stack by passing the address of a numbered interface I have just committed a change in bms_netdev which enforces strict and better defined semantics for the IP_SENDSRCADDR option in udp_output(). This fits one of the main intended use cases of this option, e.g. a routing daemon, bound to 0.0.0.0 and a non-ephemeral port, which needs to explicitly override the hard-coded source selection policy in ip_output() to send an undirected broadcast on a numbered interface. It also fits a use case whereby a bound socket may wish to temporarily ask for default source selection policy by specifying INADDR_ANY, although this needs to be reviewed and tested further; I believe in_pcbbind_setup() will detect a collision in this case. We always obtain the inp_info write lock if IP_SENDSRCADDR was specified, in case we need to temporarily re-bind laddr. Pseudo-conditions as follows. IP_SENDSRCADDR with lport NOT BOUND is NOT OK. We should never try to persistently bind a socket which is not bound unless we are bind(2). IP_SENDSRCADDR with !INADDR_ANY when laddr is NOT BOUND is OK. It means override the source selection logic and use src.sin_addr instead. IP_SENDSRCADDR with INADDR_ANY when laddr is BOUND is OK; it It means override the bound address and use source selection logic instead. IP_SENDSRCADDR with INADDR_ANY when laddr is BOUND is OK. It means override the bound address and use source selection logic instead. IP_SENDSRCADDR with INADDR_ANY when laddr is NOT BOUND is NOT OK. It means no valid source is specified. Regards, BMS --- //depot/vendor/freebsd/src/sys/netinet/udp_usrreq.c 2007/02/20 10:22:30 +++ //depot/user/bms/netdev/sys/netinet/udp_usrreq.c 2007/03/07 12:28:16 @@ -747,7 +747,8 @@ return (EMSGSIZE); } - src.sin_addr.s_addr = INADDR_ANY; + bzero(src, sizeof(src)); + if (control != NULL) { /* * XXX: Currently, we assume all the optional information is @@ -777,12 +778,10 @@ error = EINVAL; break; } -bzero(src, sizeof(src)); src.sin_family = AF_INET; src.sin_len = sizeof(src); -src.sin_port = inp-inp_lport; src.sin_addr = *(struct in_addr *)CMSG_DATA(cm); break; default: error = ENOPROTOOPT; break; @@ -797,7 +796,7 @@ return (error); } - if (src.sin_addr.s_addr != INADDR_ANY || addr != NULL) { + if (src.sin_family == AF_INET || addr != NULL) { INP_INFO_WLOCK(udbinfo); unlock_udbinfo = 1; } else @@ -810,11 +809,20 @@ laddr = inp-inp_laddr; lport = inp-inp_lport; - if (src.sin_addr.s_addr != INADDR_ANY) { - if (lport == 0) { + + /* + * If the IP_SENDSRCADDR control message was specified, override the + * source address for this datagram. Its use is invalidated if the + * address thus specified is incomplete or clobbers other inpcbs. + */ + if (src.sin_family == AF_INET) { + if ((lport == 0) || + (laddr.s_addr == INADDR_ANY + src.sin_addr.s_addr == INADDR_ANY)) { error = EINVAL; goto release; } + src.sin_port = lport; error = in_pcbbind_setup(inp, (struct sockaddr *)src, laddr.s_addr, lport, td-td_ucred); if (error) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SO_ACCEPTCONN equivalent
Alexandru Arion wrote: Is there an equivalent in FreeBSD to the SO_ACCEPTCONN option for getsockopt(), available in Linux? It doesn't actually has to be an option for getsockopt(), just a way to determine if a socket has been marked to accept connections with listen(). SO_ACCEPTCONN appears to be in FreeBSD 6.2 and CURRENT already. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Yar Tikhiy wrote: My proposed check for IFF_DRV_RUNNING is by no means a priority task. I can add it by myself after you finish your great current project regarding ether_input() and friends. Just committed in p4: //depot/user/bms/netdev/sys/net/if_ethersubr.c#6 - /home/bms/p4/netdev/sys/net/if_ethersubr.c --- /tmp/tmp.11470.0Tue Mar 6 15:45:08 2007 +++ /home/bms/p4/netdev/sys/net/if_ethersubr.c Tue Mar 6 15:45:01 2007 @@ -511,6 +511,13 @@ m_freem(m); return; } +#ifdef DIAGNOSTIC + if ((ifp-if_flags IFF_DRV_RUNNING) == 0) { + if_printf(ifp, discard frame at !IFF_DRV_RUNNING\n); + m_freem(m); + return; + } +#endif Thanks! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Eygene Ryabinkin wrote: I am awfully sorry, but you're seem to be mistaken: Thanks for clarifying this. That'll be because I didn't read if_bridge that far. ;^) In my original message I was just looking at if_ethersubr.c. I need to make sure any changes which are made to if_bridge to deal with vlan problems are incorporated into bms_netdev so that after I commit M_PROMISC, it does the right thing. if_bridge calls the ipfw directly only for the L2 filtering (when the net.link.bridge.ipfw is set to 1). This is processed by the block in if_bridge just above to the 'ipfwpass' label. In bms_netdev, the behaviour of ether_demux() is unchanged. ip_dn_claim_rule() is called to determine if there is an IPFW (usually dummynet) rule for the input frame at ethernet level, if-and-only-if net.link.ether.ipfw is non-zero. I just committed some comments to clarify this and styled it the same as the check in ether_output_frame(). However -- the IPFW check in ether_demux() is *skipped* in bms_netdev if M_PROMISC is set. This is because we might drop packets which are destined for vlan_input() which flow in because the interface is IFF_PROMISC. Strictly speaking this bends the rules of dummynet, because if you have frames coming in due to promiscuous mode, which the rest of the stack doesn't expect, they won't be filtered by Dummynet pipes. But the L3 filtering is done fully by the pfil hooks, as I understand the code. Moreover, I am using 'pf' in my case, not the ipfw. Yes, this is always the case for the upper layers. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Julian Elischer wrote: When we added netgraph we split both the input and output parts so that they would provide 'natural' entrypoints for a bridge. Consider where a bridge wants to put packets. In bms_netdev, bridge_input() is entered directly from ether_input(). It may potentially re-enter, so M_PROMISC is cleared on frames thus handed off to if_bridge(4). Same for ng_ether(4). Since the split however other code has made use of those entrypoints at different times. I'm not sure at the moment whether other code does so now. According to KScope on -CURRENT, the only other places which call the split ether_demux() are dummynet_send() and ng_ether_rcv_upper(). Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Eygene Ryabinkin wrote: Will try to understand if it will cure my problem, thanks! Attaching my patch, just in case if freebsd gnats will be down ;)) Thanks for this. It looks like Andrew may be in a better position to say if this fix should go in or not. It is possible that if bridge changes the ifp and that the frame should be forwarded locally, i.e. to the upper protocol layers, that ifp should also be updated in ether_input() (as NetBSD does) to make sure that the later checks are against the updated ifp. I have just changed this behaviour in p4 bms_netdev. Please try to test with this code. If you can't access p4, then I can extract an updated patch though this will take longer. This should help to eliminate the need for DEV_CARP compile-time conditionals in if_bridge(4). Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Hi, Thanks for your reply. Yar Tikhiy wrote: My concern is that, with possible callers of ether_input() being not really *from* but *on behalf* of the interface, e.g., in Netgraph, IFF_DRV_RUNNING can be a way for the interface driver to tell us: I'm not ready yet, so don't believe anyone who pretends he has a packet from me. E.g., a vlan(4) interface gets IFF_DRV_RUNNING set only if it is properly attached to an Ethernet interface (known as the vlan's parent). AFAIK this is a totally legitimate use of IFF_DRV_RUNNING. Now assume that a vlan interface is UP but not RUNNING because it's detached from the parent. If a buggy Netgraph node or another source of synthetic traffic decides to inject a packet as though it comes in from the said vlan interface, handling the packet as usual will be bogus. IMHO the IFF_UP check in ether_input() is mostly for a similar purpose: If all callers of ether_input() were in real and conformant interface drivers, we shouldn't bother re-checking IFF_UP in ether_input() either because the driver of a down interface wouldn't call ether_input() for it in the first place. I agree with the point you make here about non-conforming drivers; however there are cogent performance arguments for checking IFF_UP immediately. If an interface is configured administratively down, it shouldn't be pumping traffic into the network stack. I do however realize there are situations where this can happen. Suppose, for example, the thread which calls ether_input() is scheduled on another CPU. Dropping such frames immediately on entry into ether_input() saves tying up a thread for any longer than is absolutely necessary. Perhaps Kip, who is working on 10GbE performance just now, can advise further. Of course, we can omit the check for IFF_DRV_RUNNING if we think that synthetic traffic from an unready interface is OK. But I'm afraid we shouldn't. In addition, I wonder if we can move the conformance checks to a wrapper function so that conformant drivers don't have to pay the performance penalty of the just in case checks per each inbound Ethernet packet. Thanks for explaining this further. Perhaps I should put the check for IFF_DRV_RUNNING under INVARIANTS or make it a KASSERT? The code in bms_netdev as it stands bends the rules a little. The IFF_UP check was in ether_demux() before. The original reason for the ether_input()/ether_demux() split was to accomodate Netgraph. I must admit that I hadn't fully mapped out the possible re-entry scenarios with Netgraph because they may be arbitrarily complicated by its very nature. Whilst Netgraph is a cool feature, and one I am very grateful that FreeBSD has, I wonder if it is OK that we should have checks which potentially pessimize performance for the main use cases to protect the stack against Netgraph frames which are bogons, or bugs in Netgraph nodes. I'm open to hearing more about this, but my own resources (time, money) are a limiting factor as to what I can do. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Hi, Eygene Ryabinkin wrote: Sure, I can test it, but then I need to know what problems are cured by your patch, or I just should watch if it will not break something. My concern is that I want to make sure that all these changes to the ether_input path work OK together. The M_PROMISC flag is set further down when it's determined that a frame flowing into ether_input() was received promiscuously, and therefore Layer 3 protocols (e.g. IP) may not want to see it. In NetBSD, after if_bridge is given a chance to claim an input frame, the ifp may be changed if the bridge needs to forward locally. In my case if_bridge drops off the packet because firewall fails to recognize the packet as good: the interface that is passed to a pfil_hooks is bad (I mean not the one expected). The ifp which your patch changes is that of the mbuf chain when bridge_input determines it is not for the bridge, but should be forwarded locally. The patch forces a locally forwarded frame to have the same ifp as it had when it came into bridge_input. I can foresee problems if the same Ethernet destination address exists on multiple bridge member interfaces. The latest version of p4 bms_netdev now updates the cached ifp in ether_input() if bridge_input() changed it in this way. NetBSD consistently uses pfil_hooks for the if_bridge *and* ether_input paths, FreeBSD currently calls ipfw directly for ether_input, which may make a difference to the behaviour which you are seeing with VLANs. Not understanding if_bridge fully, or the coupling of ipfw with if_ethersubr.c, I would hope that Andrew and others have more to say on this. Will try to see if your patch makes any difference for the 7-CURRENT, but I have no system at hand to test it, sorry. The patch is extracted from p4 therefore it should apply against CURRENT. I haven't updated the patch yet, the latest code is in p4. We won't be able to eliminate the DEV_CARP checks in this spin. I did exchange an idea with Andrew late last night whereby a list of addresses other than ether_dhost is maintained for each ifnet. Input paths then check this in addition to or instead of ether_dhost. I've added this to the Wiki. I've been working particularly hard lately so I'm not 100% clear. Thanks, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bin/94920: [rpc] rpc.statd(8) conflict with cups over tcp and udp ports 631
Synopsis: [rpc] rpc.statd(8) conflict with cups over tcp and udp ports 631 Responsible-Changed-From-To: bms-freebsd-net Responsible-Changed-By: bms Responsible-Changed-When: Sun Mar 4 15:03:40 UTC 2007 Responsible-Changed-Why: Someone else with Copious Free Time can do this -- not a priority for me. http://www.freebsd.org/cgi/query-pr.cgi?pr=94920 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: bin/100969: [rpc.lockd] rpc.lockd conflict with cups over udp ports 631
Synopsis: [rpc.lockd] rpc.lockd conflict with cups over udp ports 631 Responsible-Changed-From-To: bms-freebsd-net Responsible-Changed-By: bms Responsible-Changed-When: Sun Mar 4 15:04:14 UTC 2007 Responsible-Changed-Why: Someone else with Copious Free Time can do this -- not a priority for me. http://www.freebsd.org/cgi/query-pr.cgi?pr=100969 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] IP_SENDIF option; rework ip_output() source selection logic
Hello, Thanks to andre making a start on this, I have managed to get the IP_SENDIF option implemented today in p4 bms_netdev. Here's a patch against -CURRENT: http://people.freebsd.org/~bms/dump/sendif-20070304.diff For those who are new to this work: IP_SENDIF is broadly an analogue of the Linux socket option SO_BINDTODEVICE. It is used to bypass the traditional BSD source interface selection logic. It is a sledgehammer hack used to output datagrams on a specific interface which may not yet have an address, e.g. for DHCP. Judicious use of this option, together with IP_ONESBCAST, will make it possible for dhclient to run without BPF support in the base system. There are a few remaining issues around this code which need to be dealt with. These are: * Fix IP_SENDIF and IP_SENDSRCADDR for unbound sockets. This goes without saying. For these options to be useful the socket should not have to be bound anywhere. The fact that IP_SENDSRCADDR is currently broken contradicts both our documentation and UNIX Network Programming Vol 1 3rd Edition. * Allow IP_SENDIF to be used from the raw IP output path. Some people might want to do this. * Add a specific privilege level for IP_SENDIF. Currently it requires the 'open raw socket' privilege, as it is Not Normal Behaviour. * Disable hardware checksums on output, if we have to do that. My testing with msk(4) suggests this might not be needed. When/if we adopt NetBSD's source selection policy concept (e.g. for fully supporting link-local IPv4) this code will most likely have to be updated, and/or when/if we adopt equal-cost multipath. The hack IP_ONESBCAST itself may eventually be eliminated by doing things slightly differently in the forwarding trie i.e. using interface preference and/or IP_SENDIF and populating the trie with 255.255.255.255 routes. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: CARP behaviour
Yar Tikhiy wrote: We shouldn't cache route pointers anywhere anymore. It has been completely removed from the PCBs and things like gif and others. Sounds like a good way to go, too! :-) Thanks! gre(4) does very funky things with the route it caches to the tunnel endpoint. Someone(tm) should have a look at that. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge
Hi, I haven't seen your patch, can you point me at it off-list? Thanks. Eygene Ryabinkin wrote: I traced the current if_bridge.c behaviour to the NetBSD's if_bridge.c 1.9. This was the first version in that the firewall hooks were introduced. And the assumtion that the MAC identifies the physical interfaces was used in this first version. And a question: can anyone say if my patch will break some known good behaviour and if the current behaviour of if_bridge is based on some logic I am currently failing to understand. I would greatly appreciate it if you could look at the combined M_PROMISC and 802.1p patch, which rewrites ether_input() significantly. It sounds like the issues you are having with vlans and bridges may potentially be fixed by this patch, or that the fix may be incorporated more easily with this patch. In NetBSD, after if_bridge is given a chance to claim an input frame, the ifp may be changed if the bridge needs to forward locally. M_PROMISC is used to indicate that a frame was received promiscuously, in case ether_input() re-enters itself with the same mbuf chain. Certain consumers of ether_input() need to punch holes in the logic used to detect if a frame was for us or not because they do funky things with Ethernet destination addresses, e.g. carp. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Yar Tikhiy wrote: Now I see your point, thanks! Well, at least in theory, the driver shouldn't call ether_input() if the interface isn't running. OTOH, the interface shouldn't be getting traffic if it's !UP. However, I suspect that not all drivers handle IFF_UP fully or even can do it at all due to hardware limitations. As I understand it, in an ideal world a !UP interface should be deaf and dumb and not interfering in any way with the network still connected to it physically. Therefore discarding inbound traffic from a !UP interface may be a necessary workaround, but it may not be enough. All that boils down to this: The IFF_UP check in ether_input() is more to a sanity check than to the way for IFF_UP to work. Therefore we can add the IFF_DRV_RUNNING sanity check there, too, for completeness. Thanks for your explanation. I'm still not sure I understand why IFF_DRV_RUNNING should be checked for in ether_input(). There is a pretty clear reason for checking for IFF_UP in ether_input(); an interface which is configured administratively down should not be bringing traffic into the stack, regardless of whether it is a hardware device or a pseudo-device. IFF_UP has been in since 4.2BSD; it is more or less integral to how the BSD network stack operates. There are situations in which a pseudo-device or hardware device could incorrectly call ether_input() with such traffic. Reading net/if.h, IFF_DRV_RUNNING is documented as meaning 'resources are allocated for this device'. Surely such a check is redundant and not relevant to the operation of ether_input()? As far as I can tell it is similar to the old meaning of IFF_RUNNING, and there are legitimate situations in which the hardware or its queues may have stopped processing temporarily whilst the interface may be administratively up (and thus accepting traffic). Please correct me if I'm wrong or point out situations where it's important IFF_DRV_RUNNING state is checked outside of a driver. Sorry if I seem obtuse, but I'm sure I'm missing some detail here. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Yar Tikhiy wrote: In fact, there two independent flags indicating interface's readiness: IFF_UP and IFF_DRV_RUNNING. The former is controlled by the admin and the latter, by the driver. E.g., an interface can be UP but not really ready to operate due to h/w reasons, or vice versa. Perhaps we should check both flags to see if the interface is, so to say, up and running. if_vlan.c has an obvious macro for that, and it can go to if_var.h to avoid code duplication if we decide it's the right way to take. Thanks for looking at this. The purpose of the IFF_UP check is to immediately drop frames destined for an interface which is administratively configured down. Surely if ether_input() is called from the driver, there should be no need to check IFF_DRV_RUNNING? Indeed if the hardware flips to a state where it is not running but its internal queues or descriptor rings are draining, this might cause frames to be lost? Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Ethernet cleanup; 802.1p input and M_PROMISC
Hello all, I would like to announce an updated version of the 802.1p input patch, available at: http://people.freebsd.org/~bms/dump/latest-8021p.diff I have cut down the original scope of the patch. I previously ran into problems when I tried to move VLAN tag input and output processing into if_ethersubr.c. FreeBSD should now accept VLAN 0 traffic on input with this patch. In addition to this, the M_PROMISC flag is now used, which considerably simplifies the Ethernet input path in general. I have performed some light testing on a 1Gbps COTS switch with 802.1q encapsulation and without, with carp and vlan, with and without hardware VLAN tagging, and all looks OK. I would greatly appreciate further testing, particularly with if_bridge and ng_ether which I have not tried. If all goes to plan, I would hope to commit this code to -CURRENT within the next 10 days. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
CARP behaviour
During testing of M_PROMISC I noticed a couple of issues with our CARP. 1. carp doesn't seem to maintain input/output statistics on its ifnet. 2. carp doesn't seem to detect that the underlying route to the subnet its address is exposed on changed to another interface. Are these conditions normal / expected? Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Proposal: Add M_HASCL().
Bruce M Simpson wrote: Much network code needs to know if the mbuf it is looking at is using a cluster. I propose putting M_HASCL() in sys/mbuf.h. I realise this is a style change, however, it seems to be a very common idiom. I sent this, then I looked at NetBSD, having caught a glimpse of their MBUFTRACE code when skimming lots of diffs. That is also a good idea, and might help us catch problems before they go prime-time; I've added it to the wiki. Point there is, M_HASCL() seems to be a hangover from the 4.4BSD era. NetBSD seems to treat clusters and external storage as separate entities. So I'm reconsidering this in the light of this new evidence. As far as I understand it, the presence of M_EXT in an mbuf chain's header in FreeBSD always indicate that we are using external storage (not necessarily, but possibly, a cluster). Can someone confirm this? Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: nconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR
Bruce M Simpson wrote: Hello, In preparation for tightening up our handling of INADDR_BROADCAST sends, I ran some brief tests today on the network stack with the attached test code. I found some inconsistencies when run against 6.2-RELEASE; 1. IP_ONESBCAST breaks if SO_DONTROUTE is specified. One thing appears to be consistent about the failure mode: bad UDP checksums. dc(4) is being used on the destination end of the test network, so checksum offloading should not be an issue. I am also seeing the wrong destination address being used in most cases. This is intermittent regardless of whether the socket is bound or unbound. This is consistent with ip_output() treating its internal flag IP_SENDONES as separate from IP_ROUTETOIF. I was skimming an old patch of mine which attempts to implement part of SO_BINDTODEVICE which contains a fix related to this condition. The fix isn't the right fix so I will revisit this now and hopefully commit a fix shortly. 2. IP_SENDSRCADDR has some other inconsistencies. a. The option is always rejected if the socket is not bound. I find this behaviour suspect; the whole point of the option is to specify, for SOCK_DGRAM and SOCK_RAW, the source address of a packet. b. 0.0.0.0 is always accepted. A regular interface lookup is used based on destination if this is specified. This appears suspect to me because such an option is redundant. This is of course a separate issue. Because it's more involved (it concerns the general concept of 'ip unnumbered' in the stack) it needs further consideration before any fix is attempted. udp_output() will only call in_pcbbind_setup() if a non-INADDR_ANY source address was specified; this is usually obtained from the socket being bound previously. This explains why the IP_SENDSRCADDR option is rejected in udp_output() for an unbound socket. It *will* be accepted if the option contains INADDR_ANY. In this case, normal source address selection takes place. This is a good use case demonstrating the need for source address selection logic such as is now found in NetBSD. There is no sanity checking on the IP_SENDSRCADDR option data containing INADDR_ANY; such an option is redundant and is nonsensical for an unbound socket. We should reject the option if it contains INADDR_ANY if and only if the socket is not bound. Implementing such a check is fairly easy and makes sense for this use case. Returning EINVAL in this case seems acceptable according to ip(4). The option *should* be accepted if the application has bound the socket to a device somehow (oh dear, SO_BINDTODEVICE rears its head again) as DHCP for example needs to override any IPv4 address which may be assigned on an ifnet with 0.0.0.0. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR
Andre Oppermann wrote: I have some WIP here too. I'll send it to you later this afternoon. Thanks, I look forward to seeing it, re Issue #2 IP_SENDSRCADDR. Dealing with dhclient is a separate issue -- here, something like IP_SENDIF needs to be introduced, as we are truly in an 'ip unnumbered' situation -- ie the ifnet MAY not yet have been assigned an IPv4 address at all, and IP_SENDSRCADDR implies that you are source routing in the local stack by passing the address of a numbered interface I have however dealt with Issue #1 by committing a fix to ip_output() for the IP_ONESBCAST SO_DONTROUTE case. This together with the fix you committed for ethernet next-hop resolution (thanks!) should mean that projects like OLSRD can stop using libnet and other hacks for sending 255.255.255.255 on FreeBSD. The original broadtest tool has now been cleaned up and put into the tree under src/tools/regression/netinet/ipbroadcast. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: is setsockopt SO_NOSIGPIPE work?
Anton Yuzhaninov wrote: RE It works, but only if you use send() instead of write(). RE Alternatively, you can control the behavior on a per RE message basis, by passing the MSG_NOSIGNAL in the flags RE argument to the send() call (without having to set a RE socket option). Thanks, with send() it works fine. I think it should be documented in setsockopt(2). AFAIK this is not a POSIX sockopt. I can only trace it back to MacOS X as the origin. Most applications I know of set the handler for SIGPIPE to SIG_IGN in such situations. Call graph: write() - dofilewrite() - soo_write() - pru_send() Looking at the code for the generic write() path it looks like we would never squelch this kind of SIGPIPE intentionally. In soo_write() we check the SO_NOSIGPIPE option to tell if we should call psignal(). However, as soon as we return from soo_write(), the EPIPE is mapped to psignal() by the generic code in dofilewrite() which generates the SIGPIPE you are seeing. I think this may be a bug but in the absence of precise written requirements I can't be sure. :-) BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Re: is setsockopt SO_NOSIGPIPE work?
Anton Yuzhaninov wrote: Thanks, with send() it works fine. I think it should be documented in setsockopt(2). Try this patch. The comment doesn't reflect what the code does. SIGPIPE may actually be getting queued twice in your case. It is most likely that the process's main thread wasn't preempted before return from the syscall. Perhaps someone more familiar with the signal code than I can chime in. --- sys_generic.c 14 Oct 2006 19:01:55 - 1.151 +++ sys_generic.c 1 Mar 2007 17:30:39 - @@ -489,7 +489,7 @@ dofilewrite(td, fd, fp, auio, offset, fl error == EINTR || error == EWOULDBLOCK)) error = 0; /* Socket layer is responsible for issuing SIGPIPE. */ - if (error == EPIPE) { + if (fp-f_type != DTYPE_SOCKET error == EPIPE) { PROC_LOCK(td-td_proc); psignal(td-td_proc, SIGPIPE); PROC_UNLOCK(td-td_proc); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Re: is setsockopt SO_NOSIGPIPE work?
Anton Yuzhaninov wrote: Works for me. Committed, thanks for finding this bug. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Re: is setsockopt SO_NOSIGPIPE work?
N.J. Mann wrote: Could this be why mail from cron doesn't work for me in 6.2? I got as far as finding that cron receives a SIGPIPE while sending the mail message to sendmail, but never worked out why. I ended up hacking cron to ignore SIGPIPE and then ENOTIME to investigate further. Unlikely, unless cron were directly hooked up to a TCP socket. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Feature request: exit netstat(1) after user specified outputs
LI Xin wrote: Hi, If no one objects this change, I will commit it? No objection here. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Proposal: Add M_HASCL().
Much network code needs to know if the mbuf it is looking at is using a cluster. I propose putting M_HASCL() in sys/mbuf.h. I realise this is a style change, however, it seems to be a very common idiom. Places this macro is currently defined and used directly: netinet/ip_mroute.c netinet6/ip6_mroute.c nfsclient/nfsm_subs.h nfsserver/nfsm_subs.h Places which use this idiom by another name: if_ppp.c ppp_tty.c Places which use this idiom indirectly by its expansion: sys/mbuf.h sys/socketvar.h netinet/ip6.h dev/pdq Many device drivers and third party code. Head on over to http://fxr.watson.org/fxr/ident?i=M_HASCL and have a look. Feel free to not bikeshed about this. It became apparent that this is a common idiom (needing to know if an mbuf is using external storage for whatever reason). Thoughts? BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: making a dumb switch into a smart one
Luigi Rizzo wrote: partly off topic, but maybe someone migth find this interesting given that the device can do vlan tag insertion/removal, so it can be used to provide additional fan-in/fan-out to freebsd-based routers in not too high-speed networks. Were you trying to perform the same evil experiment on the Asound 4-port Ethernet 'switch' PCI card which I found in my/your old office at ICSI? ;-) I think it was Juli Mallett who said she had code to deal with the Asound. This is cool. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Networking FreeBSD Wiki
[EMAIL PROTECTED] wrote: George, maybe there should be a separate category in GNATS also, for network issues? Instead of being in kern you mean? I have thought that before but I don't control GNATS and we'd have to review a lot of bugs. I have noticed there has been a gradual effort over time by the Bugmeisters to classify bugs by putting [netinet] or other strings in the one-line bug synopsis. Whilst this is a great help, it still doesn't address many of the issues we have with GNATS, upon which consensus has not yet been reached as to how to go forward. Personally, I'd like to blow GNATS up and replace it with Bugzilla. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/86848: [pf][multicast] destroying active syncdev leads to panic
Hi, Please try the attached patch which should hopefully fix this issue (untested). Regards, BMS ? .swp Index: if_pfsync.c === RCS file: /home/ncvs/src/sys/contrib/pf/net/if_pfsync.c,v retrieving revision 1.32 diff -u -p -r1.32 if_pfsync.c --- if_pfsync.c 29 Dec 2006 13:59:47 - 1.32 +++ if_pfsync.c 25 Feb 2007 16:11:03 - @@ -170,6 +170,9 @@ void pfsync_timeout(void *); void pfsync_send_bus(struct pfsync_softc *, u_int8_t); void pfsync_bulk_update(void *); void pfsync_bulkfail(void *); +#ifdef __FreeBSD__ +static void pfsync_ifdetach(void *, struct ifnet *); +#endif int pfsync_sync_ok; #ifndef __FreeBSD__ @@ -191,6 +194,9 @@ pfsync_clone_destroy(struct ifnet *ifp) struct pfsync_softc *sc; sc = ifp-if_softc; +#ifdef __FreeBSD__ + EVENTHANDLER_DEREGISTER(ifnet_departure_event, sc-sc_detachtag); +#endif callout_stop(sc-sc_tmo); callout_stop(sc-sc_bulk_tmo); callout_stop(sc-sc_bulkfail_tmo); @@ -225,6 +231,16 @@ pfsync_clone_create(struct if_clone *ifc return (ENOSPC); } +#ifdef __FreeBSD__ + sc-sc_detachtag = EVENTHANDLER_REGISTER(ifnet_departure_event, + pfsync_ifdetach, sc, EVENTHANDLER_PRI_ANY); + if (sc-sc_detachtag == NULL) { + if_free(ifp); + free(sc, M_PFSYNC); + return (ENOSPC); + } +#endif + pfsync_sync_ok = 1; sc-sc_mbuf = NULL; sc-sc_mbuf_net = NULL; @@ -1870,6 +1886,35 @@ pfsync_sendout(sc) #ifdef __FreeBSD__ static void +pfsync_ifdetach(void *arg, struct ifnet *ifp) +{ + struct pfsync_softc *sc = (struct pfsync_softc *)arg; + struct ip_moptions *imo; + + if (sc == NULL || sc-sc_sync_ifp != ifp) + return; /* not for us; unlocked read */ + + PF_LOCK(); + + /* Deal with detaching an interface which went away. */ + sc-sc_sync_ifp = NULL; + if (sc-sc_mbuf_net != NULL) { + s = splnet(); + m_freem(sc-sc_mbuf_net); + sc-sc_mbuf_net = NULL; + sc-sc_statep_net.s = NULL; + splx(s); + } + imo = sc-sc_imo; + if (imo-imo_num_memberships 0) { + in_delmulti(imo-imo_membership[--imo-imo_num_memberships]); + imo-imo_multicast_ifp = NULL; + } + + PF_UNLOCK(); +} + +static void pfsync_senddef(void *arg) { struct pfsync_softc *sc = (struct pfsync_softc *)arg; @@ -1879,6 +1924,14 @@ pfsync_senddef(void *arg) IF_DEQUEUE(sc-sc_ifq, m); if (m == NULL) break; +#if 1 + /* XXX: paranoia */ + if (sc-sc_sync_ifp == NULL) { + pfsyncstats.pfsyncs_oerrors++; + m_freem(m); + continue; + } +#endif if (ip_output(m, NULL, NULL, IP_RAWOUTPUT, sc-sc_imo, NULL)) pfsyncstats.pfsyncs_oerrors++; } Index: if_pfsync.h === RCS file: /home/ncvs/src/sys/contrib/pf/net/if_pfsync.h,v retrieving revision 1.7 diff -u -p -r1.7 if_pfsync.h --- if_pfsync.h 10 Jun 2005 17:23:49 - 1.7 +++ if_pfsync.h 25 Feb 2007 16:11:03 - @@ -181,6 +181,7 @@ struct pfsync_softc { int sc_maxupdates; /* number of updates/state */ #ifdef __FreeBSD__ LIST_ENTRY(pfsync_softc) sc_next; + eventhandler_tag sc_detachtag; #endif }; #endif ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/100519: [netisr] suggestion to fix suboptimal network polling
Synopsis: [netisr] suggestion to fix suboptimal network polling State-Changed-From-To: feedback-open State-Changed-By: bms State-Changed-When: Sun Feb 25 16:18:13 UTC 2007 State-Changed-Why: Back to the net pool Responsible-Changed-From-To: bms-net Responsible-Changed-By: bms Responsible-Changed-When: Sun Feb 25 16:18:13 UTC 2007 Responsible-Changed-Why: Back to the net pool http://www.freebsd.org/cgi/query-pr.cgi?pr=100519 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/86848: [pf][multicast] destroying active syncdev leads to panic
Whups. That needs 'int s' or the spl calls removed. I am under the weather today (dry flu type virus)... ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Re: ioctl: SIOCADDMULTI (howto?)
I have now added a regression test for this bug in HEAD, under src/tools/regression/ethernet/ethermulti. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
NetworkRfcCompliance is born
http://wiki.freebsd.org/NetworkRfcCompliance Please begin wiki-whacking! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NetworkRfcCompliance is born
Luigi Rizzo wrote: On Wed, Feb 21, 2007 at 02:50:27PM +, Bruce M Simpson wrote: http://wiki.freebsd.org/NetworkRfcCompliance before it is too late to change, maybe it is the case to spell RFC as all capital letters ? It would surely be better named NetworkStandardsCompliance as IEEE stuff appears inevitably also. I am pressed for time at the moment, so, other volunteers very welcome to do so... BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unable to connect broadband
satimis wrote: Hi folks, FreeBSD-6.2-amd64 ... The onboard NIC seems not detected. In the absence of required information, I speculate your machine has msk(4) or another recent chipset which may be supported in FreeBSD-CURRENT but not FreeBSD-STABLE. Please post the full output of 'pciconf -lv' from booting a recent FreeSBIE version to the list and hopefully someone can offer more help. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
If you run IS-IS please contact me
Anyone out there running IS-IS on a FreeBSD machine, please contact me. It's my understanding that IS-IS requires link-layer multicast support. Therefore I would like to hear from anyone who is running an implementation of it on FreeBSD successfully. I want to make sure it continues to operate in the 6.2-STABLE and 7.0-CURRENT code bases, given that we plan a lot of changes to Ethernet and how it works in those code bases. If you could let me know which implementation of IS-IS you're using, how long you've been running it for, how large the network you route with IS-IS is, and which FreeBSD releases you have been using, that would be most useful. Thank you in advance! Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ioctl: SIOCADDMULTI (howto?)
Here is a better patch for the netstat output. I haven't had time to look at the kernel yet. If this patch is good for you I'll commit it on -CURRENT. It cleans up the group membership output significantly and displays the Link-layer information separately. If anyone 'out there' has been relying on this output in scripts, please tell me. BMS --- mcast.c.orig Sat Feb 17 18:12:28 2007 +++ mcast.c Tue Feb 20 23:26:41 2007 @@ -71,21 +71,39 @@ #define MYIFNAME_SIZE 128 void -ifmalist_dump(void) +ifmalist_dump_af(struct ifmaddrs *ifmap, int af) { - struct ifmaddrs *ifmap, *ifma; + struct ifmaddrs *ifma; sockunion_t *psa; char myifname[MYIFNAME_SIZE]; char addrbuf[INET6_ADDRSTRLEN]; char *pcolon; void *addr; - char *pifname, *plladdr, *pgroup; + char *pafname, *pifname, *plladdr, *pgroup; - if (getifmaddrs(ifmap)) - err(EX_OSERR, getifmaddrs); + if (!((af == AF_INET) || (af == AF_LINK) +#ifdef INET6 + || (af == AF_INET6) +#endif + )) + return; + + switch (af) { + case AF_INET: + pafname = IPv4; + break; + case AF_INET6: + pafname = IPv6; + break; + case AF_LINK: + pafname = Link-layer; + break; + } - fputs(IPv4/IPv6 Multicast Group Memberships\n, stdout); - fprintf(stdout, %-20s\t%-16s\t%s\n, Group, Gateway, Netif); + fprintf(stdout, %s Multicast Group Memberships\n, pafname); + fprintf(stdout, %-20s\t%-16s\t%s\n, Group, + Next Hop/L2 Address, + Netif); for (ifma = ifmap; ifma; ifma = ifma-ifma_next) { @@ -94,16 +112,32 @@ /* Group address */ psa = (sockunion_t *)ifma-ifma_addr; + if (psa-sa.sa_family != af) + continue; switch (psa-sa.sa_family) { case AF_INET: pgroup = inet_ntoa(psa-sin.sin_addr); break; +#ifdef INET6 case AF_INET6: addr = psa-sin6.sin6_addr; inet_ntop(psa-sa.sa_family, addr, addrbuf, sizeof(addrbuf)); pgroup = addrbuf; break; +#endif + case AF_LINK: + if ((psa-sdl.sdl_alen == ETHER_ADDR_LEN) || + (psa-sdl.sdl_type == IFT_ETHER)) { +pgroup = +ether_ntoa((struct ether_addr *)psa-sdl.sdl_data); + } else { +pgroup = addr2ascii(AF_LINK, +psa-sdl, +sizeof(struct sockaddr_dl), +addrbuf); + } + break; default: continue; /* XXX */ } @@ -116,14 +150,20 @@ plladdr = inet_ntoa(psa-sin.sin_addr); break; case AF_LINK: -if (psa-sdl.sdl_type == IFT_ETHER) - plladdr = ether_ntoa((struct ether_addr *)psa-sdl.sdl_data); -else - plladdr = link_ntoa(psa-sdl); +if (psa-sdl.sdl_type == IFT_ETHER) { + plladdr = +ether_ntoa((struct ether_addr *)psa-sdl.sdl_data); +} else { + plladdr = addr2ascii(AF_LINK, + psa-sdl, + sizeof(struct sockaddr_dl), + addrbuf); +} break; } - } else + } else { plladdr = none; + } /* Interface upon which the membership exists */ psa = (sockunion_t *)ifma-ifma_name; @@ -143,6 +183,23 @@ fprintf(stdout, %-20s\t%-16s\t%s\n, pgroup, plladdr, pifname); } +} + +void +ifmalist_dump(void) +{ + struct ifmaddrs *ifmap; + + if (getifmaddrs(ifmap)) + err(EX_OSERR, getifmaddrs); + + ifmalist_dump_af(ifmap, AF_LINK); + fputs(\n, stdout); + ifmalist_dump_af(ifmap, AF_INET); + fputs(\n, stdout); +#ifdef INET6 + ifmalist_dump_af(ifmap, AF_INET6); +#endif freeifmaddrs(ifmap); } ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Re: ioctl: SIOCADDMULTI (howto?)
Jouke Witteveen wrote: So my apologies for suggesting it doesn't work at all; it seems that the application I'm trying to get to work (wpa_supplicant for wired interfaces) just doesn't _send_ its packets the right way. That's a big relief! I added an item to the Wiki for someone to write a regression test. Things aren't perfect though. In if.c the if_findmulti function is broken (always returns NULL). I presume just comparing the *LLADDR((sockaddr *)sa) data on both sockets is a better check, though my knowledge on these things is limited. I think I see a possible problem, though the code looks as though it is behaving as expected. I am looking at RELENG_6 if.c. I think sa_equal() may be to blame. sa_equal() performs a binary comparison on all of sa_data up to sa_len. Looking at struct sockaddr_dl, this might not be the right thing at all in that situation... though I need another pair of eyes to look. Can anyone shed light on this? An AF_INET and AF_INET6 address can be completely specified and compared with sa_equal(). An AF_LINK address looks as though sa_equal() may return irrational results. As for netstat, I do not really know what is keeping it from showing the Multicast addresses. Again: my knowledge on this matter is limited. All I can think of is that getifmaddrs is forgetting something (perhaps the lack of a group membership). Maybe you can take a look at it (I believe you wrote it). I wrote the libc getifmaddrs() function and integrated it into netstat -g; Harti Brandt wrote the NET_RT_IFMALIST support. getifmaddrs() *should* return sockaddr_dl as well as sockaddr_in and all the others. netstat skips over AF_LINK addresses. Try this patch to reveal them. It doesn't seem to show the IPv4 link layer memberships underneath, which is interesting... As I am still learning how best to contribute to a project as big as FreeBSD and I do not think I am skilled enough yet in C I refrain from writing a patch. I am eager to see one though, be it only out of curiosity to know what would be considered a proper fix. Give it a try anyway! I like to think we have strong healthy egos round here. Regards, BMS --- mcast.c.orig Sat Feb 17 18:12:28 2007 +++ mcast.c Sat Feb 17 18:14:15 2007 @@ -84,7 +84,7 @@ if (getifmaddrs(ifmap)) err(EX_OSERR, getifmaddrs); - fputs(IPv4/IPv6 Multicast Group Memberships\n, stdout); + fputs(IPv4/IPv6/Layer 2 Multicast Group Memberships\n, stdout); fprintf(stdout, %-20s\t%-16s\t%s\n, Group, Gateway, Netif); for (ifma = ifmap; ifma; ifma = ifma-ifma_next) { @@ -103,6 +103,15 @@ inet_ntop(psa-sa.sa_family, addr, addrbuf, sizeof(addrbuf)); pgroup = addrbuf; + break; + case AF_LINK: + if (psa-sdl.sdl_type == IFT_ETHER) { +plladdr = ether_ntoa((struct ether_addr *) +psa-sdl.sdl_data); + } else { +plladdr = link_ntoa(psa-sdl); + } + strlcpy(addrbuf, plladdr, sizeof(addrbuf)); break; default: continue; /* XXX */ ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Recommendations for OSPF v3 book?
Does anyone have any good suggestions for a book which discusses OSPF v3 architecture? I have read the original John Moy book 'OSPF: Anatomy of an Internet routing protocol' but would very much like to know of there is a good text out there which discusses OSPF in the wider context of IPv6 and the improvements made in version 3 of the protocol. I should be most grateful for your suggestions. Kind regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Updated 802.1p/q patch
Yar Tikhiy wrote: Do you have any architectural reservations about nested VLANs in the main network stack? Presently, a one-line patch can allow a vlan(4) to attach to another vlan(4), but I haven't heard about the behaviour of the resulting setup yet. After looking around it seems there is definite scope and demand for such a feature in scenarios such as ISP Metro Ethernet setups. However, we can't rely on M_VLANTAG alone to implement it. To do it we need to be sure of the following: 1. Output path in vlan(4) changes not to call ether_output_frame() directly if nested. 2. Output path in vlan(4) detects when it's going to re-enter the parent vlan(4), and makes sure the inner 802.1q header is expanded and inserted from M_VLANTAG before passing it down the stack. 3. That the drivers and cards out there can deal with Q-in-Q. 4. That the input path only extracts and applies M_VLANTAG for the outer 802.1q header. 4. That the input path is able to reenter vlan(4) correctly on the way back up the stack; The code which produces/consumes M_VLANTAG from the 802.1q header might need to be made common. The priority field them becomes problematic. As a compromise I'd suggest the priority field in the VLAN tag is derived from the innermost 802.1q header, which will be the first M_VLANTAG which the Ethernet part of the stack sees. This gives ALTQ/RSVP/PF a chance to do its thing without complicated workarounds. BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] Part 2 of low level 802.1p priority support
Pyun YongHyeon wrote: Further testing with drivers is needed (I can't be 100% sure it fails with msk(4) because something strange is happening when vlan tagging is turned off). Perhaps Pyun knows? I guess I've not merged local changes before committing to HEAD. How about attached one? I can confirm that the merged VLAN tag code works OK with msk and VLAN_HWTAGGING disabled when using this patch. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Updated 802.1p/q patch
Hi, I have tested my 802.1p input patch with vlans configured. So far so good. It is now available from: http://people.FreeBSD.org/~bms/dump/latest-8021p.diff This updated patch moves the 802.1q encapsulation into if_ethersubr.c, allowing M_VLANTAG to be passed up and down the stack for 802.1p priority. I would greatly appreciate wider testing before it is committed. I've noticed that vlan(4) will not put a parent interface into PROMISC if the vlanhwtag capability exists but is disabled. If the main non-vlan input path receives datagrams destined for a layer 3 address configured on a vlan interface, the netinet stack will quite reasonably try to reply on the vlan interface unless net.inet.ip.check_interface is set to 1; something to be aware of. If vlan(4) gets an mbuf which has already been tagged with M_VLANTAG from higher up in the stack, it *should* ignore the vlan id by overwriting it, and using the priority field already assigned to it, so that ALTQ or PF can do its magic. This new patch should do this. The Ethernet code will not use 802.1p by default unless it came from higher up (by way of M_VLANTAG passed to a driver); we should insert the 802.1p tag in the situation where we got an M_VLANTAG from further up without a vlan(4) instance being involved. The new patch should do this. We should also make sure the CFI bit is always cleared in bridging situations as it has special meaning for token ring and FDDI. What has not been tested or considered is the situation where we have nested VLANs. At least one individual has asked about this feature. At the moment, I'd suggest that only Netgraph potentially deals with this rather than the main network stack. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Gateway slowed down to barely usable
Andrea Venturoli wrote: Today it suddenly dropped to a bare few b/s. I checked the ISP line by attaching another machine in place of this and it could do full 1Mb/s, so this box was the problem. After a simple reboot it started working as good as always. Now the question is: in case this happens again, how do I find out what's wrong? CPU usage was under 2% and so was swap usage... what else could I check? What tools should I use? Points for further investigation: How long was the machine up for? Exactly which network components in FreeBSD are you using? Do you have any figures on what kind of network load the machine was dealing with? Can you rule out problems with an intermediate switch? Based on what you've said I can only speculate that the possible causes are either mbuf memory fragmentation or a driver problem; both are a total stab in the dark. Regards, BMS bye Thanks av. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Configuring rendevous point
[EMAIL PROTECTED] wrote: Hi all situation: got freebsd box working as NAT for my local network. In kernel config there is an option PIM. FYI, PIM is now the default in -CURRENT; the option has been removed. You should be able to load multicast routing with PIM as a loadable kernel module in -CURRENT. I want my hosts behind NAT to receive multicast streams. I`ve seen in Debian in pimdd.conf undocumented option rp_address, which stands for rendevous point IP address (http://ftp.debian.org/debian/pool/main/p/pimd/pimd_2.1.0-alpha29.17-6.diff.gz). PIM-DM (Dense mode) does not use the Rendezvous Point. Is there any way to specify rendevous point in freebsd via pimd.conf or mrouted ? Try XORP, in ports/net/xorp; it supports PIM-SM (Sparse mode) which is probably what you want for this kind of network configuration. Normally the RP for a given group or set of groups is discovered using the Auto-RP feature of PIM-SM however, they may be statically configured; see the 'static-rps {}' configuration block in XORP's PIM-SM. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] netstat(1) should print CIDR prefixes
Gary Corcoran wrote: Since those 'classes' haven't meant anything for many years, and interpreting them as 'special' is just plain wrong in almost all cases these days, I think the change is the right thing to do. I've had +3. Any objections? If I hear none I will make this change in -CURRENT; with a note in UPDATING. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Part 1 of low level 802.1p priority support
Hi, Here is the first patch to bring in 802.1p Packet Priority to FreeBSD; this is to support Differentiated Services and Quality-of-Service. This builds on the M_VLANTAG support introduced by Andre last September. This first stage enables FreeBSD to pass packets for 802.1q with VLAN 0 to the main input path in the stack, which is the IEEE standards-compliant behaviour. With the attached patch and test packet, you can test this for yourself. Currently this is limited to interfaces which support VLAN_HWTAGGING. To make the change universal, an architectural change is needed; some of the inline 802.1q processing needs to be moved from if_vlan.c to if_ethersubr.c. To use this: 1. Apply attached patch on a separate machine to be used as a test peer. 2. Process attached hex dump with xxd from Vim distribution (editors/vim) to convert back to a binary pcap file. 3. Configure test address on test peer, preferably using a separate physical LAN. 4. Use ports/net-mgmt/tcpreplay to inject the traffic, with the appropriate IP and MAC addresses. 5. Observe that you get an ICMP echo reply back WITHOUT 802.1q encapsulation. Currently, the code deals only with receiving VLAN tags at a low level and does nothing about sending them. This is just the low level stuff -- QoS is not magically happening right now. Comments... testing... suggestions... Regards, BMS ? .swp Index: if_ethersubr.c === RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.222 diff -u -p -r1.222 if_ethersubr.c --- if_ethersubr.c 24 Dec 2006 08:52:13 - 1.222 +++ if_ethersubr.c 10 Feb 2007 16:46:42 - @@ -618,6 +618,7 @@ ether_demux(struct ifnet *ifp, struct mb struct ether_header *eh; int isr; u_short ether_type; + uint16_t vlanid; #if defined(NETATALK) struct llc *l; #endif @@ -627,6 +628,7 @@ ether_demux(struct ifnet *ifp, struct mb KASSERT(ifp != NULL, (ether_demux: NULL interface pointer)); + vlanid = 0; eh = mtod(m, struct ether_header *); ether_type = ntohs(eh-ether_type); @@ -708,36 +710,44 @@ post_stats: */ if (m-m_flags M_VLANTAG) { /* - * If no VLANs are configured, drop. + * Deal with numbered 802.1q VLANs, by passing frames for + * specifically numbered VLANs to the VLAN input handler. */ - if (ifp-if_vlantrunk == NULL) { - ifp-if_noproto++; - m_freem(m); + vlanid = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag); + if (ifp-if_vlantrunk != NULL vlanid != 0) { + KASSERT(vlan_input_p != NULL, + (ether_input: VLAN not loaded!)); + (*vlan_input_p)(ifp, m); return; } /* - * vlan_input() will either recursively call ether_input() - * or drop the packet. + * Drop frames with VLAN encapsulation if VLANs are not + * configured on this interface, if and only if they did + * not contain 802.1p priority information. + * Such frames are preserved, because code further up the + * stack may use the 802.1p information. */ - KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!)); - (*vlan_input_p)(ifp, m); - return; + if (ifp-if_vlantrunk == NULL vlanid != 0) { + ifp-if_noproto++; + m_freem(m); + return; + } } /* * Handle protocols that expect to have the Ethernet header * (and possibly FCS) intact. */ - switch (ether_type) { - case ETHERTYPE_VLAN: + if (ether_type == ETHERTYPE_VLAN vlanid != 0) { if (ifp-if_vlantrunk != NULL) { - KASSERT(vlan_input_p,(ether_input: VLAN not loaded!)); + KASSERT(vlan_input_p, + (ether_input: VLAN not loaded!)); (*vlan_input_p)(ifp, m); } else { ifp-if_noproto++; m_freem(m); + return; } - return; } /* Strip off Ethernet header. */ 000: d4c3 b2a1 0200 0400 010: 0100 51ed cd45 e444 0d00 Q..E.D.. 020: 6600 6600 0010 f...f... 030: c6bb 16f4 8100 0800 4500 0054 258f ..E..T%. 040: 4001 4110 0a00 0005 0a00 0006 0800 [EMAIL PROTECTED] 050: ca41 4154 45cd ed51 000c ce3b 0809 .AAT..E..Q...;.. 060: 0a0b 0c0d 0e0f 1011 1213 1415 1617 1819 070: 1a1b 1c1d 1e1f 2021 2223 2425 2627 2829 .. !#$%'() 080: 2a2b 2c2d 2e2f 3031 3233 3435 3637 *+,-./01234567 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Part 2 of low level 802.1p priority support
This updated patch moves VLAN tag decapsulation into if_ethersubr.c and always uses M_VLANTAG, which is also passed to the upper layer. Tests with ping: fxp (no VLAN_HWTAGGING support) OK msk (VLAN_HWTAGGING enabled) OK msk (VLAN_HWTAGGING disanabled) FAIL I am concerned that this may need review and testing to support situations where we do nested VLANs or with bridge(4) before it can be committed. Further testing with drivers is needed (I can't be 100% sure it fails with msk(4) because something strange is happening when vlan tagging is turned off). Perhaps Pyun knows? Regards, BMS Index: if_ethersubr.c === RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.222 diff -u -p -r1.222 if_ethersubr.c --- if_ethersubr.c 24 Dec 2006 08:52:13 - 1.222 +++ if_ethersubr.c 10 Feb 2007 18:16:54 - @@ -701,43 +701,50 @@ post_stats: } } #endif - /* - * Check to see if the device performed the VLAN decapsulation and - * provided us with the tag. + * If the device did not perform decapsulation of the 802.1q VLAN + * header itself, do this now, and tag the mbuf with M_VLANTAG. + * Remove the 802.1q header by copying the Ethernet addresses over + * it and adjusting the beginning of the data in the mbuf. + * Re-inspect the ether_type field so we do the right thing + * for VLAN 0. */ - if (m-m_flags M_VLANTAG) { - /* - * If no VLANs are configured, drop. - */ - if (ifp-if_vlantrunk == NULL) { - ifp-if_noproto++; - m_freem(m); + if ((ether_type == ETHERTYPE_VLAN) !(m-m_flags M_VLANTAG)) { + struct ether_vlan_header *evl; + + if (m-m_len sizeof(*evl) + (m = m_pullup(m, sizeof(*evl))) == NULL) { + if_printf(ifp, cannot pullup VLAN header\n); return; } - /* - * vlan_input() will either recursively call ether_input() - * or drop the packet. - */ - KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!)); - (*vlan_input_p)(ifp, m); - return; + + evl = mtod(m, struct ether_vlan_header *); + m-m_pkthdr.ether_vtag = ntohs(evl-evl_tag); + m-m_flags |= M_VLANTAG; + bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN, + ETHER_HDR_LEN - ETHER_TYPE_LEN); + m_adj(m, ETHER_VLAN_ENCAP_LEN); + /* We need to see the inner type field in case of reentry. */ + eh = mtod(m, struct ether_header *); + ether_type = ntohs(eh-ether_type); } /* - * Handle protocols that expect to have the Ethernet header - * (and possibly FCS) intact. + * Deal with numbered 802.1q VLANs, by passing these frames to + * the VLAN input handler. Frames destined for VLAN 0 are for + * the main input path. Otherwise, drop frames with VLAN tags. */ - switch (ether_type) { - case ETHERTYPE_VLAN: + if ((m-m_flags M_VLANTAG) + EVL_VLANOFTAG(m-m_pkthdr.ether_vtag) != EVL_VLAN_ZERO) { if (ifp-if_vlantrunk != NULL) { - KASSERT(vlan_input_p,(ether_input: VLAN not loaded!)); + KASSERT(vlan_input_p, + (ether_input: VLAN not loaded!)); (*vlan_input_p)(ifp, m); } else { ifp-if_noproto++; m_freem(m); + return; } - return; } /* Strip off Ethernet header. */ Index: if_vlan.c === RCS file: /home/ncvs/src/sys/net/if_vlan.c,v retrieving revision 1.117 diff -u -p -r1.117 if_vlan.c --- if_vlan.c 30 Dec 2006 21:10:25 - 1.117 +++ if_vlan.c 10 Feb 2007 18:16:54 - @@ -911,51 +911,9 @@ vlan_input(struct ifnet *ifp, struct mbu uint16_t tag; KASSERT(trunk != NULL, (%s: no trunk, __func__)); + KASSERT((m-m_flags M_VLANTAG),(%s: M_VLANTAG not set, __func__)); - if (m-m_flags M_VLANTAG) { - /* - * Packet is tagged, but m contains a normal - * Ethernet frame; the tag is stored out-of-band. - */ - tag = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag); - m-m_flags = ~M_VLANTAG; - } else { - struct ether_vlan_header *evl; - - /* - * Packet is tagged in-band as specified by 802.1q. - */ - switch (ifp-if_type) { - case IFT_ETHER: - if (m-m_len sizeof(*evl) - (m = m_pullup(m, sizeof(*evl))) == NULL) { -if_printf(ifp, cannot pullup VLAN header\n); -return; - } - evl = mtod(m, struct ether_vlan_header *); - tag = EVL_VLANOFTAG(ntohs(evl-evl_tag)); - - /* - * Remove the 802.1q header by copying the Ethernet - * addresses over it and adjusting the beginning of - * the data in the mbuf. The encapsulated Ethernet - * type field is already in place. - */ - bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN, - ETHER_HDR_LEN - ETHER_TYPE_LEN); - m_adj(m, ETHER_VLAN_ENCAP_LEN); - break; - - default: -#ifdef INVARIANTS - panic(%s: %s has unsupported if_type %u, - __func__, ifp-if_xname, ifp-if_type); -#endif - m_freem(m); - ifp-if_noproto++; - return; - } - } + tag = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag); TRUNK_RLOCK(trunk); #ifdef VLAN_ARRAY Index: if_vlan_var.h
[PATCH] Introduce M_PROMISC to lower part of Ethernet code
Hi, Thunderbird keeps crashing whenever I draft these messages, which is frustrating. Can we discuss this change? I would like to get it in as we get the following wins: 1. Potentially cleaner code in ether_demux()/ether_input() 2. Ways of detecting and preventing L2/L3 forwarding loops 3. Being able to do more with promiscuous mode in general e.g. using it to emulate broken IFF_ALLMULTI with network cards which can't support multicast routing properly. Feedback eagerly looked forward to; this is not a complete change; this is strictly development quality at the moment. Regards, BMS Index: net/if_ethersubr.c === RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v retrieving revision 1.222 diff -u -p -r1.222 if_ethersubr.c --- net/if_ethersubr.c 24 Dec 2006 08:52:13 - 1.222 +++ net/if_ethersubr.c 10 Feb 2007 20:59:39 - @@ -582,6 +582,7 @@ ether_input(struct ifnet *ifp, struct mb if (IFP2AC(ifp)-ac_netgraph != NULL) { KASSERT(ng_ether_input_p != NULL, (ng_ether_input_p is NULL)); + m-m_flags = ~M_PROMISC; (*ng_ether_input_p)(ifp, m); if (m == NULL) return; @@ -598,6 +599,7 @@ ether_input(struct ifnet *ifp, struct mb * at the src/sys/netgraph/ng_ether.c:ng_ether_rcv_upper() */ if (ifp-if_bridge) { + m-m_flags = ~M_PROMISC; BRIDGE_INPUT(ifp, m); if (m == NULL) return; @@ -634,6 +636,14 @@ ether_demux(struct ifnet *ifp, struct mb if (rule) /* packet was already bridged */ goto post_stats; #endif + /* + * If the frame was received promiscuously, mark it as such. + */ + if ((ifp-if_flags IFF_PROMISC) + !ETHER_IS_MULTICAST(eh-ether_dhost) + bcmp(eh-ether_dhost, IF_LLADDR(ifp), ETHER_ADDR_LEN) != 0) { + m-m_flags |= M_PROMISC; + } if (!(ifp-if_bridge) !((ether_type == ETHERTYPE_VLAN || m-m_flags M_VLANTAG) @@ -648,8 +658,10 @@ ether_demux(struct ifnet *ifp, struct mb * evaluation, to see if the carp ether_dhost values break any * of these checks! */ - if (ifp-if_carp carp_forus(ifp-if_carp, eh-ether_dhost)) + if (ifp-if_carp carp_forus(ifp-if_carp, eh-ether_dhost)) { + m-m_flags = ~M_PROMISC; goto pre_stats; + } #endif /* * Discard packet if upper layers shouldn't see it because it @@ -662,14 +674,16 @@ ether_demux(struct ifnet *ifp, struct mb * give them a chance to consider it as well (e. g. in case * bridging is only active on a VLAN). They will drop it if * it's undesired. + * + * XXX: There is no way this check can be invoked if + * there are no VLANs attached to this parent interface, + * which is likely to cause recursion if we're acting + * as an IP forwarder... */ - if ((ifp-if_flags IFF_PROMISC) != 0 - !ETHER_IS_MULTICAST(eh-ether_dhost) - bcmp(eh-ether_dhost, - IF_LLADDR(ifp), ETHER_ADDR_LEN) != 0 - (ifp-if_flags IFF_PPROMISC) == 0) { - m_freem(m); - return; + if ((m-m_flags M_PROMISC) + (ifp-if_flags IFF_PPROMISC) == 0) { + m_freem(m); + return; } } @@ -720,6 +734,7 @@ post_stats: * or drop the packet. */ KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!)); + m-m_flags = ~M_PROMISC; (*vlan_input_p)(ifp, m); return; } @@ -732,6 +747,7 @@ post_stats: case ETHERTYPE_VLAN: if (ifp-if_vlantrunk != NULL) { KASSERT(vlan_input_p,(ether_input: VLAN not loaded!)); + m-m_flags = ~M_PROMISC; (*vlan_input_p)(ifp, m); } else { ifp-if_noproto++; Index: sys/mbuf.h === RCS file: /home/ncvs/src/sys/sys/mbuf.h,v retrieving revision 1.202 diff -u -p -r1.202 mbuf.h --- sys/mbuf.h 25 Jan 2007 01:05:23 - 1.202 +++ sys/mbuf.h 10 Feb 2007 20:59:40 - @@ -182,6 +182,7 @@ struct mbuf { #define M_FIRSTFRAG 0x1000 /* packet is first fragment */ #define M_LASTFRAG 0x2000 /* packet is last fragment */ #define M_VLANTAG 0x1 /* ether_vtag is valid */ +#define M_PROMISC 0x2 /* packet was not for us */ /* * External buffer types: identify ext_buf type. @@ -203,7 +204,7 @@ struct mbuf { #define M_COPYFLAGS (M_PKTHDR|M_EOR|M_RDONLY|M_PROTO1|M_PROTO1|M_PROTO2|\ M_PROTO3|M_PROTO4|M_PROTO5|M_SKIP_FIREWALL|\ M_BCAST|M_MCAST|M_FRAG|M_FIRSTFRAG|M_LASTFRAG|\ - M_VLANTAG) + M_VLANTAG|M_PROMISC) /* * Flags to purge when crossing layers. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] Make INET6 MROUTING dynamically loadable in GENERIC
Hi, This should do what it says on the tin... Regards, BMS Make IPv6 multicast forwarding dynamically loadable into a GENERIC kernel. Index: conf/files === RCS file: /home/ncvs/src/sys/conf/files,v retrieving revision 1.1175 diff -u -p -r1.1175 files --- conf/files 7 Feb 2007 18:55:29 - 1.1175 +++ conf/files 10 Feb 2007 22:07:19 - @@ -1759,6 +1759,7 @@ netinet/ip_icmp.c optional inet netinet/ip_input.c optional inet netinet/ip_ipsec.c optional ipsec netinet/ip_ipsec.c optional fast_ipsec +netinet/ip_mroute.c optional inet | inet6 netinet/ip_mroute.c optional mrouting netinet/ip_options.c optional inet netinet/ip_output.c optional inet @@ -1814,7 +1815,7 @@ netinet6/in6_src.c optional inet6 netinet6/ip6_forward.c optional inet6 netinet6/ip6_id.c optional inet6 netinet6/ip6_input.c optional inet6 -netinet6/ip6_mroute.c optional inet6 +netinet6/ip6_mroute.c optional mrouting inet6 netinet6/ip6_output.c optional inet6 netinet6/ipcomp_core.c optional ipsec netinet6/ipcomp_input.c optional ipsec Index: modules/ip_mroute_mod/Makefile === RCS file: /home/ncvs/src/sys/modules/ip_mroute_mod/Makefile,v retrieving revision 1.14 diff -u -p -r1.14 Makefile --- modules/ip_mroute_mod/Makefile 9 Feb 2007 01:42:43 - 1.14 +++ modules/ip_mroute_mod/Makefile 10 Feb 2007 22:07:19 - @@ -1,13 +1,21 @@ # $FreeBSD: src/sys/modules/ip_mroute_mod/Makefile,v 1.14 2007/02/09 01:42:43 bms Exp $ -.PATH: ${.CURDIR}/../../netinet +.PATH: ${.CURDIR}/../../netinet ${.CURDIR}/../../netinet6 KMOD= ip_mroute -SRCS= ip_mroute.c opt_mac.h opt_mrouting.h +SRCS= ip_mroute.c +SRCS+= ip6_mroute.c +SRCS+= opt_inet.h opt_inet6.h opt_mac.h opt_mrouting.h .if !defined(KERNBUILDDIR) +opt_inet.h: + echo #define INET 1 ${.TARGET} + +opt_inet6.h: + echo #define INET6 1 ${.TARGET} + opt_mrouting.h: - echo #define MROUTING 1 ${.TARGET} + echo #define MROUTING 1 ${.TARGET} .endif .include bsd.kmod.mk Index: netinet/ip_mroute.c === RCS file: /home/ncvs/src/sys/netinet/ip_mroute.c,v retrieving revision 1.128 diff -u -p -r1.128 ip_mroute.c --- netinet/ip_mroute.c 10 Feb 2007 14:48:42 - 1.128 +++ netinet/ip_mroute.c 10 Feb 2007 22:07:20 - @@ -55,6 +55,8 @@ * $FreeBSD: src/sys/netinet/ip_mroute.c,v 1.128 2007/02/10 14:48:42 bms Exp $ */ +#include opt_inet.h +#include opt_inet6.h #include opt_mac.h #include opt_mrouting.h @@ -217,6 +219,12 @@ struct protosw in_pim_protosw = { .pr_usrreqs = rip_usrreqs }; static const struct encaptab *pim_encap_cookie; + +#ifdef INET6 +extern struct protosw in6_pim_protosw; /* ip6_mroute.c: struct in6_protosw */ +static const struct encaptab *pim6_encap_cookie; +#endif + static int pim_encapcheck(const struct mbuf *, int, int, void *); /* @@ -2737,7 +2745,7 @@ pim_register_send_rp(struct ip *ip, stru } /* - * pim_encapcheck() is called by the encap4_input() path at runtime to + * pim_encapcheck() is called by the encap[46]_input() path at runtime to * determine if a packet is for PIM; allowing PIM to be dynamically loaded * into the kernel. */ @@ -2995,6 +3003,10 @@ pim_input_to_daemon: return; } +/* + * XXX: This is common code for dealing with initialization for both + * the IPv4 and IPv6 multicast forwarding paths. It could do with cleanup. + */ static int ip_mroute_modevent(module_t mod, int type, void *unused) { @@ -3006,6 +3018,7 @@ ip_mroute_modevent(module_t mod, int typ ip_mrouter_reset(); TUNABLE_ULONG_FETCH(net.inet.pim.squelch_wholepkt, pim_squelch_wholepkt); + pim_encap_cookie = encap_attach_func(AF_INET, IPPROTO_PIM, pim_encapcheck, in_pim_protosw, NULL); if (pim_encap_cookie == NULL) { @@ -3015,6 +3028,23 @@ ip_mroute_modevent(module_t mod, int typ mtx_destroy(mrouter_mtx); return (EINVAL); } + +#ifdef INET6 + pim6_encap_cookie = encap_attach_func(AF_INET6, IPPROTO_PIM, + pim_encapcheck, in6_pim_protosw, NULL); + if (pim6_encap_cookie == NULL) { + printf(ip_mroute: unable to attach pim6 encap\n); + if (pim_encap_cookie) { + encap_detach(pim_encap_cookie); + pim_encap_cookie = NULL; + } + VIF_LOCK_DESTROY(); + MFC_LOCK_DESTROY(); + mtx_destroy(mrouter_mtx); + return (EINVAL); + } +#endif + ip_mcast_src = X_ip_mcast_src; ip_mforward = X_ip_mforward; ip_mrouter_done = X_ip_mrouter_done; @@ -3039,6 +3069,12 @@ ip_mroute_modevent(module_t mod, int typ if (ip_mrouter) return EINVAL; +#ifdef INET6 + if (pim6_encap_cookie) { + encap_detach(pim6_encap_cookie); + pim6_encap_cookie = NULL; + } +#endif if (pim_encap_cookie) { encap_detach(pim_encap_cookie); pim_encap_cookie = NULL; Index: netinet6/in6_proto.c === RCS file: /home/ncvs/src/sys/netinet6/in6_proto.c,v retrieving
[PATCH] netstat(1) should print CIDR prefixes
Hi, This is a POLA violating 'let's move with the times' patch that gets rid of the special treatment of classful IPv4 network prefixes in 'netstat -rn' output. Comments please! Rgards, BMS Index: route.c === RCS file: /home/ncvs/src/usr.bin/netstat/route.c,v retrieving revision 1.76 diff -u -p -r1.76 route.c --- route.c 13 May 2005 16:31:10 - 1.76 +++ route.c 10 Feb 2007 22:55:50 - @@ -865,32 +865,7 @@ netname(u_long in, u_long mask) strncpy(line, cp, sizeof(line) - 1); line[sizeof(line) - 1] = '\0'; } else { - switch (dmask) { - case IN_CLASSA_NET: - if ((i IN_CLASSA_HOST) == 0) { -sprintf(line, %lu, C(i 24)); -break; - } - /* FALLTHROUGH */ - case IN_CLASSB_NET: - if ((i IN_CLASSB_HOST) == 0) { -sprintf(line, %lu.%lu, - C(i 24), C(i 16)); -break; - } - /* FALLTHROUGH */ - case IN_CLASSC_NET: - if ((i IN_CLASSC_HOST) == 0) { -sprintf(line, %lu.%lu.%lu, - C(i 24), C(i 16), C(i 8)); -break; - } - /* FALLTHROUGH */ - default: - sprintf(line, %lu.%lu.%lu.%lu, -C(i 24), C(i 16), C(i 8), C(i)); - break; - } + inet_ntop(AF_INET, (char *)in, line, sizeof(line) - 1); } domask(line + strlen(line), i, mask); return (line); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Networking FreeBSD Wiki
Joel Dahl wrote: How about moving stuff from the (outdated) dingo[*] project page to this wiki page instead? [*] http://www.freebsd.org/projects/dingo/ That's what he did. I feel a twinge of responsibility for this, and the stupid name. I have just totally steamrollered in and edited (merged some of my own tasks). I have a bunch of other stuff on my list which I'll add...! BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/106999: [netgraph] [patch] ng_ksocket fails to clear multicast flag on mbuf before passing to stack
Synopsis: [netgraph] [patch] ng_ksocket fails to clear multicast flag on mbuf before passing to stack Responsible-Changed-From-To: freebsd-net-bms Responsible-Changed-By: bms Responsible-Changed-When: Fri Feb 9 02:39:10 UTC 2007 Responsible-Changed-Why: I'll take this http://www.freebsd.org/cgi/query-pr.cgi?pr=106999 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [PATCH] tun(4) does not clean up after itself
This change has now been committed on -CURRENT (reviewed by bz@) so it is now settling in. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Proposal: remove encap from MROUTING
I count no objections and +1 in favour from Andre. To maintain POLA, I will decapitate (Argh, pun) it from HEAD with no MFC to begin with. Arguments in favour: * mrouted was removed from the base system. * PIM does not use MROUTING's IPIP tunnels, and PIM is regarded as the standard these days for multicast routing. * MROUTING's internal tunnels do not have any management capabilities; gif(4) has. * It achieves some diff reduction with OpenBSD. * Reduces locking/netisr fandango. * The MROUTING paths could do with some overall cleanup anyway, and it doesn't seem appropriate to merge back such work, except as a backport. I plan to commit this patch some time this week. Regards, BMS Index: ip_mroute.c === RCS file: /home/ncvs/src/sys/netinet/ip_mroute.c,v retrieving revision 1.122 diff -u -p -r1.122 ip_mroute.c --- ip_mroute.c 6 Nov 2006 13:42:04 - 1.122 +++ ip_mroute.c 7 Feb 2007 02:57:07 - @@ -181,33 +181,7 @@ static struct callout expire_upcalls_ch; static struct tbf tbftable[MAXVIFS]; #define TBF_REPROCESS (hz / 100) /* 100x / second */ -/* - * 'Interfaces' associated with decapsulator (so we can tell - * packets that went through it from ones that get reflected - * by a broken gateway). These interfaces are never linked into - * the system ifnet list no routes point to them. I.e., packets - * can't be sent this way. They only exist as a placeholder for - * multicast source verification. - */ -static struct ifnet multicast_decap_if[MAXVIFS]; - #define ENCAP_TTL 64 -#define ENCAP_PROTO IPPROTO_IPIP /* 4 */ - -/* prototype IP hdr for encapsulated packets */ -static struct ip multicast_encap_iphdr = { -#if BYTE_ORDER == LITTLE_ENDIAN - sizeof(struct ip) 2, IPVERSION, -#else - IPVERSION, sizeof(struct ip) 2, -#endif - 0,/* tos */ - sizeof(struct ip), /* total length */ - 0,/* id */ - 0,/* frag offset */ - ENCAP_TTL, ENCAP_PROTO, - 0,/* checksum */ -}; /* * Bandwidth meter variables and constants @@ -287,14 +261,6 @@ static vifi_t reg_vif_num = VIFI_INVALID * Private variables. */ static vifi_t numvifs; -static const struct encaptab *encap_cookie; - -/* - * one-back cache used by mroute_encapcheck to locate a tunnel's vif - * given a datagram's src ip address. - */ -static u_long last_encap_src; -static struct vif *last_encap_vif; /* * Callout for queue processing. @@ -325,7 +291,6 @@ static int set_assert(int); static void expire_upcalls(void *); static int ip_mdq(struct mbuf *, struct ifnet *, struct mfc *, vifi_t); static void phyint_send(struct ip *, struct vif *, struct mbuf *); -static void encap_send(struct ip *, struct vif *, struct mbuf *); static void tbf_control(struct vif *, struct mbuf *, struct ip *, u_long); static void tbf_queue(struct vif *, struct mbuf *); static void tbf_process_q(struct vif *); @@ -792,14 +757,6 @@ X_ip_mrouter_done(void) ip_mrouter = NULL; mrt_api_config = 0; -VIF_LOCK(); -if (encap_cookie) { - const struct encaptab *c = encap_cookie; - encap_cookie = NULL; - encap_detach(c); -} -VIF_UNLOCK(); - callout_stop(tbf_reprocess_ch); VIF_LOCK(); @@ -859,8 +816,6 @@ X_ip_mrouter_done(void) /* * Reset de-encapsulation cache */ -last_encap_src = INADDR_ANY; -last_encap_vif = NULL; #ifdef PIM reg_vif_num = VIFI_INVALID; #endif @@ -924,90 +879,6 @@ set_api_config(uint32_t *apival) } /* - * Decide if a packet is from a tunnelled peer. - * Return 0 if not, 64 if so. XXX yuck.. 64 ??? - */ -static int -mroute_encapcheck(const struct mbuf *m, int off, int proto, void *arg) -{ -struct ip *ip = mtod(m, struct ip *); -int hlen = ip-ip_hl 2; - -/* - * don't claim the packet if it's not to a multicast destination or if - * we don't have an encapsulating tunnel with the source. - * Note: This code assumes that the remote site IP address - * uniquely identifies the tunnel (i.e., that this site has - * at most one tunnel with the remote site). - */ -if (!IN_MULTICAST(ntohl(((struct ip *)((char *)ip+hlen))-ip_dst.s_addr))) - return 0; -if (ip-ip_src.s_addr != last_encap_src) { - struct vif *vifp = viftable; - struct vif *vife = vifp + numvifs; - - last_encap_src = ip-ip_src.s_addr; - last_encap_vif = NULL; - for ( ; vifp vife; ++vifp) - if (vifp-v_rmt_addr.s_addr == ip-ip_src.s_addr) { - if ((vifp-v_flags (VIFF_TUNNEL|VIFF_SRCRT)) == VIFF_TUNNEL) - last_encap_vif = vifp; - break; - } -} -if (last_encap_vif == NULL) { - last_encap_src = INADDR_ANY; - return 0; -} -return 64; -} - -/* - * De-encapsulate a packet and feed it back through ip input (this - * routine is called whenever IP gets a packet that mroute_encap_func() - * claimed). - */ -static void -mroute_encap_input(struct mbuf *m, int off) -{ -struct ip *ip = mtod(m, struct ip *); -int hlen = ip-ip_hl 2; - -if (hlen sizeof(struct ip)) -
Re: [PATCH] ip_fastfwd forwards directed broadcasts
This has now been applied to -CURRENT after testing by a 3rd party. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Seeking to hear from people with broken IFF_ALLMULTI cards
Hi, If any of you out there have network interfaces which have broken ALLMULTI handling (i.e. they can't handle multicast routing), I would love to hear from you. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ioctl: SIOCADDMULTI (howto?)
Jouke Witteveen wrote: Hello all, I'm in need of some information on how to utilize SIOCADDMULTI. It is supposed to be demonstrated by the mtest [1] program, but that doesn't do anything (on an SIOCDELMULTI rn it appears nothing was added: ENOENT), At least not for the values I tested, 1.80.c2.0.0.1 in particular. I presume it doesn't work because the program has not been revised in 3 years and revision 1.4 notes that it might not work. If this ioctl is depricated then please tell me what is the best way to receive multicast messages from the 01.80.c2.00.00.0x (802.1) range? It is ofcourse possible to go into ALLMULTI-mode and check on all datagrams, but the NIC's I use are suited with a very nice hardware filter (21143 chip) that should be able to do this more effectively. Anyway, I believe Linux still programs the hardware filter through SIOCADDMULTI so is a bit easier on this. I tracked down the source from the ioctl call to the network driver for some time now and could find no obvious fault, except for quite much casting, and inconsistent use of types (checks happen on all sorts of casts: socket, sokcet_dl, multiaddr, ...). It's quite possible that path is broken, as hardly anyone else out there needs to directly join a link-layer multicast group, and there is no regression test for it. The IP paths are known to work A-OK. If you didn't have code hooked up to ether_demux() to see this traffic, you'd never see it in userland anyway. As such, it's not a priority for me to fix , but will try to help anyway. Are there specific performance constraints for your app? If not you should just be able to use pcap (or bpf) to get the traffic. Admittedly this is a performance hit, but with the optimization work on bpf and ever more powerful CPUs, this shouldn't be a big issue. You can write a regression test for this though with getifmaddrs(). anglepoise:~/head/src/sys/net % s mtest Password: multicast membership test program; enter ? for list of commands a fxp0 01.80.c2.00.00.02 ether address added should yield route -nv monitor output got message of size 128 on Mon Feb 5 21:23:57 2007 RTM_NEWMADDR: new multicast group membership on iface: len 128, sockaddrs: IFP,IFA fxp0:0.90.27.59.40.2c 1.80.c2.0.0.2 Of course, netstat -g won't show you this, because it's concerned with IP/IPv6 only. netstat -ian should however tell you which link-layer multicast addresses are configured. When I add an ethernet multicast address manually with mtest, I see vmstat -m | grep ether_multi increment as I'd expect. It looks like there may be a missing piece somewhere. The code which I see is OK but the results aren't as I'd expect. I am quite tired at the moment so I may be way off. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Proposal: remove encap from MROUTING
How would you all feel about removing the old encapsulation methods from IPv4 multicast routing as OpenBSD has done? http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/ip_mroute.c.diff?r1=1.42r2=1.43 The last time I deployed any such infrastructure, I had to use gif(4); in a NATted world, the encap stuff has never worked cleanly for me or been worth the additional effort in deployment. Regards, BMS ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
[PATCH] tun(4) does not clean up after itself
Hi, I just saw this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/100080 This patch appears to fix the problem. Any obvious glaring errors? Testers please? Regards, BMS Index: if_tun.c === RCS file: /home/ncvs/src/sys/net/if_tun.c,v retrieving revision 1.161 diff -u -p -r1.161 if_tun.c --- if_tun.c 6 Nov 2006 13:42:02 - 1.161 +++ if_tun.c 2 Feb 2007 23:30:04 - @@ -388,16 +388,21 @@ tunclose(struct cdev *dev, int foo, int splx(s); } + /* Delete all addresses and routes which reference this interface. */ if (ifp-if_drv_flags IFF_DRV_RUNNING) { struct ifaddr *ifa; s = splimp(); - /* find internet addresses and delete routes */ - TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link) - if (ifa-ifa_addr-sa_family == AF_INET) -/* Unlocked read. */ + TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link) { + /* deal w/IPv4 PtP destination; unlocked read */ + if (ifa-ifa_addr-sa_family == AF_INET) { rtinit(ifa, (int)RTM_DELETE, tp-tun_flags TUN_DSTADDR ? RTF_HOST : 0); + } else { +rtinit(ifa, (int)RTM_DELETE, 0); + } + } + if_purgeaddrs(ifp); ifp-if_drv_flags = ~IFF_DRV_RUNNING; splx(s); } ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]