Call for testers: olsrd and IP_ONESBCAST

2007-04-09 Thread Bruce M Simpson

Hi,

For a while now I have had a patch available to teach olsrd to use 
IP_ONESBCAST instead of using libnet/bpf just to send broadcast 
datagrams in FreeBSD, which has had IP_ONESBCAST for a few years now.


If anyone is using olsrd on FreeBSD I would greatly appreciate testing 
and feedback for this patch: 
http://people.freebsd.org/~bms/dump/olsrd-onesbcast.diff


Thanks!
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Interface index hack in IP_ADD_MEMBERSHIP

2007-04-09 Thread Bruce M Simpson

Yar Tikhiy wrote:

Quagga still uses it, too, if its configure script detects FreeBSD
or NetBSD.  I'm afraid it was me who submitted the patch to the
Quagga folks when I'd found that Quagga's ospfd couldn't handle
unnumbered P2P interfaces in FreeBSD because their local IPs weren't
unique.  Unfortunately, Quagga doesn't seem to use the protocol
independent part of the RFC 3678 API yet.
  


A preliminary patch for the Rhyolite.com routed is available at:
   http://people.freebsd.org/~bms/dump/routed.rfc3678.diff

The upcoming rewrite of IPv4 multicast host-mdoe logic (currently in 
bms_netdev) adds support for the Linux-derived 'struct ip_mreqn' for 
specifying interface indexes to IP_MULTICAST_IF. The RFC 3678 API is 
implemented; IGMPv3 and MLDv2 may be hooked in later on subject to 
available resources.


The RFC 1724 hack has been completely removed from the kernel in this 
spin. The new code passes the existing regression tests for any-source 
multicast. I hope to have source-specific multicast regression tests in 
the main tree ASAP, I am very close to a code drop.


Whilst the radical approach of rewriting this stuff may break legacy 
applications, they should probably be updated to support the new APIs 
anyway, given that Linux 2.6 and Microsoft Windows Longhorn both 
support RFC 3678.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Spillover routing?

2007-04-07 Thread Bruce M. Simpson

Rajkumar S wrote:

Hi,

I have a low cost 128kbps and a high cost 512 kbps link to internet.
Is it possible to do a spillover routing so that the high cost link
is used only when the low cost link is, say, used more than 80%.
This feature is almost certainly not going to be present in the base 
system. What you would need to do to implement this is to configure a 
part of the kernel to perform bandwidth measurements and make an upcall 
to bring up the other link in a dial-on-demand style configuration. Add 
NAT into the mix and it gets even more interesting. I believe pf+altq 
may have the potential to do this however I could not help you with 
where to begin re configuring it to do so, so I wish you best of luck in 
your research.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Source-specific multicast

2007-04-06 Thread Bruce M Simpson
I am very close to merging support for RFC 3678 to -CURRENT.  I will 
make a patch available before I commit.


The only userland consumer in the tree which is likely to be affected by 
the removal of ip_multicast_if() from the kernel is routed, which I will 
update to use the new setsourcefilter() API.


The SSM code does change some of the coupling between sockets and IGMP, 
and changes some logic in udp_input; strict multicast membership becomes 
the default. For systems which deal with many multicast sockets and 
traffic, they may benefit from an additional hash table. I haven't 
finished touching the raw IP input path.


Given current looming commitments I'm open to someone volunteering to 
finish the work of merging IGMPv3 and MLDv2, or possibly to fund the work.


I wish to get at least the socket part of ASM/SSM merged before I come 
back to Yar's PR with vlan and pfsync, which I have not had reason to 
investigate thoroughly; I have had no further reports of problems with 
carp(4) in -CURRENT.


regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: A radical restructuring of IPsec...

2007-04-06 Thread Bruce M. Simpson
I'm all for this in principle. I believe that the case for FAST_IPSEC 
over KAME IPSEC is fairly clear for those of us who have read the USENIX 
paper. Qualitatively speaking I can say FAST_IPSEC has been more 
pleasant to work with when introducing the TCP-MD5 support.


I will try to look at the patch in more detail as time permits.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IPv6 Router Alert breaks forwarding

2007-04-05 Thread Bruce M. Simpson
I can only speak about IPv4 router alert in detail; we do nothing with 
IPv4 RA nor would it appear that it would make any real difference in 
performance given how the code is laid out. RSVP packets should be 
passed verbatim to userland from ip_input() via rip_input() there.


I think your IPv6 fix is good for now but will wait to hear further from 
[EMAIL PROTECTED]


I am heading out the door so if someone could add an item for this to 
http://wiki.FreeBSD.org/Networking I should be most grateful.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: intel 802.11 2200BG routing

2007-04-01 Thread Bruce M. Simpson

Da Rock wrote:


So I could use some guidance as to what I can do to rectifiy this 
problem. I have 2 goals:

1. setup iwi to start on boot, and attach to my ap whenever its in range.
2. make sure iwi stays connected without manually monitoring it.
3. prioritise my routes via the rl0 and iwi if's so that cable is used 
over wifi, but both can be used to access the network. 


Umm, that's 3 goals. :^) The short answer is, you can't do what you're 
trying to do, yet.


You can cut over without rebooting, you just need to remember to kill 
off all dhclient processes and manually remove the default route, as in 
FreeBSD all forwarding entries ('routes') reference an interface 
pointer, and the PRC_IFDOWN handler will not touch routes marked RTF_STATIC.


No one as far as I know has rolled a 'cutover' script. What would be 
really useful is a port which can do this cutover in a more general way 
until the stack is changed. This isn't that different from say Microsoft 
Windows where a manual cutover is needed, although the OS having a 
multipath FIB ('routing table') helps.


The long answer is, it's possible, and it requires some things in the 
network stack to be carefully reworked. I have looked at these issues in 
some depth; there are at least 3 items on the Network Stack Wiki which 
are directly relevant to making the kind of clean cut-over between 
wireless/wired interfaces possible.


Notably looking at the PRC_IFDOWN handler in netinet, making forwarding 
entry lookup skip interfaces marked down, and introducing route 
preference into the routing trie. There are historical reasons why the 
code is the way it is. It will take a while to get these issues 
addressed going forward.


Regards,
BMS

P.S. routed isn't going to help you at all in this situation, it's just 
an implementation of the RIPv2 routing protocol; it may have helped as 
the routes it introduces to the kernel are !RTF_STATIC.


One thing I haven't tried is IPv4 Router Discovery (rdisc), that may 
help update the default route quickly. The problem with this of course 
is the additional network configuration in the infrastructure itself.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IPFW update frequency

2007-03-30 Thread Bruce M. Simpson
For what it's worth, the code I wrote for XORP is only for IPFW2, and 
uses its tables feature to atomically transcribe XORP rulesets to IPFW 
ones before swapping them in.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Merging rc.d/network_ipv6 into rc.d/netif

2007-03-29 Thread Bruce M. Simpson

Mike Makonnen wrote:

I would
especially like feedback from folks more familiar with IPv6. One
gotcha I've noticed is that if you boot with ipv6_enable turned
off, then try to start IPv6 on an interface later on, it doesn't
work because none of the interfaces (except lo0) has a link-local
address (see rc.d/auto_linklocal). How can we fix this? Also, I
would appreciate feedback on how stopping IPv6 on an interface
should be handled. In rc.d/network_ipv6 it was handled at all.
Currently, it goes through and deletes all
IPv6 addresses on the interface.
  


I agree. We should be able to add/remove IPv6 link-local addresses 
somehow at runtime, after boot, without necessarily bringing up IPv6 on 
an interface during boot.


I am thinking at some point it may be for the best if some of the code 
to do with address families is restructured so that the administrator is 
able to explicitly attach or detach protocol domains e.g. AF_INET, 
AF_INET6 to network interfaces on the command line, based on my 
experience of making the changes necessary for refcounting of various 
network stack structures.


I'd like to get this fixed going forward, though, as ever, other work 
takes priority...


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: The broadcast of python in FreeBSD

2007-03-28 Thread Bruce M. Simpson

Zhu Yan wrote:


When I send the broadcast in FreeBSD with address 255.255.255.255, the
packet can not be received by other OS. 


FreeBSD applications need to use the IP_ONESBCAST option to send 
all-ones broadcasts. See the ip(4) man page.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Vrrp/CARP/ucarp Problems

2007-03-27 Thread Bruce M. Simpson

Andrea Venturoli wrote:

Jordan Gordeev wrote:
The only load balancing that CARP supports, to my knowledge, is ARP 
level load balancing. From carp(4):

The ARP load balancing has some limitations.  First, ARP balancing only
works on the local network segment.  It cannot balance traffic that
crosses a router, because the router itself will always be 
balanced to

the same virtual host.


Forgive me for stepping in, but I had read the above statement over 
and over trying to figure what it meant; perhaps it's not so clear...


If I understood it correctly it's not saying you should not use CARP 
on routers. Instead it's meaning that load-balancing won't cross a 
third router which is on cascade of the two CARP routers.

...

Andrea, you are correct. Jordan is pointing out the main limitation of 
CARP, which is that it operates only within a broadcast domain.  I 
should point out such a feature is out of scope for VRRP, CARP, IPMP or 
other Layer 2 IP sharing protocol. However this behaviour is just fine 
for load balancing a router, in which case one relies on next-hop 
reachability anyway.


The thing to remember with CARP is that it relies on the ability of the 
interface to go into promiscuous mode to pick up traffic for its virtual 
MAC addresses. More modern cards may support more than one station 
address in hardware, which avoids the need for promiscuous mode 
processing, however we don't currently support this hardware feature.


If one wishes to load balance across Layer 3 hops (rather than within 
the same broadcast domain), what one is asking for is a feature like 
BGP4 Anycast, IPv6 Anycast, or OSPF-based Anycast which relies on 
cooperating routers to inject a route into the Layer 3 routing domain 
for a given 'virtual' IP address.


There is a daemon out there which uses the OSPF API in Quagga to flood 
OSPF domains with virtual host routes for anycasting services using 
Opaque LSAs but I forget its name. XORP has the potential to do the same 
but requires some development effort to do so.


If one wishes to load balance specific requests for an application layer 
service, one enters the wonderful world of 'middleware' and competing 
commercial solutions to the problem.


And this is where money comes into play...

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/19875: A new protocol family, PF_IPOPTION, to handle IP options at socket interface

2007-03-26 Thread Bruce M Simpson
Synopsis: A new protocol family, PF_IPOPTION, to handle IP options at socket 
interface

State-Changed-From-To: suspended-closed
State-Changed-By: bms
State-Changed-When: Mon Mar 26 14:36:38 UTC 2007
State-Changed-Why: 
It is unlikely this code will ever be committed.

Reasons:
1) This information can be obtained via cmsg so as to lie out-of-band
   of protocol data
2) This code is IPv4 specific
3) Most consumers of IP options and router alerts either live in the kernel,
   or have this information delivered via raw sockets.

http://www.freebsd.org/cgi/query-pr.cgi?pr=19875
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: GRE with key

2007-03-26 Thread Bruce M. Simpson

Cristian KLEIN wrote:

Hello everybody,

I am new to FreeBSD kernel hacking, so please excuse my perhaps stupid
questions.

I would like to add key support to gre(4). I have already been able to
use gre(4) with a hardcoded key. The single thing remaining to do is to
transfer the key from ifconfig(8). The key is an uint32_t and I haven't
found a way to transfer it without modifying ifconfig(8).
  

Excellent. Thanks for volunteering to do this!

My question is, which is the BSD-style to achieve the above? Solutions
I came up with are as follows:
1) Use SIOCSDRVSPEC / SIOCGDRVSPEC
2) Add SIOCSGREKEY / SIOCGGREKEY
3) [Probably to ugly to be mentioned, but requires fairy few
modifications.] Add a sysctl MIB which is read when calling ifconfig
... create.
  
If I were doing this, I would add the code to ifconfig.c where the other 
tunnel stuff lives, and go for option number 2. Feel free to modify 
ifconfig to accomodate the the new options.

Another thing I wanted to ask is, which function of ifconfig(8) should I
modify to display the GRE key?
  

Look at how af_status_tunnel() works and consider adding it there.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MPLS implementation

2007-03-21 Thread Bruce M. Simpson

Sam Wun wrote:

Hi,

Is there any MPLS implementation for FreeBSD?
I found a port ayame mpls for netbsd, but the last implementation was 
dated

back to 2003, seems very old.

There is NISTswitch, but it is most likely very bit-rotted by now.

I would suggest helping Anihudda Bodhra out on the Click port as it 
would be a great starting point for prototyping MPLS due to how Click 
will most likely attach to the kernel forwarding paths.


The key to success with MPLS is to learn from the layer 2 forwarding 
stuff in if_bridge; to integrate cleanly with the Ethernet code; to use 
ALTQ for the token bucket filter and traffic classification policies; 
and to not break the regular forwarding path.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ICMP-floods

2007-03-20 Thread Bruce M. Simpson
I have a patch attached to http://wiki.freebsd.org/Networking to 
rate-limit ICMP which is generated by the forwarding path.


It would be useful to find out if this offers symptomatic relief in this 
situation, although as Chuck points out, it is most likely being caused 
by a routing loop.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Proposal: Merge RFC3678 multicast APIs

2007-03-20 Thread Bruce M Simpson

Hi,

I propose that we merge the RFC3678 advanced multicast APIs. Doing so 
gets us closer to IGMPv3 and SSM. I would greatly appreciate suggestions 
about how to deal with the include header issue below.


I have already started merging the basic definitions into p4 branch 
bms_netdev.


Background:
* RFC3678 specifies user and kernel APIs for any-source and 
specific-source multicast for IPv4, IPv6, and protocol-independent use.

 * this includes struct ip_mreq_source and friends
 * SIOCSIPMSFILTER and SIOCGMSFILTER are historical and may be ignored.

Impact:
* It requires that struct sockaddr_storage is visible to netinet/in.h.
 * This change breaks the following files in the kernel:
in4_cksum.c inet_ntoa.c ip_ecn.c in6_cksum.c in_cksum.c slcompress.c
 ...which do not include sys/socket.h where this structure is defined.

Benefit:
* We get the SSM API. We don't support IGMPv3 or SSM yet, but this is 
part of the work.
* Better to do this now and incrementally; the IGMPv3 implementations 
out there for FreeBSD have been published as patch sets which are now 
bitrotting.
* This lets us eliminate the ugly RFC1724 hack from the IPv4 stack, 
which is used to
  specify an outgoing IPv4 multicast interface by passing a 24-bit 
interface index

  in the host portion of a 0.0.0.0/8 address.
 * This behaviour is not portable; Microsoft Windows Vista uses the 
full 32-bit wide interface index space in both its IPv4 and IPv6 stack. 
No snickering from the gallery please -- Dave Thaler has done excellent 
work bringing the MS stack closer to IETF standards.
 * routed uses this; it can be patched to not do so; the RFC3678 API 
for this is to use the generic MCAST_JOIN_GROUP socket option which 
accepts an interface index as an argument in struct group_req.
 * Linux defines a struct ip_mreqn as a workaround for applications 
using the pre RFC3678
   API. Inside the kernel it maps IFA to IFP when handling 
IP_ADD_MEMBERSHIP, thus avoiding

   the 0.0.0.0/8 hack.

See ip(4) in HEAD for the polite rendering of my rant about doing IGMP 
correctly and its  implications for addressing in the IPv4 stack (short: 
you need an IP address for it to work properly, and source address 
selection, or IPv6, is looking like a really good idea in a 
wireless/manet/mobile/ad-hoc world).


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Interface index hack in IP_ADD_MEMBERSHIP

2007-03-19 Thread Bruce M Simpson

Hi,

I plan to get rid of the ugly little ip_multicast_if() hack in the IP 
stack.=

Before I do, is anyone actually using this?

RFC 3678 specifies a protocol independent API for socket group 
memberships which allow joins on interfaces referenced by index. This is 
intended to support IGMPv3 and MLDv2.


Regards,
BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Interface index hack in IP_ADD_MEMBERSHIP

2007-03-19 Thread Bruce M Simpson

Eugene Grosbein wrote:


I recall that routed and ripd used to utilize something similar
long time ago. I'm not sure if they have switched to another API.
  

You're right -- this would break routed on point-to-point interfaces.

They didn't, unless it was updated at the upstream, i.e. rhyolite.com.

This means that the RFC1724 hack can't be safely deprecated without 
breaking this use case, until routed is updated to use the RFC 3678 
protocol-independent ASM API.


Linux uses a slightly different technique to work-around this; ip_mreq 
is expanded to ip_mreqn internally, and the interface index is 
explicitly passed around in the kernel.


The blocker in the FreeBSD case which prevents us simply adopting this 
is the source interface selection logic in ip_output().


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] Multicast refcounting in network stack

2007-03-19 Thread Bruce M Simpson

Hi,

A patch against -CURRENT is now available:
   http://people.freebsd.org/~bms/dump/multi_refcounting.diff

This is a fairly sweeping architectural change which should resolve 
memory leaks and potential panics with the network stack as a whole, to 
better support interface detach at runtime.


I'd like to check it in as soon as possible as it fixes the root cause 
of the problems we have had with carp and pfsync in our stack. NetBSD 
has implemented refcounting like this for some time now, so it does not 
suffer from the same problems.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: networking code and splx()

2007-03-19 Thread Bruce M. Simpson

Ignacio Rey wrote:

...
The question is: Have calls to these functions been wrapped? or are they
simply not used in this context?
  
splx() and friends have been no-ops since FreeBSD 5.x was branched. 
Synchronization is now done using other mechanisms such as mutexes and 
spin locks. See the new man page locking(9) in -CURRENT.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: PMTU Discovery support

2007-03-19 Thread Bruce M. Simpson

Kevin Lahey wrote:


The boxes were running FreeBSD-6.1, but I can't really vouch for the
particular kernel configuration.  It could well be that the problem is
with the loose nut behind the wheel, rather than with FreeBSD. :-)
  


I believe PMTU measurements may only be relied upon for active TCP 
connections, but it's been a while since I read this code.


It would be useful if non-TCP drivers such as gre(4) could be extended 
to perform PMTU discovery and auto-tune their MTU based on this, as 
manually setting the MTU is a bit random and can result in horrible 
fragmentation when going across the big-I Internet.


I imagine doing this would require changes to the icmp input path and a 
bit of abstraction.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Multicast refcounting in network stack

2007-03-19 Thread Bruce M Simpson

Andre Oppermann wrote:

  http://people.freebsd.org/~bms/dump/multi_refcounting.diff

Patch looks good.  :-)

Committed, with some changes.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


IPv4, IPv6 and link-layer multicast refcounting in bms_netdev

2007-03-17 Thread Bruce M Simpson
I have just committed reference counting for multicast structures in p4. 
Change list number is 116036.


This should fix the problems with pfsync and carp since the scalability 
fixes for IPv4 multicast last September. A further cumulative fix for 
pfsync is present in this branch.


Basic testing with the stock IPv4 and Ethernet code have been performed. 
Further testing would be much appreciated before the code is merged to 
HEAD. The refcounting has been implemented in a way so as not to break 
the 6.x ABI so that it may be merged to STABLE.


It would be great to have feedback on how these patches may affect 
vlan(4) which is the only other consumer of the in_delmulti() KPI.


My experience working on this suggests IFF_NEEDSGIANT is a real headache 
for dealing with ifnets which may potentially go away during the 
lifetime of the system.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


MFCing rev 1.96 of netinet/in.c for Zeroconf

2007-03-17 Thread Bruce M Simpson

The change itself is very simple;
   
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/in.c.diff?r1=1.95r2=1.96


This change is necessary before IPv4 address scope and source selection 
policy may be implemented.


Does anyone see any potential problems with this? It is possible that 
there are people out there forwarding between LANs with 169.254.0.0/16 
subnetted on different interfaces, though this is not RFC compliant 
behaviour, so I'd like to hear about that before I merge it.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network

2007-03-16 Thread Bruce M. Simpson

Anton Yuzhaninov wrote:

Thursday, March 15, 2007, 7:30:54 PM, Andre Oppermann wrote:

AO IMO when configuring a interface with an IP address and network it should
AO kick out previous host and/or network routes matching it.  Unless those
AO are from locally configured interfaces, then it should reject the new
AO attempt.

New route should replace existing one only if it have administrative
distance (in cisco terms) smaller than AD for existing route.

Preference of network from locally configured interface is only
particular case of this general principle.
  
We are obstructed by the current radix trie code only matching on 
destination and prefix. Adding 'administrative distance' to the FTE 
match is something which should seriously be considered. It is a 
stepping stone to equal cost multipath and would help in this situation.


It does however considerably change the semantics of the existing 
routing socket and its consumers would need to be updated to reflect 
that fact.


As I hinted at in my original response: it seems acceptable that 
ifconfig'ing an interface into the system should be able to clobber the 
overlapping routes in the meantime, but only until the architecture is 
fixed.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Generic ioctl and ether_ioctl don't agree

2007-03-14 Thread Bruce M. Simpson

Yar Tikhiy wrote:

Hi folks,

Quite a while ago I noticed that our ioctl handlers get the ioctl
command via u_long, but ether_ioctl()'s command argument is int.
This disarray dates back to 1998, when ioctl functions started to
take u_long as the command, but ether_ioctl() was never fixed.
Fortunately, our ioctl command coding still fits in 32 bits, or
else we would've got problems on 64-bit arch'es already.  I'd like
to fix this long-standing bug some day after RELENG_7 is branched.
Of course, this will break ABI to network modules on all 64-bit
arch'es.  BTW, the same applies to other L2 layers, such as firewire,
which seems to have been cloned from if_ethersubr.c.
  
This is one of those annoying things which breaks compatibility with 
external modules.


I'm not sure about this, though. I was getting sign extension warnings 
on amd64 last week when I was testing the IGMPv3 aware mtest(8). Perhaps 
if we're fixing these ABIs, we should commit to an explicit C99 type 
with known bit width, i.e. uint32_t.


I would be much happier if we began using C99 types in the code.

Just my 2c.
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: tap(4) should go UP if opened

2007-03-14 Thread Bruce M. Simpson

Hi,

Frank Behrens wrote:
If we have no possibility to mark the interface as UP for the non-root process the 
net.link.tap.user_open=1 is useless, because we can not transmit any packets. With the 
patch the interface goes UP only, when the administrator allowed non-root user access.


  


The conditional in the second patch is a no-op as the open will be 
forbidden if the user did not have privilege to open the tap. Bringing 
the interface up by default potentially violates POLA, so this should 
not happen by default.


Please try the attached patch, which puts this behaviour under a sysctl.

Thanks,
BMS
 //depot/user/bms/netdev/sys/net/if_tap.c#1 - /home/bms/p4/netdev/sys/net/if_tap.c 
--- /tmp/tmp.58336.0	Wed Mar 14 13:06:09 2007
+++ /home/bms/p4/netdev/sys/net/if_tap.c	Wed Mar 14 13:05:54 2007
@@ -150,7 +150,8 @@
  */
 static struct mtx		tapmtx;
 static int			tapdebug = 0;/* debug flag   */
-static int			tapuopen = 0;/* allow user open() */	 
+static int			tapuopen = 0;/* allow user open() */
+static int			tapuponopen = 0;/* IFF_UP on open() */
 static int			tapdclone = 1;	/* enable devfs cloning */
 static SLIST_HEAD(, tap_softc)	taphead; /* first device */
 static struct clonedevs 	*tapclones;
@@ -164,6 +165,8 @@
 Ethernet tunnel software network interface);
 SYSCTL_INT(_net_link_tap, OID_AUTO, user_open, CTLFLAG_RW, tapuopen, 0,
 	Allow user to open /dev/tap (based on node permissions));
+SYSCTL_INT(_net_link_tap, OID_AUTO, up_on_open, CTLFLAG_RW, tapuponopen, 0,
+	Bring interface up when /dev/tap is opened);
 SYSCTL_INT(_net_link_tap, OID_AUTO, devfs_cloning, CTLFLAG_RW, tapdclone, 0,
 	Enably legacy devfs interface creation);
 SYSCTL_INT(_net_link_tap, OID_AUTO, debug, CTLFLAG_RW, tapdebug, 0, );
@@ -502,6 +505,8 @@
 	s = splimp();
 	ifp-if_drv_flags |= IFF_DRV_RUNNING;
 	ifp-if_drv_flags = ~IFF_DRV_OACTIVE;
+	if (tapuponopen)
+		ifp-if_flags |= IFF_UP;
 	splx(s);
 
 	TAPDEBUG(%s is open. minor = %#x\n, ifp-if_xname, minor(dev));
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: [PATCH] Removal of redundant entries from ifnet manpage

2007-03-14 Thread Bruce M. Simpson

Aniruddha Bohra wrote:

Hi,
The ifnet manpage contains entries for the following routines which do 
not exist in the ifnet struct. 

committed, thanks!
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network

2007-03-14 Thread Bruce M. Simpson

Gleb Smirnoff wrote:

AFAIK, the problem needs a more generic approach. I see two approaches.

1) Introduce RTM_CHANGEADD, a command that will forcibly add route,
deleting all conflicting ones. Use this command in in_addprefix().

2) In rt_flags field we still have several extra bits. We can use
them to specify route source - RTS_CONNECTED, RTS_STATIC, RTS_XXX,
where XXX is a routing protocol. When issuing RTM_ADD a route with
a preferred source (e.g. CONNECTED vs STATIC) will override the old
one.

  


The proposed changes also constitute a hack.

I understand that they are being proposed to address problems we 
currently have in the stack, i.e. that we do not support multipathing, 
though it is more than likely they will be blown away in future when the 
architecture changes (and it has to change).


Approach 1 is largely irrelevant if multiple paths are introduced to the 
network stack; there is then no concept of a conflicting forwarding 
entry, only preference derived from the interface, entry flags, or the 
entry ('route') itself.


Approach 2 has some merit to it, although the forwarding plane should 
not care where the forwarding entry came from unless it needs to (e.g. 
next-hop resolution).


It seems reasonable that the forwarding plane should tag entries as 
being 'CONNECTED' i.e. derived from the address configuration of an 
interface. I believe many implementations out there do this, and 
multi-path does not change this.


We already have the RTF_PROTO1 flag to determine if the forwarding entry 
('route') came from a routing protocol in userland, so there should be 
no need to change the existing flags.


The RTF_STATIC flag only has special meaning in that it means 'the user 
added this forwarding entry manually via the route(8) command'. We 
should preserve these semantics, though I believe we should start 
implementing forwarding preference in the radix trie.


I think it seems acceptable and reasonable that we use a limited form of 
Approach 2 to clobber 'routes' being aded in the case described in the 
PR, until such time as the network stack is re-engineered to support 
multiple paths and forwarding preference.


I also believe it is useful if we start to use more modern technical 
jargon to discuss 'routes' in the network stack, because we are actually 
discussing the behaviour of entries in a forwarding table.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/106722: [net] [patch] ifconfig may not connect an interface to known network

2007-03-14 Thread Bruce M. Simpson

Gleb Smirnoff wrote:

I was afraid that this would raise an argument on multipath routing. Let's
temporary do not speak about multipath but just decide what is the correct
way to remove conflicting routes when we are assigning an IP prefix to a
local interface?
  
My suggestion is to take the second approach you outlined but modify it 
slightly.


That way, the conflict between the 'connected' FTE introduced by 
ifconfig'ing the interface and the pre-existing FTE for that network 
prefix, may be resolved in a manner which doesn't break current 
consumers of the routing code, and leaves the way open to do multipath 
later w/o problems.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-13 Thread Bruce M. Simpson

Eygene Ryabinkin wrote:

I tried to understand this, because Bruce already gave me a patch,
but I am a bit stupid: I do not see how M_PROMISC that is cleared
unconditionally before BRIDGE_INPUT will help us to identify the
right interface. As I see now, the BRIDGE_INPUT is called once from
if_ethersubr.c, once from if_gif.c and once from ng_ether.c:
http://fxr.watson.org/fxr/ident?i=BRIDGE_INPUT
So there is no distinct code paths that can allow BRIDGE_INPUT to
modify its behaviour based on the M_PROMISC flag.

But I feel that I am wrong in some place and missing some discuission
on the M_PROMISC. Can anyone point me to the right place?
  
In short: M_PROMISC exists to easily identify frames which were received 
promiscuously, to prevent infinite recursion, and to simplify code which 
needs to re-enter ether_input().


M_PROMISC is a flag introduced by NetBSD into their ethernet input path 
to deal with the case where an entity in the network stack needs to 
receive frames promiscuously, without necessarily passing those frames 
to the upper layers e.g. IPv4. It is not documented; the code is the 
documentation in this instance.


It is cleared when an mbuf chain is passed to another entity which may 
consume the frame in that mbuf chain, in case the entity re-enters 
ether_input() with the same mbuf chain for local delivery (e.g. bridge, 
netgraph, vlan).



I do not think M_PROMISC alone is sufficient to solve our architectural 
problems at Layer 2.
  

So all the tangled if()s inside LIST_FOREACH() will be gone completely
from bridge_input().



But we still need to see if we want to consume the packet by the
bridge or it members or to do forwarding. Am I missing something?
  
Correct. Just because a frame was received promiscuously, does not imply 
that the bridge will be the only consumer of that frame.
  

I'm afraid there is a serious flaw in the very notion of such a
logical interface.  If it's true, we should start by admitting
that the support for logical interfaces should be a side hack for
compatibility, and not something that can live forever on the main
code path.



I agree with you. That is why I patched if_bridge once again to enable
the pfil hooks for the physical incoming interface. And there are
two ways to solve the problem:
- to give each VLAN interface the distinct MAC, as Bruce suggested,
  
I didn't suggest this. :-) I pointed out that the code matches on 
destination MAC only at the moment.


vlan(4) is an abstraction of something which exists as part of the 
Ethernet framing, and is not a physical interface in its own right, as 
was correctly identified above.

- to refuse the logical interfaces completely and to support only
physical ones. It is what my very first (and very short) patch
did. But this can break some existing firewall rulesets. And that
should be discuissed -- we do not need the total breakage due to
out changes. And you're right: the best way for this alternative is to
leave the current behaviour as the compatibility sysctl that is turned
off by default and move to the filtering on the physical interfaces
by default. No problem, but skilled network people that are using
FreeBSD as the bridge for VLANs should say if they are happy with it.
  
I think it is acceptable for if_bridge(4) to know about the existence of 
VLAN interfaces and to deal with them accordingly as a special case, 
because Spanning Tree is specified differently in the case where VLANs 
are present. Therefore it is not unreasonable for if_bridge(4) to be 
looking at VLAN headers in the mbuf chain.


As such I think the behaviour Andrew Thompson and I were discussing off 
list should be made the default: that is, the first 802.1q VLAN header 
is stripped off and turned into an M_VLANTAG before being passed to 
other consumers in the stack.


The presence of M_VLANTAG makes it very easy to see that a frame was 
received with a VLAN header without involving vlan(4) and reduces the 
amount of 802.1q specific code across Layer 2 subsystems.


Regards,
BMS


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: UltraVNC on freebsd

2007-03-12 Thread Bruce M. Simpson

Rashid N. Achilov wrote:


TightVNC or TridiaVNC. But encryption and file transmission will not available 
with these VNC's and UltraVNC at another end
  

JFYI:

I have heard corporate IT people who mostly work with Windows discuss 
UltraVNC. I don't see a port for it. It is on SourceForge so perhaps 
someone will step up to contribute a port.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-12 Thread Bruce M. Simpson

Hi,

Eygene Ryabinkin wrote:


Speaking about vlan problems: the original problem is to do something
with VLAN interfaces only because they are sharing the MAC of their
physical parent. The problem itself is not VLAN-specific -- if there
will be two physical interfaces with the same MACs and they will be
bridged, the problem will still be here.
  

I see this also.

What would be good is if there was a way to record additional MAC 
addresses for each ifnet, in addition to the if_lladdr member. This 
would cut down the cruft in ether_input(), if_bridge(4) and possibly 
also carp(4).


For network cards with more than one perfect hash filter entry in the 
hardware, programming these into the card would *perhaps* be more 
efficient when trying to achieve line rate with gigabit and beyond.


This would most likely require an ABI change. The VLAN handling problem 
doesn't go away; we will still need to check if a bridge member is a 
VLAN interface because we can't uniquely key off the MAC as you point out.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-12 Thread Bruce M. Simpson

Eygene Ryabinkin wrote:

This is a different point. The bridge wants to know about bridge
members MACs just because it should catch the packets that are
destined to the bridge members. It is the only way for an L2 thing
that is operating in the promiscious mode.
  

Correct.

For our case (when MACs are the same): I think that rik@ has explained
it rather good, so you should read his message once again. Perhaps,
we can talk about this off-list and in Russian, if you prefer.
  

The problem isn't going to go away.

It will get bigger when 802.3ad trunking is introduced. Andrew Thompson 
is currently working on this code. It may also affect the 802.11 code in 
future, which as you know is layered around Ethernet.


It would be good to have a well thought out architectural solution for 
this problem.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-12 Thread Bruce M. Simpson

Yar Tikhiy wrote:

Guys, excuse me, but I still fail to see how the case of VLANs'
sharing a single MAC differs from the case of several physical
interfaces with the same MAC from the POV of a bridge.  A bridge
can have no own MAC addresses at all, it plays with foreign MAC
addresses only.  Therefore I can't see why our bridge code needs
to know local MAC addresses, let alone why it fails when they're
the same.  Could you give me a hint?  Thanks!
  


A few points:

1. A bridge *does* have a MAC address; it is automatically assigned one 
to participate in IEEE 802.1d Spanning Tree.


2. In the case where 802.3ad trunking is implemented, the same Ethernet 
address may be used by multiple physical interfaces.


3.  As Eygene explained well: there are a number of consumers of 
Ethernet frames in the stack. As if_bridge may potentially be passed 
mbuf chains containing packets for these consumers first, it must 
examine the destination address to determine if it should claim the 
packet or not.


Finally, because of the above points, the Ethernet destination address 
cannot be regarded as a unique key in the bridge code, or indeed the 
general Ethernet path, for where packets should be relayed in the stack 
as a whole.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: tap(4) should go UP if opened

2007-03-09 Thread Bruce M. Simpson

Frank Behrens wrote:
How does tun(4) handle this? tun(4) is also set to down, when closed. It is not set to up, when 
ist is opened, but when an address is assigned by the user process. This is fine, because it 
needs always an ip address. tap(4) as layer 2 tunnel device does not need an ip address, so 
setting it up on open is IMHO the best solution.


  
This isn't consistent with the other software cloneable interfaces which 
emulate certain layer 2 semantics, e.g. bridge, trunk, vlan; see below.
Sound this reasonable or how should I handle the tap(4) open by an user process, when this 
process does not run as root?
  
I recently committed Landon Fuller's code which makes tap and tun 
cloneable interfaces which may then be created via 'ifconfig tap0 create'.


Automatically setting the interface to IFF_UP is not consistent with the 
semantics for other network interfaces; it requires specific privileges 
(usually super-user or PRIV_NET_SETIFFLAGS in -CURRENT) to do.


However, we also support the creation of tap/tun instances by 
non-super-users, so there is motivation for the change. Configuring a 
tap interface to up by a non-superuser should only be permitted if the 
interface itself was created by a non-superuser, and if 
net.link.tap.user_open is set to 1.


A more involved patch is needed to do this right for all cases -- we 
should not do this by default.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SO_ACCEPTCONN equivalent

2007-03-08 Thread Bruce M. Simpson

Alexandru Arion wrote:


Thanks for both suggestions. Since I'll support version 5.4 and up, this
leaves me to using the workaround implied by calling accept and checking
the returned value, for now.
  
Erm. It looks like it's implemented in 5.4 as well, although you might 
have mentioned in your original mail you were working with a legacy 
version of FreeBSD. :^)


http://fxr.watson.org/fxr/ident?v=RELENG54i=SO_ACCEPTCONN

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SO_ACCEPTCONN equivalent

2007-03-08 Thread Bruce M. Simpson

Vlad GALU wrote:




Erm. It looks like it's implemented in 5.4 as well, although you might
have mentioned in your original mail you were working with a legacy
version of FreeBSD. :^)

http://fxr.watson.org/fxr/ident?v=RELENG54i=SO_ACCEPTCONN


  Manpage diff attached.

Mailman ate your homework. :/

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SO_ACCEPTCONN equivalent

2007-03-08 Thread Bruce M. Simpson

Bruce M. Simpson wrote:




  Manpage diff attached.

Mailman ate your homework. :/

My bad. Committed.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SO_ACCEPTCONN equivalent

2007-03-08 Thread Bruce M. Simpson

Alexandru Arion wrote:

Tried it on fresh install of 5.4: compiled the source locally, run, got
error Protocol not available. Same code works on Linux.

By replacing SO_ACCEPTCONN with SO_REUSEADDR, or any other option that
appears in the manual page for 5.4, the program works correctly.

Bruce, is there something I'm missing?
  
There was a thread about this on a mailing list in the past from Robert 
Watson who was concerned introducing the option might introduce race 
conditions; please see the archives for this.


If SO_ACCEPTCONN does not work for you, please consider submitting a 
regression test for it e.g. src/tools/regression/sockets/acceptconn so 
that someone can pick up on this.


Thanks!
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR

2007-03-08 Thread Bruce M. Simpson

Bruce M. Simpson wrote:

I have just committed a change in bms_netdev which enforces strict
and better defined semantics for the IP_SENDSRCADDR option in 
udp_output().




I have just committed this change in -CURRENT.

After testing it with 'ipbroadcast', it looks good apart from sockets 
which are already laddr bound. This is forbidden by in_pcbbind_setup(). 
The same caveats apply -- it might collide with an already bound inpcb.


It is OK for code to choose any source address configured on the box as 
this will be needed to override source selection come ECMP.


If someone else steps up to make it work when socket is laddr bound, 
well and cool. I now consider it 'fit for purpose'. I'm satisfied with 
this for now.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR

2007-03-07 Thread Bruce M. Simpson

Bruce M. Simpson wrote:


Dealing with dhclient is a separate issue -- here, something like 
IP_SENDIF needs to be introduced, as we are truly in an 'ip 
unnumbered' situation -- ie the ifnet MAY not yet have been assigned 
an IPv4 address at all, and IP_SENDSRCADDR implies that you are source 
routing in the local stack by passing the address of a numbered interface 

I have just committed a change in bms_netdev which enforces strict
and better defined semantics for the IP_SENDSRCADDR option in udp_output().

This fits one of the main intended use cases of this option, e.g. a routing
daemon, bound to 0.0.0.0 and a non-ephemeral port, which needs to
explicitly override the hard-coded source selection policy in ip_output()
to send an undirected broadcast on a numbered interface.

It also fits a use case whereby a bound socket may wish to temporarily
ask for default source selection policy by specifying INADDR_ANY, although
this needs to be reviewed and tested further; I believe in_pcbbind_setup()
will detect a collision in this case.

We always obtain the inp_info write lock if IP_SENDSRCADDR was specified,
in case we need to temporarily re-bind laddr.

Pseudo-conditions as follows.

IP_SENDSRCADDR with lport NOT BOUND is NOT OK.
We should never try to persistently bind a socket which is not bound unless
we are bind(2).

IP_SENDSRCADDR with !INADDR_ANY when laddr is NOT BOUND is OK.
It means override the source selection logic and use src.sin_addr instead.

IP_SENDSRCADDR with INADDR_ANY when laddr is BOUND is OK; it
It means override the bound address and use source selection logic instead.

IP_SENDSRCADDR with INADDR_ANY when laddr is BOUND is OK.
It means override the bound address and use source selection logic instead.

IP_SENDSRCADDR with INADDR_ANY when laddr is NOT BOUND is NOT OK.
It means no valid source is specified.


Regards,
BMS
--- //depot/vendor/freebsd/src/sys/netinet/udp_usrreq.c	2007/02/20 10:22:30
+++ //depot/user/bms/netdev/sys/netinet/udp_usrreq.c	2007/03/07 12:28:16
@@ -747,7 +747,8 @@
 		return (EMSGSIZE);
 	}
 
-	src.sin_addr.s_addr = INADDR_ANY;
+	bzero(src, sizeof(src));
+
 	if (control != NULL) {
 		/*
 		 * XXX: Currently, we assume all the optional information is
@@ -777,12 +778,10 @@
 	error = EINVAL;
 	break;
 }
-bzero(src, sizeof(src));
 src.sin_family = AF_INET;
 src.sin_len = sizeof(src);
-src.sin_port = inp-inp_lport;
 src.sin_addr = *(struct in_addr *)CMSG_DATA(cm);
 break;
 			default:
 error = ENOPROTOOPT;
 break;
@@ -797,7 +796,7 @@
 		return (error);
 	}
 
-	if (src.sin_addr.s_addr != INADDR_ANY || addr != NULL) {
+	if (src.sin_family == AF_INET || addr != NULL) {
 		INP_INFO_WLOCK(udbinfo);
 		unlock_udbinfo = 1;
 	} else
@@ -810,11 +809,20 @@
 
 	laddr = inp-inp_laddr;
 	lport = inp-inp_lport;
-	if (src.sin_addr.s_addr != INADDR_ANY) {
-		if (lport == 0) {
+
+	/*
+	 * If the IP_SENDSRCADDR control message was specified, override the
+	 * source address for this datagram. Its use is invalidated if the
+	 * address thus specified is incomplete or clobbers other inpcbs.
+	 */
+	if (src.sin_family == AF_INET) {
+		if ((lport == 0) ||
+		(laddr.s_addr == INADDR_ANY 
+		 src.sin_addr.s_addr == INADDR_ANY)) {
 			error = EINVAL;
 			goto release;
 		}
+		src.sin_port = lport;
 		error = in_pcbbind_setup(inp, (struct sockaddr *)src,
 		laddr.s_addr, lport, td-td_ucred);
 		if (error)
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: SO_ACCEPTCONN equivalent

2007-03-07 Thread Bruce M. Simpson

Alexandru Arion wrote:

Is there an equivalent in FreeBSD to the SO_ACCEPTCONN option for
getsockopt(), available in Linux? It doesn't actually has to be an
option for getsockopt(), just a way to determine if a socket has been
marked to accept connections with listen().
  

SO_ACCEPTCONN appears to be in FreeBSD 6.2 and CURRENT already.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-06 Thread Bruce M Simpson

Yar Tikhiy wrote:

My proposed check for IFF_DRV_RUNNING is by no means a priority
task.  I can add it by myself after you finish your great current
project regarding ether_input() and friends.
  

Just committed in p4:

 //depot/user/bms/netdev/sys/net/if_ethersubr.c#6 - 
/home/bms/p4/netdev/sys/net/if_ethersubr.c 

--- /tmp/tmp.11470.0Tue Mar  6 15:45:08 2007
+++ /home/bms/p4/netdev/sys/net/if_ethersubr.c  Tue Mar  6 15:45:01 2007
@@ -511,6 +511,13 @@
   m_freem(m);
   return;
   }
+#ifdef DIAGNOSTIC
+   if ((ifp-if_flags  IFF_DRV_RUNNING) == 0) {
+   if_printf(ifp, discard frame at !IFF_DRV_RUNNING\n);
+   m_freem(m);
+   return;
+   }
+#endif

Thanks!
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-06 Thread Bruce M. Simpson

Eygene Ryabinkin wrote:

I am awfully sorry, but you're seem to be mistaken:
Thanks for clarifying this. That'll be because I didn't read if_bridge 
that far. ;^) In my original message I was just looking at if_ethersubr.c.


I need to make sure any changes which are made to if_bridge to deal with 
vlan problems are incorporated into bms_netdev so that after I commit 
M_PROMISC, it does the right thing.

 if_bridge calls
the ipfw directly only for the L2 filtering (when the net.link.bridge.ipfw
is set to 1).  This is processed by the block in if_bridge just
above to the 'ipfwpass' label.

  


In bms_netdev, the behaviour of ether_demux() is unchanged.

ip_dn_claim_rule() is called to determine if there is an IPFW (usually 
dummynet) rule for the input frame at ethernet level, if-and-only-if 
net.link.ether.ipfw is non-zero. I just committed some comments to 
clarify this and styled it the same as the check in ether_output_frame().


However -- the IPFW check in ether_demux() is *skipped* in bms_netdev if 
M_PROMISC is set. This is because we might drop packets which are 
destined for vlan_input() which flow in because the interface is 
IFF_PROMISC.


Strictly speaking this bends the rules of dummynet, because if you have 
frames coming in due to promiscuous mode, which the rest of the stack 
doesn't expect, they won't be filtered by Dummynet pipes.



But the L3 filtering is done fully by the pfil hooks, as I understand
the code. Moreover, I am using 'pf' in my case, not the ipfw.
  

Yes, this is always the case for the upper layers.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-06 Thread Bruce M Simpson

Julian Elischer wrote:


When we added netgraph we split both the input and output parts
so that they would provide 'natural' entrypoints for a bridge.
Consider where a bridge wants to put packets.
In bms_netdev, bridge_input() is entered directly from ether_input(). It 
may potentially re-enter, so M_PROMISC is cleared on frames thus handed 
off to if_bridge(4). Same for ng_ether(4).


Since the split however other code has made use of those entrypoints 
at different
times. I'm not sure at the moment whether other code does so now. 
According to KScope on -CURRENT, the only other places which call the 
split ether_demux() are dummynet_send() and ng_ether_rcv_upper().


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-05 Thread Bruce M. Simpson

Eygene Ryabinkin wrote:

Will try to understand if it will cure my problem, thanks!
Attaching my patch, just in case if freebsd gnats will be down ;))
  


Thanks for this. It looks like Andrew may be in a better position to say 
if this fix should go in or not.


It is possible that if bridge changes the ifp and that the frame should 
be forwarded locally, i.e. to the upper protocol layers, that ifp should 
also be updated in ether_input() (as NetBSD does) to make sure that the 
later checks are against the updated ifp.


I have just changed this behaviour in p4 bms_netdev. Please try to test 
with this code. If you can't access p4, then I can extract an updated 
patch though this will take longer.


This should help to eliminate the need for DEV_CARP compile-time 
conditionals in if_bridge(4).


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-05 Thread Bruce M Simpson

Hi,

Thanks for your reply.

Yar Tikhiy wrote:

My concern is that, with possible callers of ether_input() being
not really *from* but *on behalf* of the interface, e.g., in Netgraph,
IFF_DRV_RUNNING can be a way for the interface driver to tell us:
I'm not ready yet, so don't believe anyone who pretends he has a
packet from me.

E.g., a vlan(4) interface gets IFF_DRV_RUNNING set only if it is
properly attached to an Ethernet interface (known as the vlan's
parent).  AFAIK this is a totally legitimate use of IFF_DRV_RUNNING.
Now assume that a vlan interface is UP but not RUNNING because it's
detached from the parent.  If a buggy Netgraph node or another
source of synthetic traffic decides to inject a packet as though
it comes in from the said vlan interface, handling the packet as
usual will be bogus.

IMHO the IFF_UP check in ether_input() is mostly for a similar
purpose: If all callers of ether_input() were in real and conformant
interface drivers, we shouldn't bother re-checking IFF_UP in
ether_input() either because the driver of a down interface wouldn't
call ether_input() for it in the first place.
  
I agree with the point you make here about non-conforming drivers; 
however there are cogent performance arguments for checking IFF_UP 
immediately. If an interface is configured administratively down, it 
shouldn't be pumping traffic into the network stack.  I do however 
realize there are situations where this can happen.


Suppose, for example, the thread which calls ether_input() is scheduled 
on another CPU. Dropping such frames immediately on entry into 
ether_input() saves tying up a thread for any longer than is absolutely 
necessary.


Perhaps Kip, who is working on 10GbE performance just now, can advise 
further.


Of course, we can omit the check for IFF_DRV_RUNNING if we think
that synthetic traffic from an unready interface is OK.  But I'm
afraid we shouldn't.

In addition, I wonder if we can move the conformance checks to a
wrapper function so that conformant drivers don't have to pay the
performance penalty of the just in case checks per each inbound
Ethernet packet.
  
Thanks for explaining this further. Perhaps I should put the check for 
IFF_DRV_RUNNING under INVARIANTS or make it a KASSERT?


The code in bms_netdev as it stands bends the rules a little. The IFF_UP 
check was in ether_demux() before. The original reason for the 
ether_input()/ether_demux() split was to accomodate Netgraph. I must 
admit that I hadn't fully mapped out the possible re-entry scenarios 
with Netgraph because they may be arbitrarily complicated by its very 
nature.


Whilst Netgraph is a cool feature, and one I am very grateful that 
FreeBSD has, I wonder if it is OK that we should have checks which  
potentially pessimize performance for the main use cases to protect the 
stack against Netgraph frames which are bogons, or bugs in Netgraph nodes.


I'm open to hearing more about this, but my own resources (time, money) 
are a limiting factor as to what I can do.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-05 Thread Bruce M. Simpson

Hi,

Eygene Ryabinkin wrote:

Sure, I can test it, but then I need to know what problems are cured
by your patch, or I just should watch if it will not break something.

  
My concern is that I want to make sure that all these changes to the 
ether_input path work OK together.


The M_PROMISC flag is set further down when it's determined that a frame 
flowing into ether_input() was received promiscuously, and therefore 
Layer 3 protocols (e.g. IP) may not want to see it.


In NetBSD, after if_bridge is given a chance to claim an input frame, the ifp 
may be changed if the bridge needs to forward locally.



In my case if_bridge drops off the packet because firewall fails to
recognize the packet as good: the interface that is passed to a
pfil_hooks is bad (I mean not the one expected).
  
The ifp which your patch changes is that of the mbuf chain when 
bridge_input determines it is not for the bridge, but should be 
forwarded locally. The patch forces a locally forwarded frame to have 
the same ifp as it had when it came into bridge_input. I can foresee 
problems if the same Ethernet destination address exists on multiple 
bridge member interfaces.


The latest version of p4 bms_netdev now updates the cached ifp in 
ether_input() if bridge_input() changed it in this way.


NetBSD consistently uses pfil_hooks for the if_bridge *and* ether_input 
paths, FreeBSD currently calls ipfw directly for ether_input, which may 
make a difference to the behaviour which you are seeing with VLANs.


Not understanding if_bridge fully, or the coupling of ipfw with 
if_ethersubr.c, I would hope that Andrew and others have more to say on 
this.



Will try to see if your patch makes any difference for the 7-CURRENT,
but I have no system at hand to test it, sorry.
  
The patch is extracted from p4 therefore it should apply against 
CURRENT. I haven't updated the patch yet, the latest code is in p4.


We won't be able to eliminate the DEV_CARP checks in this spin. I did 
exchange an idea with Andrew late last night whereby a list of addresses 
other than ether_dhost is maintained for each ifnet. Input paths then 
check this in addition to or instead of ether_dhost.


I've added this to the Wiki.

I've been working particularly hard lately so I'm not 100% clear.

Thanks,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bin/94920: [rpc] rpc.statd(8) conflict with cups over tcp and udp ports 631

2007-03-04 Thread Bruce M Simpson
Synopsis: [rpc] rpc.statd(8) conflict with cups over tcp and udp ports 631

Responsible-Changed-From-To: bms-freebsd-net
Responsible-Changed-By: bms
Responsible-Changed-When: Sun Mar 4 15:03:40 UTC 2007
Responsible-Changed-Why: 
Someone else with Copious Free Time can do this -- not a priority for me.

http://www.freebsd.org/cgi/query-pr.cgi?pr=94920
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bin/100969: [rpc.lockd] rpc.lockd conflict with cups over udp ports 631

2007-03-04 Thread Bruce M Simpson
Synopsis: [rpc.lockd] rpc.lockd conflict with cups over udp ports 631

Responsible-Changed-From-To: bms-freebsd-net
Responsible-Changed-By: bms
Responsible-Changed-When: Sun Mar 4 15:04:14 UTC 2007
Responsible-Changed-Why: 
Someone else with Copious Free Time can do this -- not a priority for me.

http://www.freebsd.org/cgi/query-pr.cgi?pr=100969
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] IP_SENDIF option; rework ip_output() source selection logic

2007-03-04 Thread Bruce M Simpson

Hello,

Thanks to andre making a start on this, I have managed to get the 
IP_SENDIF option implemented today in p4 bms_netdev. Here's a patch 
against -CURRENT:

   http://people.freebsd.org/~bms/dump/sendif-20070304.diff

For those who are new to this work:
  IP_SENDIF is broadly an analogue of the Linux socket option 
SO_BINDTODEVICE. It is used to bypass the traditional BSD source 
interface selection logic. It is a sledgehammer hack used to output 
datagrams on a specific interface which may not yet have an address, 
e.g. for DHCP. Judicious use of this option, together with IP_ONESBCAST, 
will make it possible for dhclient to run without BPF support in the 
base system.


There are a few remaining issues around this code which need to be dealt 
with. These are:


* Fix IP_SENDIF and IP_SENDSRCADDR for unbound sockets.
This goes without saying. For these options to be useful the socket 
should not have to be bound anywhere. The fact that IP_SENDSRCADDR is 
currently broken contradicts both our documentation and UNIX Network 
Programming Vol 1 3rd Edition.


* Allow IP_SENDIF to be used from the raw IP output path.
Some people might want to do this.

* Add a specific privilege level for IP_SENDIF.
Currently it requires the 'open raw socket' privilege, as it is Not 
Normal Behaviour.


* Disable hardware checksums on output, if we have to do that.
My testing with msk(4) suggests this might not be needed.

When/if we adopt NetBSD's source selection policy concept (e.g. for 
fully supporting link-local IPv4) this code will most likely have to be 
updated, and/or when/if we adopt equal-cost multipath.


The hack IP_ONESBCAST itself may eventually be eliminated by doing 
things slightly differently in the forwarding trie i.e. using interface 
preference and/or IP_SENDIF and populating the trie with 255.255.255.255 
routes.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP behaviour

2007-03-04 Thread Bruce M Simpson

Yar Tikhiy wrote:

We shouldn't cache route pointers anywhere anymore.  It has been completely
removed from the PCBs and things like gif and others.


Sounds like a good way to go, too! :-)  Thanks!
  
gre(4) does very funky things with the route it caches to the tunnel 
endpoint. Someone(tm) should have a look at that.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/109815: wrong interface identifier at pfil_hooks for vlans + if_bridge

2007-03-04 Thread Bruce M. Simpson

Hi,

I haven't seen your patch, can you point me at it off-list? Thanks.

Eygene Ryabinkin wrote:


I traced the current if_bridge.c behaviour to the NetBSD's if_bridge.c
1.9. This was the first version in that the firewall hooks were
introduced. And the assumtion that the MAC identifies the physical
interfaces was used in this first version.

And a question: can anyone say if my patch will break some known
good behaviour and if the current behaviour of if_bridge is based
on some logic I am currently failing to understand.
  
I would greatly appreciate it if you could look at the combined 
M_PROMISC and 802.1p patch, which rewrites ether_input() significantly. 
It sounds like the issues you are having with vlans and bridges may 
potentially be fixed by this patch, or that the fix may be incorporated 
more easily with this patch.


In NetBSD, after if_bridge is given a chance to claim an input frame, 
the ifp may be changed if the bridge needs to forward locally. M_PROMISC 
is used to indicate that a frame was received promiscuously, in case 
ether_input() re-enters itself with the same mbuf chain. Certain 
consumers of ether_input() need to punch holes in the logic used to 
detect if a frame was for us or not because they do funky things with 
Ethernet destination addresses, e.g. carp.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-04 Thread Bruce M Simpson

Yar Tikhiy wrote:


Now I see your point, thanks!  Well, at least in theory, the driver
shouldn't call ether_input() if the interface isn't running.  OTOH,
the interface shouldn't be getting traffic if it's !UP.  However,
I suspect that not all drivers handle IFF_UP fully or even can do
it at all due to hardware limitations.  As I understand it, in an
ideal world a !UP interface should be deaf and dumb and not interfering
in any way with the network still connected to it physically.
Therefore discarding inbound traffic from a !UP interface may be a
necessary workaround, but it may not be enough.  All that boils
down to this: The IFF_UP check in ether_input() is more to a sanity
check than to the way for IFF_UP to work.  Therefore we can add the
IFF_DRV_RUNNING sanity check there, too, for completeness.
  

Thanks for your explanation.

I'm still not sure I understand why IFF_DRV_RUNNING should be checked 
for in ether_input().


There is a pretty clear reason for checking for IFF_UP in ether_input(); 
an interface which is configured administratively down should not be 
bringing traffic into the stack, regardless of whether it is a hardware 
device or a pseudo-device. IFF_UP has been in since 4.2BSD; it is more 
or less integral to how the BSD network stack operates. There are 
situations in which a pseudo-device or hardware device could incorrectly 
call ether_input() with such traffic.


Reading net/if.h, IFF_DRV_RUNNING is documented as meaning 'resources 
are allocated for this device'. Surely such a check is redundant and not 
relevant to the operation of ether_input()? As far as I can tell it is 
similar to the old meaning of IFF_RUNNING, and there are legitimate 
situations in which the hardware or its queues may have stopped 
processing temporarily whilst the interface may be administratively up 
(and thus accepting traffic).


Please correct me if I'm wrong or point out situations where it's 
important IFF_DRV_RUNNING state is checked outside of a driver. Sorry if 
I seem obtuse, but I'm sure I'm missing some detail here.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-03 Thread Bruce M Simpson

Yar Tikhiy wrote:


In fact, there two independent flags indicating interface's readiness:
IFF_UP and IFF_DRV_RUNNING.  The former is controlled by the admin
and the latter, by the driver.  E.g., an interface can be UP but
not really ready to operate due to h/w reasons, or vice versa.
Perhaps we should check both flags to see if the interface is, so
to say, up and running.  if_vlan.c has an obvious macro for that,
and it can go to if_var.h to avoid code duplication if we decide it's
the right way to take.
  

Thanks for looking at this.

The purpose of the IFF_UP check is to immediately drop frames destined 
for an interface which is administratively configured down.


Surely if ether_input() is called from the driver, there should be no 
need to check IFF_DRV_RUNNING? Indeed if the hardware flips to a state 
where it is not running but its internal queues or descriptor rings are 
draining, this might cause frames to be lost?


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] Ethernet cleanup; 802.1p input and M_PROMISC

2007-03-02 Thread Bruce M Simpson

Hello all,

I would like to announce an updated version of the 802.1p input patch, 
available at:

   http://people.freebsd.org/~bms/dump/latest-8021p.diff

I have cut down the original scope of the patch. I previously ran into 
problems when I tried to move VLAN tag input and output processing into 
if_ethersubr.c.


FreeBSD should now accept VLAN 0 traffic on input with this patch. In 
addition to this, the M_PROMISC flag is now used, which considerably 
simplifies the Ethernet input path in general.


I have performed some light testing on a 1Gbps COTS switch with 802.1q 
encapsulation and without, with carp and vlan, with and without hardware 
VLAN tagging, and all looks OK. I would greatly appreciate further 
testing, particularly with if_bridge and ng_ether which I have not tried.


If all goes to plan, I would hope to commit this code to -CURRENT within 
the next 10 days.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


CARP behaviour

2007-03-02 Thread Bruce M Simpson

During testing of M_PROMISC I noticed a couple of issues with our CARP.

1. carp doesn't seem to maintain input/output statistics on its ifnet.

2. carp doesn't seem to detect that the underlying route to the subnet
  its address is exposed on changed to another interface.

Are these conditions normal / expected?

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Proposal: Add M_HASCL().

2007-03-01 Thread Bruce M. Simpson

Bruce M Simpson wrote:
Much network code needs to know if the mbuf it is looking at is using 
a cluster. I propose putting M_HASCL() in sys/mbuf.h. I realise this 
is a style change, however, it seems to be a very common idiom.
I sent this, then I looked at NetBSD, having caught a glimpse of their 
MBUFTRACE code when skimming lots of diffs. That is also a good idea, 
and might help us catch problems before they go prime-time; I've added 
it to the wiki.


Point there is, M_HASCL() seems to be a hangover from the 4.4BSD era. 
NetBSD seems to treat clusters and external storage as separate 
entities. So I'm reconsidering this in the light of this new evidence.


As far as I understand it, the presence of M_EXT in an mbuf chain's 
header in FreeBSD always indicate that we are using external storage 
(not necessarily, but possibly, a cluster).


Can someone confirm this?

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR

2007-03-01 Thread Bruce M. Simpson

Bruce M Simpson wrote:

Hello,

In preparation for tightening up our handling of INADDR_BROADCAST 
sends, I ran some brief tests today on the network stack with the 
attached test code.


I found some inconsistencies when run against 6.2-RELEASE;

1. IP_ONESBCAST breaks if SO_DONTROUTE is specified.

One thing appears to be consistent about the failure mode: bad UDP 
checksums.
dc(4) is being used on the destination end of the test network, so 
checksum offloading should not be an issue.
I am also seeing the wrong destination address being used in most 
cases. This is intermittent regardless of whether the socket is bound 
or unbound.
This is consistent with ip_output() treating its internal flag 
IP_SENDONES as separate from IP_ROUTETOIF. I was skimming an old patch 
of mine which attempts to implement part of SO_BINDTODEVICE which 
contains a fix related to this condition.


The fix isn't the right fix so I will revisit this now and hopefully 
commit a fix shortly.


2. IP_SENDSRCADDR has some other inconsistencies.
a. The option is always rejected if the socket is not bound.
I find this behaviour suspect; the whole point of the option is to 
specify, for SOCK_DGRAM and SOCK_RAW, the source address of a packet.

b. 0.0.0.0 is always accepted.
A regular interface lookup is used based on destination if this is 
specified. This appears suspect to me because such an option is 
redundant.
This is of course a separate issue. Because it's more involved (it 
concerns the general concept of 'ip unnumbered' in the stack) it needs 
further consideration before any fix is attempted.


udp_output() will only call in_pcbbind_setup() if a non-INADDR_ANY 
source address was specified; this is usually obtained from the socket 
being bound previously. This explains why the IP_SENDSRCADDR option is 
rejected in udp_output() for an unbound socket. It *will* be accepted if 
the option contains INADDR_ANY. In this case, normal source address 
selection takes place.


This is a good use case demonstrating the need for source address 
selection logic such as is now found in NetBSD.


There is no sanity checking on the IP_SENDSRCADDR option data containing 
INADDR_ANY; such an option is redundant and is nonsensical for an 
unbound socket. We should reject the option if it contains INADDR_ANY if 
and only if the socket is not bound. Implementing such a check is fairly 
easy and makes sense for this use case. Returning EINVAL in this case 
seems acceptable according to ip(4).


The option *should* be accepted if the application has bound the socket 
to a device somehow (oh dear, SO_BINDTODEVICE rears its head again) as 
DHCP for example needs to override any IPv4 address which may be 
assigned on an ifnet with 0.0.0.0.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Inconsistencies with IP_ONESBCAST and/or IP_SENDSRCADDR

2007-03-01 Thread Bruce M. Simpson

Andre Oppermann wrote:


I have some WIP here too.  I'll send it to you later this afternoon.


Thanks, I look forward to seeing it, re Issue #2 IP_SENDSRCADDR.

Dealing with dhclient is a separate issue -- here, something like 
IP_SENDIF needs to be introduced, as we are truly in an 'ip unnumbered' 
situation -- ie the ifnet MAY not yet have been assigned an IPv4 address 
at all, and IP_SENDSRCADDR implies that you are source routing in the 
local stack by passing the address of a numbered interface


I have however dealt with Issue #1 by committing a fix to ip_output() 
for the IP_ONESBCAST  SO_DONTROUTE case.


This together with the fix you committed for ethernet next-hop 
resolution (thanks!) should mean that projects like OLSRD can stop using 
libnet and other hacks for sending 255.255.255.255 on FreeBSD.


The original broadtest tool has now been cleaned up and put into the 
tree under src/tools/regression/netinet/ipbroadcast.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: is setsockopt SO_NOSIGPIPE work?

2007-03-01 Thread Bruce M. Simpson

Anton Yuzhaninov wrote:

RE It works, but only if you use send() instead of write().
RE Alternatively, you can control the behavior on a per
RE message basis, by passing the MSG_NOSIGNAL in the flags
RE argument to the send() call (without having to set a
RE socket option).

Thanks, with send() it works fine.
I think it should be documented in setsockopt(2).
  
AFAIK this is not a POSIX sockopt. I can only trace it back to MacOS X 
as the origin.
Most applications I know of set the handler for SIGPIPE to SIG_IGN in 
such situations.


Call graph: write() - dofilewrite() - soo_write() - pru_send()

Looking at the code for the generic write() path it looks like we would 
never squelch this kind of SIGPIPE intentionally.
In soo_write() we check the SO_NOSIGPIPE option to tell if we should 
call psignal().
However, as soon as we return from soo_write(), the EPIPE is mapped to 
psignal() by the generic code in dofilewrite() which generates the 
SIGPIPE you are seeing.


I think this may be a bug but in the absence of precise written 
requirements I can't be sure. :-)


BMS





___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] Re: is setsockopt SO_NOSIGPIPE work?

2007-03-01 Thread Bruce M. Simpson

Anton Yuzhaninov wrote:


Thanks, with send() it works fine.
I think it should be documented in setsockopt(2).
Try this patch. The comment doesn't reflect what the code does. SIGPIPE 
may actually be getting queued twice in your case. It is most likely 
that the process's main thread wasn't preempted before return from the 
syscall.


Perhaps someone more familiar with the signal code than I can chime in.

--- sys_generic.c 14 Oct 2006 19:01:55 - 1.151
+++ sys_generic.c 1 Mar 2007 17:30:39 -
@@ -489,7 +489,7 @@ dofilewrite(td, fd, fp, auio, offset, fl
   error == EINTR || error == EWOULDBLOCK))
   error = 0;
   /* Socket layer is responsible for issuing SIGPIPE. */
-   if (error == EPIPE) {
+   if (fp-f_type != DTYPE_SOCKET  error == EPIPE) {
   PROC_LOCK(td-td_proc);
   psignal(td-td_proc, SIGPIPE);
   PROC_UNLOCK(td-td_proc);

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Re: is setsockopt SO_NOSIGPIPE work?

2007-03-01 Thread Bruce M. Simpson

Anton Yuzhaninov wrote:

Works for me.
  

Committed, thanks for finding this bug.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Re: is setsockopt SO_NOSIGPIPE work?

2007-03-01 Thread Bruce M. Simpson

N.J. Mann wrote:


Could this be why mail from cron doesn't work for me in 6.2?  I got as
far as finding that cron receives a SIGPIPE while sending the mail
message to sendmail, but never worked out why.  I ended up hacking cron
to ignore SIGPIPE and then ENOTIME to investigate further.

Unlikely, unless cron were directly hooked up to a TCP socket.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Feature request: exit netstat(1) after user specified outputs

2007-02-28 Thread Bruce M. Simpson

LI Xin wrote:

Hi,

If no one objects this change, I will commit it?

  

No objection here.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Proposal: Add M_HASCL().

2007-02-28 Thread Bruce M Simpson
Much network code needs to know if the mbuf it is looking at is using a 
cluster. I propose putting M_HASCL() in sys/mbuf.h. I realise this is a 
style change, however, it seems to be a very common idiom.


Places this macro is currently defined and used directly:
netinet/ip_mroute.c
netinet6/ip6_mroute.c
nfsclient/nfsm_subs.h
nfsserver/nfsm_subs.h

Places which use this idiom by another name:
if_ppp.c
ppp_tty.c

Places which use this idiom indirectly by its expansion:
sys/mbuf.h
sys/socketvar.h
netinet/ip6.h
dev/pdq
Many device drivers and third party code.

Head on over to http://fxr.watson.org/fxr/ident?i=M_HASCL and have a look.

Feel free to not bikeshed about this. It became apparent that this is a 
common idiom (needing to know if an mbuf is using external storage for 
whatever reason).


Thoughts?

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: making a dumb switch into a smart one

2007-02-26 Thread Bruce M. Simpson

Luigi Rizzo wrote:

partly off topic, but maybe someone migth find this interesting
given that the device can do vlan tag insertion/removal,
so it can be used to provide additional fan-in/fan-out
to freebsd-based routers in not too high-speed networks.
  
Were you trying to perform the same evil experiment on the Asound 4-port 
Ethernet 'switch' PCI card which I found in my/your old office at ICSI? ;-)


I think it was Juli Mallett who said she had code to deal with the Asound.

This is cool.

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Networking FreeBSD Wiki

2007-02-26 Thread Bruce M. Simpson

[EMAIL PROTECTED] wrote:


George, maybe there should be a separate category in GNATS also, for
network issues?



Instead of being in kern you mean?  I have thought that before but I
don't control GNATS and we'd have to review a lot of bugs.
  
I have noticed there has been a gradual effort over time by the 
Bugmeisters to classify bugs by putting [netinet] or other strings in 
the one-line bug synopsis.


Whilst this is a great help, it still doesn't address many of the issues 
we have with GNATS, upon which consensus has not yet been reached as to 
how to go forward.


Personally, I'd like to blow GNATS up and replace it with Bugzilla.

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/86848: [pf][multicast] destroying active syncdev leads to panic

2007-02-25 Thread Bruce M Simpson


 Hi,


Please try the attached patch which should hopefully fix this issue 
(untested).


Regards,
BMS

? .swp
Index: if_pfsync.c
===
RCS file: /home/ncvs/src/sys/contrib/pf/net/if_pfsync.c,v
retrieving revision 1.32
diff -u -p -r1.32 if_pfsync.c
--- if_pfsync.c	29 Dec 2006 13:59:47 -	1.32
+++ if_pfsync.c	25 Feb 2007 16:11:03 -
@@ -170,6 +170,9 @@ void	pfsync_timeout(void *);
 void	pfsync_send_bus(struct pfsync_softc *, u_int8_t);
 void	pfsync_bulk_update(void *);
 void	pfsync_bulkfail(void *);
+#ifdef __FreeBSD__
+static void	pfsync_ifdetach(void *, struct ifnet *);
+#endif
 
 int	pfsync_sync_ok;
 #ifndef __FreeBSD__
@@ -191,6 +194,9 @@ pfsync_clone_destroy(struct ifnet *ifp)
 struct pfsync_softc *sc;
 
 	sc = ifp-if_softc;
+#ifdef __FreeBSD__
+	EVENTHANDLER_DEREGISTER(ifnet_departure_event, sc-sc_detachtag);
+#endif
 	callout_stop(sc-sc_tmo);
 	callout_stop(sc-sc_bulk_tmo);
 	callout_stop(sc-sc_bulkfail_tmo);
@@ -225,6 +231,16 @@ pfsync_clone_create(struct if_clone *ifc
 		return (ENOSPC);
 	}
 
+#ifdef __FreeBSD__
+	sc-sc_detachtag = EVENTHANDLER_REGISTER(ifnet_departure_event,
+	pfsync_ifdetach, sc, EVENTHANDLER_PRI_ANY);
+	if (sc-sc_detachtag == NULL) {
+		if_free(ifp);
+		free(sc, M_PFSYNC);
+		return (ENOSPC);
+	}
+#endif
+
 	pfsync_sync_ok = 1;
 	sc-sc_mbuf = NULL;
 	sc-sc_mbuf_net = NULL;
@@ -1870,6 +1886,35 @@ pfsync_sendout(sc)
 
 #ifdef __FreeBSD__
 static void
+pfsync_ifdetach(void *arg, struct ifnet *ifp)
+{
+	struct pfsync_softc *sc = (struct pfsync_softc *)arg;
+	struct ip_moptions *imo;
+
+	if (sc == NULL || sc-sc_sync_ifp != ifp)
+		return;		/* not for us; unlocked read */
+
+	PF_LOCK();
+
+	/* Deal with detaching an interface which went away. */
+	sc-sc_sync_ifp = NULL;
+	if (sc-sc_mbuf_net != NULL) {
+		s = splnet();
+		m_freem(sc-sc_mbuf_net);
+		sc-sc_mbuf_net = NULL;
+		sc-sc_statep_net.s = NULL;
+		splx(s);
+	}
+	imo = sc-sc_imo;
+	if (imo-imo_num_memberships  0) {
+		in_delmulti(imo-imo_membership[--imo-imo_num_memberships]);
+		imo-imo_multicast_ifp = NULL;
+	}
+
+	PF_UNLOCK();
+}
+
+static void
 pfsync_senddef(void *arg)
 {
 	struct pfsync_softc *sc = (struct pfsync_softc *)arg;
@@ -1879,6 +1924,14 @@ pfsync_senddef(void *arg)
 		IF_DEQUEUE(sc-sc_ifq, m);
 		if (m == NULL)
 			break;
+#if 1
+		/* XXX: paranoia */
+		if (sc-sc_sync_ifp == NULL) {
+			pfsyncstats.pfsyncs_oerrors++;
+			m_freem(m);
+			continue;
+		}
+#endif
 		if (ip_output(m, NULL, NULL, IP_RAWOUTPUT, sc-sc_imo, NULL))
 			pfsyncstats.pfsyncs_oerrors++;
 	}
Index: if_pfsync.h
===
RCS file: /home/ncvs/src/sys/contrib/pf/net/if_pfsync.h,v
retrieving revision 1.7
diff -u -p -r1.7 if_pfsync.h
--- if_pfsync.h	10 Jun 2005 17:23:49 -	1.7
+++ if_pfsync.h	25 Feb 2007 16:11:03 -
@@ -181,6 +181,7 @@ struct pfsync_softc {
 	int			 sc_maxupdates;	/* number of updates/state */
 #ifdef __FreeBSD__
 	LIST_ENTRY(pfsync_softc) sc_next;
+	eventhandler_tag	 sc_detachtag;
 #endif
 };
 #endif
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: kern/100519: [netisr] suggestion to fix suboptimal network polling

2007-02-25 Thread Bruce M Simpson
Synopsis: [netisr] suggestion to fix suboptimal network polling

State-Changed-From-To: feedback-open
State-Changed-By: bms
State-Changed-When: Sun Feb 25 16:18:13 UTC 2007
State-Changed-Why: 
Back to the net pool


Responsible-Changed-From-To: bms-net
Responsible-Changed-By: bms
Responsible-Changed-When: Sun Feb 25 16:18:13 UTC 2007
Responsible-Changed-Why: 
Back to the net pool

http://www.freebsd.org/cgi/query-pr.cgi?pr=100519
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/86848: [pf][multicast] destroying active syncdev leads to panic

2007-02-25 Thread Bruce M. Simpson

Whups. That needs 'int s' or the spl calls removed.
I am under the weather today (dry flu type virus)...
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Re: ioctl: SIOCADDMULTI (howto?)

2007-02-24 Thread Bruce M. Simpson
I have now added a regression test for this bug in HEAD, under 
src/tools/regression/ethernet/ethermulti.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


NetworkRfcCompliance is born

2007-02-21 Thread Bruce M Simpson

http://wiki.freebsd.org/NetworkRfcCompliance

Please begin wiki-whacking!

BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NetworkRfcCompliance is born

2007-02-21 Thread Bruce M Simpson

Luigi Rizzo wrote:

On Wed, Feb 21, 2007 at 02:50:27PM +, Bruce M Simpson wrote:
  

http://wiki.freebsd.org/NetworkRfcCompliance



before it is too late to change, maybe it is the case to
spell RFC as all capital letters ?
  
It would surely be better named NetworkStandardsCompliance as IEEE stuff 
appears inevitably also.


I am pressed for time at the moment, so, other volunteers very welcome 
to do so...


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unable to connect broadband

2007-02-20 Thread Bruce M. Simpson

satimis wrote:

Hi folks,

FreeBSD-6.2-amd64
...
The onboard NIC seems not detected.
  
In the absence of required information, I speculate your machine has 
msk(4) or another recent chipset which may be supported in 
FreeBSD-CURRENT but not FreeBSD-STABLE.


Please post the full output of 'pciconf -lv' from booting a recent 
FreeSBIE version to the list and hopefully someone can offer more help.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


If you run IS-IS please contact me

2007-02-20 Thread Bruce M Simpson

Anyone out there running IS-IS on a FreeBSD machine, please contact me.

It's my understanding that IS-IS requires link-layer multicast support. 
Therefore I would like to hear from anyone who is running an 
implementation of it on FreeBSD successfully. I want to make sure it 
continues to operate in the 6.2-STABLE and 7.0-CURRENT code bases, given 
that we plan a lot of changes to Ethernet and how it works in those code 
bases.


If you could let me know which implementation of IS-IS you're using, how 
long you've been running it for, how large the network you route with 
IS-IS is, and which FreeBSD releases you have been using, that would be 
most useful.


Thank you in advance!

Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ioctl: SIOCADDMULTI (howto?)

2007-02-20 Thread Bruce M. Simpson
Here is a better patch for the netstat output. I haven't had time to 
look at the kernel yet.


If this patch is good for you I'll commit it on -CURRENT. It cleans up 
the group membership output significantly and displays the Link-layer 
information separately.


If anyone 'out there' has been relying on this output in scripts, please 
tell me.


BMS
--- mcast.c.orig	Sat Feb 17 18:12:28 2007
+++ mcast.c	Tue Feb 20 23:26:41 2007
@@ -71,21 +71,39 @@
 #define MYIFNAME_SIZE 128
 
 void
-ifmalist_dump(void)
+ifmalist_dump_af(struct ifmaddrs *ifmap, int af)
 {
-	struct ifmaddrs *ifmap, *ifma;
+	struct ifmaddrs *ifma;
 	sockunion_t *psa;
 	char myifname[MYIFNAME_SIZE];
 	char addrbuf[INET6_ADDRSTRLEN];
 	char *pcolon;
 	void *addr;
-	char *pifname, *plladdr, *pgroup;
+	char *pafname, *pifname, *plladdr, *pgroup;
 
-	if (getifmaddrs(ifmap))
-		err(EX_OSERR, getifmaddrs);
+	if (!((af == AF_INET) || (af == AF_LINK)
+#ifdef INET6
+	|| (af == AF_INET6)
+#endif
+	))
+	return;
+
+	switch (af) {
+	case AF_INET:
+		pafname = IPv4;
+		break;
+	case AF_INET6:
+		pafname = IPv6;
+		break;
+	case AF_LINK:
+		pafname = Link-layer;
+		break;
+	}
 
-	fputs(IPv4/IPv6 Multicast Group Memberships\n, stdout);
-	fprintf(stdout, %-20s\t%-16s\t%s\n, Group, Gateway, Netif);
+	fprintf(stdout, %s Multicast Group Memberships\n, pafname);
+	fprintf(stdout, %-20s\t%-16s\t%s\n, Group,
+	Next Hop/L2 Address,
+	Netif);
 
 	for (ifma = ifmap; ifma; ifma = ifma-ifma_next) {
 
@@ -94,16 +112,32 @@
 
 		/* Group address */
 		psa = (sockunion_t *)ifma-ifma_addr;
+		if (psa-sa.sa_family != af)
+			continue;
 		switch (psa-sa.sa_family) {
 		case AF_INET:
 			pgroup = inet_ntoa(psa-sin.sin_addr);
 			break;
+#ifdef INET6
 		case AF_INET6:
 			addr = psa-sin6.sin6_addr;
 			inet_ntop(psa-sa.sa_family, addr, addrbuf,
 			sizeof(addrbuf));
 			pgroup = addrbuf;
 			break;
+#endif
+		case AF_LINK:
+			if ((psa-sdl.sdl_alen == ETHER_ADDR_LEN) ||
+			(psa-sdl.sdl_type == IFT_ETHER)) {
+pgroup =
+ether_ntoa((struct ether_addr *)psa-sdl.sdl_data);
+			} else {
+pgroup = addr2ascii(AF_LINK,
+psa-sdl,
+sizeof(struct sockaddr_dl),
+addrbuf);
+			}
+			break;
 		default:
 			continue;	/* XXX */
 		}
@@ -116,14 +150,20 @@
 plladdr = inet_ntoa(psa-sin.sin_addr);
 break;
 			case AF_LINK:
-if (psa-sdl.sdl_type == IFT_ETHER)
-	plladdr = ether_ntoa((struct ether_addr *)psa-sdl.sdl_data);
-else
-	plladdr = link_ntoa(psa-sdl);
+if (psa-sdl.sdl_type == IFT_ETHER) {
+	plladdr =
+ether_ntoa((struct ether_addr *)psa-sdl.sdl_data);
+} else {
+	plladdr = addr2ascii(AF_LINK,
+	psa-sdl,
+	sizeof(struct sockaddr_dl),
+	addrbuf);
+}
 break;
 			}
-		} else
+		} else {
 			plladdr = none;
+		}
 
 		/* Interface upon which the membership exists */
 		psa = (sockunion_t *)ifma-ifma_name;
@@ -143,6 +183,23 @@
 
 		fprintf(stdout, %-20s\t%-16s\t%s\n, pgroup, plladdr, pifname);
 	}
+}
+
+void
+ifmalist_dump(void)
+{
+	struct ifmaddrs *ifmap;
+
+	if (getifmaddrs(ifmap))
+		err(EX_OSERR, getifmaddrs);
+
+	ifmalist_dump_af(ifmap, AF_LINK);
+	fputs(\n, stdout);
+	ifmalist_dump_af(ifmap, AF_INET);
+	fputs(\n, stdout);
+#ifdef INET6
+	ifmalist_dump_af(ifmap, AF_INET6);
+#endif
 
 	freeifmaddrs(ifmap);
 }
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

[PATCH] Re: ioctl: SIOCADDMULTI (howto?)

2007-02-17 Thread Bruce M. Simpson

Jouke Witteveen wrote:


So my apologies for suggesting it doesn't work at all; it seems that
the application I'm trying to get to work (wpa_supplicant for wired
interfaces) just doesn't _send_ its packets the right way.
That's a big relief! I added an item to the Wiki for someone to write a 
regression test.


Things aren't perfect though. In if.c the if_findmulti function is
broken (always returns NULL). I presume just comparing the
*LLADDR((sockaddr *)sa) data on both sockets is a better check, though
my knowledge on these things is limited.
I think I see a possible problem, though the code looks as though it is 
behaving as expected.

I am looking at RELENG_6 if.c. I think sa_equal() may be to blame.

sa_equal() performs a binary comparison on all of sa_data up to sa_len. 
Looking at struct sockaddr_dl, this might not be the right thing at all 
in that situation... though I need another pair of eyes to look. Can 
anyone shed light on this? An AF_INET and AF_INET6 address can be 
completely specified and compared with sa_equal(). An AF_LINK address 
looks as though sa_equal() may return irrational results.


As for netstat, I do not really know what is keeping it from showing
the Multicast addresses. Again: my knowledge on this matter is
limited. All I can think of is that getifmaddrs is forgetting
something (perhaps the lack of a group membership). Maybe you can take
a look at it (I believe you wrote it).


I wrote the libc getifmaddrs() function and integrated it into netstat 
-g; Harti Brandt wrote the NET_RT_IFMALIST support. getifmaddrs() 
*should* return sockaddr_dl as well as sockaddr_in and all the others.


netstat skips over AF_LINK addresses. Try this patch to reveal them. It 
doesn't seem to show the IPv4 link layer memberships underneath, which 
is interesting...


As I am still learning how best to contribute to a project as big as
FreeBSD and I do not think I am skilled enough yet in C I refrain from
writing a patch. I am eager to see one though, be it only out of
curiosity to know what would be considered a proper fix.
Give it a try anyway!  I like to think we have strong healthy egos round 
here.


Regards,
BMS

--- mcast.c.orig	Sat Feb 17 18:12:28 2007
+++ mcast.c	Sat Feb 17 18:14:15 2007
@@ -84,7 +84,7 @@
 	if (getifmaddrs(ifmap))
 		err(EX_OSERR, getifmaddrs);
 
-	fputs(IPv4/IPv6 Multicast Group Memberships\n, stdout);
+	fputs(IPv4/IPv6/Layer 2 Multicast Group Memberships\n, stdout);
 	fprintf(stdout, %-20s\t%-16s\t%s\n, Group, Gateway, Netif);
 
 	for (ifma = ifmap; ifma; ifma = ifma-ifma_next) {
@@ -103,6 +103,15 @@
 			inet_ntop(psa-sa.sa_family, addr, addrbuf,
 			sizeof(addrbuf));
 			pgroup = addrbuf;
+			break;
+		case AF_LINK:
+			if (psa-sdl.sdl_type == IFT_ETHER) {
+plladdr = ether_ntoa((struct ether_addr *)
+psa-sdl.sdl_data);
+			} else {
+plladdr = link_ntoa(psa-sdl);
+			}
+			strlcpy(addrbuf, plladdr, sizeof(addrbuf));
 			break;
 		default:
 			continue;	/* XXX */
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

Recommendations for OSPF v3 book?

2007-02-16 Thread Bruce M Simpson
Does anyone have any good suggestions for a book which discusses OSPF v3 
architecture?


I have read the original John Moy book 'OSPF: Anatomy of an Internet 
routing protocol' but would very much like to know of there is a good 
text out there which discusses OSPF in the wider context of IPv6 and the 
improvements made in version 3 of the protocol.


I should be most grateful for your suggestions.

Kind regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Updated 802.1p/q patch

2007-02-15 Thread Bruce M Simpson

Yar Tikhiy wrote:

Do you have any architectural reservations about nested VLANs in
the main network stack?  Presently, a one-line patch can allow a
vlan(4) to attach to another vlan(4), but I haven't heard about the
behaviour of the resulting setup yet.
  
After looking around it seems there is definite scope and demand for 
such a feature in scenarios such as ISP Metro Ethernet setups. However, 
we can't rely on M_VLANTAG alone to implement it. To do it we need to be 
sure of the following:


1. Output path in vlan(4) changes not to call ether_output_frame() 
directly if nested.
2. Output path in vlan(4) detects when it's going to re-enter the parent 
vlan(4), and makes sure the inner 802.1q header is expanded and inserted 
from M_VLANTAG before passing it down the stack.

3. That the drivers and cards out there can deal with Q-in-Q.
4. That the input path only extracts and applies M_VLANTAG for the outer 
802.1q header.
4. That the input path is able to reenter vlan(4) correctly on the way 
back up the stack; The code which produces/consumes M_VLANTAG from the 
802.1q header might need to be made common.


The priority field them becomes problematic. As a compromise I'd suggest 
the priority field in the VLAN tag is derived from the innermost 802.1q 
header, which will be the first M_VLANTAG which the Ethernet part of the 
stack sees.


This gives ALTQ/RSVP/PF a chance to do its thing without complicated 
workarounds.


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] Part 2 of low level 802.1p priority support

2007-02-14 Thread Bruce M. Simpson

Pyun YongHyeon wrote:
  Further testing with drivers is needed (I can't be 100% sure it fails 
  with msk(4) because something strange is happening when vlan tagging is 
  turned off). Perhaps Pyun knows?
  


I guess I've not merged local changes before committing to HEAD.
How about attached one?
  
I can confirm that the merged VLAN tag code works OK with msk and 
VLAN_HWTAGGING disabled when using this patch.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] Updated 802.1p/q patch

2007-02-14 Thread Bruce M Simpson

Hi,

I have tested my 802.1p input patch with vlans configured. So far so good.

It is now available from: 
http://people.FreeBSD.org/~bms/dump/latest-8021p.diff


This updated patch moves the 802.1q encapsulation into if_ethersubr.c, 
allowing

M_VLANTAG to be passed up and down the stack for 802.1p priority.
I would greatly appreciate wider testing before it is committed.

I've noticed that vlan(4) will not put a parent interface into PROMISC
if the vlanhwtag capability exists but is disabled.

If the main non-vlan input path receives datagrams destined for
a layer 3 address configured on a vlan interface, the netinet stack
will quite reasonably try to reply on the vlan interface unless
net.inet.ip.check_interface is set to 1; something to be aware of.

If vlan(4) gets an mbuf which has already been tagged with M_VLANTAG
from higher up in the stack, it *should* ignore the vlan id by overwriting
it, and using the priority field already assigned to it, so that ALTQ or
PF can do its magic. This new patch should do this.

The Ethernet code will not use 802.1p by default unless it came from
higher up (by way of M_VLANTAG passed to a driver); we should insert
the 802.1p tag in the situation where we got an M_VLANTAG from further
up without a vlan(4) instance being involved. The new patch should do this.

We should also make sure the CFI bit is always cleared in bridging
situations as it has special meaning for token ring and FDDI.

What has not been tested or considered is the situation where we have
nested VLANs. At least one individual has asked about this feature. At
the moment, I'd suggest that only Netgraph potentially deals with this
rather than the main network stack.

Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Gateway slowed down to barely usable

2007-02-14 Thread Bruce M. Simpson

Andrea Venturoli wrote:


Today it suddenly dropped to a bare few b/s. I checked the ISP line by 
attaching another machine in place of this and it could do full 1Mb/s, 
so this box was the problem.


After a simple reboot it started working as good as always.

Now the question is: in case this happens again, how do I find out 
what's wrong?

CPU usage was under 2% and so was swap usage... what else could I check?
What tools should I use?

Points for further investigation:
How long was the machine up for?
Exactly which network components in FreeBSD are you using?
Do you have any figures on what kind of network load the machine was 
dealing with?

Can you rule out problems with an intermediate switch?

Based on what you've said I can only speculate that the possible causes 
are either mbuf memory fragmentation or a driver problem; both are a 
total stab in the dark.


Regards,
BMS


 bye  Thanks
av.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Configuring rendevous point

2007-02-12 Thread Bruce M. Simpson

[EMAIL PROTECTED] wrote:

Hi all

situation: got freebsd box working as NAT for my local network. In kernel
config there is an option PIM.
  
FYI, PIM is now the default in -CURRENT; the option has been removed. 
You should be able to load multicast routing with PIM as a loadable 
kernel module in -CURRENT.

I want my hosts behind NAT to receive multicast streams. I`ve seen in
Debian in pimdd.conf undocumented option rp_address, which stands for
rendevous point IP address
(http://ftp.debian.org/debian/pool/main/p/pimd/pimd_2.1.0-alpha29.17-6.diff.gz).
  

PIM-DM (Dense mode) does not use the Rendezvous Point.

Is there any way to specify rendevous point in freebsd via pimd.conf or
mrouted ?
Try XORP, in ports/net/xorp; it supports PIM-SM (Sparse mode) which is 
probably what you want for this kind of network configuration.


Normally the RP for a given group or set of groups is discovered using 
the Auto-RP feature of PIM-SM however, they may be statically 
configured; see the 'static-rps {}' configuration block in XORP's PIM-SM.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] netstat(1) should print CIDR prefixes

2007-02-12 Thread Bruce M Simpson

Gary Corcoran wrote:


Since those 'classes' haven't meant anything for many years, and 
interpreting
them as 'special' is just plain wrong in almost all cases these days, 
I think

the change is the right thing to do.

I've had +3.

Any objections? If I hear none I will make this change in -CURRENT; with 
a note in UPDATING.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] Part 1 of low level 802.1p priority support

2007-02-10 Thread Bruce M Simpson

Hi,

Here is the first patch to bring in 802.1p Packet Priority to FreeBSD; 
this is to support Differentiated Services and Quality-of-Service. This 
builds on the M_VLANTAG support introduced by Andre last September.


This first stage enables FreeBSD to pass packets for 802.1q with VLAN 0 
to the main input path in the stack, which is the IEEE 
standards-compliant behaviour. With the attached patch and test packet, 
you can test this for yourself.


Currently this is limited to interfaces which support VLAN_HWTAGGING. To 
make the change universal, an architectural change is needed; some of 
the inline 802.1q processing needs to be moved from if_vlan.c to 
if_ethersubr.c.


To use this:
1. Apply attached patch on a separate machine to be used as a test peer.
2. Process attached hex dump with xxd from Vim distribution 
(editors/vim) to convert back to a binary pcap file.
3. Configure test address on test peer, preferably using a separate 
physical LAN.
4. Use ports/net-mgmt/tcpreplay to inject the traffic, with the 
appropriate IP and MAC addresses.
5. Observe that you get an ICMP echo reply back WITHOUT 802.1q 
encapsulation.


Currently, the code deals only with receiving VLAN tags at a low level 
and does nothing about sending them.
This is just the low level stuff -- QoS is not magically happening right 
now.


Comments... testing... suggestions...

Regards,
BMS

? .swp
Index: if_ethersubr.c
===
RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.222
diff -u -p -r1.222 if_ethersubr.c
--- if_ethersubr.c	24 Dec 2006 08:52:13 -	1.222
+++ if_ethersubr.c	10 Feb 2007 16:46:42 -
@@ -618,6 +618,7 @@ ether_demux(struct ifnet *ifp, struct mb
 	struct ether_header *eh;
 	int isr;
 	u_short ether_type;
+	uint16_t vlanid;
 #if defined(NETATALK)
 	struct llc *l;
 #endif
@@ -627,6 +628,7 @@ ether_demux(struct ifnet *ifp, struct mb
 
 	KASSERT(ifp != NULL, (ether_demux: NULL interface pointer));
 
+	vlanid = 0;
 	eh = mtod(m, struct ether_header *);
 	ether_type = ntohs(eh-ether_type);
 
@@ -708,36 +710,44 @@ post_stats:
 	 */
 	if (m-m_flags  M_VLANTAG) {
 		/*
-		 * If no VLANs are configured, drop.
+		 * Deal with numbered 802.1q VLANs, by passing frames for
+		 * specifically numbered VLANs to the VLAN input handler.
 		 */
-		if (ifp-if_vlantrunk == NULL) {
-			ifp-if_noproto++;
-			m_freem(m);
+		vlanid = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag);
+		if (ifp-if_vlantrunk != NULL  vlanid != 0) {
+			KASSERT(vlan_input_p != NULL,
+			(ether_input: VLAN not loaded!));
+			(*vlan_input_p)(ifp, m);
 			return;
 		}
 		/*
-		 * vlan_input() will either recursively call ether_input()
-		 * or drop the packet.
+		 * Drop frames with VLAN encapsulation if VLANs are not
+		 * configured on this interface, if and only if they did
+		 * not contain 802.1p priority information.
+		 * Such frames are preserved, because code further up the
+		 * stack may use the 802.1p information.
 		 */
-		KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!));
-		(*vlan_input_p)(ifp, m);
-		return;
+		if (ifp-if_vlantrunk == NULL  vlanid != 0) {
+			ifp-if_noproto++;
+			m_freem(m);
+			return;
+		}
 	}
 
 	/*
 	 * Handle protocols that expect to have the Ethernet header
 	 * (and possibly FCS) intact.
 	 */
-	switch (ether_type) {
-	case ETHERTYPE_VLAN:
+	if (ether_type == ETHERTYPE_VLAN  vlanid != 0) {
 		if (ifp-if_vlantrunk != NULL) {
-			KASSERT(vlan_input_p,(ether_input: VLAN not loaded!));
+			KASSERT(vlan_input_p,
+			(ether_input: VLAN not loaded!));
 			(*vlan_input_p)(ifp, m);
 		} else {
 			ifp-if_noproto++;
 			m_freem(m);
+			return;
 		}
-		return;
 	}
 
 	/* Strip off Ethernet header. */
000: d4c3 b2a1 0200 0400      
010:   0100  51ed cd45 e444 0d00  Q..E.D..
020: 6600  6600     0010  f...f...
030: c6bb 16f4 8100  0800 4500 0054 258f  ..E..T%.
040:  4001 4110 0a00 0005 0a00 0006 0800  [EMAIL PROTECTED]
050: ca41 4154  45cd ed51 000c ce3b 0809  .AAT..E..Q...;..
060: 0a0b 0c0d 0e0f 1011 1213 1415 1617 1819  
070: 1a1b 1c1d 1e1f 2021 2223 2425 2627 2829  .. !#$%'()
080: 2a2b 2c2d 2e2f 3031 3233 3435 3637   *+,-./01234567
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

[PATCH] Part 2 of low level 802.1p priority support

2007-02-10 Thread Bruce M. Simpson
This updated patch moves VLAN tag decapsulation into if_ethersubr.c and 
always uses M_VLANTAG, which is also passed to the upper layer.


Tests with ping:
fxp (no VLAN_HWTAGGING support)  OK
msk (VLAN_HWTAGGING enabled) OK
msk (VLAN_HWTAGGING disanabled) FAIL

I am concerned that this may need review and testing to support 
situations where we do nested VLANs or with bridge(4) before it can be 
committed.


Further testing with drivers is needed (I can't be 100% sure it fails 
with msk(4) because something strange is happening when vlan tagging is 
turned off). Perhaps Pyun knows?


Regards,
BMS


Index: if_ethersubr.c
===
RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.222
diff -u -p -r1.222 if_ethersubr.c
--- if_ethersubr.c	24 Dec 2006 08:52:13 -	1.222
+++ if_ethersubr.c	10 Feb 2007 18:16:54 -
@@ -701,43 +701,50 @@ post_stats:
 		}
 	}
 #endif
-
 	/*
-	 * Check to see if the device performed the VLAN decapsulation and
-	 * provided us with the tag.
+	 * If the device did not perform decapsulation of the 802.1q VLAN
+	 * header itself, do this now, and tag the mbuf with M_VLANTAG.
+	 * Remove the 802.1q header by copying the Ethernet addresses over
+	 * it and adjusting the beginning of the data in the mbuf.
+	 * Re-inspect the ether_type field so we do the right thing
+	 * for VLAN 0.
 	 */
-	if (m-m_flags  M_VLANTAG) {
-		/*
-		 * If no VLANs are configured, drop.
-		 */
-		if (ifp-if_vlantrunk == NULL) {
-			ifp-if_noproto++;
-			m_freem(m);
+	if ((ether_type == ETHERTYPE_VLAN)  !(m-m_flags  M_VLANTAG)) {
+		struct ether_vlan_header *evl;
+
+		if (m-m_len  sizeof(*evl) 
+		(m = m_pullup(m, sizeof(*evl))) == NULL) {
+			if_printf(ifp, cannot pullup VLAN header\n);
 			return;
 		}
-		/*
-		 * vlan_input() will either recursively call ether_input()
-		 * or drop the packet.
-		 */
-		KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!));
-		(*vlan_input_p)(ifp, m);
-		return;
+
+		evl = mtod(m, struct ether_vlan_header *);
+		m-m_pkthdr.ether_vtag = ntohs(evl-evl_tag);
+		m-m_flags |= M_VLANTAG;
+		bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN,
+		  ETHER_HDR_LEN - ETHER_TYPE_LEN);
+		m_adj(m, ETHER_VLAN_ENCAP_LEN);
+		/* We need to see the inner type field in case of reentry. */
+		eh = mtod(m, struct ether_header *);
+		ether_type = ntohs(eh-ether_type);
 	}
 
 	/*
-	 * Handle protocols that expect to have the Ethernet header
-	 * (and possibly FCS) intact.
+	 * Deal with numbered 802.1q VLANs, by passing these frames to
+	 * the VLAN input handler. Frames destined for VLAN 0 are for
+	 * the main input path. Otherwise, drop frames with VLAN tags.
 	 */
-	switch (ether_type) {
-	case ETHERTYPE_VLAN:
+	if ((m-m_flags  M_VLANTAG) 
+	EVL_VLANOFTAG(m-m_pkthdr.ether_vtag) != EVL_VLAN_ZERO) {
 		if (ifp-if_vlantrunk != NULL) {
-			KASSERT(vlan_input_p,(ether_input: VLAN not loaded!));
+			KASSERT(vlan_input_p,
+			(ether_input: VLAN not loaded!));
 			(*vlan_input_p)(ifp, m);
 		} else {
 			ifp-if_noproto++;
 			m_freem(m);
+			return;
 		}
-		return;
 	}
 
 	/* Strip off Ethernet header. */
Index: if_vlan.c
===
RCS file: /home/ncvs/src/sys/net/if_vlan.c,v
retrieving revision 1.117
diff -u -p -r1.117 if_vlan.c
--- if_vlan.c	30 Dec 2006 21:10:25 -	1.117
+++ if_vlan.c	10 Feb 2007 18:16:54 -
@@ -911,51 +911,9 @@ vlan_input(struct ifnet *ifp, struct mbu
 	uint16_t tag;
 
 	KASSERT(trunk != NULL, (%s: no trunk, __func__));
+	KASSERT((m-m_flags  M_VLANTAG),(%s: M_VLANTAG not set, __func__));
 
-	if (m-m_flags  M_VLANTAG) {
-		/*
-		 * Packet is tagged, but m contains a normal
-		 * Ethernet frame; the tag is stored out-of-band.
-		 */
-		tag = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag);
-		m-m_flags = ~M_VLANTAG;
-	} else {
-		struct ether_vlan_header *evl;
-
-		/*
-		 * Packet is tagged in-band as specified by 802.1q.
-		 */
-		switch (ifp-if_type) {
-		case IFT_ETHER:
-			if (m-m_len  sizeof(*evl) 
-			(m = m_pullup(m, sizeof(*evl))) == NULL) {
-if_printf(ifp, cannot pullup VLAN header\n);
-return;
-			}
-			evl = mtod(m, struct ether_vlan_header *);
-			tag = EVL_VLANOFTAG(ntohs(evl-evl_tag));
-
-			/*
-			 * Remove the 802.1q header by copying the Ethernet
-			 * addresses over it and adjusting the beginning of
-			 * the data in the mbuf.  The encapsulated Ethernet
-			 * type field is already in place.
-			 */
-			bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN,
-			  ETHER_HDR_LEN - ETHER_TYPE_LEN);
-			m_adj(m, ETHER_VLAN_ENCAP_LEN);
-			break;
-
-		default:
-#ifdef INVARIANTS
-			panic(%s: %s has unsupported if_type %u,
-			  __func__, ifp-if_xname, ifp-if_type);
-#endif
-			m_freem(m);
-			ifp-if_noproto++;
-			return;
-		}
-	}
+	tag = EVL_VLANOFTAG(m-m_pkthdr.ether_vtag);
 
 	TRUNK_RLOCK(trunk);
 #ifdef VLAN_ARRAY
Index: if_vlan_var.h

[PATCH] Introduce M_PROMISC to lower part of Ethernet code

2007-02-10 Thread Bruce M Simpson

Hi,

Thunderbird keeps crashing whenever I draft these messages, which is 
frustrating.


Can we discuss this change? I would like to get it in as we get the 
following wins:


1. Potentially cleaner code in ether_demux()/ether_input()
2. Ways of detecting and preventing L2/L3 forwarding loops
3. Being able to do more with promiscuous mode in general e.g. using it 
to emulate broken IFF_ALLMULTI with network cards which can't support 
multicast routing properly.


Feedback eagerly looked forward to; this is not a complete change; this 
is strictly development quality at the moment.


Regards,
BMS
Index: net/if_ethersubr.c
===
RCS file: /home/ncvs/src/sys/net/if_ethersubr.c,v
retrieving revision 1.222
diff -u -p -r1.222 if_ethersubr.c
--- net/if_ethersubr.c	24 Dec 2006 08:52:13 -	1.222
+++ net/if_ethersubr.c	10 Feb 2007 20:59:39 -
@@ -582,6 +582,7 @@ ether_input(struct ifnet *ifp, struct mb
 	if (IFP2AC(ifp)-ac_netgraph != NULL) {
 		KASSERT(ng_ether_input_p != NULL,
 		(ng_ether_input_p is NULL));
+		m-m_flags = ~M_PROMISC;
 		(*ng_ether_input_p)(ifp, m);
 		if (m == NULL)
 			return;
@@ -598,6 +599,7 @@ ether_input(struct ifnet *ifp, struct mb
 	 * at the src/sys/netgraph/ng_ether.c:ng_ether_rcv_upper()
 	 */
 	if (ifp-if_bridge) {
+		m-m_flags = ~M_PROMISC;
 		BRIDGE_INPUT(ifp, m);
 		if (m == NULL)
 			return;
@@ -634,6 +636,14 @@ ether_demux(struct ifnet *ifp, struct mb
 	if (rule)	/* packet was already bridged */
 		goto post_stats;
 #endif
+	/*
+	 * If the frame was received promiscuously, mark it as such.
+	 */
+	if ((ifp-if_flags  IFF_PROMISC) 
+	!ETHER_IS_MULTICAST(eh-ether_dhost) 
+	bcmp(eh-ether_dhost, IF_LLADDR(ifp), ETHER_ADDR_LEN) != 0) {
+		m-m_flags |= M_PROMISC;
+	}
 
 	if (!(ifp-if_bridge) 
 	!((ether_type == ETHERTYPE_VLAN || m-m_flags  M_VLANTAG) 
@@ -648,8 +658,10 @@ ether_demux(struct ifnet *ifp, struct mb
 		 * evaluation, to see if the carp ether_dhost values break any
 		 * of these checks!
 		 */
-		if (ifp-if_carp  carp_forus(ifp-if_carp, eh-ether_dhost))
+		if (ifp-if_carp  carp_forus(ifp-if_carp, eh-ether_dhost)) {
+			m-m_flags = ~M_PROMISC;
 			goto pre_stats;
+		}
 #endif
 		/*
 		 * Discard packet if upper layers shouldn't see it because it
@@ -662,14 +674,16 @@ ether_demux(struct ifnet *ifp, struct mb
 		 * give them a chance to consider it as well (e. g. in case
 		 * bridging is only active on a VLAN).  They will drop it if
 		 * it's undesired.
+		 *
+		 * XXX: There is no way this check can be invoked if
+		 * there are no VLANs attached to this parent interface,
+		 * which is likely to cause recursion if we're acting
+		 * as an IP forwarder...
 		 */
-		if ((ifp-if_flags  IFF_PROMISC) != 0
-		 !ETHER_IS_MULTICAST(eh-ether_dhost)
-		 bcmp(eh-ether_dhost,
-		  IF_LLADDR(ifp), ETHER_ADDR_LEN) != 0
-		 (ifp-if_flags  IFF_PPROMISC) == 0) {
-			m_freem(m);
-			return;
+		if ((m-m_flags  M_PROMISC) 
+		(ifp-if_flags  IFF_PPROMISC) == 0) {
+			m_freem(m);
+			return;
 		}
 	}
 
@@ -720,6 +734,7 @@ post_stats:
 		 * or drop the packet.
 		 */
 		KASSERT(vlan_input_p != NULL,(ether_input: VLAN not loaded!));
+		m-m_flags = ~M_PROMISC;
 		(*vlan_input_p)(ifp, m);
 		return;
 	}
@@ -732,6 +747,7 @@ post_stats:
 	case ETHERTYPE_VLAN:
 		if (ifp-if_vlantrunk != NULL) {
 			KASSERT(vlan_input_p,(ether_input: VLAN not loaded!));
+			m-m_flags = ~M_PROMISC;
 			(*vlan_input_p)(ifp, m);
 		} else {
 			ifp-if_noproto++;
Index: sys/mbuf.h
===
RCS file: /home/ncvs/src/sys/sys/mbuf.h,v
retrieving revision 1.202
diff -u -p -r1.202 mbuf.h
--- sys/mbuf.h	25 Jan 2007 01:05:23 -	1.202
+++ sys/mbuf.h	10 Feb 2007 20:59:40 -
@@ -182,6 +182,7 @@ struct mbuf {
 #define	M_FIRSTFRAG	0x1000	/* packet is first fragment */
 #define	M_LASTFRAG	0x2000	/* packet is last fragment */
 #define	M_VLANTAG	0x1	/* ether_vtag is valid */
+#define	M_PROMISC	0x2	/* packet was not for us */
 
 /*
  * External buffer types: identify ext_buf type.
@@ -203,7 +204,7 @@ struct mbuf {
 #define	M_COPYFLAGS	(M_PKTHDR|M_EOR|M_RDONLY|M_PROTO1|M_PROTO1|M_PROTO2|\
 			M_PROTO3|M_PROTO4|M_PROTO5|M_SKIP_FIREWALL|\
 			M_BCAST|M_MCAST|M_FRAG|M_FIRSTFRAG|M_LASTFRAG|\
-			M_VLANTAG)
+			M_VLANTAG|M_PROMISC)
 
 /*
  * Flags to purge when crossing layers.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

[PATCH] Make INET6 MROUTING dynamically loadable in GENERIC

2007-02-10 Thread Bruce M Simpson

Hi,

This should do what it says on the tin...

Regards,
BMS
Make IPv6 multicast forwarding dynamically loadable into a GENERIC kernel.

Index: conf/files
===
RCS file: /home/ncvs/src/sys/conf/files,v
retrieving revision 1.1175
diff -u -p -r1.1175 files
--- conf/files	7 Feb 2007 18:55:29 -	1.1175
+++ conf/files	10 Feb 2007 22:07:19 -
@@ -1759,6 +1759,7 @@ netinet/ip_icmp.c		optional inet
 netinet/ip_input.c		optional inet
 netinet/ip_ipsec.c		optional ipsec
 netinet/ip_ipsec.c		optional fast_ipsec
+netinet/ip_mroute.c		optional inet | inet6
 netinet/ip_mroute.c		optional mrouting
 netinet/ip_options.c		optional inet
 netinet/ip_output.c		optional inet
@@ -1814,7 +1815,7 @@ netinet6/in6_src.c		optional inet6
 netinet6/ip6_forward.c		optional inet6
 netinet6/ip6_id.c		optional inet6
 netinet6/ip6_input.c		optional inet6
-netinet6/ip6_mroute.c		optional inet6
+netinet6/ip6_mroute.c		optional mrouting inet6
 netinet6/ip6_output.c		optional inet6
 netinet6/ipcomp_core.c		optional ipsec
 netinet6/ipcomp_input.c		optional ipsec
Index: modules/ip_mroute_mod/Makefile
===
RCS file: /home/ncvs/src/sys/modules/ip_mroute_mod/Makefile,v
retrieving revision 1.14
diff -u -p -r1.14 Makefile
--- modules/ip_mroute_mod/Makefile	9 Feb 2007 01:42:43 -	1.14
+++ modules/ip_mroute_mod/Makefile	10 Feb 2007 22:07:19 -
@@ -1,13 +1,21 @@
 # $FreeBSD: src/sys/modules/ip_mroute_mod/Makefile,v 1.14 2007/02/09 01:42:43 bms Exp $
 
-.PATH: ${.CURDIR}/../../netinet
+.PATH: ${.CURDIR}/../../netinet ${.CURDIR}/../../netinet6
 
 KMOD=	ip_mroute
-SRCS=	ip_mroute.c opt_mac.h opt_mrouting.h
+SRCS=	ip_mroute.c
+SRCS+=	ip6_mroute.c
+SRCS+=	opt_inet.h opt_inet6.h opt_mac.h opt_mrouting.h
 
 .if !defined(KERNBUILDDIR)
+opt_inet.h:
+	echo #define INET 1  ${.TARGET}
+
+opt_inet6.h:
+	echo #define INET6 1  ${.TARGET}
+
 opt_mrouting.h:
-	echo #define	MROUTING 1  ${.TARGET}
+	echo #define MROUTING 1  ${.TARGET}
 .endif
 
 .include bsd.kmod.mk
Index: netinet/ip_mroute.c
===
RCS file: /home/ncvs/src/sys/netinet/ip_mroute.c,v
retrieving revision 1.128
diff -u -p -r1.128 ip_mroute.c
--- netinet/ip_mroute.c	10 Feb 2007 14:48:42 -	1.128
+++ netinet/ip_mroute.c	10 Feb 2007 22:07:20 -
@@ -55,6 +55,8 @@
  * $FreeBSD: src/sys/netinet/ip_mroute.c,v 1.128 2007/02/10 14:48:42 bms Exp $
  */
 
+#include opt_inet.h
+#include opt_inet6.h
 #include opt_mac.h
 #include opt_mrouting.h
 
@@ -217,6 +219,12 @@ struct protosw in_pim_protosw = {
 	.pr_usrreqs =		rip_usrreqs
 };
 static const struct encaptab *pim_encap_cookie;
+
+#ifdef INET6
+extern struct protosw in6_pim_protosw;	/* ip6_mroute.c: struct in6_protosw */
+static const struct encaptab *pim6_encap_cookie;
+#endif
+
 static int pim_encapcheck(const struct mbuf *, int, int, void *);
 
 /*
@@ -2737,7 +2745,7 @@ pim_register_send_rp(struct ip *ip, stru
 }
 
 /*
- * pim_encapcheck() is called by the encap4_input() path at runtime to
+ * pim_encapcheck() is called by the encap[46]_input() path at runtime to
  * determine if a packet is for PIM; allowing PIM to be dynamically loaded
  * into the kernel.
  */
@@ -2995,6 +3003,10 @@ pim_input_to_daemon:
 return;
 }
 
+/*
+ * XXX: This is common code for dealing with initialization for both
+ * the IPv4 and IPv6 multicast forwarding paths. It could do with cleanup.
+ */
 static int
 ip_mroute_modevent(module_t mod, int type, void *unused)
 {
@@ -3006,6 +3018,7 @@ ip_mroute_modevent(module_t mod, int typ
 	ip_mrouter_reset();
 	TUNABLE_ULONG_FETCH(net.inet.pim.squelch_wholepkt,
 	pim_squelch_wholepkt);
+
 	pim_encap_cookie = encap_attach_func(AF_INET, IPPROTO_PIM,
 	pim_encapcheck, in_pim_protosw, NULL);
 	if (pim_encap_cookie == NULL) {
@@ -3015,6 +3028,23 @@ ip_mroute_modevent(module_t mod, int typ
 		mtx_destroy(mrouter_mtx);
 		return (EINVAL);
 	}
+
+#ifdef INET6
+	pim6_encap_cookie = encap_attach_func(AF_INET6, IPPROTO_PIM,
+	pim_encapcheck, in6_pim_protosw, NULL);
+	if (pim6_encap_cookie == NULL) {
+		printf(ip_mroute: unable to attach pim6 encap\n);
+		if (pim_encap_cookie) {
+		encap_detach(pim_encap_cookie);
+		pim_encap_cookie = NULL;
+		}
+		VIF_LOCK_DESTROY();
+		MFC_LOCK_DESTROY();
+		mtx_destroy(mrouter_mtx);
+		return (EINVAL);
+	}
+#endif
+
 	ip_mcast_src = X_ip_mcast_src;
 	ip_mforward = X_ip_mforward;
 	ip_mrouter_done = X_ip_mrouter_done;
@@ -3039,6 +3069,12 @@ ip_mroute_modevent(module_t mod, int typ
 	if (ip_mrouter)
 	return EINVAL;
 
+#ifdef INET6
+	if (pim6_encap_cookie) {
+	encap_detach(pim6_encap_cookie);
+	pim6_encap_cookie = NULL;
+	}
+#endif
 	if (pim_encap_cookie) {
 	encap_detach(pim_encap_cookie);
 	pim_encap_cookie = NULL;
Index: netinet6/in6_proto.c
===
RCS file: /home/ncvs/src/sys/netinet6/in6_proto.c,v
retrieving 

[PATCH] netstat(1) should print CIDR prefixes

2007-02-10 Thread Bruce M Simpson

Hi,

This is a POLA violating 'let's move with the times' patch that gets rid 
of the special treatment of classful IPv4 network prefixes in 'netstat 
-rn' output. Comments please!


Rgards,
BMS
Index: route.c
===
RCS file: /home/ncvs/src/usr.bin/netstat/route.c,v
retrieving revision 1.76
diff -u -p -r1.76 route.c
--- route.c	13 May 2005 16:31:10 -	1.76
+++ route.c	10 Feb 2007 22:55:50 -
@@ -865,32 +865,7 @@ netname(u_long in, u_long mask)
 		strncpy(line, cp, sizeof(line) - 1);
 		line[sizeof(line) - 1] = '\0';
 	} else {
-		switch (dmask) {
-		case IN_CLASSA_NET:
-			if ((i  IN_CLASSA_HOST) == 0) {
-sprintf(line, %lu, C(i  24));
-break;
-			}
-			/* FALLTHROUGH */
-		case IN_CLASSB_NET:
-			if ((i  IN_CLASSB_HOST) == 0) {
-sprintf(line, %lu.%lu,
-	C(i  24), C(i  16));
-break;
-			}
-			/* FALLTHROUGH */
-		case IN_CLASSC_NET:
-			if ((i  IN_CLASSC_HOST) == 0) {
-sprintf(line, %lu.%lu.%lu,
-	C(i  24), C(i  16), C(i  8));
-break;
-			}
-			/* FALLTHROUGH */
-		default:
-			sprintf(line, %lu.%lu.%lu.%lu,
-C(i  24), C(i  16), C(i  8), C(i));
-			break;
-		}
+		inet_ntop(AF_INET, (char *)in, line, sizeof(line) - 1);
 	}
 	domask(line + strlen(line), i, mask);
 	return (line);
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Networking FreeBSD Wiki

2007-02-09 Thread Bruce M. Simpson

Joel Dahl wrote:


How about moving stuff from the (outdated) dingo[*] project page to this
wiki page instead?

[*] http://www.freebsd.org/projects/dingo/
  


That's what he did. I feel a twinge of responsibility for this, and the 
stupid name. I have just totally steamrollered in and edited (merged 
some of my own tasks). I have a bunch of other stuff on my list which 
I'll add...!


BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/106999: [netgraph] [patch] ng_ksocket fails to clear multicast flag on mbuf before passing to stack

2007-02-08 Thread Bruce M Simpson
Synopsis: [netgraph] [patch] ng_ksocket fails to clear multicast flag on mbuf 
before passing to stack

Responsible-Changed-From-To: freebsd-net-bms
Responsible-Changed-By: bms
Responsible-Changed-When: Fri Feb 9 02:39:10 UTC 2007
Responsible-Changed-Why: 
I'll take this

http://www.freebsd.org/cgi/query-pr.cgi?pr=106999
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [PATCH] tun(4) does not clean up after itself

2007-02-06 Thread Bruce M. Simpson
This change has now been committed on -CURRENT (reviewed by bz@) so it 
is now settling in.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Proposal: remove encap from MROUTING

2007-02-06 Thread Bruce M. Simpson

I count no objections and +1 in favour from Andre.

To maintain POLA, I will decapitate (Argh, pun) it from HEAD with no MFC 
to begin with.


Arguments in favour:
* mrouted was removed from the base system.
* PIM does not use MROUTING's IPIP tunnels, and PIM is regarded as the 
standard these days for multicast routing.
* MROUTING's internal tunnels do not have any management capabilities; 
gif(4) has.

* It achieves some diff reduction with OpenBSD.
* Reduces locking/netisr fandango.
* The MROUTING paths could do with some overall cleanup anyway, and it 
doesn't seem appropriate to merge back such work, except as a backport.


I plan to commit this patch some time this week.

Regards,
BMS

Index: ip_mroute.c
===
RCS file: /home/ncvs/src/sys/netinet/ip_mroute.c,v
retrieving revision 1.122
diff -u -p -r1.122 ip_mroute.c
--- ip_mroute.c	6 Nov 2006 13:42:04 -	1.122
+++ ip_mroute.c	7 Feb 2007 02:57:07 -
@@ -181,33 +181,7 @@ static struct callout expire_upcalls_ch;
 static struct tbf tbftable[MAXVIFS];
 #define		TBF_REPROCESS	(hz / 100)	/* 100x / second */
 
-/*
- * 'Interfaces' associated with decapsulator (so we can tell
- * packets that went through it from ones that get reflected
- * by a broken gateway).  These interfaces are never linked into
- * the system ifnet list  no routes point to them.  I.e., packets
- * can't be sent this way.  They only exist as a placeholder for
- * multicast source verification.
- */
-static struct ifnet multicast_decap_if[MAXVIFS];
-
 #define ENCAP_TTL 64
-#define ENCAP_PROTO IPPROTO_IPIP	/* 4 */
-
-/* prototype IP hdr for encapsulated packets */
-static struct ip multicast_encap_iphdr = {
-#if BYTE_ORDER == LITTLE_ENDIAN
-	sizeof(struct ip)  2, IPVERSION,
-#else
-	IPVERSION, sizeof(struct ip)  2,
-#endif
-	0,/* tos */
-	sizeof(struct ip),		/* total length */
-	0,/* id */
-	0,/* frag offset */
-	ENCAP_TTL, ENCAP_PROTO,
-	0,/* checksum */
-};
 
 /*
  * Bandwidth meter variables and constants
@@ -287,14 +261,6 @@ static vifi_t reg_vif_num = VIFI_INVALID
  * Private variables.
  */
 static vifi_t	   numvifs;
-static const struct encaptab *encap_cookie;
-
-/*
- * one-back cache used by mroute_encapcheck to locate a tunnel's vif
- * given a datagram's src ip address.
- */
-static u_long last_encap_src;
-static struct vif *last_encap_vif;
 
 /*
  * Callout for queue processing.
@@ -325,7 +291,6 @@ static int set_assert(int);
 static void expire_upcalls(void *);
 static int ip_mdq(struct mbuf *, struct ifnet *, struct mfc *, vifi_t);
 static void phyint_send(struct ip *, struct vif *, struct mbuf *);
-static void encap_send(struct ip *, struct vif *, struct mbuf *);
 static void tbf_control(struct vif *, struct mbuf *, struct ip *, u_long);
 static void tbf_queue(struct vif *, struct mbuf *);
 static void tbf_process_q(struct vif *);
@@ -792,14 +757,6 @@ X_ip_mrouter_done(void)
 ip_mrouter = NULL;
 mrt_api_config = 0;
 
-VIF_LOCK();
-if (encap_cookie) {
-	const struct encaptab *c = encap_cookie;
-	encap_cookie = NULL;
-	encap_detach(c);
-}
-VIF_UNLOCK();
-
 callout_stop(tbf_reprocess_ch);
 
 VIF_LOCK();
@@ -859,8 +816,6 @@ X_ip_mrouter_done(void)
 /*
  * Reset de-encapsulation cache
  */
-last_encap_src = INADDR_ANY;
-last_encap_vif = NULL;
 #ifdef PIM
 reg_vif_num = VIFI_INVALID;
 #endif
@@ -924,90 +879,6 @@ set_api_config(uint32_t *apival)
 }
 
 /*
- * Decide if a packet is from a tunnelled peer.
- * Return 0 if not, 64 if so.  XXX yuck.. 64 ???
- */
-static int
-mroute_encapcheck(const struct mbuf *m, int off, int proto, void *arg)
-{
-struct ip *ip = mtod(m, struct ip *);
-int hlen = ip-ip_hl  2;
-
-/*
- * don't claim the packet if it's not to a multicast destination or if
- * we don't have an encapsulating tunnel with the source.
- * Note:  This code assumes that the remote site IP address
- * uniquely identifies the tunnel (i.e., that this site has
- * at most one tunnel with the remote site).
- */
-if (!IN_MULTICAST(ntohl(((struct ip *)((char *)ip+hlen))-ip_dst.s_addr)))
-	return 0;
-if (ip-ip_src.s_addr != last_encap_src) {
-	struct vif *vifp = viftable;
-	struct vif *vife = vifp + numvifs;
-
-	last_encap_src = ip-ip_src.s_addr;
-	last_encap_vif = NULL;
-	for ( ; vifp  vife; ++vifp)
-	if (vifp-v_rmt_addr.s_addr == ip-ip_src.s_addr) {
-		if ((vifp-v_flags  (VIFF_TUNNEL|VIFF_SRCRT)) == VIFF_TUNNEL)
-		last_encap_vif = vifp;
-		break;
-	}
-}
-if (last_encap_vif == NULL) {
-	last_encap_src = INADDR_ANY;
-	return 0;
-}
-return 64;
-}
-
-/*
- * De-encapsulate a packet and feed it back through ip input (this
- * routine is called whenever IP gets a packet that mroute_encap_func()
- * claimed).
- */
-static void
-mroute_encap_input(struct mbuf *m, int off)
-{
-struct ip *ip = mtod(m, struct ip *);
-int hlen = ip-ip_hl  2;
-
-if (hlen  sizeof(struct ip))
-	

Re: [PATCH] ip_fastfwd forwards directed broadcasts

2007-02-05 Thread Bruce M. Simpson

This has now been applied to -CURRENT after testing by a 3rd party.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Seeking to hear from people with broken IFF_ALLMULTI cards

2007-02-05 Thread Bruce M Simpson

Hi,

If any of you out there have network interfaces which have broken 
ALLMULTI handling (i.e. they can't handle multicast routing), I would 
love to hear from you.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ioctl: SIOCADDMULTI (howto?)

2007-02-05 Thread Bruce M. Simpson

Jouke Witteveen wrote:

Hello all,

I'm in need of some information on how to utilize SIOCADDMULTI. It is
supposed to be demonstrated by the mtest [1] program, but that doesn't
do anything (on an SIOCDELMULTI rn it appears nothing was added:
ENOENT), At least not for the values I tested, 1.80.c2.0.0.1 in
particular. I presume it doesn't work because the program has not been
revised in 3 years and revision 1.4 notes that it might not work.
If this ioctl is depricated then please tell me what is the best way
to receive multicast messages from the 01.80.c2.00.00.0x (802.1)
range? It is ofcourse possible to go into ALLMULTI-mode and check on
all datagrams, but the NIC's I use are suited with a very nice
hardware filter (21143 chip) that should be able to do this more
effectively. Anyway, I believe Linux still programs the hardware
filter through SIOCADDMULTI so is a bit easier on this.
I tracked down the source from the ioctl call to the network driver
for some time now and could find no obvious fault, except for quite
much casting, and inconsistent use of types (checks happen on all
sorts of casts: socket, sokcet_dl, multiaddr, ...).
It's quite possible that path is broken, as hardly anyone else out there 
needs to directly join a link-layer multicast group, and there is no 
regression test for it.


The IP paths are known to work A-OK. If you didn't have code hooked up 
to ether_demux() to see this traffic, you'd never see it in userland anyway.


As such, it's not a priority for me to fix , but will try to help anyway.

Are there specific performance constraints for your app? If not you 
should just be able to use pcap (or bpf) to get the traffic. Admittedly 
this is a performance hit, but with the optimization work on bpf and 
ever more powerful CPUs, this shouldn't be a big issue.


You can write a regression test for this though with getifmaddrs().

anglepoise:~/head/src/sys/net % s mtest
Password:
multicast membership test program; enter ? for list of commands
a fxp0 01.80.c2.00.00.02
ether address added

should yield route -nv monitor output

got message of size 128 on Mon Feb  5 21:23:57 2007
RTM_NEWMADDR: new multicast group membership on iface: len 128,
sockaddrs: IFP,IFA
fxp0:0.90.27.59.40.2c 1.80.c2.0.0.2

Of course, netstat -g won't show you this, because it's concerned with 
IP/IPv6 only.
netstat -ian should however tell you which link-layer multicast 
addresses are configured.


When I add an ethernet multicast address manually with mtest, I see 
vmstat -m | grep ether_multi increment as I'd expect.


It looks like there may be a missing piece somewhere. The code which I 
see is OK but the results aren't as I'd expect. I am quite tired at the 
moment so I may be way off.


Regards,
BMS

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Proposal: remove encap from MROUTING

2007-02-02 Thread Bruce M Simpson
How would you all feel about removing the old encapsulation methods from 
IPv4 multicast routing as OpenBSD has done?


http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/ip_mroute.c.diff?r1=1.42r2=1.43

The last time I deployed any such infrastructure, I had to use gif(4); 
in a NATted world, the encap stuff has never worked cleanly for me or 
been worth the additional effort in deployment.


Regards,
BMS
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


[PATCH] tun(4) does not clean up after itself

2007-02-02 Thread Bruce M Simpson

Hi,

I just saw this PR:
   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/100080

This patch appears to fix the problem. Any obvious glaring errors?

Testers please?

Regards,
BMS
Index: if_tun.c
===
RCS file: /home/ncvs/src/sys/net/if_tun.c,v
retrieving revision 1.161
diff -u -p -r1.161 if_tun.c
--- if_tun.c	6 Nov 2006 13:42:02 -	1.161
+++ if_tun.c	2 Feb 2007 23:30:04 -
@@ -388,16 +388,21 @@ tunclose(struct cdev *dev, int foo, int 
 		splx(s);
 	}
 
+	/* Delete all addresses and routes which reference this interface. */
 	if (ifp-if_drv_flags  IFF_DRV_RUNNING) {
 		struct ifaddr *ifa;
 
 		s = splimp();
-		/* find internet addresses and delete routes */
-		TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link)
-			if (ifa-ifa_addr-sa_family == AF_INET)
-/* Unlocked read. */
+		TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link) {
+			/* deal w/IPv4 PtP destination; unlocked read */
+			if (ifa-ifa_addr-sa_family == AF_INET) {
 rtinit(ifa, (int)RTM_DELETE,
 tp-tun_flags  TUN_DSTADDR ? RTF_HOST : 0);
+			} else {
+rtinit(ifa, (int)RTM_DELETE, 0);
+			}
+		}
+		if_purgeaddrs(ifp);
 		ifp-if_drv_flags = ~IFF_DRV_RUNNING;
 		splx(s);
 	}
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]

<    1   2   3   4   5   6   >