Re: OFED stack, RDMA, ipoib help needed
At Tue, 8 May 2012 12:11:20 +0200, Gergely CZUCZY wrote: > > Hello, > > I'd like to ask a few question in order to get some hardware to work > we've got recently. > > The hardwares are the following: > - 2x dualport Mellanox ConnectX-3 VPI cards, with 56Gbps ports > - 4 computing modules with a singleport Mellanox MT27500-family > ConnectX-3 port. > > The 2 dualport cards are in a storage box, and the 4 singleport ones > are integrated on blade-like computing nodes (4 boxes in 2U). The > storage is running FreeBSD 9-STABLE, 2012-05-07 cvsup, and the > computing nodes are running linux. > > So far we had been able to bring up the subnet-manager on the FreeBSD > node, and one of the links got into Active state, which is quite good. > We had been able to ibping between the nodes. The FreeBSD kernel > config, in addition to GENERIC, is the following: > > options OFED > options SDP > device ipoib > options IPOIB_CM > device mlx4ib > device mthca > device mlxen > > Right now we're having problems with the following issues, situations: > > 1) we assigned IP addresses to both ib interfaces (fbsd, linux side), > but weren't able to ping over IP. We've seen icmp-echo-requests leaving > the box on the linux box, but haven't seen any incoming traffic on the > freebsd one. On the freebsd side, we had several issues: > - no incoming packets seen by tcpdump on the ib interface > - when trying to ping the other side, we've got "no route to host", >but the routing entry existed in the routing table. > - we had a few of these messages in our messages: "ib2: timing out; 0 >sends N recieves not completed", where started at 22,34 and was >growing. > Have you looked at your arp tables? (arp -a) Do you have any messages in dmesg on the FreeBSD side? Can you show us the output of ifconfig on the FreeBSD side? > 2) We're unable to find any resources on how to do RDMA on the FreeBSD > side. We'd like to use SRP (SCSI RDMA Protocol) communication, and/or > NFS-over-RDMA for our storage link between the boxes. Where could we > find any info on this? Sorry but I can't help you with this one. > 3) Enabling connected-mode, we weren't able to find a way to specify or > query the port that connected mode is using. Could someone please point > us to the right direction regarding this minor issue? This ought to work in FreeBSD as it does in Linux, but I've not personally tried it. > 4) We were also unable to find how to switch these dual-personality > cards between infiniband and ethernet modes. Could we also get some > pointers regarding this please? > It usually depends on what cable you're using, what it's plugged into, and what driver you bring up. The mlx4 driver should be able to give you an Ethernet device with the Connect X-3 cards. > Basically any help would be welcome which could help making infiniband > work. > > As a side question, I've seen a comming for OFED in HEAD by jhb, fixing > a few things, may I ask when will that get MFC'd to RELENG-9? > This I don't know about. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Interface MTU question...
Howdy, Does anyone know the reason for this particular check in ip_output.c? if (rte != NULL && (rte->rt_flags & (RTF_UP|RTF_HOST))) { /* * This case can happen if the user changed the MTU * of an interface after enabling IP on it. Because * most netifs don't keep track of routes pointing to * them, there is no way for one to update all its * routes when the MTU is changed. */ if (rte->rt_rmx.rmx_mtu > ifp->if_mtu) rte->rt_rmx.rmx_mtu = ifp->if_mtu; mtu = rte->rt_rmx.rmx_mtu; } else { mtu = ifp->if_mtu; } To my mind the > ought to be != so that any change, up or down, of the interface MTU is eventually reflected in the route. Also, this code does not check if it is both a HOST route and UP, but only if it is one other the other, so don't be fooled by that, this check happens for any route we have if it's up. My proposed change is this: Index: ip_output.c === --- ip_output.c (revision 225561) +++ ip_output.c (working copy) @@ -320,7 +320,7 @@ * them, there is no way for one to update all its * routes when the MTU is changed. */ - if (rte->rt_rmx.rmx_mtu > ifp->if_mtu) + if (rte->rt_rmx.rmx_mtu != ifp->if_mtu) rte->rt_rmx.rmx_mtu = ifp->if_mtu; mtu = rte->rt_rmx.rmx_mtu; } else { Please let me know what y'all think. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Proposal for changes to network device drivers and network stack (RFC)
At Fri, 7 Sep 2012 01:28:16 -0700, Anuranjan Shukla wrote: > > > > > >> struct socket { > >> > >>int so_fibnum; /* routing domain for this socket */ > >>uint32_t so_user_cookie; > >> + u_int so_oqueue; /* manage send prioritizing based on > >>application > >> needs */ > >> + u_short so_lrid; /* logical routing */ > >> }; > >> > > > >I'd be interested to know how this is used. > > We use the first one as a 'direction' to the forwarding path to select an > appropriate priority queue to send the packet on. In a generic (i.e. > Something other than our specific system) system, one could consider > interesting ways to use queues on a multi queue NIC with help from a > driver. The second one is for a system with logical routing capabilities > (multiple routing systems within the same chassis). It gives an > application opening a socket an option to select the specific logical > routing instance. OK, that's what I guessed but thanks for confirming it. > I'll provide smaller pieces of diffs for the kernel without networking > patch I'd sent out. Let me know if you prefer the device driver interface > to be in that too. Yes, please. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/123758: [panic] panic while restarting net/freenet6
Synopsis: [panic] panic while restarting net/freenet6 Responsible-Changed-From-To: gnn->n...@freebsd.org Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:13:33 UTC 2010 Responsible-Changed-Why: http://www.freebsd.org/cgi/query-pr.cgi?pr=123758 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/123758: [panic] panic while restarting net/freenet6
Synopsis: [panic] panic while restarting net/freenet6 Responsible-Changed-From-To: n...@freebsd.org->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:14:53 UTC 2010 Responsible-Changed-Why: Give this one back. http://www.freebsd.org/cgi/query-pr.cgi?pr=123758 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/86427: [lor] Deadlock with FASTIPSEC and nat
Synopsis: [lor] Deadlock with FASTIPSEC and nat Responsible-Changed-From-To: gnn->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:18:21 UTC 2010 Responsible-Changed-Why: I believe this is fixed but others can comment on it at will. http://www.freebsd.org/cgi/query-pr.cgi?pr=86427 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/81095: IPsec connection stops working if associated network interface goes down and then up again.
Synopsis: IPsec connection stops working if associated network interface goes down and then up again. Responsible-Changed-From-To: gnn->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:34:03 UTC 2010 Responsible-Changed-Why: This is probably not longer valid given the changes in our IPSec stack over the last 4 years. People are welcome to retest/resubmit. http://www.freebsd.org/cgi/query-pr.cgi?pr=81095 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/78968: FreeBSD freezes on mbufs exhaustion (network interface independent)
Synopsis: FreeBSD freezes on mbufs exhaustion (network interface independent) Responsible-Changed-From-To: gnn->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:35:12 UTC 2010 Responsible-Changed-Why: 5.3 bug, probably no longer relevant. http://www.freebsd.org/cgi/query-pr.cgi?pr=78968 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/65616: IPSEC can't detunnel GRE packets after real ESP encryption
Synopsis: IPSEC can't detunnel GRE packets after real ESP encryption Responsible-Changed-From-To: gnn->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:47:06 UTC 2010 Responsible-Changed-Why: This is likely stale. http://www.freebsd.org/cgi/query-pr.cgi?pr=65616 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/56233: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
Synopsis: IPsec tunnel (ESP) over IPv6: MTU computation is wrong Responsible-Changed-From-To: gnn->freebsd-net Responsible-Changed-By: gnn Responsible-Changed-When: Tue Jun 15 17:47:41 UTC 2010 Responsible-Changed-Why: I'm not working on IPSec at the moment, handing this one back. http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Panic on boot with em1 attached
Hi, Can you try this with fastforwarding off? It looks like a double free somewhere in the ip_fastforward() routine. Someone frees m but does not NULL it out and at the drop: label the mbuf m is valid but the data within it has already been freed. Knowing if this is related only to the fast forwarding case will help. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Panic on boot with em1 attached
At Tue, 23 Dec 2008 13:57:39 +0200, Vladimir V. Kobal wrote: > > With fastforwarding off the system works well and boots without panicing. > OK, that narrows it down. Are you using any filtering such as PF, ipfw, etc.? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
A new tool for low level testing...
Hi, I just checked in a small tool to HEAD in /usr/src/tools/tools/ether_reflect which uses pcap and bpf to reflect ethernet packets just about the driver layer without involving the protocol stacks. This is useful for people doing low level testing of drivers and switches. If you happen to be lucky enough to have an ethernet packet generator (ixia et al) this will do what you want in terms of reflecting the packets back. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Panic on boot with em1 attached
At Tue, 23 Dec 2008 22:49:24 +0200, Vladimir V. Kobal wrote: > > We are using pf+ALTQ for shaping and ipfw for filtering, diverting into > netgraph nodes, attaching altq queues. > OK, that also makes sense given what I saw in the code. Can you explain your entire setup? That is, which filters, which interfaces, what bits of netgraph etc. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: A new tool for low level testing...
At Tue, 23 Dec 2008 13:00:12 -0800, julian wrote: > > g...@freebsd.org wrote: > > Hi, > > > > I just checked in a small tool to HEAD in > > /usr/src/tools/tools/ether_reflect which uses pcap and bpf to reflect > > ethernet packets just about the driver layer without involving the > > protocol stacks. This is useful for people doing low level testing of > > drivers and switches. If you happen to be lucky enough to have an > > ethernet packet generator (ixia et al) this will do what you want in > > terms of reflecting the packets back. > > > > Later, > > George > > ___ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > > OR > > ngctl mkpeer em0: echo lower echo > > > hm no this would leave the source and destination headers in hte > same order.. they need to be swapped.. > > ok so I need to make a patch, but it would be much quicker than a user > utility.. I agree that netgraph is the right long term answer. I look forward to what you come up with. Also, +1 to an improved set of docs on netgraph. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: freebsd 7.0-RELEASE BUG ping: sendto: No buffer space available
At Sat, 24 Jan 2009 16:20:06 +, Rui Paulo wrote: > > > On 24 Jan 2009, at 12:54, Yony Yossef wrote: > > > Hi All, > > > > I'm facing a temporary network hang on my interfaces following a flood > > ping/stress udp test. > > > > I'm running a netperf UDP test which is giving results but does not > > return > > to the shell. > > client output: > > > > UDP UNIDIRECTIONAL SEND TEST from fe80::202:c9ff:fe02:e1fe%mtnic0 > > (fe80::202:c9ff:fe02:e1fe) port 0 AF_INET6 to > > fe80::202:c9ff:fe02:e1f4%mt > > nic0 (fe80::202:c9ff:fe02:e1f4) port 0 AF_INET6 > > Socket Message Elapsed Messages > > SizeSize Time Okay Errors Throughput > > bytes bytessecs# # 10^6bits/sec > > > > 327681472 10.02 547428 1694280 643.60 > > 32768 10.02 25089 29.50 > > > > > > (HANG) > > > > After a minute or two it returns to the shell with the following > > message: > > shutdown_control: no response received errno 55 > > > > 20 minutes later (!!) the interface is working again. > > > > netstat -m and vmstat -z outputs during the hang time: > > > > # netstat -m > > 25687/6578/32265 mbufs in use (current/cache/total) > > 17404/2438/19842/65536 mbuf clusters in use (current/cache/total/max) > > 0/1024 mbuf+clusters out of packet secondary zone in use (current/ > > cache) > > 2071/1369/3440/65536 4k (page size) jumbo clusters in use > > (current/cache/total/max) > > 0/0/0/65536 9k jumbo clusters in use (current/cache/total/max) > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > > 49513K/11996K/61510K bytes allocated to network (current/cache/total) > > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > > 0/0/0 sfbufs in use (current/peak/max) > > 0 requests for sfbufs denied > > 0 requests for sfbufs delayed > > 0 requests for I/O initiated by sendfile > > 0 calls to protocol drain routines > > I think there are too many mbufs in use. You're probably facing an > mbuf leakage and that causes an interface hang. > If this is a large memory machine try upping the number of clusters and mbufs. On 64 bit systems with large memories 1,000,000 mbufs is not unheard of. kern.ipc.nmbclusters: 100 Also, with UDP you can easily overrun different buffers within the system. You might also look at: netstat -id and see if the driver is dropping packets, and if so you might up its send queue. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Canonical Packet Traces?
Howdy, A very slightly off topic question for [EMAIL PROTECTED] Does anyone know of a web site that collects and indexes canonical packet traces for network protocols? I'm looking for a good storehouse of traces to use in testing. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Canonical Packet Traces?
Thanks to all who responded. I'll check out the links. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]
At Wed, 29 Aug 2007 08:24:58 +0100, Bruce M. Simpson wrote: > > BTW: Casual inspection with kscope suggests there is a similar > free-while-locked issue in nd6_ns_input() (netient6/nd6_nbr.c) and > in_arpinput() (netinet/if_ether.c). > > nd6_ns_input() references rt-»rt_gateway after rtfree(), a potential > race not to mention a use-after-free. > > I haven't checked Coverity for this, but it just doesn't look right. At least in the ND6 case I think that the correct logic is: //depot/user/gnn/ipsec_seven/src/sys/netinet6/nd6_nbr.c#1 - /sources/p4/user/gnn/ipsec_seven/src/sys/netinet6/nd6_nbr.c @@ -215,8 +215,6 @@ rt = rtalloc1((struct sockaddr *)&tsin6, 0, 0); need_proxy = (rt && (rt->rt_flags & RTF_ANNOUNCE) != 0 && rt->rt_gateway->sa_family == AF_LINK); - if (rt) - rtfree(rt); if (need_proxy) { /* * proxy NDP for single entry @@ -228,6 +226,9 @@ proxydl = SDL(rt->rt_gateway); } } + if (!need_proxy || ifa == NULL) + if (rt) + rtfree(rt); } if (ifa == NULL) { /* Thoughts? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: RFC: Evolution of the em driver
At Mon, 29 Oct 2007 10:45:17 -0700, Jack Vogel wrote: > > I have an important decision to make and I thought rather than just make > it and spring it on you I'd present the issues and see what opinions were. > > Our newer hardware uses new features that, more and more, require > parallel code paths in the driver. For instance, the 82575 (Zoar) uses > what are called 'advanced descriptors', this means different TX path. > The 7.0 em driver has this support in it, it just uses a function pointer > to handle it. > > When I add in multiqueue/RSS support it will add even more code > that functions this way. > > What the Linux team did was to split the newer code into a standalone > driver, they call it 'igb'. I had originally resisted doing this, but with > the development I have been working on the past month I am starting > to wonder if it might not be best to follow them. > > I see 3 possibilities and I'd like feedback, which would you prefer if > you have a preference and why. > > First, keep the driver as is and just live with multiple code paths > and features, possibly #ifdef'ed as they appear. > > Second, split the driver as Linux has into em and igb. The added > question then is how to split it, Linux made the line the use of > advanced descriptors, so Zoar and after, but I could also see a > case for having everything PCI-E/MSI capable being in the new > driver. > > Third, sort of a half-way approach, split up code but not the > driver, in other words offer different source files that can be > compiled into the driver, so you could have the one big jumbo > driver with all in there, or one that will only work with a subset > of adapters. This one would probably be the most work, because > its a new approach. As you're the main maintainer it's your choice. Whatever is easiest for you and gives us the most readable code. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: dup code in in6.c
At Fri, 30 Nov 2007 17:00:25 -0800, julian wrote: > > The following diff removes some (whart looks to me to be) duplicate code. > > Anyone care to comment before I commit it? > > (I'm trying to imagine a case where it does something useful to do this twice > but not really succeeding). > It's a duplicate, the diff is fine. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: resend: multiple routing table roadmap (format fix)
At Wed, 26 Dec 2007 16:26:11 -0800, julian wrote: > > Resending as my mailer made a dog's breakfast of the first one > with all sorts of wierd line breaks... hopefully this will be better. > (I haven't sent it yet so I'm hoping).. > > > --- > > > > On thing where FreeBSD has been falling behind, and which by chance > I have some time to work on is "policy based routing", which allows > different packet streams to be routed by more than just the > destination address. > > Constraints: > > > I want to make some form of this available in the 6.x tree > (and by extension 7.x) , but FreeBSD in general needs it so I might as > well > do it in -current and back port the portions I need. > > One of the ways that this can be done is to have the ability to > instantiate multiple kernel routing tables (which I will now > refer to as "Forwarding Information Bases" or "FIBs" for political > correctness reasons. Which FIB a particular packet uses to make > the next hop decision can be decided by a number of mechanisms. > The policies these mechanisms implement are the "Policies" referred > to in "Policy based routing". > > One of the constraints I have if I try to back port this work to > 6.x is that it must be implemented as a EXTENSION to the existing > ABIs in 6.x so that third party applications do not need to be > recompiled in timespan of the branch. > > Implementation method, (part 1) > --- > For this reason I have implemented a "sufficient subset" of a > multiple routing table solution in Perforce, and back-ported it > to 6.x. (also in Perforce though not yet caught up with what I > have done in -current/P4). The subset allows a number of FIBs > to be defined at compile time (sufficient for my purposes in 6.x) and > implements the changes needed to allow IPV4 to use them. I have not done > the changes for ipv6 simply because I do not need it, and I do not > have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. > > Other protocol families are left untouched and should there be > users with proprietary protocol families, they should continue to work > and be oblivious to the existence of the extra FIBs. > > To understand how this is done, one must know that the current FIB > code starts everything off with a single dimensional array of > pointers to FIB head structures (One per protocol family), each of > which in turn points to the trie of routes available to that family. > > The basic change in the ABI compatible version of the change is to > extent that array to be a 2 dimensional array, so that > instead of protocol family X looking at rt_tables[X] for the > table it needs, it looks at rt_tables[Y][X] when for all > protocol families except ipv4 Y is always 0. > Code that is unaware of the change always just sees the first row > of the table, which of course looks just like the one dimensional > array that existed before. > > > The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() > are all maintained, but refer only to the first row of the array, > so that existing callers in proprietary protocols can continue to > do the "right thing". > Some new entry points are added, for the exclusive use of ipv4 code > called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), > which have an extra argument which refers the code to the correct row. > > In addition, there are some new entry points (currently called > dom_rtalloc() and friends) that check the Address family being > looked up and call either rtalloc() (and friends) if the protocol > is not IPv4 forcing the action to row 0 or to the appropriate row > if it IS IPv4 (and that info is available). These are for calling > from code that is not specific to any particular protocol. The way > these are implemented would change in the non ABI preserving code > to be added later. > > One feature of the first version of the code is that for ipv4, > the interface routes show up automatically on all the FIBs, so > that no matter what FIB you select you always have the basic > direct attached hosts available to you. (rtinit() does this > automatically). > You CAN delete an interface route from one FIB should you want > to but by default it's there. ARP information is also available > in each FIB. It's assumed that the same machine would have the > same MAC address, regardless of which FIB you are using to get > to it. > > > This brings us as to how the correct FIB is selected for an outgoing > IPV4 packet. > > Packets fall into one of a number of classes. > 1/ locally generated packets, coming from a socket/PCB. > Such packets select a FIB from a number associated with the > socket/PCB. This in turn is inherited from the process, > but can be changed by a socket option. The process in turn > inherits it on fork. I have written a utility call setfib > that acts a bit like nice.. > > setfib -n 3 ping
Re: resend: multiple routing table roadmap (format fix)
At Fri, 28 Dec 2007 20:40:30 +0100, Marko Zec wrote: > The thrust behind Julian's work seems to be providing multiple > forwarding tables for for purposes of traffic engineering / policy > based routing, with a single firewall instance used as a classifier. > vimage-style network stack virtualization provides for more strict > isolation on both port and IP address space, independent firewall > instances, IPSEC config / state etc., and as such might be better > suited for providing enhanced jail-style virtual hosting environments, > as well as for providing virtual router "slices". > > So once we get Julian's multi-FIB stuff in the base system, I see no > reason why we couldn't have this functionality replicated in > each "vimage" instance, i.e. have multiple independent virtual > networking environnments, each with multiple FIBs. > > Implementationwise, my hacks currently rely on macros for conditional > virtualization of global variables / structs. As long as Julian's > changes continue to be unconditional, i.e. without playing a similar > macroization game, I think integrating this code (once it hits HEAD) > into p4/projects/vimage should be more or less a straightforward job. Cool, that's what I wanted to hear. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Network device driver KPI/ABI and TOE
At Sun, 6 Jan 2008 13:47:24 + (GMT), rwatson wrote: > > > There's also the opportunity to think about whether it's possible to > harden things in such a ways as to not give up our flexibility to > keep maintaining and improving TCP (and other related subsystems), > yet improving the quality of life for a third party TOE driver > maintainer. For example, might we provide accessor routines for > certain data structures, or attempt to structure things to hide more > of TCP locking from a TOE implementation? Should we suggest that > non-native TOE implementations rely less on our TCP code and provide > there own where the hardware doesn't provide a complete > implementation, in order to avoid building dependency on things that > we know will change? > Given the intimacy that I just perused in the code, basically the driver knows a lot about internal TCP data structures, I think we need to think about a kernel KPI just for these things. I'm not very happy that there are things like cxgb_tcp_ctlinput() although I do know that cleaning that kind of thing up and making a better KPI will be hard. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Are there known issues with multicast on Intel Pro 1000?
Howdy, At my current gig we find that the network interface locks up if we subject it to a high rate of multicast traffic. Since the whole purpose of this box is to do multicast (it absorbs a feed of data over multicast manipulates and then sends it out again over multicast) it's a "bad thing" if this kind of thing does not work. What I currently know is not complete but I figured I could start here. The symptom is that all network communication stops, but the system itself is still responsive, so I can get to the console and get information. Release: 6.2 and 6.3-PRERELEASE (6.3 as of Wed Jan 16th) `Motherboard: CPU: 2 x Intel Xeon X5365 3GHz (4 cores each) Memory: 8G em0: Intel PRO/1000 6.7.3 port 0x2000-0x201f mem 0xd832-0xd833 em1: Intel PRO/1000 6.7.3 port 0x2020-0x203f mem 0xd832-0xd833 em2: Intel PRO/1000 6.7.3 port 0x3000-0x303f mem 0xd824-0xd825, 0xd820-0xd823 em3: Intel PRO/1000 6.7.3 port 0x3040-0x307f mem 0xd826-0xd827 Other data: em2 is the interface that multicasts out our digested data and it also is receiving a lot of digested multicast traffic, which is being recorded by a proprietary program sysctl dev.em.2.debug=1 em2: CTRL = 0x487c0a01 RCTL=0x8002 em2: Pcket buffer = Tx=16k Rx=48k em2: fifo workaround = 0, fifo_reset_count = 0 em2: hw tdh = 76, hw tdt = 76 em2: hw rdh = 213, hw rdt = 212 em2: Num Tx descriptors avail = 256 em2: Tx Descriptors not avail1 = 0 em2: Tx Descriptors not avail2 = 0 em2: Std mbuf failed = 0 em2: Std mbuf cluster fialed = 1247383 (this number is increasing by about 1 a second) em2: Driver dropped packets = 0 em2: Driver tx dma failure in encap = 0 sysctl dev.em.2.stats=1 (all are zero except what is recorded) em2: Missed Packets = 4683 em2: Receive No Buffers = 46905 em2: RX overruns = 83 em2: Good Packets Rcvd = 11416687 em2: Good Packets Xmtd = 146576 em0 is the interface we receive the raw data over multicast on em0: hw tdh = 130, hw tdt = 130 em0: hw rdh = 13, hw rdt = 12 em0: Num Tx descriptors avail = 256 em0: Std mbuf cluster failed = 5111461 (this number is going up by about 1 a second) sysctl dev.em.0.stats=1 (all are zero except what is recorded) em0: Missed Packets = 292778 em0: Receive No Buffers = 96211 em0: RX overruns = 1092 em0: Good Packets Rcvd = 5386001 em0: Good Packets Xmtd = 12418 em3 receives a little data from multicast and it is recorded using a proprietary program em3: hw tdh = 45, hw tdt = 45 em3: hw rdh = 216, hw rdt = 215 em3: Num Tx descriptors avail = 256 em3: Std mbuf cluster failed = 195951 (also going up by 1 very slowly) sysctl dev.em.3.stats=1 (all are zero except what is recorded) em3: Good Packets Rcvd = 9637851 em3: Good Packets Xmtd = 8237 One odd thing is that when the system boots, em1, which is unused in this case complains of: em1: Using MSI interrupt em1: Setup of Shared code failed What more do people need to help debug this? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Are there known issues with multicast on Intel Pro 1000?
At Thu, 17 Jan 2008 15:06:19 -0500, randall wrote: > > [EMAIL PROTECTED] wrote: > > Howdy, > > > > At my current gig we find that the network interface locks up if we > > subject it to a high rate of multicast traffic. Since the whole > > purpose of this box is to do multicast (it absorbs a feed of data over > > multicast manipulates and then sends it out again over multicast) it's > > a "bad thing" if this kind of thing does not work. > > > > What I currently know is not complete but I figured I could start > > here. > > > > The symptom is that all network communication stops, but the system > > itself is still responsive, so I can get to the console and get > > information. > > If you let it run long enough does it eventually lock up? > > I have seen similar behavior when a lock is not released when > I was breaking things :-) > > Everything is fine EXCEPT the interface.. for a while.. then > eventually you get a train-wreck :-) > > I would drop to ddb and do the show locks.. > > Also I believe top (or ps) will tell you what locks are being > waited on in a course way... I think the ps in DDB will do this. On closer inspection it looks like an "out of mbufs" situation and so the right answer is to "up the nmbclusters" but there seem to be other issues with this code and multicast so I'm likely to jump into DDB and look more closely at it, likely next week. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: tcp-md5 check for incomming connection
At Thu, 31 Jan 2008 13:15:12 +0100 (CET), Ingo Flaschberger wrote: > > Dear Andre, > > >> 2) linux method: > >> Look for CONFIG_TCP_MD5SIG in linux-2.6.24/net/ipv4/tcp_ipv4.c > >> (sorry no weblink..) > >> They check and block md5-packets early in tcp_v4_do_rcv. > >> afinet.c -> tcp_v4_rcv -> tcp_v4_do_rcv > >> -> for Freebsd: place some logic early in tcp_input function > >> and call a new function to check md5. > > > > IMHO calling a special function that does the check (like in tcp_output) > > is the way to go. This function should be run as late as possible after > > the other segment validity checks to prevent easy cpu exhaustion attacks > > with packets that only get the port numbers right. > > > > In tcp_new there is a natural place to perform the check. tcp_input will > > show up this weekend. This doesn't prevent your work on the current code > > at all as tcp_new won't show up in -current for a long time and when it > > does it will not get MFC'd. > > Ok. > I will do the first patch for freebsd 6.2 (as my system uses it) and do > the a port to current (and I thing 6.3 too). > > Regardding Bruce: > I would prefer to implement md5 via the old setkey api as I also have todo > my daily business. > > >> 3) Bruce extended method: > >> http://lists.freebsd.org/pipermail/freebsd-net/2004-April/003761.html > >> Use his code and add at severall places in tcp_input function > >> similar checks. > >> > >> Options: > >> *) enable disable it via sysctl > >> *) count total, good and bad packets via sysctl > > > > This belongs into struct tcpstat, not a new sysctl. > > Ok. > With which tool can this counters be read? > Should I add the on/off feature? Via which tool? > Enable/disable via sysctl. Read via netstat. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel compile options
At Thu, 7 Feb 2008 15:16:44 +0100, Michael Tuexen wrote: > > Dear all, > > I was able to build an IPv4 only kernel by having > options INET > #options INET6 > in the kernel config file. > > Is it supposed to work that one can build a IPv6-only > kernel by using > #options INET > options INET6 > I have not tried and I actually doubt it. > And should I be able to compile a kernel without IPv4 and IPv6 > support by using > #options INET > #options INET6 > I believe this does not work either. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: if_start() and send queue question
At Thu, 07 Feb 2008 19:45:28 +0400, Tofig Suleymanov wrote: > > Hello list, > > I will be grateful if someone could point me to the right direction > regarding the question below. > > My device driver is getting incoming packets fine, but for some reason I > am not able to send a single packet. Here is the source code: > http://www.freebsd.az/if_ib.c > > I've added several debug messages to the source and here is the output: > / > /(bringing interface up and assigning the ip/netmask combination) > / > ifconfig ib0 192.168.0.6 netmask 255.255.255.0 up > > /(and here is what I get in /var/log/messages /; /it seems to be a > standard arp broadcast) > / > Feb 7 19:14:32 schizo kernel: ib_init entered > Feb 7 19:14:32 schizo kernel: ib_start entered > Feb 7 19:14:32 schizo kernel: ib_encap entered > Feb 7 19:14:32 schizo kernel: DHOST ff ff ff ff ff ff > Feb 7 19:14:32 schizo kernel: SHOST 0 c0 ee 22 3 14 > Feb 7 19:14:32 schizo kernel: txeof entered > Feb 7 19:14:32 schizo kernel: txeof exiting > > /(now I try pinging, but no joy . I've added extra debug messages inside > ping.c) > > /schizo# ping 192.168.0.1 > PING 192.168.0.1 (192.168.0.1): 56 data bytes > packets sent: -1 > ping: sendto: Invalid argument > packets sent: -1 > ping: sendto: Invalid argument > packets sent: -1 > ping: sendto: Invalid argument > ^C > --- 192.168.0.1 ping statistics --- > 3 packets transmitted, 0 packets received, 100% packet loss > / > > I have also tied to add debug messages to sys/net/if.c and > sys/net/netisr.c and it seems that the kernel doesn't even try to run my > ib_start() function. > Some things to try: 1) Add debug statements to the ib_start() routine. 2) See if bpf works (tcpdump -i ib0) 3) Show us the output of: ifconfig ib0 netstat -i Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Problems with Chelsio driver in CURRENT...
Hi, I have two MP/Multicore Xeon boxes with CX4 based Chelsio cards in them. If I boot 7.0-RC1 the cards can talk to each other. If I build a recent kernel/world (for instance from today) I cannot ping between them. I have tried using GENERIC as wella as a custom kernel. kodama8# ifconfig cxgb0 cxgb0: flags=8843 metric 0 mtu 9000 options=1bb ether 00:07:43:05:20:68 inet 172.16.0.2 netmask 0xff00 broadcast 172.16.0.255 media: Ethernet 10Gbase-CX4 (autoselect ) status: active kodama8# ping 172.16.0.1 PING 172.16.0.1 (172.16.0.1): 56 data bytes ^C --- 172.16.0.1 ping statistics --- 5 packets transmitted, 0 packets received, 100.0% packet loss kodama8# kodama$ uname -a FreeBSD kodama8.neville-neil.comA 7.0-RC1 FreeBSD 7.0-RC1 #0: Mon Dec 24 10:10:07 UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC amd64 kodama8# nozomi8# ifconfig cxgb0 cxgb0: flags=8843 metric 0 mtu 9000 options=1bb ether 00:07:43:05:20:43 inet 172.16.0.1 netmask 0xff00 broadcast 172.16.0.255 media: Ethernet 10Gbase-CX4 (autoselect ) status: active nozomi8# nozomi8# uname -a FreeBSD nozomi8.neville-neil.com 8.0-CURRENT FreeBSD 8.0-CURRENT #2: Wed Feb 13 15:47:05 JST 2008 [EMAIL PROTECTED]:/usr/obj/scratch/FreeBSD.HEAD/src/sys/GENERIC amd64 nozomi8# The dmesg is at the end of this mail. Thoughts? Thanks, George nozomi8# dmesg Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-CURRENT #2: Wed Feb 13 15:47:05 JST 2008 [EMAIL PROTECTED]:/usr/obj/scratch/FreeBSD.HEAD/src/sys/GENERIC WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU X5355 @ 2.66GHz (2666.68-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f7 Stepping = 7 Features=0xbfebfbff Features2=0x4e3bd AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 4 usable memory = 8575602688 (8178 MB) avail memory = 8306462720 (7921 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 acpi_hpet0: iomem 0xfed0-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 cpu0: on acpi0 est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est0 attach returned 6 p4tcc0: on cpu0 cpu1: on acpi0 est1: on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est1 attach returned 6 p4tcc1: on cpu1 cpu2: on acpi0 est2: on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est2 attach returned 6 p4tcc2: on cpu2 cpu3: on acpi0 est3: on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est3 attach returned 6 p4tcc3: on cpu3 cpu4: on acpi0 est4: on cpu4 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est4 attach returned 6 p4tcc4: on cpu4 cpu5: on acpi0 est5: on cpu5 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est5 attach returned 6 p4tcc5: on cpu5 cpu6: on acpi0 est6: on cpu6 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est6 attach returned 6 p4tcc6: on cpu6 cpu7: on acpi0 est7: on cpu7 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 82a082a0600082a device_attach: est7 attach returned 6 p4tcc7: on cpu7 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: at device 0.0 on pci3 pci4: on pcib4 ahd0: port 0x2400-0x24ff,0x2000-0x20ff mem 0xd8b0-0xd8b01fff irq 16 at device 2.0 on pci4 ahd0: [ITHREAD] aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs ahd1: port 0x2c00-0x2cff,
Re: Problems with Chelsio driver in CURRENT...
At Wed, 13 Feb 2008 00:52:52 -0800, Kip Macy wrote: > > Oops sorry ... What is the output of 'sysctl dev.cxgbc.0'? > Here ya go, and thanks! Later, George nozomi8# ifconfig cxgb0 cxgb0: flags=8843 metric 0 mtu 9000 options=1bb ether 00:07:43:05:20:43 inet 172.16.0.1 netmask 0xff00 broadcast 172.16.0.255 media: Ethernet 10Gbase-CX4 (autoselect ) status: active nozomi8# sysctl dev.cxgbc.0 dev.cxgbc.0.%desc: Chelsio T310 RNIC, 1 port dev.cxgbc.0.%driver: cxgbc dev.cxgbc.0.%location: slot=0 function=0 dev.cxgbc.0.%pnpinfo: vendor=0x1425 device=0x0030 subvendor=0x1425 subdevice=0x0001 class=0x02 dev.cxgbc.0.%parent: pci9 dev.cxgbc.0.firmware_version: 4.7.0 dev.cxgbc.0.enable_debug: 0 dev.cxgbc.0.tunq_coalesce: 0 dev.cxgbc.0.txq_overrun: 0 dev.cxgbc.0.pcpu_cache_enable: 1 dev.cxgbc.0.cache_alloc: 0 dev.cxgbc.0.cached: 0 dev.cxgbc.0.ext_freed: 0 dev.cxgbc.0.mbufs_outstanding: 0 dev.cxgbc.0.pack_outstanding: 0 dev.cxgbc.0.intr_coal: 1 dev.cxgbc.0.port0.nqsets: 8 dev.cxgbc.0.port0.qs0.rspq.size: 1024 dev.cxgbc.0.port0.qs0.rspq.cidx: 2 dev.cxgbc.0.port0.qs0.rspq.credits: 2 dev.cxgbc.0.port0.qs0.rspq.phys_addr: 0x03cf dev.cxgbc.0.port0.qs0.rspq.dump_start: 0 dev.cxgbc.0.port0.qs0.rspq.dump_count: 0 dev.cxgbc.0.port0.qs0.txq_eth.dropped: 0 dev.cxgbc.0.port0.qs0.txq_eth.sendqlen: 0 dev.cxgbc.0.port0.qs0.txq_eth.queue_pidx: 0 dev.cxgbc.0.port0.qs0.txq_eth.queue_cidx: 0 dev.cxgbc.0.port0.qs0.txq_eth.processed: 0 dev.cxgbc.0.port0.qs0.txq_eth.cleaned: 0 dev.cxgbc.0.port0.qs0.txq_eth.in_use: 1 dev.cxgbc.0.port0.qs0.txq_eth.frees: 0 dev.cxgbc.0.port0.qs0.txq_eth.skipped: 0 dev.cxgbc.0.port0.qs0.txq_eth.coalesced: 0 dev.cxgbc.0.port0.qs0.txq_eth.enqueued: 1 dev.cxgbc.0.port0.qs0.txq_eth.stopped_flags: 0 dev.cxgbc.0.port0.qs0.txq_eth.phys_addr: 0x7e7c dev.cxgbc.0.port0.qs0.txq_eth.qgen: 1 dev.cxgbc.0.port0.qs0.txq_eth.hw_cidx: 0 dev.cxgbc.0.port0.qs0.txq_eth.hw_pidx: 1 dev.cxgbc.0.port0.qs0.txq_eth.dump_start: 0 dev.cxgbc.0.port0.qs0.txq_eth.dump_count: 0 dev.cxgbc.0.port0.qs1.rspq.size: 1024 dev.cxgbc.0.port0.qs1.rspq.cidx: 0 dev.cxgbc.0.port0.qs1.rspq.credits: 0 dev.cxgbc.0.port0.qs1.rspq.phys_addr: 0x8456 dev.cxgbc.0.port0.qs1.rspq.dump_start: 0 dev.cxgbc.0.port0.qs1.rspq.dump_count: 0 dev.cxgbc.0.port0.qs1.txq_eth.dropped: 0 dev.cxgbc.0.port0.qs1.txq_eth.sendqlen: 0 dev.cxgbc.0.port0.qs1.txq_eth.queue_pidx: 0 dev.cxgbc.0.port0.qs1.txq_eth.queue_cidx: 0 dev.cxgbc.0.port0.qs1.txq_eth.processed: 0 dev.cxgbc.0.port0.qs1.txq_eth.cleaned: 0 dev.cxgbc.0.port0.qs1.txq_eth.in_use: 0 dev.cxgbc.0.port0.qs1.txq_eth.frees: 0 dev.cxgbc.0.port0.qs1.txq_eth.skipped: 0 dev.cxgbc.0.port0.qs1.txq_eth.coalesced: 0 dev.cxgbc.0.port0.qs1.txq_eth.enqueued: 0 dev.cxgbc.0.port0.qs1.txq_eth.stopped_flags: 0 dev.cxgbc.0.port0.qs1.txq_eth.phys_addr: 0x8464 dev.cxgbc.0.port0.qs1.txq_eth.qgen: 1 dev.cxgbc.0.port0.qs1.txq_eth.hw_cidx: 0 dev.cxgbc.0.port0.qs1.txq_eth.hw_pidx: 0 dev.cxgbc.0.port0.qs1.txq_eth.dump_start: 0 dev.cxgbc.0.port0.qs1.txq_eth.dump_count: 0 dev.cxgbc.0.port0.qs2.rspq.size: 1024 dev.cxgbc.0.port0.qs2.rspq.cidx: 0 dev.cxgbc.0.port0.qs2.rspq.credits: 0 dev.cxgbc.0.port0.qs2.rspq.phys_addr: 0x86b4 dev.cxgbc.0.port0.qs2.rspq.dump_start: 0 dev.cxgbc.0.port0.qs2.rspq.dump_count: 0 dev.cxgbc.0.port0.qs2.txq_eth.dropped: 0 dev.cxgbc.0.port0.qs2.txq_eth.sendqlen: 0 dev.cxgbc.0.port0.qs2.txq_eth.queue_pidx: 0 dev.cxgbc.0.port0.qs2.txq_eth.queue_cidx: 0 dev.cxgbc.0.port0.qs2.txq_eth.processed: 0 dev.cxgbc.0.port0.qs2.txq_eth.cleaned: 0 dev.cxgbc.0.port0.qs2.txq_eth.in_use: 0 dev.cxgbc.0.port0.qs2.txq_eth.frees: 0 dev.cxgbc.0.port0.qs2.txq_eth.skipped: 0 dev.cxgbc.0.port0.qs2.txq_eth.coalesced: 0 dev.cxgbc.0.port0.qs2.txq_eth.enqueued: 0 dev.cxgbc.0.port0.qs2.txq_eth.stopped_flags: 0 dev.cxgbc.0.port0.qs2.txq_eth.phys_addr: 0x86b6 dev.cxgbc.0.port0.qs2.txq_eth.qgen: 1 dev.cxgbc.0.port0.qs2.txq_eth.hw_cidx: 0 dev.cxgbc.0.port0.qs2.txq_eth.hw_pidx: 0 dev.cxgbc.0.port0.qs2.txq_eth.dump_start: 0 dev.cxgbc.0.port0.qs2.txq_eth.dump_count: 0 dev.cxgbc.0.port0.qs3.rspq.size: 1024 dev.cxgbc.0.port0.qs3.rspq.cidx: 0 dev.cxgbc.0.port0.qs3.rspq.credits: 0 dev.cxgbc.0.port0.qs3.rspq.phys_addr: 0x8815 dev.cxgbc.0.port0.qs3.rspq.dump_start: 0 dev.cxgbc.0.port0.qs3.rspq.dump_count: 0 dev.cxgbc.0.port0.qs3.txq_eth.dropped: 0 dev.cxgbc.0.port0.qs3.txq_eth.sendqlen: 0 dev.cxgbc.0.port0.qs3.txq_eth.queue_pidx: 0 dev.cxgbc.0.port0.qs3.txq_eth.queue_cidx: 0 dev.cxgbc.0.port0.qs3.txq_eth.processed: 0 dev.cxgbc.0.port0.qs3.txq_eth.cleaned: 0 dev.cxgbc.0.port0.qs3.txq_eth.in_use: 0 dev.cxgbc.0.port0.qs3.txq_eth.frees: 0 dev.cxgbc.0.port0.qs3.txq_eth.skipped: 0 dev.cxgbc.0.port0.qs3.txq_eth.coalesced: 0 dev.cxgbc.0.port0.qs3.txq_eth.enqueued: 0 dev.cxgbc.0.port0.qs3.txq_eth.stopped_flags: 0 dev.cxgbc.0.port0.qs3.txq_eth.phys_addr: 0x8816 dev.cxgbc.0.port0.qs3.txq_eth.qgen: 1 dev.cxgbc.0.port0.qs3.txq_eth.hw_cidx:
Re: Problems with Chelsio driver in CURRENT...
OK, one more data point. The issue is somewhere between RC2 and CURRENT. I just put RC2 on the same box, and RC1 can talk to RC2 over the Chelsio cards. I have now tried RC2 and CURRENT and still no dice. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic in 6.3-RELEASE when multi-cast client exits
At Tue, 19 Feb 2008 14:00:56 +, Bruce M. Simpson wrote: > > Rob Watt wrote: > > Hi. > > > > We recently upgraded some of our machines to 6.3-RELEASE and we have been > > plagued by repeatable panics when our multi-cast client applications exit. > > Our machines have Intel X5365 processors, LSI MegaSAS 1064R cards, and Intel > > Pro 1000 MF nic cards (although we have seen this problem with the onboard > > Intel copper nics as well). We have seen this panic with machines that have > > Tyan boards as well as Super Micro. I have seen a few postings that seem to > > refer to related panics, and bug > > http://www.freebsd.org/cgi/query-pr.cgi?pr=116077 contains a patch that > > seems like it should address the problem, but our patched system still > > panics. I have attached the output from 3 of the dumps/backtraces. Dump #1 > > is probably the most useful. I am happy to provide more info if necessary. > > > > Some folk reported that they didn't see this problem occur with the code > in 7.x, which jibes as I rewrote some of the logic in that branch. It's > been nearly a year since I last had time to look at anything related to > this. > > My understanding is that 7.0 is getting closer to release status so you > may wish to try reproducing the problem there. > > The human resource situation hasn't changed much on my end, though I am > getting closer to having time to finishing IGMPv3 (it's needed for other > stuff in the future). I haven't been able to reproduce the bug in the > PR, which makes suggesting other courses of action difficult. > I can reproduce this panic with a small piece of code I've been hacking for work. The code depends on classes that are proprietary but the program itself is simple and I'll ask work if I can sanitize it in the next few days. The program is intended as a multicast jitter/latency tester, but works well as a general exerciser of the multicast code. The panic is basically an issue with terminating a process and handling the multicast address lists on the interface. I have not tracked down the exact cause as yet but am working on it now. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPV6_TCLASS missing from ip6(4)
At Wed, 20 Feb 2008 18:25:05 +, Bruce M Simpson wrote: > > I just noticed that whilst the socket code appears to support > IPV6_TCLASS, we don't document it. > > I haven't raised a PR for this issue yet nor have I written a patch. > Please do both :-) Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: panic in 6.3-RELEASE when multi-cast client exits
FYI this is fixed by a one line change that is about to hit 6-STABLE: @@ -991,7 +991,6 @@ * a new record. Otherwise, we are done. */ if (ifma->ifma_protospec != NULL) { - if_delmulti_ent(ifma); /* We don't need another reference */ IN_MULTI_UNLOCK(); IFF_UNLOCKGIANT(ifp); return ifma->ifma_protospec; Sent to me by Stephan Uphoff. I tested it today. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: LOR icmp6_input/nd6_lookup
At Fri, 29 Feb 2008 13:44:27 -0600, Kevin Day wrote: > > This is from 7.0-RELEASE: > > lock order reversal: > 1st 0xc3bde2b8 rtentry (rtentry) @ netinet6/nd6.c:1930 > 2nd 0xc3af367c radix node head (radix node head) @ net/route.c:147 > KDB: stack backtrace: > db_trace_self_wrapper > (c08af130,e11b8600,c0662bbe,c08b1592,c3af367c,...) at > db_trace_self_wrapper+0x26 > kdb_backtrace(c08b1592,c3af367c,c08b15f3,c08b15f3,c08b9ce7,...) at > kdb_backtrace+0x29 > witness_checkorder(c3af367c,9,c08b9cde,93,e11b8624,...) at > witness_checkorder+0x6de > _mtx_lock_flags(c3af367c,0,c08b9cde,93,c066160b,...) at _mtx_lock_flags > +0xbc > rtalloc1(e11b86e0,0,0,0,c3c9d01c,...) at rtalloc1+0x63 > nd6_lookup(c3c9d024,0,c39fd800,c3bde258,c3bde258,...) at nd6_lookup+0x55 > nd6_is_addr_neighbor(c3c9d01c,c39fd800,c08c1d75,78a,c09a5ed8,...) at > nd6_is_addr_neighbor+0x3b > nd6_output(c39fd800,c39fd800,c3cf9b00,c3c9d01c,c3bde258,...) at > nd6_output+0x10f > ip6_output(c3cf9b00,0,e11b88e0,0,0,...) at ip6_output+0x1081 > icmp6_reflect(c3cf9b00,28,8,1,c08c96d0,...) at icmp6_reflect+0x42f > icmp6_input(e11b8c88,e11b8c70,3a,1d5,0,...) at icmp6_input+0x6dc > ip6_input(c3be2900,0,c08b9887,8c,c09a1e24,...) at ip6_input+0xe36 > netisr_processqueue(c0955e30,0,c08b9887,f6,c3865a40,...) at > netisr_processqueue+0x8b > swi_net(0,0,c08a938d,471,c3870364,...) at swi_net+0x9b > ithread_loop(c383ac90,e11b8d38,c08a9115,305,c3873000,...) at > ithread_loop+0x1b5 > fork_exit(c060fbe0,c383ac90,e11b8d38) at fork_exit+0xb8 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0, eip = 0, esp = 0xe11b8d70, ebp = 0 --- > > Are LOR's still PR-worthy? Yes, can you file one? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: [PATCH] kern/120958: no response to ICMP traffic on interface configured with a link-local address
At Thu, 13 Mar 2008 20:58:25 -0400, James Snow wrote: > > [1 ] > On Thu, Mar 13, 2008 at 08:40:07PM -0400, James Snow wrote: > > > > Also, I took a cue from the IN_LINKLOCAL() macro and added two new > > macros to sys/netinet/in.h to perform checks for the loopback network > > and the "zero" network. IN_LOOPBACK() and IN_ZERONET(), respectively. > > Woops. I suppose the macros are more useful when they're actually > called. > > Attached is a revised patch that performs the check for loopback > addresses less than twice but more than never. > This looks good. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
A new tool to measure multicast performance...
Howdy, I have just finished updating a new tool in src/tools/tools/mctest which is a multicast test program. The mctest program works by sending packets from a source to a sink over using a multicast address and then the sink reflects the packets it receives back to the source. The source records the transmission and reception time of each packet and reports the round trip time, which the sink prints out the time between packets, in microseconds. The program is best used to debug ethernet drivers as well as our multicast and UDP code. For more information please read the manual page. Sorry, IPv6 is not supported as yet, only IPv4. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: kern/120958: no response to ICMP traffic on interface configured with a link-local address
Synopsis: no response to ICMP traffic on interface configured with a link-local address State-Changed-From-To: open->patched State-Changed-By: gnn State-Changed-When: Thu Apr 17 12:51:46 UTC 2008 State-Changed-Why: User submitted a patch which is now applied and tested. Take over bug until closed. Responsible-Changed-From-To: freebsd-net->gnn Responsible-Changed-By: gnn Responsible-Changed-When: Thu Apr 17 12:51:46 UTC 2008 Responsible-Changed-Why: The user's suggested patch has been applied. Take the bug over until its closed. http://www.freebsd.org/cgi/query-pr.cgi?pr=120958 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Regarding if_alloc()
At Thu, 17 Apr 2008 18:35:23 -0700 (PDT), vijay singh wrote: > > Hi all. How do we avoid a race in populating the ifindex_table? Id > this is a TODO, as it seems from the code below, would it be > acceptable if I wrote a patch and reused the ifnet_lock > [IFNET_WLOCK, IFNET_WUNLOCK]? > It is almost always acceptable to submit a patch :-) > > if_alloc(u_char type) > { > struct ifnet *ifp; > > ifp = malloc(sizeof(struct ifnet), M_IFNET, M_WAITOK|M_ZERO); > > /* > * Try to find an empty slot below if_index. If we fail, take > * the next slot. > * > * XXX: should be locked! > */ > for (ifp->if_index = 1; ifp->if_index <= if_index; ifp->if_index++) { > if (ifnet_byindex(ifp->if_index) == NULL) > break; > } > > There are still parts of the network device infrastructure that need some locking, and it would seem that this is one of them. I know Brooks Davis was also looking at this stuff so he may comment as well. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: MFC of TOE support to RELENG_7
At Thu, 17 Apr 2008 21:00:04 -0700, Kip Macy wrote: > > I would like to MFC TOE and RDMA support in the last week of May / > first week of June. My primary objective is that it be present in 7.1. > The re team has not yet decided when the freeze date for 7.1 will be, > so I may end up asking to do it earlier. > > The reason I'm bringing it up roughly 6 weeks in advance is that there > is a certain amount of debate with regards to the ABI guarantees that > FreeBSD network developers are willing to commit to for the remaining > life of the RELENG_7 branch. > > I've made the following two simplifying assumptions: >- struct tcpcb and struct sockbuf are append only - i.e. if members > are added, they will be added to the end >- lock ordering will not change, e.g. the inpcb lock will always be > acquired before the sockbuf lock > > Is there any reason to believe that these simplifying assumptions are > not acceptable? If so, why? > > > I've added the following sets of accessor functions: >- lock acquire/release for socket, sockbuf, inpcb >- higher level functions for tcp shutdown and syncache to abstract > away the tcbinfo lock >- accessor functions for all the accessed fields in socket and > inpcb so that none of the members are referenced as offsets from the > base of the structure I apologize for not yet reviewing all the code. I take that last bit to mean the drivers can reach up into sockets given those functions? I gather this is due to the work necessary to implement RDMA over TCP? > The current state of the code can be seen at: > http://157.22.130.171/svn/branches/projects/iwarp/sys/ Is there a simple way to get just that directory without doing a svn on your whole repo? And if not, what's the easiest way to just grab that stuff? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
zonelimit issues...
Hi, I am wondering why this patch was never committed? http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround It does seem to address an issue I'm seeing where processes get into the zonelimit state through the use of mbufs (a high speed UDP packet receiver) but even after network pressure is reduced/removed the process never gets out of that state again. Applying the patch fixed the issue, but I'd like to have some discussion as to the general merits of the approach. Unfortunately the test that currently causes this is tied very tightly to code at work that I can't share, but I will hopefully be improving mctest to try to exhibit this behavior. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: zonelimit issues...
At Sun, 20 Apr 2008 09:53:49 -0700, Chris Pratt wrote: > > > On Apr 20, 2008, at 2:43 AM, Robert Watson wrote: > > > > > On Fri, 18 Apr 2008, Chris Pratt wrote: > > > >> Doesn't 7.0 fix this? I'd like to see an official definitive > >> answer and all I've been going on is that the problem description > >> is no longer in the errata. > > > > Unfortunately, bugs of this sort don't really "work" that way -- > > specific bugs are a property of a problem in code (or a problem in > > design), but what we have right now is a report of a symptom that > > might reflect zero or more specific bugs. It's unclear that the > > problem described in errata is the problem you've been > > experiencing, or that the (at least one) fixed bug with the same > > symptoms is that one you've been experiencing. For better or > > worse, the only way to really tell of a generic class of hang or > > wedging is fixed is to try out the new version and see. In most > > cases, "zonelimit" wedging reflects one of two things: > > > > (1) Inadequate resource allocation to the network stack or some other > > component, try tuning up the memory tunable for clusters (for > > example). > > > For several months I did quite a bit of tuning. I never increased > nmbclusters beyond the 32768 shown in the docs because man > tuning doesn't define it's use of "arbitrarily high". Inability to boot > could mean travel. Kris Kenneway had provided instructions to > get a dump. I set up for that but have never had a dump. The > only respite came from adding another circuit, another NIC and > spreading traffic. We increased our lock time from every couple > of days during the heavy bot period of late 2006 to now every > month or during traditionally slow months, even two months. > For example, we ran a record 72 days last summer. It was a > very dead summer traffic wise. > > I will try to increase the nmbclusters dramatically if I can figure > out what a safe top limit is but it sounds like the jump to > 7.0 RELEASE may be worth the effort. I would want to wait > until this issue with TCP, Windows and certain routers is well > past. I had not seen that applied to 7_0_0 yet and that would be > a show stopper. Is there a way to know what is safe for > nmbclusters given an 8GB ram system? On "big" systems I am currently using 65000, and that seems safe so far. This is on an 8 core (2P) Xeon box with 8G of RAM. > I did vmstats data collection for a couple of months when things > were at their worst. The results were nebulous to me based > on lack of code knowledge. All I actually found was that a > certain counter would drop to 0 and never recover. I didn't > know if it was meaningful and received no replies when I > asked FreeBSD-Questions. It was 128-Bucket or something > like that. > > > (2) A memory leak in a network device driver or other network part, > > which > > needs to be debugged and fixed. > > > > Initially I thought there may be something related to the bge > driver and moved the high traffic apps on an em. This didn't > seem to help much, nor did polling. > > I am most willing to collect data if I could figure out how to > collect something meaningful. I gather from what you say, > that 7.0 would provide this. > > I really appreciate both of your responses. Just based on > this one problem, 6.x has been a bad experience after > years of seemingly impossible uptime on 4 and 5.x > FreeBSD. Well there are plenty of us motivated to get at these issues. Can you do me a favor and characterize your traffic a bit? Is it mostly TCP, or heavily UDP or some sort of mix? The issues I see are UDP based, which is less surprising as UDP has no backpressure and it is easy to over commit the system by upping the socket buffer space allocated without upping the number of clusters to compensate. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: zonelimit issues...
At Sun, 20 Apr 2008 10:32:25 +0100 (BST), rwatson wrote: > > > On Fri, 18 Apr 2008, [EMAIL PROTECTED] wrote: > > > I am wondering why this patch was never committed? > > > > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround > > > > It does seem to address an issue I'm seeing where processes get into the > > zonelimit state through the use of mbufs (a high speed UDP packet receiver) > > but even after network pressure is reduced/removed the process never gets > > out of that state again. Applying the patch fixed the issue, but I'd like > > to have some discussion as to the general merits of the approach. > > > > Unfortunately the test that currently causes this is tied very tightly to > > code at work that I can't share, but I will hopefully be improving mctest > > to > > try to exhibit this behavior. > > When you take all load off the system, do mbufs and clusters get properly > freed back to UMA (as visible in netstat -m)? If not, continuing to bump up > against the zonelimit would suggest an mbuf/cluster leak, in which case we > need to track that bug. > This is unclear as the process that creates the issue opens 50 UDP multicast sockets with very large socket buffers. I am investigating this aspect some more. > You might consider adding a debugging-only zonelimit waiter count to > the UMA zone, and checks/assertions that a wakeup is being generated > properly. Yes. Do you have an example I can easily steal? > That is, to confirm that the wakeup is generated when memory is > freed up if there are threads waiting. There is at least one as-yet > MFC'd fix to the sleep/wakeup code, I believe, that might be > relevant here. Is the problem you're reporting on 7.x, or on 8.x? > If 8.x, that's probably not it, but if 7.x, it could be. (This same > sleep/wakeup bug occasionally leads to wedging of dump(8), I > believe). I have seen this on 7.0 RELEASE, and STABLE and on CURRENT (8). I am currently working on it on CURRENT because if I have a fix it's going to have to go there first. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: zonelimit issues...
At Mon, 21 Apr 2008 16:46:00 +0900, [EMAIL PROTECTED] wrote: > > At Sun, 20 Apr 2008 10:32:25 +0100 (BST), > rwatson wrote: > > > > > > On Fri, 18 Apr 2008, [EMAIL PROTECTED] wrote: > > > > > I am wondering why this patch was never committed? > > > > > > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround > > > > > > It does seem to address an issue I'm seeing where processes get into the > > > zonelimit state through the use of mbufs (a high speed UDP packet > > > receiver) > > > but even after network pressure is reduced/removed the process never gets > > > out of that state again. Applying the patch fixed the issue, but I'd > > > like > > > to have some discussion as to the general merits of the approach. > > > > > > Unfortunately the test that currently causes this is tied very tightly to > > > code at work that I can't share, but I will hopefully be improving mctest > > > to > > > try to exhibit this behavior. > > > > When you take all load off the system, do mbufs and clusters get properly > > freed back to UMA (as visible in netstat -m)? If not, continuing to bump > > up > > against the zonelimit would suggest an mbuf/cluster leak, in which case we > > need to track that bug. > > > > This is unclear as the process that creates the issue opens 50 UDP > multicast sockets with very large socket buffers. I am investigating > this aspect some more. > OK, yes, the clusters etc. go back to normal when the incoming pressure is released. I do not believe we have a cluster/mbuf leak. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: zonelimit issues...
At Tue, 22 Apr 2008 06:35:38 -0700, Chris Pratt wrote: > > > On Apr 21, 2008, at 12:43 AM, [EMAIL PROTECTED] wrote: > > > ...snip > > > > Well there are plenty of us motivated to get at these issues. Can you > > do me a favor and characterize your traffic a bit? Is it mostly TCP, > > The traffic that seems to take us out is TCP port 80. I'll make a > generalized guess but it does seem to follow. We freeze on one of > two dramatically heavy use days for our industry (Sunday and Monday > evening). The hang will actually occur on Monday or Tuesday > following these days if sufficient traffic hits us. It has not > always followed this pattern but most frequently. There is always a > high presence of high frequency attacks of various sorts. For > example referer spam posts which hit us hard on our busy > evenings. So it is TCP and I would presume we usually have the > establishment of many useless sessions that could cause us to bump > up against limits and cause exhaustion coupled with our real traffic > peaks. > Interesting, but with TCP it should be easier to tune this, in particular because TCP has backoff once a packet drops. I gather you are using facilities, like accept filters, that make it easy to drop less useful traffic? > This thread has given me several things to try and I'm adjusting (e.g., > nmbclusters) upward to see what happens. Sounds good. Using netstat -m and netstat -an are a good way to watch this issue. -m is the number of mbufs/clusters in use and -an will show you all sockets, but what you want to check on s the number of bytes in the recv and send socket buffers, which are the 2nd and 3rd columns. > I should also mention that this system has the natural limitations > on it's traffic ceiling of two T1s on two NICs and a 3rd LAN NIC > fielding continuous round-robin mysql replication and rsync style > mirroring. It uses two bge interfaces and one server type em > interface. It's always troubled me that the zonelimit issues have > always been associated with higher volume circuits (in what I've > read). But since our issue is very directly related to traffic > levels and seem to occur at times where my monitors show us way over > committed on the two outward facing T1s, I'm still going to proceed > with the adjustments and see if it increases our survivability. Since zonelimit is a state reached when your system is out of resources it makes sense that the higher the traffic the sooner you'll reach it. > Thanks for your time on this. > No problem, it's what I like to do :-) Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Change from BSDL to GPL
At Mon, 05 May 2008 06:31:25 +0800, kevin wrote: > > Hi, all > I want to port 4.4BSD-Lite's TCP/IP source code to my own OS kernel. > My OS kernel is GPL licenced. > Is it possible for me to modify 4.4BSD-Lite's source code and change its > licence from 4.4BSD-Lite licence to GPL licence? > Alas, the short answer is "Consult an IP lawyer." Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Proposed patch to the kernel and to netstat...
Howdy, I have developed the attached patch which extends the functionality of netstat (via the -x flag) to show us all the socket buffer statistics. The kernel change counts mbufs, as well as clusters (at the moment of any size) and gives output like this: Proto Recv-Q Send-Q Local Address Foreign Address R-MBUF S-MBUF R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX (state) tcp4 0 0 127.0.0.1.6010 *.* 0 0 0 0 65536 32768 1 2048 0 0 262144 262144 LISTEN tcp6 0 0 ::1.6010 *.* 0 0 0 0 65536 32768 1 2048 0 0 262144 262144 LISTEN tcp4 0 0 172.16.186.130.22 172.16.186.1.53443 0 0 0 0 66608 33304 1 2048 0 0 262144 262144 ESTABLISHED tcp4 0 0 172.16.186.130.29178 172.16.186.1.22 0 0 0 0 0 0 0 0 0 0 0 0 TIME_WAIT tcp4 0 0 172.16.186.130.62302 69.147.83.41.22 0 0 0 0 65700 74540 1 2048 0 0 262144 262144 ESTABLISHED tcp4 0 0 127.0.0.1.62415127.0.0.1.6010 0 0 0 0 0 0 0 0 0 0 0 0 TIME_WAIT Note you need a very wide screen to read that. The man page is also updated but the relevant bits are: The -x flag causes netstat to output all the information recorded about data stored in the socket buffers. The fields are: R-MBUFNumber of mbufs in the receive queue. S-MBUFNumber of mbufs in the send queue. R-CLUSNumber of clusters, of any type, in the recieve queue. S-CLUSNumber of clusters, of any type, in the send queue. R-HIWAReceive buffer high water mark, in bytes. S-HIWASend buffer high water mark, in bytes. R-LOWAReceive buffer low water mark, in bytes. S-LOWASend buffer low water mark, in bytes. R-BCNTReceive buffer byte count. S-BCNTSend buffer byte count. R-BMAXMaximum bytes that can be used in the receive buffer. S-BMAXMaximum bytes that can be used in the send buffer. Please email me comments. I'd like to commit this to HEAD soon. It can't be put into 7 without removing the cluster and mbuf counting, but I might do that as well if there is interest. Best, George netstat.diff Description: Binary data ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Anyone seen this error on em ?
Jun 9 18:23:59 ... kernel: em0: port 0x2000-0x201f mem 0xd802-0xd803,0xd800 -0xd801 irq 18 at device 0.0 on pci4 Jun 9 18:23:59 ... kernel: em0: Using MSI interrupt Jun 9 18:23:59 ... kernel: em0: Setup of Shared code failed Jun 9 18:23:59 ... kernel: device_attach: em0 attach returned 6 I've never seen the "returned 6" thing. Plugging a cable into the other em device on the motherboard, a super micro, currently works, but it looks like bad hardware to me. Thoughts? Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Weirdness - FBSD 7, Routing, Packet generator, em taskq
At Thu, 26 Jun 2008 23:25:18 -0400, Paul wrote: > > I have a FreeBSD router set up with Full BGP routes and I'm doing some > tests on using it for routing. > > 7.0-RELEASE-p1 FreeBSD 7.0-RELEASE-p1 #6: Thu Apr 17 18:11:49 EDT 2008 > amd64 > > oddness..: > > Use a packet generator to generate random source ips and ports and send > traffic through the router to a destination on the other side, single ip. > What happens is the 'em0 taskq' starts to eat cpu... but the funny > thing is immediately when I start the traffic (say, 100,000 pps) em0 > taskq is about 15% cpu.. and then over the course of 2 minutes or so it > climbs to 60% cpu.. This makes no sense.. The packets per second are > continuous and it just routed 100kpps for 60 seconds with less cpu so > why in the world would it slowly climb like that? > > It's an observation I suppose and I was hoping if someone could > enlighten me on WHY.. :) I did test it on 3 different machines by the way. > It even does this with just a handful of routes in the routing table , I > tried that too just to rule that out. > I don't remember Freebsd 4/5 doing this?? > What are you using to measure the CPU time? Some tools take time to gather up enough samples. Also, have you tried to do any profiling on the kernel to see why this might be the case? http://www.watson.org/~robert/freebsd/netperf/profile/ Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
What's the deal with hardware checksum and net.inet.udp.checksum?
I would assume that if a card, say the em, has hardware TX checksum that the UDP checksum could be calculated by the hardware, but this seems not to be the case. The manual pages are unhelpful in this regard. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: What's the deal with hardware checksum and net.inet.udp.checksum?
At Thu, 10 Jul 2008 11:43:23 +0100 (BST), rwatson wrote: > > On Wed, 9 Jul 2008, [EMAIL PROTECTED] wrote: > > > I would assume that if a card, say the em, has hardware TX checksum that > > the > > UDP checksum could be calculated by the hardware, but this seems not to be > > the case. The manual pages are unhelpful in this regard. > > On the whole, they should be generated in hardware as long as it's > not administratively disabled with ifconfig, and as long as there > aren't know bugs in the hardware for the rev you're using. Just for > example, hardware checksumming is disabled in software for quite a > few early 1gbps cards due to bugs in the hardware causing rather > nasty side effects. What specific problem are you seeing? We do do > a software checksum of the pseudo-header, but the UDP data should be > checksummed by hardware. > > (The usual test for hardware checksum being enabled on transmit is > to tcpdump the interface and see tcpdump reporting lots of bad > checksums, as the BPF capture happens before hardware checksumming > is run -- in principle on the receive side that shouldn't happen!) > If the sysctl it turned off on the transmitter then the receiving machine sees UDP checksums of 0. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: What's the deal with hardware checksum and net.inet.udp.checksum?
A, thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
igb doesn't compile in STABLE?
Howdy, As of today, this afternoon, I see the following: linking kernel.debug e1000_api.o(.text+0xad9): In function `e1000_setup_init_funcs': ../../../dev/em/e1000_api.c:343: undefined reference to `e1000_init_function_pointers_80003es2lan' e1000_api.o(.text+0xae8):../../../dev/em/e1000_api.c:340: undefined reference to `e1000_init_function_pointers_82571' e1000_api.o(.text+0xafa):../../../dev/em/e1000_api.c:334: undefined reference to `e1000_init_function_pointers_82541' e1000_api.o(.text+0xb0c):../../../dev/em/e1000_api.c:328: undefined reference to `e1000_init_function_pointers_82540' e1000_api.o(.text+0xb1e):../../../dev/em/e1000_api.c:321: undefined reference to `e1000_init_function_pointers_82543' e1000_api.o(.text+0xb30):../../../dev/em/e1000_api.c:316: undefined reference to `e1000_init_function_pointers_82542' e1000_ich8lan.o(.text+0x98c): In function `e1000_valid_nvm_bank_detect_ich8lan': ../../../dev/em/e1000_ich8lan.c:1032: undefined reference to `e1000_translate_register_82542' e1000_ich8lan.o(.text+0xc32): In function `e1000_acquire_swflag_ich8lan': ../../../dev/em/e1000_ich8lan.c:424: undefined reference to `e1000_translate_register_82542' e1000_ich8lan.o(.text+0xc6e):../../../dev/em/e1000_ich8lan.c:426: undefined reference to `e1000_translate_register_82542' e1000_ich8lan.o(.text+0xc9d):../../../dev/em/e1000_ich8lan.c:422: undefined reference to `e1000_translate_register_82542' e1000_ich8lan.o(.text+0xced):../../../dev/em/e1000_ich8lan.c:436: undefined reference to `e1000_translate_register_82542' e1000_ich8lan.o(.text+0x16bf):../../../dev/em/e1000_ich8lan.c:2700: more undefined references to `e1000_translate_register_82542' follow *** Error code 1 Thoughts? Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: igb doesn't compile in STABLE?
At Mon, 14 Jul 2008 14:53:16 -0700, Jack Vogel wrote: > > Just guessing, did someone change conf/files maybe?? > If you build a STABLE kernel with igb AND em then things work and the kernel uses em. I'm not sure which thing needs to be changed in conf/files or otherwise though. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: igb doesn't compile in STABLE?
At Tue, 15 Jul 2008 10:07:22 -0700, Jack Vogel wrote: > > Oh, so the problem is if igb alone is defined? > Yes. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: igb doesn't compile in STABLE?
At Tue, 15 Jul 2008 10:35:57 -0700, Jack Vogel wrote: > > OK, will put on my todo list :) > Thanks. A kernel built that way (i.e. with igb and em) does actually work, which is good, but if you're going to split them up we should get this right before 7.1. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: moving sockbuf in to its own header
At Sun, 20 Jul 2008 16:07:29 -0700, Kip Macy wrote: > > Actually, I'd like to re-factor multiple parts of socketvar in to > separate files. > > Please provide feedback on the following: > > http://www.fsmware.com/socketvar_refactor.diff > Looks good to me. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HEADS UP: E1000 networking changes in STABLE/7.1 RELEASE
Hi Jack, Thanks for this and for the concise pciconf line. We use em (soon to be igb) interfaces extensively at work. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Small patch to multicast code...
Hi, Turns out there is a bug in the code that loops back multicast packets. If the underlying device driver supports checksum offloading then the packet that is looped back, when it is transmitted on the wire, is incorrect, due to the fact that the packet is not fully copied. Here is a patch. Comments welcome. Best, George Index: ip_output.c === --- ip_output.c (revision 181731) +++ ip_output.c (working copy) @@ -1135,7 +1135,7 @@ register struct ip *ip; struct mbuf *copym; - copym = m_copy(m, 0, M_COPYALL); + copym = m_dup(m, M_DONTWAIT); if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen)) copym = m_pullup(copym, hlen); if (copym != NULL) { ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Thu, 21 Aug 2008 22:35:19 +0200, Luigi Rizzo wrote: > > On Thu, Aug 21, 2008 at 03:11:56PM -0400, [EMAIL PROTECTED] wrote: > > Hi, > > > > Turns out there is a bug in the code that loops back multicast > > packets. If the underlying device driver supports checksum offloading > > then the packet that is looped back, when it is transmitted on the > > wire, is incorrect, due to the fact that the packet is not fully > > copied. > > > > Here is a patch. Comments welcome. > > > > Best, > > George > > > > Index: ip_output.c > > === > > --- ip_output.c (revision 181731) > > +++ ip_output.c (working copy) > > @@ -1135,7 +1135,7 @@ > > register struct ip *ip; > > struct mbuf *copym; > > > > - copym = m_copy(m, 0, M_COPYALL); > > + copym = m_dup(m, M_DONTWAIT); > > if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen)) > > copym = m_pullup(copym, hlen); > > if (copym != NULL) { > > I am slightly puzzled -- what is exactly the problem, i.e. what part > of the packet on the wire is incorrect ? The IP header is within hlen so > the m_pullup() should be enough to leave the original content intact. > > The only thing i can think of is that it's the UDP checksum, > residing beyond hlen, which is overwritten somewhere in the > call to if_simloop -- in which case perhaps a better fix is > to m_pullup() the udp header as well ? It is the checksum that gets trashed, yes. > (in any case, it is worthwhile to add a comment to explain > what should be done -- the code paths using m_*() have become > quite fragile with these hw support enhancements that now > require selective modifications on previously shared, readonly buffers). The m_*() routines actually have reasonable comments, it just seems the wrong one was used here. Best, Gerge ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Fri, 22 Aug 2008 03:27:11 +0100, Bruce M. Simpson wrote: > > [EMAIL PROTECTED] wrote: > >> The only thing i can think of is that it's the UDP checksum, > >> residing beyond hlen, which is overwritten somewhere in the > >> call to if_simloop -- in which case perhaps a better fix is > >> to m_pullup() the udp header as well ? > >> > > > > It is the checksum that gets trashed, yes. > > ... > > The m_*() routines actually have reasonable comments, it just seems > > the wrong one was used here. > > > > Actually, m_copy() has been legacy for some time now -- see comments. > > I'd be concerned that the change to m_dup() (which makes a full mbuf > chain copy) rather than m_copym() (which bumps refcounts) is going to > eat into the mbuf clusters on fast links, though it's an easy band-aid > for the problem. I gather you mean that a fast link on which also we're looping back the packet will be an issue? Since this packet is only going into the simloop() routine. > I agree with Luigi that some of the API contract for mbuf(9) doesn't > hold any more now that we have TSO and other offload. I was actually hoping, as the person who last hacked this code, that you might have a suggestion as to a "right" fix. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Fri, 22 Aug 2008 21:42:00 +0200, Luigi Rizzo wrote: > > On Fri, Aug 22, 2008 at 07:43:03PM +0100, Bruce M. Simpson wrote: > > [EMAIL PROTECTED] wrote: > > >I gather you mean that a fast link on which also we're looping back > > >the packet will be an issue? Since this packet is only going into the > > >simloop() routine. > > > > > > > We end up calling if_simloop() from a few "interesting" places, in > > particular the kernel PIM packet handler. > > > > In this particular case we're going to take a full mbuf chain copy every > > time we send a packet which needs to be looped back to userland. > ... > > In the case of ip_mloopback(), somehow we are stomping on a read-only > > copy of an mbuf chain. The use of m_copy() with m_pullup() there is fine > > according to the documented uses of mbuf(9), although as Luigi pointed > > out, most likely we need to look at the upper-layer protocol too, e.g. > > where UDP checksums are also being offloaded. > > in fact, george, if you have an easy way to reproduce the error, > could you see if reverting your change and instead adding > sizeof(struct udphdr) to the length argument in the call to m_pullup() > fixes the problem ? I don't have sample code I can give but it's simple to set up and test. On machine A set up a sender and a listener for the same multicast group/port. On machine B set up a listener. Send from A with the listener on. B should see nothing and its "bad checksums" counter should increase. Turn off listener on A. Send again, B should get the packet. If you listen to the traffic with tcpdump on a 3rd machine you'll see that the checksum is constant, even if the data in the packet, like the ports, is not. Your ethernet cards have to have hardware checksum offloading. I'm using em/igb in 7-STABLE. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Fri, 22 Aug 2008 19:43:03 +0100, Bruce M. Simpson wrote: > > We end up calling if_simloop() from a few "interesting" places, in > particular the kernel PIM packet handler. > > In this particular case we're going to take a full mbuf chain copy every > time we send a packet which needs to be looped back to userland. Right, I know the penalty. > It's been a while since I've done any in-depth FreeBSD work other > than hacking on the IGMPv3 snap, and my time is largely tied up with > other work these days, sadly. > > It doesn't seem right to my mind that we need to make a full copy of > an mbuf chain with m_dup() to workaround this kind of problem. > > Whilst it may suffice for a band-aid workaround, we may see mbuf > pool fragmentation as packet rates go up. > > However we are now in a "new world order" where mbuf chains may be > very tied to the device where they've originated or to where they're > going. It isn't clear to me where this kind of intrusion is > happening. > > In the case of ip_mloopback(), somehow we are stomping on a > read-only copy of an mbuf chain. The use of m_copy() with m_pullup() > there is fine according to the documented uses of mbuf(9), although > as Luigi pointed out, most likely we need to look at the upper-layer > protocol too, e.g. where UDP checksums are also being offloaded. > > Some of the code in the IGMPv3 branch actually reworks how loopback > happens i.e. the preference is not to loop back wherever possible > because of the locking implications. Check the bms_netdev branch > history for more info. Well, what I suspect is the problem are these bits: udp_output(): /* * Set up checksum and output datagram. */ if (udp_cksum) { if (inp->inp_flags & INP_ONESBCAST) faddr.s_addr = INADDR_BROADCAST; ui->ui_sum = in_pseudo(ui->ui_src.s_addr, faddr.s_addr, htons((u_short)len + sizeof(struct udphdr) + IPPROTO_UDP)); m->m_pkthdr.csum_flags = CSUM_UDP; m->m_pkthdr.csum_data = offsetof(struct udphdr, uh_sum); } else ip_mloopback(): copym = m_copy(m, 0, M_COPYALL); if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen)) copym = m_pullup(copym, hlen); if (copym != NULL) { /* If needed, compute the checksum and mark it as valid. */ if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) { in_delayed_cksum(copym); copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA; copym->m_pkthdr.csum_flags |= CSUM_DATA_VALID | CSUM_PSEUDO_HDR; copym->m_pkthdr.csum_data = 0x; } and: in_delayed_cksum(struct mbuf *m) { struct ip *ip; u_short csum, offset; ip = mtod(m, struct ip *); offset = ip->ip_hl << 2 ; csum = in_cksum_skip(m, ip->ip_len, offset); if (m->m_pkthdr.csum_flags & CSUM_UDP && csum == 0) csum = 0x; offset += m->m_pkthdr.csum_data;/* checksum offset */ Somehow the data that the device needs to do the proper checksum offload is getting trashed here. Now, since it's clear we need a writable packet structure so that we don't trash the original, I'm wondering if the m_pullup() will be sufficient. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Fri, 22 Aug 2008 22:43:39 +0100, Bruce M. Simpson wrote: > > [EMAIL PROTECTED] wrote: > > Somehow the data that the device needs to do the proper checksum > > offload is getting trashed here. Now, since it's clear we need a > > writable packet structure so that we don't trash the original, I'm > > wondering if the m_pullup() will be sufficient. > > > > If it's serious enough to break UDP checksumming on the wire, perhaps we > should just swallow the mbuf allocator heap churn and do the m_dup() for > now, but slap in a big comment about why it's there. I think if none of us finds a better way before early next week that's what I'll do so that this at least works in 7.1. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Tue, 26 Aug 2008 14:50:33 + (UTC), Bjoern A. Zeeb wrote: > > On Tue, 26 Aug 2008, George V. Neville-Neil wrote: > > Hi, > > > At Mon, 25 Aug 2008 21:40:38 +0200, > > John Hay wrote: > >> > >> I have tried it and it does fix my problem. RIP2 over multicast works > >> again. :-) > > > > Good to hear. I'm waiting on a bit more feedback but I think I'll be > > checking this in soon, with a big comment talking about the > > performance implications etc. > > So wait a second; what was the m_pullup vs. m_dup thing? Has anyone > actually tried that? I mean using a sledgehammer if a mitten would be > enough is kind of .. uhm. You get it. Perhaps I'm confused, I've been off dealing with other issues for a few days, but m_pullup doesn't make a copy of the packet or its fields, only makes sure that it's contiguous in memory. Am I wrong in that? Since the bug is that two pieces of code modify the same data, in ways that interfere, I'm not sure how we can avoid making a copy. It might be nice to limit the copy, but we'd still need two copies, one for the loopback device and one for the real device. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Tue, 26 Aug 2008 17:56:13 -0700, Sam Leffler wrote: > > [EMAIL PROTECTED] wrote: > > At Tue, 26 Aug 2008 14:50:33 + (UTC), > > Bjoern A. Zeeb wrote: > > > >> On Tue, 26 Aug 2008, George V. Neville-Neil wrote: > >> > >> Hi, > >> > >> > >>> At Mon, 25 Aug 2008 21:40:38 +0200, > >>> John Hay wrote: > >>> > I have tried it and it does fix my problem. RIP2 over multicast works > again. :-) > > >>> Good to hear. I'm waiting on a bit more feedback but I think I'll be > >>> checking this in soon, with a big comment talking about the > >>> performance implications etc. > >>> > >> So wait a second; what was the m_pullup vs. m_dup thing? Has anyone > >> actually tried that? I mean using a sledgehammer if a mitten would be > >> enough is kind of .. uhm. You get it. > >> > > > > Perhaps I'm confused, I've been off dealing with other issues for a > > few days, but m_pullup doesn't make a copy of the packet or its > > fields, only makes sure that it's contiguous in memory. Am I wrong in that? > > > > Since the bug is that two pieces of code modify the same data, in ways > > that interfere, I'm not sure how we can avoid making a copy. It might > > be nice to limit the copy, but we'd still need two copies, one for the > > loopback device and one for the real device. > > > > > pull the headers up. copy just the headers. no deep copy. > I'm confused, if it's these lines that are screwed up: /* If needed, compute the checksum and mark it as valid. */ if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) { in_delayed_cksum(copym); copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA; copym->m_pkthdr.csum_flags |= CSUM_DATA_VALID | CSUM_PSEUDO_HDR; copym->m_pkthdr.csum_data = 0x; in particular that last line, then how does pulling up the header help? That's not part of the packet, that's the checksum data in the pkthdr itself. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Small patch to multicast code...
At Fri, 29 Aug 2008 18:28:53 +0200, Luigi Rizzo wrote: > > and to be more explicit - the result of m_pullup is that > the number of bytes specified as m_pullup argument are in > a private piece of memory -- the 'data' region within the mbuf -- so > you can freely play with them without trouble. > > That is why i suggested to just increase the argument to m_pullup > by the size of the udp header so one can overwrite the checksum > within the mbuf without touching the shared part in the cluster > (if any). I tried various versions of that, but then I noticed that I also had to save out the pkthdr structure as well. Did you come up with a faster workable patch? For now I'm going to commit the patch I sent originally. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Proposed patch, convert IFQ_MAXLEN to kernel tunable...
Hi, It turns out that the last time anyone looked at this constant was before 1994 and it's very likely time to turn it into a kernel tunable. On hosts that have a high rate of packet transmission packets can be dropped at the interface queue because this value is too small. Rather than make a sweeping code change I propose the following change to the macro and updating a couple of places in the IP and IPv6 stacks that were using this macro to set their own global variables. I have tested this in my test lab at work, it is not as yet in production at my day job, but will be soon. Best, George Index: netinet/ip_input.c === --- netinet/ip_input.c (revision 183299) +++ netinet/ip_input.c (working copy) @@ -133,7 +133,6 @@ struct pfil_head inet_pfil_hook; /* Packet filter hooks */ static struct ifqueue ipintrq; -static int ipqmaxlen = IFQ_MAXLEN; extern struct domain inetdomain; extern struct protosw inetsw[]; @@ -265,7 +264,7 @@ /* Initialize various other remaining things. */ ip_id = time_second & 0x; - ipintrq.ifq_maxlen = ipqmaxlen; + ipintrq.ifq_maxlen = IFQ_MAXLEN; mtx_init(&ipintrq.ifq_mtx, "ip_inq", NULL, MTX_DEF); netisr_register(NETISR_IP, ip_input, &ipintrq, NETISR_MPSAFE); } Index: net/if.c === --- net/if.c(revision 183299) +++ net/if.c(working copy) @@ -135,7 +135,14 @@ #endif intif_index = 0; -intifqmaxlen = IFQ_MAXLEN; + +int ifqmaxlen = 50; +TUNABLE_INT("net.ifqmaxlen", &ifqmaxlen); + +SYSCTL_INT(_net, OID_AUTO, ifqmaxlen, CTLFLAG_RD, + &ifqmaxlen, 0, + "interface queue length"); + struct ifnethead ifnet;/* depend on static init XXX */ struct ifgrouphead ifg_head; struct mtx ifnet_lock; Index: net/if.h === --- net/if.h(revision 183299) +++ net/if.h(working copy) @@ -221,7 +221,7 @@ #defineIFCAP_WOL (IFCAP_WOL_UCAST | IFCAP_WOL_MCAST | IFCAP_WOL_MAGIC) #defineIFCAP_TOE (IFCAP_TOE4 | IFCAP_TOE6) -#defineIFQ_MAXLEN 50 +#defineIFQ_MAXLEN ifqmaxlen #defineIFNET_SLOWHZ1 /* granularity is 1 second */ /* Index: netinet6/ip6_input.c === --- netinet6/ip6_input.c(revision 183299) +++ netinet6/ip6_input.c(working copy) @@ -115,7 +115,6 @@ u_char ip6_protox[IPPROTO_MAX]; static struct ifqueue ip6intrq; -static int ip6qmaxlen = IFQ_MAXLEN; struct in6_ifaddr *in6_ifaddr; extern struct callout in6_tmpaddrtimer_ch; @@ -178,7 +177,7 @@ printf("%s: WARNING: unable to register pfil hook, " "error %d\n", __func__, i); - ip6intrq.ifq_maxlen = ip6qmaxlen; + ip6intrq.ifq_maxlen = IFQ_MAXLEN; mtx_init(&ip6intrq.ifq_mtx, "ip6_inq", NULL, MTX_DEF); netisr_register(NETISR_IPV6, ip6_input, &ip6intrq, 0); scope6_init(); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
At Wed, 24 Sep 2008 00:17:18 +0400, Ruslan Ermilov wrote: > > Hi, > > On Tue, Sep 23, 2008 at 03:29:06PM -0400, [EMAIL PROTECTED] wrote: > > It turns out that the last time anyone looked at this constant was > > before 1994 and it's very likely time to turn it into a kernel > > tunable. On hosts that have a high rate of packet transmission > > packets can be dropped at the interface queue because this value is > > too small. Rather than make a sweeping code change I propose the > > following change to the macro and updating a couple of places in the > > IP and IPv6 stacks that were using this macro to set their own global > > variables. > > > > I have tested this in my test lab at work, it is not as yet in > > production at my day job, but will be soon. > > > It's not that bad -- most modern Ethernet drivers initialize interface > input queues themselves, and don't depend on IFQ_MAXLEN. The IPv4 > input queue is tunable via net.inet.ip.intr_queue_maxlen. The IPv6 > queue can similarly be made tunable. I agree that ifqmaxlen can be > made tunable because there's still a lot of (mostly for old hardware) > drivers that use ifqmaxlen and IFQ_MAXLEN, but I'm against changing > the definition of IFQ_MAXLEN. Imagine some code like this: > Sorry, this is about the output queue, not the input queue. Though there are both input and output queues that depend on this. > void *x[IFQ_MAXLEN]; // here it's 50 > > And some function that does: > > for (i = 0; i < IFQ_MAXLEN; i++) {// not necessarily 50 > x[i] = NULL; > } > I found no occurrences of the above in our code base. I used cscope to search all of src/sys. Are you aware of any occurrences of this? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
At Wed, 24 Sep 2008 15:50:32 +0100, Bruce M. Simpson wrote: > > Hi, > > I agree with the intent of the change that IPv4 and IPv6 input queues > should have a tunable queue length. However, the change provided is > going to make the definition of IFQ_MAXLEN global and dependent upon a > variable. > > [EMAIL PROTECTED] wrote: > > Hi, > > > > It turns out that the last time anyone looked at this constant was > > before 1994 and it's very likely time to turn it into a kernel > > tunable. On hosts that have a high rate of packet transmission > > packets can be dropped at the interface queue because this value is > > too small. Rather than make a sweeping code change I propose the > > following change to the macro and updating a couple of places in the > > IP and IPv6 stacks that were using this macro to set their own global > > variables. > > > > This isn't appropriate for many uses of ifq's which might be internal to > a given driver or subsystem, and which may use IFQ_MAXLEN for > convenience, as Ruslan has pointed out. I have code elsewhere which does > this. > > Can you please do this on a per-protocol stack basis? i.e. give IPv4 and > IPv6 their own TUNABLE queue length. > Actually what we'd need is N of these, since my target is actually the send queue, not the input queue. Let me look at this some more. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...
At Wed, 24 Sep 2008 12:53:31 -0700, John-Mark Gurney wrote: > > George V. Neville-Neil wrote this message on Tue, Sep 23, 2008 at 15:29 -0400: > > It turns out that the last time anyone looked at this constant was > > before 1994 and it's very likely time to turn it into a kernel > > tunable. On hosts that have a high rate of packet transmission > > packets can be dropped at the interface queue because this value is > > too small. Rather than make a sweeping code change I propose the > > following change to the macro and updating a couple of places in the > > IP and IPv6 stacks that were using this macro to set their own global > > variables. > > The better solution is to resurrect rwatson's patch that eliminates the > interface queue, and does direct dispatch to the ethernet driver.. > Usually the driver has a queue of 512 or more packets already, so putting > them into a second queue doesn't provide much benefit besides increasing > the amount of locking necessary to deliver packets... Actually I am making this change because I found on 10G hardware the queue is too small. Also, there are many systems where you might want to up this, usually ones that are highly biased towards transmit only, like a multicast repeater of some sort. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: if_bridge.ko requires INET6...
At Sat, 4 Feb 2006 16:16:49 +0100, Max Laier wrote: > Here it is. I'd appreciate feedback. pflog_packet() uses a lot of complex > types which makes it necessary to include pfvar.h. This is ugly, but I don't > know how to work around this. I gave this a quick read and it looked OK to me. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Zeroing wrong union member in in6_control()
At Tue, 07 Feb 2006 11:38:09 -0500, James Juran wrote: > > [1 ] > In what looks like a copy&paste remnant from the preceding case, the > wrong union member is used as the first argument to bzero in > in6_control(). This doesn't cause an actual bug, but making this change > would improve code clarity and robustness to change and also avoids a > warning from a certain static analysis tool. > > I'm not a regular FreeBSD contributor, so if this patch is worthwhile > can someone please apply it? If I should send things like this to a > different mailing list in the future, please let me know. The Kame list is still the best one for this (kame <[EMAIL PROTECTED]>) but I've forwarded it for you. I'll take care of getting this into FreeBSD though. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Separating the kernel and user land versions of PF_KEY.
Hi Folks, The attached patch makes it so that the user land and kernel land versions of the pf_key structures are different and therefore no longer dependent. This is one step in moving us away from the place we're in now where changes to one side require changes in the other. At some point soon a more full overhaul of the code will take place, likely along the lines of the code found in OpenBSD (look at net/pfkey*.[ch] there). Please send feedback etc. on this patch along to me. BTW Although this patch contains a lot of p4 cruft it applies cleanly against HEAD, at least as of a few days ago and passes the CT test suite ipsec4 which is available by installing the ct port. Thanks, George Change 89716 by [EMAIL PROTECTED] on 2006/01/15 02:41:52 First cut at removing PF_KEY data structures from the keydb. This code does not work completely yet but needs to be saved. Affected files ... ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/ipsec.c#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key.c#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key_var.h#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/keydb.h#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_ah.c#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_esp.c#2 edit ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_tcp.c#2 edit Differences ... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/ipsec.c#2 (text+ko) Index: sys/netipsec/ipsec.c --- sys/netipsec/ipsec.c.~1~Sun Feb 5 15:06:16 2006 +++ sys/netipsec/ipsec.cSun Feb 5 15:06:16 2006 @@ -92,6 +92,7 @@ #include +#define IPSEC_DEBUG #ifdef IPSEC_DEBUG int ipsec_debug = 1; #else //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key.c#2 (text+ko) Index: sys/netipsec/key.c --- sys/netipsec/key.c.~1~ Sun Feb 5 15:06:16 2006 +++ sys/netipsec/key.c Sun Feb 5 15:06:16 2006 @@ -420,7 +420,10 @@ static struct mbuf *key_setsadbxsa2 __P((u_int8_t, u_int32_t, u_int32_t)); static struct mbuf *key_setsadbxpolicy __P((u_int16_t, u_int8_t, u_int32_t)); -static void *key_dup(const void *, u_int, struct malloc_type *); +static struct seckey *key_dup_keymsg(const struct sadb_key *, u_int, +struct malloc_type *); +static struct seclifetime *key_dup_lifemsg(const struct sadb_lifetime *src, + struct malloc_type *type); #ifdef INET6 static int key_ismyaddr6 __P((struct sockaddr_in6 *)); #endif @@ -488,6 +491,10 @@ static int key_senderror __P((struct socket *, struct mbuf *, int)); static int key_validate_ext __P((const struct sadb_ext *, int)); static int key_align __P((struct mbuf *, struct sadb_msghdr *)); +static struct mbuf *key_setlifetime(struct seclifetime *src, +u_int16_t exttype); +static struct mbuf *key_setkey(struct seckey *src, u_int16_t exttype); + #if 0 static const char *key_getfqdn __P((void)); static const char *key_getuserfqdn __P((void)); @@ -909,8 +916,8 @@ /* What the best method is to compare ? */ if (key_preferred_oldsa) { - if (candidate->lft_c->sadb_lifetime_addtime > - sav->lft_c->sadb_lifetime_addtime) { + if (candidate->lft_c->addtime > + sav->lft_c->addtime) { candidate = sav; } continue; @@ -918,8 +925,8 @@ } /* preferred new sa rather than old sa */ - if (candidate->lft_c->sadb_lifetime_addtime < - sav->lft_c->sadb_lifetime_addtime) { + if (candidate->lft_c->addtime < + sav->lft_c->addtime) { d = candidate; candidate = sav; } else @@ -930,7 +937,7 @@ * suitable candidate and the lifetime of the SA is not * permanent. */ - if (d->lft_c->sadb_lifetime_addtime != 0) { + if (d->lft_c->addtime != 0) { struct mbuf *m, *result; u_int8_t satype; @@ -2787,9 +2794,9 @@ } else { KASSERT(sav->iv == NULL, ("iv but no xform")); if (sav->key_auth != NULL) - bzero(_KEYBUF(sav->key_auth), _KEYLEN(sav->key_auth)); + bzero(sav->key_auth->key_data, _KEYLEN(sav->key_auth)); if (sav->key_enc != NULL) - bzero(_KEYBUF(sav->key_enc), _KEYLEN(sav->key_enc)); + bzero(sav->key_enc->key_data, _KEYLEN(sav->key_enc)); } if (sav->key_auth != NULL) { free(sav->key_auth, M_IPSEC_MISC); @@ -3038,9 +3045,11 @@
Re: FAST_IPSEC and tunnelled packets processing
At Thu, 9 Mar 2006 15:53:03 +0100, VANHULLEBUS Yvan wrote: > > On Wed, Mar 08, 2006 at 08:02:36PM -0800, Sam Leffler wrote: > [.] > > If I recall the IPIP handling is different from KAME because there is > > support for IPIP encapsulation independent of the IPsec protocols while > > KAME only handles IPIP as part of the ESP tunnel configuration. As to > > overhead, in practice, at least back in 4.x where this work was > > originally done, the netisr dispatch was effectively shortcircuited > > because the dispatch was done from the netisr thread so the net cost was > > a enqueue+dequeue of the packet. I'm not sure about extraneous trips > > through ip_input or not stripping headers; this stuff used to work right > > but I've not looked at the code in years. > > There IS some code to remove the IPIP header, but it doesn't work. > > I just reported pr kern/94273 with a patch which solves it. > Bug taken by me :-) I'll try your patch and commit as necessary. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IP_SENDIF?
At Sun, 19 Mar 2006 21:34:19 -1000 (HST), Dave Cornejo wrote: > > Hi, > > Some time ago (Oct 2004) there was some talk of implementing > IP_SENDIF, a search of the mailing list turns up nothing since then. > Did anything ever happen with this? > No, but if you have a patch we're up for reviewing it ;-) It remains on a long list of things todo. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: IPv6 raw socket to send original udp
At Mon, 08 May 2006 05:44:51 +0900 (JST), Hideki Yamamoto wrote: > > > Hi, > > I tried to use pf as a traffic shaper for a streaming server, but > it does not work well. Input of pf is bursted packets within around 20 > msec, but is not bursted packets within around 100 msec or longer. > This traffic pattern is the feature of the streaming server. > > As pf is does not work well, I am thinking designinig original shaper > command on bridge-like freebsd box, and that the command will receive > the sever packet via libpcap, shape it and then send it constantly to > another device. To send packet from bridge-like freebsd box, I plan > to use RAW IPV6 socket. However in my small experiment, it does not > seems good, IP_HDRINCL option does not woks. > > I wonder if IPv6 raw socket can be used only for ICMPv6. > I would like to use IPv6 raw socket for original udp packet. > > Thanks in advance. > Hi, I have trimmed the cc to just -net because I am concerned mostly about the possibility of a bug in the networking code. Can you provide more information on what you're seeing on the raw IPv6 socket? If you could send a chunk of code, that might help as well. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: nd6_lookup prints bogus messages with point to point devices
At Tue, 23 May 2006 13:43:01 +0900, jinmei wrote: > Thanks, please do to. I believe the patch also fixes this problem > report: > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/93220 > > So, could you also confirm this and give feedback to (or close) the > report? (I'll send a follow-up message to the report by myself it > it's appropriate). > Will do. That was on my list to look at anyways. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: enc0 patch for ipsec
I knew there was something bothering me about enc, now I know what it was. I'm glad someone else caught this and that you fixed it. Thanks. I'll be testing the patch today. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: SCTP
At Fri, 30 Jun 2006 12:36:10 -0400, randall wrote: > > Hi all: > > The following link: > > http://www.sctp.org/cvs_diff_6_30.bz2 > > Will get you a large patch that you can apply to Current that will > add SCTP. > > Its a bzip2 patch file since it is so large :-D > > It includes the changes to a few base files.. and mainly its the > complete files diff'd against this mornings current cvs... > > Yes, I know that the build is broken in acpi/acpi_asus but the sctp > code did compile and build a kernel for me... so once the above is > fixed.. you should be able to use the patch and check it out :-D > > Oh, you will need to add > > option SCTP > > to your kernel conf... and it might not > hurt to do a make sysent in sys/kern > > I will prepare a seperate file for the overall libsctp.a > once I figure out where it should go :-D > > Happy SCTPing.. and if you have any problems with the patch please > send me an email :-D And please start testing this because many of us want to integrate this in the near future :-) Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: SCTP
At Mon, 3 Jul 2006 09:48:06 +0200, Marcin Jessa wrote: > > And please start testing this because many of us want to integrate > > this in the near future :-) > > Any hints on how to test SCTP ? > Not much really about any practical implementation of it > on http://www.sctp.org/ One trivial toy to play with is NetPIPE, but that's just a bandwidth tester. It does show socket programming with SCTP, which is relatively the same as TCP, until you get to the advanced features, which NetPIPE doesn't cover. You'll need my updated NetPIPE until the patches are committed to their project: http://www.freebsd.org/~gnn/netpipe.tar.gz I suspect Randall has a better list of things to try. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Possible inconsistency in the use of in6_delmulti()
At Tue, 18 Jul 2006 12:03:20 -0700, Tom Parker wrote: > > Hi, > > New to the list here, but fairly familiar with the innards of (at > least an older) version of the fbsd networking code. I'm fortunate in > my ability to run purify on a simulated instance of our ported version > of the networking code. Purify has picked up a problem that I'm a bit > mystified as how it can be fixed. It is present in current versions > also, I'm interested in any comments people have (I think ours is 4.4 > vintage, but it is hard to tell). > > As far as I can tell, in most calling paths when in6_delmulti() is > called, it is done after the in6_multi_mship structure has been > removed from the im6o_memberships list in the relevant PCB. This > applies to in6_ifdetach(), in6_pcbpurgeif0, ip6_setmoptions() etc. > However in in6_purgeaddr() in6_delmulti is called straight off. I'm > not sure if we've violated some usage convention, but purify is > telling me this causes access violations when we then leave the same > group using setsockopt(). in6_purgeaddr is called when we remove the > address from the interface. > > This should be possible in a real kernel. Add a multicast address to > an interface, open a socket and listen to the address, then remove the > address from the interface. > > Am I missing something here or is this a nasty problem in both the > kernel and our stack port? > It sounds like a bug to me. Can you file a PR? Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Packet Construction and Protocol Testing...
Hi, Sorry for the length of this email but I figured I'd get this out early in case there was anyone else who wanted to play with this. I have now gotten out version 0.1 of the Packet Construction Set. This is a set of Python libraries which make writing protocol testing software much easier. Of course, you have to know Python, but many people do, and I favor it strongly over other scripting choices. The Summer of Code student I'm working with has also been using this library, with favorable results. The Source Forge page is here: http://sourceforge.net/projects/pcs and the shar files submitted to get the ports created are now on: http://www.freebsd.org/~gnn/pcs.port.shar http://www.freebsd.org/~gnn/py-pypcap.shar The point of all this is to be able to write better protocol level tests for our network stack. Examples are in the scripts/ and tests/ directories of the package but a quick snippet may give a good idea of what I'm getting at: def test_icmpv4_ping(self): ip = ipv4() ip.version = 4 ip.hlen = 5 ip.tos = 0 ip.length = 84 ip.id = 1 ip.flags = 0 ip.offset = 0 ip.ttl = 33 ip.protocol = IPPROTO_ICMP ip.src = 2130706433 ip.dst = 2130706433 icmp = icmpv4() icmp.type = 8 icmp.code = 0 icmp.cksum = 0 echo = icmpv4echo() echo.id = 32767 echo.seq = 1 lo = localhost() lo.type = 2 packet = Chain([lo, ip, icmp, echo]) input = PcapConnector("lo0") input.setfilter("icmp") output = PcapConnector("lo0") assert (ip != None) out = output.write(packet.bytes, 88) assert (out == 88) This code sends a quick and dirty, ICMPv4 ping packet on localhost. The point of all this is to be able to specify packets easly (see pcs/packets/xxx.py) and then to treat the packet as an object. I intend to write up a paper on this stuff as well. There is currently a simple manual (PDF and LaTeX) in the package. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet Construction and Protocol Testing...
At Thu, 20 Jul 2006 10:40:41 -0400, Chuck Swiger wrote: > This strikes me as a pretty cool thing, thank you for putting the source out > there...given a bit of free time, I'd like to at least test this, if not > contribute. [1] :-) Thanks :-) > The port is missing a dependency on net/py-pcap, BTW, which makes most of the > tests fail if one simply downloads the shar file and tries to run them: > For now I wanted to make them separate though the documentation points out that you can't use the PCAP connector without py-pypcap. I may add the dependency in a future release. Thanks, for the patch! > [1]: If I could only get net/py-pcap to build, I might be able to do a little > more... :-) You only need net/py-pypcap, but if that's what you meant please let me know what the build problem is. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet Construction and Protocol Testing...
At Thu, 20 Jul 2006 10:48:14 -0400 (EDT), Andrew R. Reiter wrote: > > > Aren't there already tools for doing this -- libnet / libdnet that both > have py wrappers? I looked at all those, and more, but they miss an important point. That is, in PCS you define a packet like this (from pcs/packets/ipv4.py): def __init__(self, bytes = None): """ define the fields of an IPv4 packet, from RFC 791 This version does not include options.""" version = pcs.Field("version", 4, default = 4) hlen = pcs.Field("hlen", 4) tos = pcs.Field("tos", 8) length = pcs.Field("length", 16) id = pcs.Field("id", 16) flags = pcs.Field("flags", 3) offset = pcs.Field("offset", 13) ttl = pcs.Field("ttl", 8, default = 64) protocol = pcs.Field("protocol", 8) checksum = pcs.Field("checksum", 16) src = pcs.Field("src", 32) dst = pcs.Field("dst", 32) pcs.Packet.__init__(self, [version, hlen, tos, length, id, flags, offset, ttl, protocol, checksum, src, dst], bytes = bytes) # Description MUST be set after the PCS layer init self.description = "IPv4" which creates a properties in the object to hold the named field. This is what makes it possible to do: ip = ipv4() ip.ttl = 64 ip.src = inet_pton("128.32.1.1") etc. in your program. Also note that the bit lengths can be odd, such as getting the 13 bit offset field. So, PCS is doing all the packing and unpacking of the bytes for you. I intend to put in automatic bounds checking in an upcoming version. There is much more about this in the documentation, docs/pcs.pdf in the package. Future versions will allow import/export to various formats as well, such as XML, so that defining packets will be even easier as will writing tools. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet Construction and Protocol Testing...
At Fri, 21 Jul 2006 21:17:39 +0200, troglocan wrote: > > Hi, > > Sorry for the late reply, I just read the thread. Did you take a look > at Scapy (http://www.secdev.org/scapy). It does exactly (and more) > what you are trying to do ... > > a+ > > ps : also, Scapy6 (http://namabiiru.hongo.wide.ad.jp/scapy6/) provides > extension of Scapy for IPv6 (some parts of what is advertised on main > page are currently being reviewed and have been extracted of main file > temporarily). > Yup, looked at it. A single file, hard to maintain, and does not support creating arbitrary packets in the way PCS does, but it does have some interesting features. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Changes in the network interface queueing handoff model
At Sun, 30 Jul 2006 15:04:48 +0100 (BST), rwatson wrote: > Conceptual review as well as banchmarking, etc, would be most welcome. > I remember talking about this at BSDCan and certainly for high end hardware it seems that it's the right way to go. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ipv6 in ipv6 tunnel with FreeBSD 4.11
At Fri, 18 Aug 2006 15:28:11 + (GMT), Julien Abeillé wrote: > Hi, > > I am using freebsd 4.11 and trying to setup ipv6 in ipv6 tunnels. All my stuff is on HEAD and 6 so I don't know if this applies but I think it should. > I have the following testbed > 4 machines connected in line: > > M1---M2FreeBSD---M3 > c::1---c::2 | b::2b::1 | a::1---a::2 > > I want to create a tunnel between FreeBSD (b::1) and M2 (b::2) > > Here is my configuration on the FreeBSD machine: > em0 : a::1/64 > em1 b::1/64 > > I do the folllowing to setup the tunnel: > > ifconfig gif0 create > ifconfig gif0 tunnel b::1 b::2 > ifconfig gif0 d::1/64 > route add -inet6 -host c::1 -interface gif0 > > I am not sure about what is the gif0 address d::1/64 used for. > Nor am I. What directions are you following? I believe that may be there because the gif tunnel instructions talked about setting up IPv4 tunnels for IPv6. > the problem is: when i ping or send any traffic from a::2 to c::1, > the FreeBSD machine adds an ipv6 header with b::1 as source, b::2 as > destination, but with hop count limit=0 > > Is my configuration ok? A few things to note: 1) You need to have ipv6_gateway_enable="YES" set to forward packets. 2) Are you trying to tunnel between two interfaces on the same machine? It's hard to tell from your description. If the FreeBSD box is a router between two tunnels then you need two tunnel endpoints. One pointing at M2 and one pointing at M3. Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Re : ipv6 in ipv6 tunnel with FreeBSD 4.11
At Sat, 19 Aug 2006 11:45:13 + (GMT), Julien Abeillé wrote: > > Hi George, > > thanks for your answer. A few precisions then: I do two setups in > fact, one on IMUNES network emulator (this is why I use FreeBSD > 4.11), one with 4 real machines. The one with four real machines has > no tunnel endpoint. I know it is a bit weard, but the other machines > are linux machines, and I did not want to go in compatibility > problems (if there are some?). I don't know if there are compatability issues with Linux but I doubt it as the same people developed the protocol stacks, at least initially. > On this testbed (with the real machines), I just send trafic from M3 > through the FreeBSD machine. I did not set > ipv6_gateway_enable="YES", but use sysctl. I do not have a BSD here > (internet cafe) so i do not remember the exact parameter > (net.inet6.ip6.forwarding?) but i set ipv6 forwarding to one and > without tunnels I can ping from one end to the other. One question: > are the two tunnel endpoints supposed to negociate something? If > yes, I do need another endpoint. Nope, they don't need to negotiate anything, the machines are just acting as routers. You also need to have appropriate routes set. > In the IMUNES simulation, I have the 4 machines inline the same way > (M1 M2 M3 M4 ) and setup the tunnel on M2 and M3 (between b::1 and > b::2). It works but with hop count limit=0. I did the same setup > with 5 machines inline (M1 M2 M3 M4 M5) and a tunnel between M2 and > M4. It does not work anymore: if i send trafic through the tunnel > from M2 to M4, M3 discards the packets and sends an icmpv6 "time > exceeded..." message to M2. > That is odd, but it may be that one of the machines is considering the next hop address to be link local, and not global, in which case it might set the hop limit to be 1, and then it would be decremented to 0 at the other end of the tunnel. Make sure you're not using link local addresses on your tunnel endpoints. > I will try on monday without giving an IPv6 address to the gif > interface. Indeed I followed the instructions on the FreeBSD > handbook section IPv6 for IPv6 in IPv4 tunnels. The problem is I did > not find any instructions for IPv6 in IPv6. The only thing I found > in kame was: "be careful with IPv6 in IPv6 and IPv4 in IPv4 tunnels > which often result in infinite routing in the kernel". Maybe it is > what is happening here. It could be, but I don't have a setup like that to test. You might also ask on the [EMAIL PROTECTED] mailing list as well. Also, keep freebsd-net@freebsd.org cc'd as someone else might be able to answer this better than I. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: RFC: TSO patch for current
At Fri, 1 Sep 2006 15:51:21 -0700, Jack Vogel wrote: > > This is a patch for the stack and the em driver to enable TSO > on CURRENT. Previously I had problems getting it to work, but > this is functional. > > I should note that CURRENT is being a pain right now, when > I comment out em in the config the kernel panics coming up, > so I had to substitute this code into the tree. Rather bizarre :) > > I have this functionality running on a 6.1 based system, and > our test group is already testing against that driver, so far > things are looking good. > > I have designed it so the driver can continue to be built > without support. There is also a sysctl in the stack code > so you can set net.inet.tcp.tso_enable on or off and > compare. > > I know there may be some refinements to add in, but I > would like to get this into CURRENT as a start. > > Comments? A single read through of the patch looks OK to me. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ipv6 host routes
At Sun, 3 Sep 2006 15:22:14 +0200, John Hay wrote: > > Hi, > > Does anybody know how to add a direct IPv6 host route that actually works? > What I mean is not through a gateway, but for one directly reachable. > > I know it normally isn't needed because it will just work, but I'm > trying to add FreeBSD IPv6 capability to net/olsrd. It looks like I have > most of the rest working, but this is one of the last things tripping > me up. > > I have played for most of the morning with various incantations of > "route add -inet6 -host ..." and just get various non working > routes. For my test I have the machines configured on the same IPv6 > subnet and without adding anything special I can ping them, but not > after adding a route. > > The reason they (the olsr guys) do it is so that a router can have > multiple WiFi interfaces all configured on the same subnet. Then > when they get comms with a machine, they can add a route to it > through that interface. > > At the moment I'm not even at the point of trying to get multiple > interfaces on the same subnet working, although I would like to > do that in the future. It would help if you have a high-site with > multiple antennas and radios. > > So anybody that know how to add a direct IPv6 host route on FreeBSD? > Can you show us the commands, network layout and the output of netstat -r and ndp -a? Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
ALpha Release 0.2 of Packet Construction Set
This release includes checksumming for IP and ICMP packets (based on the algorithm in RFC 792) and LengthValue fields so you can easily encode things like DNS labels and the like. About half the work was done by Clement, our SoC student working on IPv6 security issues. As always comments welcome. http://pcs.sourceforge.net I hope to start writing some actual tests now that I have the ability to handle most of the relevant packet level code. BTW The package comes with quite a bit of documentation for an alpha release, as well as demo and test scripts to play with. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Can someone take a look at PR 89061 (ipv6 autoconfigure 6to4)
At Fri, 8 Sep 2006 17:14:18 -0700, Matt Reimer wrote: > > Can someone take a look at PR 89061 > (http://www.freebsd.org/cgi/query-pr.cgi?pr=89061). It contains a > patch adding an /etc/rc.conf knob to autoconfigure an RFC 3068 6to4 > address. > The comments in the PR indicate that awk can't be used at that point, so if that's true, while I think it's a good idea, the implementation will have to changed. From an IPv6 standpoint it's fine though. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
AsiaBSDCon 2007
Hi Folks, Sorry for the slightly OT email but I'm hoping some of the people dilligently working away on FreeBSD will submit papers and presentations the upcoming AsiaBSDCon 2007 to be held in Tokyo Japan in March 2007. See this link: http://asiabsdcon.org/ Thanks, and now back to our regularly scheduled program :-) Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Tentative first patch for FAST_IPSEC with IPv6
Howdy, There is now a patch at http://people.freebsd.org/~gnn/fast_ipv6.patch which should allow you to run FAST_IPSEC with IPv6. It is very new, it has passed most TAHI tests, and does not, so far as I know, panic the kernel. This is a patch against HEAD. Please test and send feedback. There is still more to do but at least this is now starting to work. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
HEADS UP, minor change to IPv6 link local address setup
Hi Folks, I just committed to HEAD a minor change to our IPv6 support. Unless a user sets ipv6_enable to YES in rc.conf link local addresses will NOT appear on any interface. This seems to make some sense because you shouldn't have them if you didn't ask for IPv6 to be enabled. IPv6 remains in the kernel by default. Please let me know if there are any issues with this change. I did test this but of course not as extensively as all of you can. I intend to MFC this in 3 days if re@ is willing to let me. Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Problems under test of IPv6 Ready Logo Program Phase-2
At Wed, 18 Oct 2006 19:08:57 -0800, chenxiaochen wrote: > > Dear all, This is my second letter here, I am beginning to love here > for there are many kindly friends, such as SUZUKI > Shinsuke<[EMAIL PROTECTED]> :) Ok,questions follows... Does someone > do research on IPv6 Ready Logo Program? Now I am doing IPv6 > conformance test under TAHI platform and I meet some problems. Though some of us are using TAHI I do not believe the project itself is going for the Logo Program. I am working on IPv6 and IPSec and using TAHI regularly. > My test setup is below: > -+---+- Link1 >| | >| | >|rl0 | rl1 > TN NUT >|bge0 | rl0 >| | >| | > -+---+- Link0 > TN:IBM desktop PC,OS is FreeBSD6.1; > NUT:IBM desktop PC,OS is FreeBSD6.1 > --- rl0,rl1,bge0 stand for the NICs of TN and NUT. > My test software is v6eval-3.0.10 and package are Self_Test_1-4-2 and > v6eval-remotes-3.0. > > 1. Section 5: RFC 2463 - ICMPv6 >"case 11 Part B: Multicast Destination" --- fail >After TN send Echo Request to global multicast address(ff1e::1:2), the > following words appear on NUT's screen-rl1:discard oversize frame (ether > type 86dd flags 3 len 1514 > max 1294 ) >However, "case 10 Part A: Unicast Destination" passes. > > 2. Section 2: RFC 2461 - Neighbor Discovery for IPv6 >"127 Part C: Sending Unsolicited RA (Min Values)" --- fail >After NUT excutes rtadvd, TN says "Could't observe RA". >The corresponding rtadvd.conf is I don't believe that you need to run your own RA. TAHI is usually self contained. >But when I use Ethereal to capture the IP package, I get RA about 6 > seconds later after rtadvd is excuted. >The captured RA's parameters are: >cur hop limit--64 >router lifetime--1800 >reachable tiem--0 >retrans time--0 >valid lifetime--0x00278d00 >preferred lifetime--0x00093a80 > You shoudl check if this "just works" without the RA. > 3. Section 3: RFC 2462 - IPv6 Stateless Address Autoconfiguration >All cases fail >ReasonTN can't observe DAD process. >I can't capture DAD packages by Ethereal in the network start process. > >But I can get DAD packages on IBM T43(NIC is bge0, OS is FreeBSD >6.1) and T30(NIC is fxp0, OS is FreeBSD 5.4) when the network >start( host test). Someone ever told me that --- "there is a bug >in FreeBSD's kernel which prevents DAD being sent. You have to >force ethernet card into any mode rather than auto-select before >it is activated, by modifying rc.network" As if rc.network has >been change to netstart in FreeBSD 6.1. But I don't know how to >modify it. I have not heard of this and don't have that hardware so can't check it. > By the way, these is a bug I found about IPv6 Ready Logo Program > Phase-2 auotmatic test. Hope this informaiton below will be useful > to you. > > 1.install v6eval-remotes-3.0 > 2.# cd /usr/local/v6eval/bin/freebsd-i386/ > 3.# ee racontrol.rmt > -- > line 288 > "\t:rtime#$rOpt_retrans:" should be changed into "\t:retrans#$rOpt_retrans:" > -- > > Best, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: check internet connection
At Fri, 27 Oct 2006 09:43:17 +1000, Sam Wun wrote: > > [1 ] > Hi, > > I want to write a C program to check freebsd's internet connection. > What s the best way to achieve this checking in layer 2 or 3 of the tcp/ip > stacks in freebsd? What do you want to check? There are many layers of connectivity. Later, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Path MTU discovery broken in IPSec
Hi Khetan, I'm confused as to why you attribute this to PMTU discovery. Do you see ICMP errors indicating that? Have you run traceroutes in both directions from each host? Thanks, George ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"