Re: OFED stack, RDMA, ipoib help needed

2012-05-15 Thread gnn
At Tue, 8 May 2012 12:11:20 +0200,
Gergely CZUCZY wrote:
> 
> Hello,
> 
> I'd like to ask a few question in order to get some hardware to work
> we've got recently.
> 
> The hardwares are the following:
>   - 2x dualport Mellanox ConnectX-3 VPI cards, with 56Gbps ports
>   - 4 computing modules with a singleport Mellanox MT27500-family
> ConnectX-3 port.
> 
> The 2 dualport cards are in a storage box, and the 4 singleport ones
> are integrated on blade-like computing nodes (4 boxes in 2U). The
> storage is running FreeBSD 9-STABLE, 2012-05-07 cvsup, and the
> computing nodes are running linux.
> 
> So far we had been able to bring up the subnet-manager on the FreeBSD
> node, and one of the links got into Active state, which is quite good.
> We had been able to ibping between the nodes. The FreeBSD kernel
> config, in addition to GENERIC, is the following:
> 
> options OFED
> options SDP
> device ipoib
> options IPOIB_CM
> device mlx4ib
> device mthca
> device mlxen
> 
> Right now we're having problems with the following issues, situations:
> 
> 1) we assigned IP addresses to both ib interfaces (fbsd, linux side),
> but weren't able to ping over IP. We've seen icmp-echo-requests leaving
> the box on the linux box, but haven't seen any incoming traffic on the
> freebsd one. On the freebsd side, we had several issues:
>  - no incoming packets seen by tcpdump on the ib interface
>  - when trying to ping the other side, we've got "no route to host",
>but the routing entry existed in the routing table.
>  - we had a few of these messages in our messages: "ib2: timing out; 0
>sends N recieves not completed", where started at 22,34 and was
>growing.
> 

Have you looked at your arp tables?  (arp -a)

Do you have any messages in dmesg on the FreeBSD side?

Can you show us the output of ifconfig on the FreeBSD side?


> 2) We're unable to find any resources on how to do RDMA on the FreeBSD
> side. We'd like to use SRP (SCSI RDMA Protocol) communication, and/or
> NFS-over-RDMA for our storage link between the boxes. Where could we
> find any info on this?


Sorry but I can't help you with this one.

> 3) Enabling connected-mode, we weren't able to find a way to specify or
> query the port that connected mode is using. Could someone please point
> us to the right direction regarding this minor issue?

This ought to work in FreeBSD as it does in Linux, but I've not
personally tried it.

> 4) We were also unable to find how to switch these dual-personality
> cards between infiniband and ethernet modes. Could we also get some
> pointers regarding this please?
> 

It usually depends on what cable you're using, what it's plugged into,
and what driver you bring up.  The mlx4 driver should be able to give
you an Ethernet device with the Connect X-3 cards.

> Basically any help would be welcome which could help making infiniband
> work.
> 
> As a side question, I've seen a comming for OFED in HEAD by jhb, fixing
> a few things, may I ask when will that get MFC'd to RELENG-9?
> 

This I don't know about.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Interface MTU question...

2012-07-11 Thread gnn
Howdy,

Does anyone know the reason for this particular check in
ip_output.c?

if (rte != NULL && (rte->rt_flags & (RTF_UP|RTF_HOST))) {
/*
 * This case can happen if the user changed the MTU
 * of an interface after enabling IP on it.  Because
 * most netifs don't keep track of routes pointing to
 * them, there is no way for one to update all its
 * routes when the MTU is changed.
 */
if (rte->rt_rmx.rmx_mtu > ifp->if_mtu)
rte->rt_rmx.rmx_mtu = ifp->if_mtu;
mtu = rte->rt_rmx.rmx_mtu;
} else {
mtu = ifp->if_mtu;
}

To my mind the > ought to be != so that any change, up or down, of the
interface MTU is eventually reflected in the route.  Also, this code
does not check if it is both a HOST route and UP, but only if it is
one other the other, so don't be fooled by that, this check happens
for any route we have if it's up.

My proposed change is this:

Index: ip_output.c
===
--- ip_output.c (revision 225561)
+++ ip_output.c (working copy)
@@ -320,7 +320,7 @@
 * them, there is no way for one to update all its
 * routes when the MTU is changed.
 */
-   if (rte->rt_rmx.rmx_mtu > ifp->if_mtu)
+   if (rte->rt_rmx.rmx_mtu != ifp->if_mtu)
rte->rt_rmx.rmx_mtu = ifp->if_mtu;
mtu = rte->rt_rmx.rmx_mtu;
} else {

Please let me know what y'all think.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Proposal for changes to network device drivers and network stack (RFC)

2012-09-07 Thread gnn
At Fri, 7 Sep 2012 01:28:16 -0700,
Anuranjan Shukla wrote:
> 
> 
> >
> >> struct socket {
> >> 
> >>int so_fibnum;  /* routing domain for this socket */
> >>uint32_t so_user_cookie;
> >> +  u_int   so_oqueue; /* manage send prioritizing based on
> >>application
> >> needs */
> >> +  u_short so_lrid; /* logical routing */
> >> };
> >> 
> >
> >I'd be interested to know how this is used.
> 
> We use the first one as a 'direction' to the forwarding path to select an
> appropriate priority queue to send the packet on. In a generic (i.e.
> Something other than our specific system) system, one could consider
> interesting ways to use queues on a multi queue NIC with help from a
> driver. The second one is for a system with logical routing capabilities
> (multiple routing systems within the same chassis). It gives an
> application opening a socket an option to select the specific logical
> routing instance.

OK, that's what I guessed but thanks for confirming it.

> I'll provide smaller pieces of diffs for the kernel without networking
> patch I'd sent out. Let me know if you prefer the device driver interface
> to be in that too.

Yes, please.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/123758: [panic] panic while restarting net/freenet6

2010-06-15 Thread gnn
Synopsis: [panic] panic while restarting net/freenet6

Responsible-Changed-From-To: gnn->n...@freebsd.org
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:13:33 UTC 2010
Responsible-Changed-Why: 


http://www.freebsd.org/cgi/query-pr.cgi?pr=123758
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/123758: [panic] panic while restarting net/freenet6

2010-06-15 Thread gnn
Synopsis: [panic] panic while restarting net/freenet6

Responsible-Changed-From-To: n...@freebsd.org->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:14:53 UTC 2010
Responsible-Changed-Why: 
Give this one back.

http://www.freebsd.org/cgi/query-pr.cgi?pr=123758
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/86427: [lor] Deadlock with FASTIPSEC and nat

2010-06-15 Thread gnn
Synopsis: [lor] Deadlock with FASTIPSEC and nat

Responsible-Changed-From-To: gnn->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:18:21 UTC 2010
Responsible-Changed-Why: 
I believe this is fixed but others can comment on it at will.

http://www.freebsd.org/cgi/query-pr.cgi?pr=86427
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/81095: IPsec connection stops working if associated network interface goes down and then up again.

2010-06-15 Thread gnn
Synopsis: IPsec connection stops working if associated network interface goes 
down and then up again.

Responsible-Changed-From-To: gnn->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:34:03 UTC 2010
Responsible-Changed-Why: 
This is probably not longer valid given the changes in our
IPSec stack over the last 4 years.  People are welcome to
retest/resubmit.

http://www.freebsd.org/cgi/query-pr.cgi?pr=81095
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/78968: FreeBSD freezes on mbufs exhaustion (network interface independent)

2010-06-15 Thread gnn
Synopsis: FreeBSD freezes on mbufs exhaustion (network interface independent)

Responsible-Changed-From-To: gnn->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:35:12 UTC 2010
Responsible-Changed-Why: 
5.3 bug, probably no longer relevant.

http://www.freebsd.org/cgi/query-pr.cgi?pr=78968
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/65616: IPSEC can't detunnel GRE packets after real ESP encryption

2010-06-15 Thread gnn
Synopsis: IPSEC can't detunnel GRE packets after real ESP encryption

Responsible-Changed-From-To: gnn->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:47:06 UTC 2010
Responsible-Changed-Why: 
This is likely stale.

http://www.freebsd.org/cgi/query-pr.cgi?pr=65616
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/56233: IPsec tunnel (ESP) over IPv6: MTU computation is wrong

2010-06-15 Thread gnn
Synopsis: IPsec tunnel (ESP) over IPv6: MTU computation is wrong

Responsible-Changed-From-To: gnn->freebsd-net
Responsible-Changed-By: gnn
Responsible-Changed-When: Tue Jun 15 17:47:41 UTC 2010
Responsible-Changed-Why: 
I'm not working on IPSec at the moment, handing this one back.

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Panic on boot with em1 attached

2008-12-22 Thread gnn
Hi,

Can you try this with fastforwarding off?  It looks like a double free
somewhere in the ip_fastforward() routine.  Someone frees m but does
not NULL it out and at the drop: label the mbuf m is valid but the
data within it has already been freed.  Knowing if this is related
only to the fast forwarding case will help.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Panic on boot with em1 attached

2008-12-23 Thread gnn
At Tue, 23 Dec 2008 13:57:39 +0200,
Vladimir V. Kobal wrote:
> 
> With fastforwarding off the system works well and boots without panicing.
> 
OK, that narrows it down.  Are you using any filtering such as PF,
ipfw, etc.?

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


A new tool for low level testing...

2008-12-23 Thread gnn
Hi,

I just checked in a small tool to HEAD in
/usr/src/tools/tools/ether_reflect which uses pcap and bpf to reflect
ethernet packets just about the driver layer without involving the
protocol stacks.  This is useful for people doing low level testing of
drivers and switches.  If you happen to be lucky enough to have an
ethernet packet generator (ixia et al) this will do what you want in
terms of reflecting the packets back.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Panic on boot with em1 attached

2008-12-24 Thread gnn
At Tue, 23 Dec 2008 22:49:24 +0200,
Vladimir V. Kobal wrote:
> 
> We are using pf+ALTQ for shaping and ipfw for filtering, diverting into
> netgraph nodes, attaching altq queues.
> 

OK, that also makes sense given what I saw in the code.  Can you
explain your entire setup?  That is, which filters, which interfaces,
what bits of netgraph etc.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: A new tool for low level testing...

2008-12-24 Thread gnn
At Tue, 23 Dec 2008 13:00:12 -0800,
julian wrote:
> 
> g...@freebsd.org wrote:
> > Hi,
> > 
> > I just checked in a small tool to HEAD in
> > /usr/src/tools/tools/ether_reflect which uses pcap and bpf to reflect
> > ethernet packets just about the driver layer without involving the
> > protocol stacks.  This is useful for people doing low level testing of
> > drivers and switches.  If you happen to be lucky enough to have an
> > ethernet packet generator (ixia et al) this will do what you want in
> > terms of reflecting the packets back.
> > 
> > Later,
> > George
> > ___
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 
> OR
> 
> ngctl mkpeer em0: echo lower echo
> 
> 
> hm no this would leave the source and destination headers in hte 
> same order.. they need to be swapped..
> 
> ok so I need to make a patch, but it would be much quicker than a user 
> utility..

I agree that netgraph is the right long term answer.  I look forward
to what you come up with.

Also, +1 to an improved set of docs on netgraph.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: freebsd 7.0-RELEASE BUG ping: sendto: No buffer space available

2009-01-26 Thread gnn
At Sat, 24 Jan 2009 16:20:06 +,
Rui Paulo wrote:
> 
> 
> On 24 Jan 2009, at 12:54, Yony Yossef wrote:
> 
> > Hi All,
> >
> > I'm facing a temporary network hang on my interfaces following a flood
> > ping/stress udp test.
> >
> > I'm running a netperf UDP test which is giving results but does not  
> > return
> > to the shell.
> > client output:
> >
> > UDP UNIDIRECTIONAL SEND TEST from fe80::202:c9ff:fe02:e1fe%mtnic0
> > (fe80::202:c9ff:fe02:e1fe) port 0 AF_INET6 to  
> > fe80::202:c9ff:fe02:e1f4%mt
> > nic0 (fe80::202:c9ff:fe02:e1f4) port 0 AF_INET6
> > Socket  Message  Elapsed  Messages
> > SizeSize Time Okay Errors   Throughput
> > bytes   bytessecs#  #   10^6bits/sec
> >
> > 327681472   10.02  547428 1694280 643.60
> > 32768   10.02   25089 29.50
> >
> >
> > (HANG)
> >
> > After a minute or two it returns to the shell with the following  
> > message:
> > shutdown_control: no response received  errno 55
> >
> > 20 minutes later (!!) the interface is working again.
> >
> > netstat -m and vmstat -z outputs during the hang time:
> >
> > # netstat -m
> > 25687/6578/32265 mbufs in use (current/cache/total)
> > 17404/2438/19842/65536 mbuf clusters in use (current/cache/total/max)
> > 0/1024 mbuf+clusters out of packet secondary zone in use (current/ 
> > cache)
> > 2071/1369/3440/65536 4k (page size) jumbo clusters in use
> > (current/cache/total/max)
> > 0/0/0/65536 9k jumbo clusters in use (current/cache/total/max)
> > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> > 49513K/11996K/61510K bytes allocated to network (current/cache/total)
> > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> > 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> > 0/0/0 sfbufs in use (current/peak/max)
> > 0 requests for sfbufs denied
> > 0 requests for sfbufs delayed
> > 0 requests for I/O initiated by sendfile
> > 0 calls to protocol drain routines
> 
> I think there are too many mbufs in use. You're probably facing an  
> mbuf leakage and that causes an interface hang.
> 
If this is a large memory machine try upping the number of clusters
and mbufs.  On 64 bit systems with large memories 1,000,000 mbufs is
not unheard of.

kern.ipc.nmbclusters: 100

Also, with UDP you can easily overrun different buffers within the
system.  You might also look at:

netstat -id

and see if the driver is dropping packets, and if so you might up its
send queue.  

Best,
George


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Canonical Packet Traces?

2007-08-19 Thread gnn
Howdy,

A very slightly off topic question for [EMAIL PROTECTED]  Does anyone know of a
web site that collects and indexes canonical packet traces for network
protocols?  I'm looking for a good storehouse of traces to use in
testing.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Canonical Packet Traces?

2007-08-20 Thread gnn
Thanks to all who responded.  I'll check out the links.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [EMAIL PROTECTED]: Re: rtfree: 0xffffff00036fb1e0 has 1 refs]

2007-09-01 Thread gnn
At Wed, 29 Aug 2007 08:24:58 +0100,
Bruce M. Simpson wrote:
> 
> BTW: Casual inspection with kscope suggests there is a similar 
> free-while-locked issue in nd6_ns_input() (netient6/nd6_nbr.c) and 
> in_arpinput() (netinet/if_ether.c).
> 
> nd6_ns_input() references rt-»rt_gateway after rtfree(), a potential 
> race not to mention a use-after-free.
> 
> I haven't checked Coverity for this, but it just doesn't look right.

At least in the ND6 case I think that the correct logic is:

 //depot/user/gnn/ipsec_seven/src/sys/netinet6/nd6_nbr.c#1 - 
/sources/p4/user/gnn/ipsec_seven/src/sys/netinet6/nd6_nbr.c 
@@ -215,8 +215,6 @@
rt = rtalloc1((struct sockaddr *)&tsin6, 0, 0);
need_proxy = (rt && (rt->rt_flags & RTF_ANNOUNCE) != 0 &&
rt->rt_gateway->sa_family == AF_LINK);
-   if (rt)
-   rtfree(rt);
if (need_proxy) {
/*
 * proxy NDP for single entry
@@ -228,6 +226,9 @@
proxydl = SDL(rt->rt_gateway);
}
}
+   if (!need_proxy || ifa == NULL)
+   if (rt)
+   rtfree(rt);
}
if (ifa == NULL) {
/*

Thoughts?

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: RFC: Evolution of the em driver

2007-10-30 Thread gnn
At Mon, 29 Oct 2007 10:45:17 -0700,
Jack Vogel wrote:
> 
> I have an important decision to make and I thought rather than just make
> it and spring it on you I'd present the issues and see what opinions were.
> 
> Our newer hardware uses new features that, more and more, require
> parallel code paths in the driver. For instance, the 82575 (Zoar) uses
> what are called 'advanced descriptors', this means different TX path.
> The 7.0 em driver has this support in it, it just uses a function pointer
> to handle it.
> 
> When I add in multiqueue/RSS support it will add even more code
> that functions this way.
> 
> What the Linux team did was to split the newer code into a standalone
> driver, they call it 'igb'. I had originally resisted doing this, but with
> the development I have been working on the past month I am starting
> to wonder if it might not be best to follow them.
> 
> I see 3 possibilities and I'd like feedback, which would you prefer if
> you have a preference and why.
> 
> First, keep the driver as is and just live with multiple code paths
> and features, possibly #ifdef'ed as they appear.
> 
> Second, split the driver as Linux has into em and igb. The added
> question then is how to split it, Linux made the line the use of
> advanced descriptors, so Zoar and after, but I could also see a
> case for having everything PCI-E/MSI capable being in the new
> driver.
> 
> Third, sort of a half-way approach, split up code but not the
> driver, in other words offer different source files that can be
> compiled into the driver, so you could have the one big jumbo
> driver with all in there, or one that will only work with a subset
> of adapters. This one would probably be the most work, because
> its a new approach.

As you're the main maintainer it's your choice.  Whatever is easiest
for you and gives us the most readable code.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: dup code in in6.c

2007-12-04 Thread gnn
At Fri, 30 Nov 2007 17:00:25 -0800,
julian wrote:
> 
> The following diff removes some (whart looks to me to be) duplicate code.
> 
> Anyone care  to comment before I commit it?
> 
> (I'm trying to imagine a case where it does something useful to do this twice
> but not really succeeding).
> 

It's a duplicate, the diff is fine. 

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: resend: multiple routing table roadmap (format fix)

2007-12-28 Thread gnn
At Wed, 26 Dec 2007 16:26:11 -0800,
julian wrote:
> 
> Resending as my mailer made a dog's breakfast of the first one
> with all sorts of wierd line breaks... hopefully this will be better.
> (I haven't sent it yet so I'm hoping)..
> 
> 
> ---
> 
> 
> 
> On thing where FreeBSD has been falling behind, and which by chance
> I have some time to work on is "policy based routing", which allows
> different packet streams to be routed by more than just the
> destination address.
> 
> Constraints:
> 
> 
> I want to make some form of this available in the 6.x tree
> (and by extension 7.x) , but FreeBSD in general needs it so I might as
> well
> do it in -current and back port the portions I need.
> 
> One of the ways that this can be done is to have the ability to
> instantiate multiple kernel routing tables (which I will now
> refer to as "Forwarding Information Bases" or "FIBs" for political
> correctness reasons. Which FIB a particular packet uses to make
> the next hop decision can be decided by a number of mechanisms.
> The policies these mechanisms implement are the "Policies" referred
> to in "Policy based routing".
> 
> One of the constraints I have if I try to back port this work to
> 6.x is that it must be implemented as a EXTENSION to the existing
> ABIs in 6.x so that third party applications do not need to be
> recompiled in timespan of the branch.
> 
> Implementation method, (part 1)
> ---
> For this reason I have implemented a "sufficient subset" of a
> multiple routing table solution in Perforce, and back-ported it
> to 6.x. (also in Perforce though not yet caught up with what I
> have done in -current/P4). The subset allows a number of FIBs
> to be defined at compile time (sufficient for my purposes in 6.x) and
> implements the changes needed to allow IPV4 to use them. I have not done
> the changes for ipv6 simply because I do not need it, and I do not
> have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
> 
> Other protocol families are left untouched and should there be
> users with proprietary protocol families, they should continue to work
> and be oblivious to the existence of the extra FIBs.
> 
> To understand how this is done, one must know that the current FIB
> code starts everything off with a single dimensional array of
> pointers to FIB head structures (One per protocol family), each of
> which in turn points to the trie of routes available to that family.
> 
> The basic change in the ABI compatible version of the change is to
> extent that array to be a 2 dimensional array, so that
> instead of protocol family X looking at rt_tables[X] for the
> table it needs, it looks at rt_tables[Y][X] when for all
> protocol families except ipv4 Y is always 0.
> Code that is unaware of the change always just sees the first row
> of the table, which of course looks just like the one dimensional
> array that existed before.
> 
> 
> The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
> are all maintained, but refer only to the first row of the array,
> so that existing callers in proprietary protocols can continue to
> do the "right thing".
> Some new entry points are added, for the exclusive use of ipv4 code
> called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
> which have an extra argument which refers the code to the correct row.
> 
> In addition, there are some new entry points (currently called
> dom_rtalloc() and friends) that check the Address family being
> looked up and call either rtalloc() (and friends) if the protocol
> is not IPv4 forcing the action to row 0 or to the appropriate row
> if it IS IPv4 (and that info is available). These are for calling
> from code that is not specific to any particular protocol. The way
> these are implemented would change in the non ABI preserving code
> to be added later.
> 
> One feature of the first version of the code is that for ipv4,
> the interface routes show up automatically on all the FIBs, so
> that no matter what FIB you select you always have the basic
> direct attached hosts available to you. (rtinit() does this
> automatically).
> You CAN delete an interface route from one FIB should you want
> to but by default it's there. ARP information is also available
> in each FIB. It's assumed that the same machine would have the
> same MAC address, regardless of which FIB you are using to get
> to it.
> 
> 
> This brings us as to how the correct FIB is selected for an outgoing
> IPV4 packet.
> 
> Packets fall into one of a number of classes.
> 1/ locally generated packets, coming from a socket/PCB.
> Such packets select a FIB from a number associated with the
> socket/PCB. This in turn is inherited from the process,
> but can be changed by a socket option. The process in turn
> inherits it on fork. I have written a utility call setfib
> that acts a bit like nice..
> 
> setfib -n 3 ping

Re: resend: multiple routing table roadmap (format fix)

2007-12-28 Thread gnn
At Fri, 28 Dec 2007 20:40:30 +0100,
Marko Zec wrote:
> The thrust behind Julian's work seems to be providing multiple 
> forwarding tables for for purposes of traffic engineering / policy 
> based routing, with a single firewall instance used as a classifier.  
> vimage-style network stack virtualization provides for more strict 
> isolation on both port and IP address space, independent firewall 
> instances, IPSEC config / state etc., and as such might be better 
> suited for providing enhanced jail-style virtual hosting environments, 
> as well as for providing virtual router "slices".
> 
> So once we get Julian's multi-FIB stuff in the base system, I see no 
> reason why we couldn't have this functionality replicated in 
> each "vimage" instance, i.e. have multiple independent virtual 
> networking environnments, each with multiple FIBs.
> 
> Implementationwise, my hacks currently rely on macros for conditional 
> virtualization of global variables / structs.  As long as Julian's 
> changes continue to be unconditional, i.e. without playing a similar 
> macroization game, I think integrating this code (once it hits HEAD) 
> into p4/projects/vimage should be more or less a straightforward job.

Cool, that's what I wanted to hear.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Network device driver KPI/ABI and TOE

2008-01-10 Thread gnn
At Sun, 6 Jan 2008 13:47:24 + (GMT),
rwatson wrote:
> 
> 
> There's also the opportunity to think about whether it's possible to
> harden things in such a ways as to not give up our flexibility to
> keep maintaining and improving TCP (and other related subsystems),
> yet improving the quality of life for a third party TOE driver
> maintainer.  For example, might we provide accessor routines for
> certain data structures, or attempt to structure things to hide more
> of TCP locking from a TOE implementation?  Should we suggest that
> non-native TOE implementations rely less on our TCP code and provide
> there own where the hardware doesn't provide a complete
> implementation, in order to avoid building dependency on things that
> we know will change?
> 

Given the intimacy that I just perused in the code, basically the
driver knows a lot about internal TCP data structures, I think we need
to think about a kernel KPI just for these things. I'm not very happy
that there are things like cxgb_tcp_ctlinput() although I do know that
cleaning that kind of thing up and making a better KPI will be hard.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Are there known issues with multicast on Intel Pro 1000?

2008-01-17 Thread gnn
Howdy,

At my current gig we find that the network interface locks up if we
subject it to a high rate of multicast traffic.  Since the whole
purpose of this box is to do multicast (it absorbs a feed of data over
multicast manipulates and then sends it out again over multicast) it's
a "bad thing" if this kind of thing does not work.

What I currently know is not complete but I figured I could start
here.

The symptom is that all network communication stops, but the system
itself is still responsive, so I can get to the console and get
information.

Release: 6.2 and 6.3-PRERELEASE (6.3 as of Wed Jan 16th)

`Motherboard:

CPU: 2 x Intel Xeon X5365 3GHz (4 cores each)

Memory: 8G

em0: Intel PRO/1000 6.7.3 port 0x2000-0x201f mem 0xd832-0xd833
em1: Intel PRO/1000 6.7.3 port 0x2020-0x203f mem 0xd832-0xd833
em2: Intel PRO/1000 6.7.3 port 0x3000-0x303f mem 0xd824-0xd825, 
0xd820-0xd823
em3: Intel PRO/1000 6.7.3 port 0x3040-0x307f mem 0xd826-0xd827

Other data:

em2 is the interface that multicasts out our digested data and it also
is receiving a lot of digested multicast traffic, which is being
recorded by a proprietary program

sysctl dev.em.2.debug=1
em2: CTRL = 0x487c0a01 RCTL=0x8002
em2: Pcket buffer = Tx=16k Rx=48k
em2: fifo workaround = 0, fifo_reset_count = 0
em2: hw tdh = 76, hw tdt = 76
em2: hw rdh = 213, hw rdt = 212
em2: Num Tx descriptors avail = 256
em2: Tx Descriptors not avail1 = 0
em2: Tx Descriptors not avail2 = 0
em2: Std mbuf failed = 0
em2: Std mbuf cluster fialed = 1247383 (this number is increasing by about 1 a
second)
em2: Driver dropped packets = 0
em2: Driver tx dma failure in encap = 0
sysctl dev.em.2.stats=1
(all are zero except what is recorded)
em2: Missed Packets = 4683
em2: Receive No Buffers = 46905
em2: RX overruns = 83
em2: Good Packets Rcvd = 11416687
em2: Good Packets Xmtd = 146576

em0 is the interface we receive the raw data over multicast on

em0: hw tdh = 130, hw tdt = 130
em0: hw rdh = 13, hw rdt = 12
em0: Num Tx descriptors avail = 256
em0: Std mbuf cluster failed = 5111461 (this number is going up by about 1 a
second)
sysctl dev.em.0.stats=1
(all are zero except what is recorded)
em0: Missed Packets = 292778
em0: Receive No Buffers = 96211
em0: RX overruns = 1092
em0: Good Packets Rcvd = 5386001
em0: Good Packets Xmtd = 12418

em3 receives a little data from multicast and it is recorded using
a proprietary program

em3: hw tdh = 45, hw tdt = 45
em3: hw rdh = 216, hw rdt = 215
em3: Num Tx descriptors avail = 256
em3: Std mbuf cluster failed = 195951 (also going up by 1 very slowly)

sysctl dev.em.3.stats=1
(all are zero except what is recorded)
em3: Good Packets Rcvd = 9637851
em3: Good Packets Xmtd = 8237



One odd thing is that when the system boots, em1, which is unused in
this case complains of:

em1: Using MSI interrupt
em1: Setup of Shared code failed



What more do people need to help debug this?  

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Are there known issues with multicast on Intel Pro 1000?

2008-01-20 Thread gnn
At Thu, 17 Jan 2008 15:06:19 -0500,
randall wrote:
> 
> [EMAIL PROTECTED] wrote:
> > Howdy,
> > 
> > At my current gig we find that the network interface locks up if we
> > subject it to a high rate of multicast traffic.  Since the whole
> > purpose of this box is to do multicast (it absorbs a feed of data over
> > multicast manipulates and then sends it out again over multicast) it's
> > a "bad thing" if this kind of thing does not work.
> > 
> > What I currently know is not complete but I figured I could start
> > here.
> > 
> > The symptom is that all network communication stops, but the system
> > itself is still responsive, so I can get to the console and get
> > information.
> 
> If you let it run long enough does it eventually lock up?
> 
> I have seen similar behavior when a lock is not released when
> I was breaking things :-)
> 
> Everything is fine EXCEPT the interface.. for a while.. then
> eventually you get a train-wreck :-)
> 
> I would drop to ddb and do the show locks..
> 
> Also I believe top (or ps) will tell you what locks are being
> waited on in a course way... I think the ps in DDB will do this.

On closer inspection it looks like an "out of mbufs" situation and so
the right answer is to "up the nmbclusters" but there seem to be other
issues with this code and multicast so I'm likely to jump into DDB and
look more closely at it, likely next week.

Thanks,
George


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: tcp-md5 check for incomming connection

2008-01-31 Thread gnn
At Thu, 31 Jan 2008 13:15:12 +0100 (CET),
Ingo Flaschberger wrote:
> 
> Dear Andre,
> 
> >> 2) linux method:
> >> Look for CONFIG_TCP_MD5SIG in linux-2.6.24/net/ipv4/tcp_ipv4.c
> >> (sorry no weblink..)
> >> They check and block md5-packets early in tcp_v4_do_rcv.
> >> afinet.c -> tcp_v4_rcv -> tcp_v4_do_rcv
> >> -> for Freebsd: place some logic early in tcp_input function
> >> and call a new function to check md5.
> >
> > IMHO calling a special function that does the check (like in tcp_output)
> > is the way to go.  This function should be run as late as possible after
> > the other segment validity checks to prevent easy cpu exhaustion attacks
> > with packets that only get the port numbers right.
> >
> > In tcp_new there is a natural place to perform the check.  tcp_input will
> > show up this weekend.  This doesn't prevent your work on the current code
> > at all as tcp_new won't show up in -current for a long time and when it
> > does it will not get MFC'd.
> 
> Ok.
> I will do the first patch for freebsd 6.2 (as my system uses it) and do 
> the a port to current (and I thing 6.3 too).
> 
> Regardding Bruce:
> I would prefer to implement md5 via the old setkey api as I also have todo 
> my daily business.
> 
> >> 3) Bruce extended method:
> >> http://lists.freebsd.org/pipermail/freebsd-net/2004-April/003761.html
> >> Use his code and add at severall places in tcp_input function
> >> similar checks.
> >> 
> >> Options:
> >> *) enable disable it via sysctl
> >> *) count total, good and bad packets via sysctl
> >
> > This belongs into struct tcpstat, not a new sysctl.
> 
> Ok.
> With which tool can this counters be read?
> Should I add the on/off feature? Via which tool?
> 

Enable/disable via sysctl.

Read via netstat.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel compile options

2008-02-12 Thread gnn
At Thu, 7 Feb 2008 15:16:44 +0100,
Michael Tuexen wrote:
> 
> Dear all,
> 
> I was able to build an IPv4 only kernel by having
> options INET
> #options INET6
> in the kernel config file.
> 
> Is it supposed to work that one can build a IPv6-only
> kernel by using
> #options INET
> options INET6
> 

I have not tried and I actually doubt it.

> And should I be able to compile a kernel without IPv4 and IPv6
> support by using
> #options INET
> #options INET6
> 

I believe this does not work either.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_start() and send queue question

2008-02-12 Thread gnn
At Thu, 07 Feb 2008 19:45:28 +0400,
Tofig Suleymanov wrote:
> 
> Hello list,
> 
> I will be grateful if someone could point me to the right direction 
> regarding the question below.
> 
> My device driver is getting incoming packets fine, but for some reason I 
> am not able to send a single  packet. Here is the source code: 
> http://www.freebsd.az/if_ib.c
> 
> I've added several debug messages to the source and here is the output:
> /
> /(bringing interface up and assigning the ip/netmask combination)
> /
> ifconfig ib0 192.168.0.6 netmask 255.255.255.0 up
> 
> /(and here is what I get in /var/log/messages /; /it seems to be a 
> standard arp broadcast)
> /
> Feb  7 19:14:32 schizo kernel: ib_init entered
> Feb  7 19:14:32 schizo kernel: ib_start entered
> Feb  7 19:14:32 schizo kernel: ib_encap entered
> Feb  7 19:14:32 schizo kernel: DHOST ff ff ff ff ff ff
> Feb  7 19:14:32 schizo kernel: SHOST  0 c0 ee 22  3 14
> Feb  7 19:14:32 schizo kernel: txeof entered
> Feb  7 19:14:32 schizo kernel: txeof exiting
> 
> /(now I try pinging, but no joy . I've added extra debug messages inside 
> ping.c)
> 
> /schizo# ping 192.168.0.1
> PING 192.168.0.1 (192.168.0.1): 56 data bytes
> packets sent: -1
> ping: sendto: Invalid argument
> packets sent: -1
> ping: sendto: Invalid argument
> packets sent: -1
> ping: sendto: Invalid argument
> ^C
> --- 192.168.0.1 ping statistics ---
> 3 packets transmitted, 0 packets received, 100% packet loss
> /
> 
> I have also tied to add debug messages to sys/net/if.c and 
> sys/net/netisr.c and it seems that the kernel doesn't even try to run my 
> ib_start() function.
> 

Some things to try:

1) Add debug statements to the ib_start() routine.

2) See if bpf works (tcpdump -i ib0)

3) Show us the output of:

ifconfig ib0

netstat -i

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Problems with Chelsio driver in CURRENT...

2008-02-12 Thread gnn
Hi,

I have two MP/Multicore Xeon boxes with CX4 based Chelsio cards in
them.  If I boot 7.0-RC1 the cards can talk to each other.  If I build
a recent kernel/world (for instance from today) I cannot ping between
them.  I have tried using GENERIC as wella as a custom kernel.

kodama8# ifconfig cxgb0
cxgb0: flags=8843 metric 0 mtu 9000

options=1bb
ether 00:07:43:05:20:68
inet 172.16.0.2 netmask 0xff00 broadcast 172.16.0.255
media: Ethernet 10Gbase-CX4  (autoselect )
status: active
kodama8# ping 172.16.0.1
PING 172.16.0.1 (172.16.0.1): 56 data bytes
^C
--- 172.16.0.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss
kodama8# 

kodama$ uname -a
FreeBSD kodama8.neville-neil.comA 7.0-RC1 FreeBSD 7.0-RC1 #0: Mon Dec 24 
10:10:07 UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  amd64
kodama8# 

nozomi8# ifconfig cxgb0
cxgb0: flags=8843 metric 0 mtu 9000

options=1bb
ether 00:07:43:05:20:43
inet 172.16.0.1 netmask 0xff00 broadcast 172.16.0.255
media: Ethernet 10Gbase-CX4  (autoselect )
status: active
nozomi8# 

nozomi8# uname -a
FreeBSD nozomi8.neville-neil.com 8.0-CURRENT FreeBSD 8.0-CURRENT #2: Wed Feb 13 
15:47:05 JST 2008 [EMAIL 
PROTECTED]:/usr/obj/scratch/FreeBSD.HEAD/src/sys/GENERIC  amd64
nozomi8# 


The dmesg is at the end of this mail.

Thoughts?

Thanks,
George


nozomi8# dmesg
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-CURRENT #2: Wed Feb 13 15:47:05 JST 2008
[EMAIL PROTECTED]:/usr/obj/scratch/FreeBSD.HEAD/src/sys/GENERIC
WARNING: WITNESS option enabled, expect reduced performance.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU   X5355  @ 2.66GHz (2666.68-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x6f7  Stepping = 7
  
Features=0xbfebfbff
  Features2=0x4e3bd
  AMD Features=0x20100800
  AMD Features2=0x1
  Cores per package: 4
usable memory = 8575602688 (8178 MB)
avail memory  = 8306462720 (7921 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): APIC ID:  6
 cpu7 (AP): APIC ID:  7
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-47 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
acpi_hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
cpu0:  on acpi0
est0:  on cpu0
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est0 attach returned 6
p4tcc0:  on cpu0
cpu1:  on acpi0
est1:  on cpu1
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est1 attach returned 6
p4tcc1:  on cpu1
cpu2:  on acpi0
est2:  on cpu2
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est2 attach returned 6
p4tcc2:  on cpu2
cpu3:  on acpi0
est3:  on cpu3
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est3 attach returned 6
p4tcc3:  on cpu3
cpu4:  on acpi0
est4:  on cpu4
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est4 attach returned 6
p4tcc4:  on cpu4
cpu5:  on acpi0
est5:  on cpu5
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est5 attach returned 6
p4tcc5:  on cpu5
cpu6:  on acpi0
est6:  on cpu6
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est6 attach returned 6
p4tcc6:  on cpu6
cpu7:  on acpi0
est7:  on cpu7
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 82a082a0600082a
device_attach: est7 attach returned 6
p4tcc7:  on cpu7
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 2.0 on pci0
pci1:  on pcib1
pcib2:  irq 16 at device 0.0 on pci1
pci2:  on pcib2
pcib3:  irq 16 at device 0.0 on pci2
pci3:  on pcib3
pcib4:  at device 0.0 on pci3
pci4:  on pcib4
ahd0:  port 0x2400-0x24ff,0x2000-0x20ff 
mem 0xd8b0-0xd8b01fff irq 16 at device 2.0 on pci4
ahd0: [ITHREAD]
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
ahd1:  port 0x2c00-0x2cff,

Re: Problems with Chelsio driver in CURRENT...

2008-02-13 Thread gnn
At Wed, 13 Feb 2008 00:52:52 -0800,
Kip Macy wrote:
> 
> Oops sorry ... What is the output of 'sysctl dev.cxgbc.0'?
> 

Here ya go, and thanks!

Later,
George


nozomi8# ifconfig cxgb0
cxgb0: flags=8843 metric 0 mtu 9000

options=1bb
ether 00:07:43:05:20:43
inet 172.16.0.1 netmask 0xff00 broadcast 172.16.0.255
media: Ethernet 10Gbase-CX4  (autoselect )
status: active
nozomi8# sysctl dev.cxgbc.0
dev.cxgbc.0.%desc: Chelsio T310 RNIC, 1 port
dev.cxgbc.0.%driver: cxgbc
dev.cxgbc.0.%location: slot=0 function=0
dev.cxgbc.0.%pnpinfo: vendor=0x1425 device=0x0030 subvendor=0x1425 
subdevice=0x0001 class=0x02
dev.cxgbc.0.%parent: pci9
dev.cxgbc.0.firmware_version: 4.7.0
dev.cxgbc.0.enable_debug: 0
dev.cxgbc.0.tunq_coalesce: 0
dev.cxgbc.0.txq_overrun: 0
dev.cxgbc.0.pcpu_cache_enable: 1
dev.cxgbc.0.cache_alloc: 0
dev.cxgbc.0.cached: 0
dev.cxgbc.0.ext_freed: 0
dev.cxgbc.0.mbufs_outstanding: 0
dev.cxgbc.0.pack_outstanding: 0
dev.cxgbc.0.intr_coal: 1
dev.cxgbc.0.port0.nqsets: 8
dev.cxgbc.0.port0.qs0.rspq.size: 1024
dev.cxgbc.0.port0.qs0.rspq.cidx: 2
dev.cxgbc.0.port0.qs0.rspq.credits: 2
dev.cxgbc.0.port0.qs0.rspq.phys_addr: 0x03cf
dev.cxgbc.0.port0.qs0.rspq.dump_start: 0
dev.cxgbc.0.port0.qs0.rspq.dump_count: 0
dev.cxgbc.0.port0.qs0.txq_eth.dropped: 0
dev.cxgbc.0.port0.qs0.txq_eth.sendqlen: 0
dev.cxgbc.0.port0.qs0.txq_eth.queue_pidx: 0
dev.cxgbc.0.port0.qs0.txq_eth.queue_cidx: 0
dev.cxgbc.0.port0.qs0.txq_eth.processed: 0
dev.cxgbc.0.port0.qs0.txq_eth.cleaned: 0
dev.cxgbc.0.port0.qs0.txq_eth.in_use: 1
dev.cxgbc.0.port0.qs0.txq_eth.frees: 0
dev.cxgbc.0.port0.qs0.txq_eth.skipped: 0
dev.cxgbc.0.port0.qs0.txq_eth.coalesced: 0
dev.cxgbc.0.port0.qs0.txq_eth.enqueued: 1
dev.cxgbc.0.port0.qs0.txq_eth.stopped_flags: 0
dev.cxgbc.0.port0.qs0.txq_eth.phys_addr: 0x7e7c
dev.cxgbc.0.port0.qs0.txq_eth.qgen: 1
dev.cxgbc.0.port0.qs0.txq_eth.hw_cidx: 0
dev.cxgbc.0.port0.qs0.txq_eth.hw_pidx: 1
dev.cxgbc.0.port0.qs0.txq_eth.dump_start: 0
dev.cxgbc.0.port0.qs0.txq_eth.dump_count: 0
dev.cxgbc.0.port0.qs1.rspq.size: 1024
dev.cxgbc.0.port0.qs1.rspq.cidx: 0
dev.cxgbc.0.port0.qs1.rspq.credits: 0
dev.cxgbc.0.port0.qs1.rspq.phys_addr: 0x8456
dev.cxgbc.0.port0.qs1.rspq.dump_start: 0
dev.cxgbc.0.port0.qs1.rspq.dump_count: 0
dev.cxgbc.0.port0.qs1.txq_eth.dropped: 0
dev.cxgbc.0.port0.qs1.txq_eth.sendqlen: 0
dev.cxgbc.0.port0.qs1.txq_eth.queue_pidx: 0
dev.cxgbc.0.port0.qs1.txq_eth.queue_cidx: 0
dev.cxgbc.0.port0.qs1.txq_eth.processed: 0
dev.cxgbc.0.port0.qs1.txq_eth.cleaned: 0
dev.cxgbc.0.port0.qs1.txq_eth.in_use: 0
dev.cxgbc.0.port0.qs1.txq_eth.frees: 0
dev.cxgbc.0.port0.qs1.txq_eth.skipped: 0
dev.cxgbc.0.port0.qs1.txq_eth.coalesced: 0
dev.cxgbc.0.port0.qs1.txq_eth.enqueued: 0
dev.cxgbc.0.port0.qs1.txq_eth.stopped_flags: 0
dev.cxgbc.0.port0.qs1.txq_eth.phys_addr: 0x8464
dev.cxgbc.0.port0.qs1.txq_eth.qgen: 1
dev.cxgbc.0.port0.qs1.txq_eth.hw_cidx: 0
dev.cxgbc.0.port0.qs1.txq_eth.hw_pidx: 0
dev.cxgbc.0.port0.qs1.txq_eth.dump_start: 0
dev.cxgbc.0.port0.qs1.txq_eth.dump_count: 0
dev.cxgbc.0.port0.qs2.rspq.size: 1024
dev.cxgbc.0.port0.qs2.rspq.cidx: 0
dev.cxgbc.0.port0.qs2.rspq.credits: 0
dev.cxgbc.0.port0.qs2.rspq.phys_addr: 0x86b4
dev.cxgbc.0.port0.qs2.rspq.dump_start: 0
dev.cxgbc.0.port0.qs2.rspq.dump_count: 0
dev.cxgbc.0.port0.qs2.txq_eth.dropped: 0
dev.cxgbc.0.port0.qs2.txq_eth.sendqlen: 0
dev.cxgbc.0.port0.qs2.txq_eth.queue_pidx: 0
dev.cxgbc.0.port0.qs2.txq_eth.queue_cidx: 0
dev.cxgbc.0.port0.qs2.txq_eth.processed: 0
dev.cxgbc.0.port0.qs2.txq_eth.cleaned: 0
dev.cxgbc.0.port0.qs2.txq_eth.in_use: 0
dev.cxgbc.0.port0.qs2.txq_eth.frees: 0
dev.cxgbc.0.port0.qs2.txq_eth.skipped: 0
dev.cxgbc.0.port0.qs2.txq_eth.coalesced: 0
dev.cxgbc.0.port0.qs2.txq_eth.enqueued: 0
dev.cxgbc.0.port0.qs2.txq_eth.stopped_flags: 0
dev.cxgbc.0.port0.qs2.txq_eth.phys_addr: 0x86b6
dev.cxgbc.0.port0.qs2.txq_eth.qgen: 1
dev.cxgbc.0.port0.qs2.txq_eth.hw_cidx: 0
dev.cxgbc.0.port0.qs2.txq_eth.hw_pidx: 0
dev.cxgbc.0.port0.qs2.txq_eth.dump_start: 0
dev.cxgbc.0.port0.qs2.txq_eth.dump_count: 0
dev.cxgbc.0.port0.qs3.rspq.size: 1024
dev.cxgbc.0.port0.qs3.rspq.cidx: 0
dev.cxgbc.0.port0.qs3.rspq.credits: 0
dev.cxgbc.0.port0.qs3.rspq.phys_addr: 0x8815
dev.cxgbc.0.port0.qs3.rspq.dump_start: 0
dev.cxgbc.0.port0.qs3.rspq.dump_count: 0
dev.cxgbc.0.port0.qs3.txq_eth.dropped: 0
dev.cxgbc.0.port0.qs3.txq_eth.sendqlen: 0
dev.cxgbc.0.port0.qs3.txq_eth.queue_pidx: 0
dev.cxgbc.0.port0.qs3.txq_eth.queue_cidx: 0
dev.cxgbc.0.port0.qs3.txq_eth.processed: 0
dev.cxgbc.0.port0.qs3.txq_eth.cleaned: 0
dev.cxgbc.0.port0.qs3.txq_eth.in_use: 0
dev.cxgbc.0.port0.qs3.txq_eth.frees: 0
dev.cxgbc.0.port0.qs3.txq_eth.skipped: 0
dev.cxgbc.0.port0.qs3.txq_eth.coalesced: 0
dev.cxgbc.0.port0.qs3.txq_eth.enqueued: 0
dev.cxgbc.0.port0.qs3.txq_eth.stopped_flags: 0
dev.cxgbc.0.port0.qs3.txq_eth.phys_addr: 0x8816
dev.cxgbc.0.port0.qs3.txq_eth.qgen: 1
dev.cxgbc.0.port0.qs3.txq_eth.hw_cidx: 

Re: Problems with Chelsio driver in CURRENT...

2008-02-13 Thread gnn
OK, one more data point.  

The issue is somewhere between RC2 and CURRENT.  I just put RC2 on the
same box, and RC1 can talk to RC2 over the Chelsio cards.

I have now tried RC2 and CURRENT and still no dice.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic in 6.3-RELEASE when multi-cast client exits

2008-02-19 Thread gnn
At Tue, 19 Feb 2008 14:00:56 +,
Bruce M. Simpson wrote:
> 
> Rob Watt wrote:
> > Hi.
> >
> > We recently upgraded some of our machines to 6.3-RELEASE and we have been
> > plagued by repeatable panics when our multi-cast client applications exit.
> > Our machines have Intel X5365 processors, LSI MegaSAS 1064R cards, and Intel
> > Pro 1000 MF nic cards (although we have seen this problem with the onboard
> > Intel copper nics as well). We have seen this panic with machines that have
> > Tyan boards as well as Super Micro. I have seen a few postings that seem to
> > refer to related panics, and bug
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=116077 contains a patch that
> > seems like it should address the problem, but our patched system still
> > panics. I have attached the output from 3 of the dumps/backtraces. Dump #1
> > is probably the most useful. I am happy to provide more info if necessary.
> >   
> 
> Some folk reported that they didn't see this problem occur with the code 
> in 7.x, which jibes as I rewrote some of the logic in that branch. It's 
> been nearly a year since I last had time to look at anything related to 
> this.
> 
> My understanding is that 7.0 is getting closer to release status so you 
> may wish to try reproducing the problem there.
> 
> The human resource situation hasn't changed much on my end, though I am 
> getting closer to having time to finishing IGMPv3 (it's needed for other 
> stuff in the future). I haven't been able to reproduce the bug in the 
> PR, which makes suggesting other courses of action difficult.
> 

I can reproduce this panic with a small piece of code I've been
hacking for work.  The code depends on classes that are proprietary
but the program itself is simple and I'll ask work if I can sanitize
it in the next few days.  The program is intended as a multicast
jitter/latency tester, but works well as a general exerciser of the
multicast code.

The panic is basically an issue with terminating a process and
handling the multicast address lists on the interface.  I have not
tracked down the exact cause as yet but am working on it now.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: IPV6_TCLASS missing from ip6(4)

2008-02-20 Thread gnn
At Wed, 20 Feb 2008 18:25:05 +,
Bruce M Simpson wrote:
> 
> I just noticed that whilst the socket code appears to support 
> IPV6_TCLASS, we don't document it.
> 
> I  haven't raised a PR for this issue yet nor have I written a patch.
> 

Please do both :-)

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic in 6.3-RELEASE when multi-cast client exits

2008-02-21 Thread gnn
FYI this is fixed by a one line change that is about to hit 6-STABLE:

@@ -991,7 +991,6 @@
 * a new record.  Otherwise, we are done.
 */
if (ifma->ifma_protospec != NULL) {
-   if_delmulti_ent(ifma);  /* We don't need another reference */
IN_MULTI_UNLOCK();
IFF_UNLOCKGIANT(ifp);
return ifma->ifma_protospec;

Sent to me by Stephan Uphoff.

I tested it today.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: LOR icmp6_input/nd6_lookup

2008-03-03 Thread gnn
At Fri, 29 Feb 2008 13:44:27 -0600,
Kevin Day wrote:
> 
> This is from 7.0-RELEASE:
> 
> lock order reversal:
>   1st 0xc3bde2b8 rtentry (rtentry) @ netinet6/nd6.c:1930
>   2nd 0xc3af367c radix node head (radix node head) @ net/route.c:147
> KDB: stack backtrace:
> db_trace_self_wrapper
> (c08af130,e11b8600,c0662bbe,c08b1592,c3af367c,...) at  
> db_trace_self_wrapper+0x26
> kdb_backtrace(c08b1592,c3af367c,c08b15f3,c08b15f3,c08b9ce7,...) at  
> kdb_backtrace+0x29
> witness_checkorder(c3af367c,9,c08b9cde,93,e11b8624,...) at  
> witness_checkorder+0x6de
> _mtx_lock_flags(c3af367c,0,c08b9cde,93,c066160b,...) at _mtx_lock_flags 
> +0xbc
> rtalloc1(e11b86e0,0,0,0,c3c9d01c,...) at rtalloc1+0x63
> nd6_lookup(c3c9d024,0,c39fd800,c3bde258,c3bde258,...) at nd6_lookup+0x55
> nd6_is_addr_neighbor(c3c9d01c,c39fd800,c08c1d75,78a,c09a5ed8,...) at  
> nd6_is_addr_neighbor+0x3b
> nd6_output(c39fd800,c39fd800,c3cf9b00,c3c9d01c,c3bde258,...) at  
> nd6_output+0x10f
> ip6_output(c3cf9b00,0,e11b88e0,0,0,...) at ip6_output+0x1081
> icmp6_reflect(c3cf9b00,28,8,1,c08c96d0,...) at icmp6_reflect+0x42f
> icmp6_input(e11b8c88,e11b8c70,3a,1d5,0,...) at icmp6_input+0x6dc
> ip6_input(c3be2900,0,c08b9887,8c,c09a1e24,...) at ip6_input+0xe36
> netisr_processqueue(c0955e30,0,c08b9887,f6,c3865a40,...) at  
> netisr_processqueue+0x8b
> swi_net(0,0,c08a938d,471,c3870364,...) at swi_net+0x9b
> ithread_loop(c383ac90,e11b8d38,c08a9115,305,c3873000,...) at  
> ithread_loop+0x1b5
> fork_exit(c060fbe0,c383ac90,e11b8d38) at fork_exit+0xb8
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xe11b8d70, ebp = 0 ---
> 
> Are LOR's still PR-worthy?

Yes, can you file one?

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: [PATCH] kern/120958: no response to ICMP traffic on interface configured with a link-local address

2008-03-20 Thread gnn
At Thu, 13 Mar 2008 20:58:25 -0400,
James Snow wrote:
> 
> [1  ]
> On Thu, Mar 13, 2008 at 08:40:07PM -0400, James Snow wrote:
> > 
> > Also, I took a cue from the IN_LINKLOCAL() macro and added two new
> > macros to sys/netinet/in.h to perform checks for the loopback network
> > and the "zero" network.  IN_LOOPBACK() and IN_ZERONET(), respectively.
> 
> Woops.  I suppose the macros are more useful when they're actually
> called.
> 
> Attached is a revised patch that performs the check for loopback
> addresses less than twice but more than never.
> 

This looks good.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


A new tool to measure multicast performance...

2008-04-03 Thread gnn
Howdy,

I have just finished updating a new tool in src/tools/tools/mctest
which is a multicast test program.  The mctest program works by
sending packets from a source to a sink over using a multicast address
and then the sink reflects the packets it receives back to the
source.  The source records the transmission and reception time of
each packet and reports the round trip time, which the sink prints out
the time between packets, in microseconds.  The program is best used
to debug ethernet drivers as well as our multicast and UDP code.

For more information please read the manual page.

Sorry, IPv6 is not supported as yet, only IPv4.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kern/120958: no response to ICMP traffic on interface configured with a link-local address

2008-04-17 Thread gnn
Synopsis: no response to ICMP traffic on interface configured with a link-local 
address

State-Changed-From-To: open->patched
State-Changed-By: gnn
State-Changed-When: Thu Apr 17 12:51:46 UTC 2008
State-Changed-Why: 
User submitted a patch which is now applied and tested.

Take over bug until closed.


Responsible-Changed-From-To: freebsd-net->gnn
Responsible-Changed-By: gnn
Responsible-Changed-When: Thu Apr 17 12:51:46 UTC 2008
Responsible-Changed-Why: 
The user's suggested patch has been applied.

Take the bug over until its closed.

http://www.freebsd.org/cgi/query-pr.cgi?pr=120958
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Regarding if_alloc()

2008-04-17 Thread gnn
At Thu, 17 Apr 2008 18:35:23 -0700 (PDT),
vijay singh wrote:
> 
> Hi all. How do we avoid a race in populating the ifindex_table? Id
> this is a TODO, as it seems from the code below, would it be
> acceptable if I wrote a patch and reused the ifnet_lock
> [IFNET_WLOCK, IFNET_WUNLOCK]?
> 

It is almost always acceptable to submit a patch :-)

> 
> if_alloc(u_char type)
> {
> struct ifnet *ifp;
> 
> ifp = malloc(sizeof(struct ifnet), M_IFNET, M_WAITOK|M_ZERO);
> 
> /*
>  * Try to find an empty slot below if_index.  If we fail, take
>  * the next slot.
>  *
>  * XXX: should be locked!
>  */
> for (ifp->if_index = 1; ifp->if_index <= if_index; ifp->if_index++) {
> if (ifnet_byindex(ifp->if_index) == NULL)
> break;
> }
> 
> 

There are still parts of the network device infrastructure that need
some locking, and it would seem that this is one of them.  I know
Brooks Davis was also looking at this stuff so he may comment as well.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MFC of TOE support to RELENG_7

2008-04-17 Thread gnn
At Thu, 17 Apr 2008 21:00:04 -0700,
Kip Macy wrote:
> 
> I would like to MFC TOE and RDMA support in the last week of May /
> first week of June. My primary objective is that it be present in 7.1.
> The re team has not yet decided when the freeze date for 7.1 will be,
> so I may end up asking to do it earlier.
> 
> The reason I'm bringing it up roughly 6 weeks in advance is that there
> is a certain amount of debate with regards to the ABI guarantees that
> FreeBSD network developers are willing to commit to for the remaining
> life of the RELENG_7 branch.
> 
> I've made the following two simplifying assumptions:
>- struct tcpcb and struct sockbuf are append only - i.e. if members
> are added, they will be added to the end
>- lock ordering will not change, e.g. the inpcb lock will always be
> acquired before the sockbuf lock
> 
> Is there any reason to believe that these simplifying assumptions are
> not acceptable? If so, why?
> 
> 
> I've added the following sets of accessor functions:
>- lock acquire/release for socket, sockbuf, inpcb
>- higher level functions for tcp shutdown and syncache to abstract
> away the tcbinfo lock
>- accessor functions for all the accessed fields in socket and
> inpcb so that none of the members are referenced as offsets from the
> base of the structure

I apologize for not yet reviewing all the code.  I take that last bit
to mean the drivers can reach up into sockets given those functions?
I gather this is due to the work necessary to implement RDMA over TCP?

> The current state of the code can be seen at:
> http://157.22.130.171/svn/branches/projects/iwarp/sys/

Is there a simple way to get just that directory without doing a svn
on your whole repo?  And if not, what's the easiest way to just grab
that stuff?  

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


zonelimit issues...

2008-04-18 Thread gnn
Hi,

I am wondering why this patch was never committed?  

 http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround

It does seem to address an issue I'm seeing where processes get into
the zonelimit state through the use of mbufs (a high speed UDP packet
receiver) but even after network pressure is reduced/removed the
process never gets out of that state again.  Applying the patch fixed
the issue, but I'd like to have some discussion as to the general
merits of the approach.

Unfortunately the test that currently causes this is tied very tightly
to code at work that I can't share, but I will hopefully be improving
mctest to try to exhibit this behavior.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: zonelimit issues...

2008-04-21 Thread gnn
At Sun, 20 Apr 2008 09:53:49 -0700,
Chris Pratt wrote:
> 
> 
> On Apr 20, 2008, at 2:43 AM, Robert Watson wrote:
> 
> >
> > On Fri, 18 Apr 2008, Chris Pratt wrote:
> >
> >> Doesn't 7.0 fix this? I'd like to see an official definitive  
> >> answer and all I've been going on is that the problem description  
> >> is no longer in the errata.
> >
> > Unfortunately, bugs of this sort don't really "work" that way --  
> > specific bugs are a property of a problem in code (or a problem in  
> > design), but what we have right now is a report of a symptom that  
> > might reflect zero or more specific bugs.  It's unclear that the  
> > problem described in errata is the problem you've been  
> > experiencing, or that the (at least one) fixed bug with the same  
> > symptoms is that one you've been experiencing.  For better or  
> > worse, the only way to really tell of a generic class of hang or  
> > wedging is fixed is to try out the new version and see.  In most  
> > cases, "zonelimit" wedging reflects one of two things:
> >
> > (1) Inadequate resource allocation to the network stack or some other
> > component, try tuning up the memory tunable for clusters (for  
> > example).
> >
> For several months I did quite a bit of tuning. I never increased
> nmbclusters beyond the 32768 shown in the docs because man
> tuning doesn't define it's use of "arbitrarily high". Inability to boot
> could mean travel. Kris Kenneway had provided instructions to
> get a dump. I set up for that but have never had a dump. The
> only respite came from adding another circuit, another NIC and
> spreading traffic. We increased our lock time from every couple
> of days during the heavy bot period of late 2006 to now every
> month or during traditionally slow months, even two months.
> For example, we ran a record 72 days last summer. It was a
> very dead summer traffic wise.
> 
> I will try to increase the nmbclusters dramatically if I can figure
> out what a safe top limit is but it sounds like the jump to
> 7.0 RELEASE may be worth the effort. I would want to wait
> until this issue with TCP, Windows and certain routers is well
> past. I had not seen that applied to 7_0_0 yet and that would be
> a show stopper. Is there a way to know what is safe for
> nmbclusters given an 8GB ram system?

On "big" systems I am currently using 65000, and that seems safe so
far.  This is on an 8 core (2P) Xeon box with 8G of RAM.

> I did vmstats data collection for a couple of months when things
> were at their worst. The results were nebulous to me based
> on lack of code knowledge. All I actually found was that a
> certain counter would drop to 0 and never recover. I didn't
> know if it was meaningful and received no replies when I
> asked FreeBSD-Questions. It was 128-Bucket or something
> like that.
> 
> > (2) A memory leak in a network device driver or other network part,  
> > which
> > needs to be debugged and fixed.
> >
> 
> Initially I thought there may be something related to the bge
> driver and moved the high traffic apps on an em. This didn't
> seem to help much, nor did polling.
> 
> I am most willing to collect data if I could figure out how to
> collect something meaningful. I gather from what you say,
> that 7.0 would provide this.
> 
> I really appreciate both of your responses. Just based on
> this one problem, 6.x has been a bad experience after
> years of seemingly impossible uptime on 4 and 5.x
> FreeBSD.

Well there are plenty of us motivated to get at these issues.  Can you
do me a favor and characterize your traffic a bit?  Is it mostly TCP,
or heavily UDP or some sort of mix?  The issues I see are UDP based,
which is less surprising as UDP has no backpressure and it is easy to
over commit the system by upping the socket buffer space allocated
without upping the number of clusters to compensate.

Best,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: zonelimit issues...

2008-04-21 Thread gnn
At Sun, 20 Apr 2008 10:32:25 +0100 (BST),
rwatson wrote:
> 
> 
> On Fri, 18 Apr 2008, [EMAIL PROTECTED] wrote:
> 
> > I am wondering why this patch was never committed?
> >
> > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
> >
> > It does seem to address an issue I'm seeing where processes get into the 
> > zonelimit state through the use of mbufs (a high speed UDP packet receiver) 
> > but even after network pressure is reduced/removed the process never gets 
> > out of that state again.  Applying the patch fixed the issue, but I'd like 
> > to have some discussion as to the general merits of the approach.
> >
> > Unfortunately the test that currently causes this is tied very tightly to 
> > code at work that I can't share, but I will hopefully be improving mctest 
> > to 
> > try to exhibit this behavior.
> 
> When you take all load off the system, do mbufs and clusters get properly 
> freed back to UMA (as visible in netstat -m)?  If not, continuing to bump up 
> against the zonelimit would suggest an mbuf/cluster leak, in which case we 
> need to track that bug.
> 

This is unclear as the process that creates the issue opens 50 UDP
multicast sockets with very large socket buffers.  I am investigating
this aspect some more.

> You might consider adding a debugging-only zonelimit waiter count to
> the UMA zone, and checks/assertions that a wakeup is being generated
> properly.  

Yes.  Do you have an example I can easily steal?

> That is, to confirm that the wakeup is generated when memory is
> freed up if there are threads waiting.  There is at least one as-yet
> MFC'd fix to the sleep/wakeup code, I believe, that might be
> relevant here.  Is the problem you're reporting on 7.x, or on 8.x?
> If 8.x, that's probably not it, but if 7.x, it could be.  (This same
> sleep/wakeup bug occasionally leads to wedging of dump(8), I
> believe).

I have seen this on 7.0 RELEASE, and STABLE and on CURRENT (8).

I am currently working on it on CURRENT because if I have a fix it's
going to have to go there first.

Best,
George


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: zonelimit issues...

2008-04-21 Thread gnn
At Mon, 21 Apr 2008 16:46:00 +0900,
[EMAIL PROTECTED] wrote:
> 
> At Sun, 20 Apr 2008 10:32:25 +0100 (BST),
> rwatson wrote:
> > 
> > 
> > On Fri, 18 Apr 2008, [EMAIL PROTECTED] wrote:
> > 
> > > I am wondering why this patch was never committed?
> > >
> > > http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
> > >
> > > It does seem to address an issue I'm seeing where processes get into the 
> > > zonelimit state through the use of mbufs (a high speed UDP packet 
> > > receiver) 
> > > but even after network pressure is reduced/removed the process never gets 
> > > out of that state again.  Applying the patch fixed the issue, but I'd 
> > > like 
> > > to have some discussion as to the general merits of the approach.
> > >
> > > Unfortunately the test that currently causes this is tied very tightly to 
> > > code at work that I can't share, but I will hopefully be improving mctest 
> > > to 
> > > try to exhibit this behavior.
> > 
> > When you take all load off the system, do mbufs and clusters get properly 
> > freed back to UMA (as visible in netstat -m)?  If not, continuing to bump 
> > up 
> > against the zonelimit would suggest an mbuf/cluster leak, in which case we 
> > need to track that bug.
> > 
> 
> This is unclear as the process that creates the issue opens 50 UDP
> multicast sockets with very large socket buffers.  I am investigating
> this aspect some more.
> 

OK, yes, the clusters etc. go back to normal when the incoming
pressure is released.  I do not believe we have a cluster/mbuf leak.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: zonelimit issues...

2008-04-22 Thread gnn
At Tue, 22 Apr 2008 06:35:38 -0700,
Chris Pratt wrote:
> 
> 
> On Apr 21, 2008, at 12:43 AM, [EMAIL PROTECTED] wrote:
> 
> > ...snip
> >
> > Well there are plenty of us motivated to get at these issues.  Can you
> > do me a favor and characterize your traffic a bit?  Is it mostly TCP,
> 
> The traffic that seems to take us out is TCP port 80. I'll make a
> generalized guess but it does seem to follow. We freeze on one of
> two dramatically heavy use days for our industry (Sunday and Monday
> evening). The hang will actually occur on Monday or Tuesday
> following these days if sufficient traffic hits us. It has not
> always followed this pattern but most frequently. There is always a
> high presence of high frequency attacks of various sorts. For
> example referer spam posts which hit us hard on our busy
> evenings. So it is TCP and I would presume we usually have the
> establishment of many useless sessions that could cause us to bump
> up against limits and cause exhaustion coupled with our real traffic
> peaks.
> 

Interesting, but with TCP it should be easier to tune this, in
particular because TCP has backoff once a packet drops.  I gather you
are using facilities, like accept filters, that make it easy to drop
less useful traffic?

> This thread has given me several things to try and I'm adjusting (e.g.,
> nmbclusters) upward to see what happens.

Sounds good.  Using netstat -m and netstat -an are a good way to watch
this issue.  -m is the number of mbufs/clusters in use and -an will
show you all sockets, but what you want to check on s the number of
bytes in the recv and send socket buffers, which are the 2nd and 3rd
columns.

> I should also mention that this system has the natural limitations
> on it's traffic ceiling of two T1s on two NICs and a 3rd LAN NIC
> fielding continuous round-robin mysql replication and rsync style
> mirroring.  It uses two bge interfaces and one server type em
> interface.  It's always troubled me that the zonelimit issues have
> always been associated with higher volume circuits (in what I've
> read). But since our issue is very directly related to traffic
> levels and seem to occur at times where my monitors show us way over
> committed on the two outward facing T1s, I'm still going to proceed
> with the adjustments and see if it increases our survivability.

Since zonelimit is a state reached when your system is out of
resources it makes sense that the higher the traffic the sooner you'll
reach it.  

> Thanks for your time on this.
> 

No problem, it's what I like to do :-)

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Change from BSDL to GPL

2008-05-08 Thread gnn
At Mon, 05 May 2008 06:31:25 +0800,
kevin wrote:
> 
> Hi, all
> I want to port 4.4BSD-Lite's TCP/IP source code to my own OS kernel.
> My OS kernel is GPL licenced.
> Is it possible for me to modify 4.4BSD-Lite's source code and change its 
> licence from 4.4BSD-Lite licence to GPL licence?
> 

Alas, the short answer is "Consult an IP lawyer."

Best,
George


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Proposed patch to the kernel and to netstat...

2008-05-14 Thread gnn
Howdy,

I have developed the attached patch which extends the functionality of
netstat (via the -x flag) to show us all the socket buffer
statistics.  The kernel change counts mbufs, as well as clusters (at
the moment of any size) and gives output like this:

Proto Recv-Q Send-Q  Local Address  Foreign Address   R-MBUF S-MBUF 
R-CLUS S-CLUS R-HIWA S-HIWA R-LOWA S-LOWA R-BCNT S-BCNT R-BMAX S-BMAX (state)
tcp4   0  0 127.0.0.1.6010 *.* 0  0 
 0  0  65536  32768  1   2048  0  0 262144 262144 LISTEN
tcp6   0  0 ::1.6010   *.* 0  0 
 0  0  65536  32768  1   2048  0  0 262144 262144 LISTEN
tcp4   0  0 172.16.186.130.22  172.16.186.1.53443  0  0 
 0  0  66608  33304  1   2048  0  0 262144 262144 
ESTABLISHED
tcp4   0  0 172.16.186.130.29178   172.16.186.1.22 0  0 
 0  0  0  0  0  0  0  0  0  0 TIME_WAIT
tcp4   0  0 172.16.186.130.62302   69.147.83.41.22 0  0 
 0  0  65700  74540  1   2048  0  0 262144 262144 
ESTABLISHED
tcp4   0  0 127.0.0.1.62415127.0.0.1.6010  0  0 
 0  0  0  0  0  0  0  0  0  0 TIME_WAIT


Note you need a very wide screen  to read that.

The man page is also updated but the relevant bits are:

 The -x flag causes netstat to output all the information recorded about
 data stored in the socket buffers.  The fields are:

 R-MBUFNumber of mbufs in the receive queue.
 S-MBUFNumber of mbufs in the send queue.
 R-CLUSNumber of clusters, of any type, in the recieve queue.
 S-CLUSNumber of clusters, of any type, in the send queue.
 R-HIWAReceive buffer high water mark, in bytes.
 S-HIWASend buffer high water mark, in bytes.
 R-LOWAReceive buffer low water mark, in bytes.
 S-LOWASend buffer low water mark, in bytes.
 R-BCNTReceive buffer byte count.
 S-BCNTSend buffer byte count.
 R-BMAXMaximum bytes that can be used in the receive buffer.
 S-BMAXMaximum bytes that can be used in the send buffer.


Please email me comments.  I'd like to commit this to HEAD soon.  It
can't be put into 7 without removing the cluster and mbuf counting,
but I might do that as well if there is interest.

Best,
George



netstat.diff
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Anyone seen this error on em ?

2008-06-09 Thread gnn



Jun  9 18:23:59 ... kernel: em0:  port 0x2000-0x201f mem 0xd802-0xd803,0xd800
-0xd801 irq 18 at device 0.0 on pci4
Jun  9 18:23:59 ... kernel: em0: Using MSI interrupt
Jun  9 18:23:59 ... kernel: em0: Setup of Shared code failed
Jun  9 18:23:59 ... kernel: device_attach: em0 attach returned 6

I've never seen the "returned 6" thing.  Plugging a cable into the
other em device on the motherboard, a super micro, currently works,
but it looks like bad hardware to me.  Thoughts?

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Weirdness - FBSD 7, Routing, Packet generator, em taskq

2008-06-27 Thread gnn
At Thu, 26 Jun 2008 23:25:18 -0400,
Paul wrote:
> 
> I have a FreeBSD router set up with Full BGP routes and I'm doing some 
> tests on  using it for routing.
> 
> 7.0-RELEASE-p1 FreeBSD 7.0-RELEASE-p1 #6: Thu Apr 17 18:11:49 EDT 2008  
> amd64
> 
> oddness..:
> 
> Use a packet generator to generate random source ips and ports and send 
> traffic through the router to a destination on the other side, single ip.
> What happens is the 'em0 taskq'  starts to eat cpu... but the funny 
> thing is immediately when I start the traffic (say, 100,000 pps) em0 
> taskq is about 15% cpu.. and then over the course of 2 minutes or so it 
> climbs to 60% cpu..  This makes no sense.. The packets per second are 
> continuous and it just routed 100kpps for 60 seconds with less cpu so 
> why in the world would it slowly climb like that? 
> 
> It's an observation I suppose and I was hoping if someone could 
> enlighten me on WHY.. :)   I did test it on 3 different machines by the way.
> It even does this with just a handful of routes in the routing table , I 
> tried that too just to rule that out.  
> I don't remember Freebsd 4/5 doing this??
> 

What are you using to measure the CPU time?  Some tools take time to
gather up enough samples.  Also, have you tried to do any profiling on
the kernel to see why this might be the case?

http://www.watson.org/~robert/freebsd/netperf/profile/

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


What's the deal with hardware checksum and net.inet.udp.checksum?

2008-07-09 Thread gnn
I would assume that if a card, say the em, has hardware TX checksum
that the UDP checksum could be calculated by the hardware, but this
seems not to be the case.  The manual pages are unhelpful in this regard.

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: What's the deal with hardware checksum and net.inet.udp.checksum?

2008-07-10 Thread gnn
At Thu, 10 Jul 2008 11:43:23 +0100 (BST),
rwatson wrote:
> 
> On Wed, 9 Jul 2008, [EMAIL PROTECTED] wrote:
> 
> > I would assume that if a card, say the em, has hardware TX checksum that 
> > the 
> > UDP checksum could be calculated by the hardware, but this seems not to be 
> > the case.  The manual pages are unhelpful in this regard.
> 
> On the whole, they should be generated in hardware as long as it's
> not administratively disabled with ifconfig, and as long as there
> aren't know bugs in the hardware for the rev you're using.  Just for
> example, hardware checksumming is disabled in software for quite a
> few early 1gbps cards due to bugs in the hardware causing rather
> nasty side effects.  What specific problem are you seeing?  We do do
> a software checksum of the pseudo-header, but the UDP data should be
> checksummed by hardware.
> 
> (The usual test for hardware checksum being enabled on transmit is
> to tcpdump the interface and see tcpdump reporting lots of bad
> checksums, as the BPF capture happens before hardware checksumming
> is run -- in principle on the receive side that shouldn't happen!)
> 

If the sysctl it turned off on the transmitter then the receiving
machine sees UDP checksums of 0.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: What's the deal with hardware checksum and net.inet.udp.checksum?

2008-07-14 Thread gnn
A, thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


igb doesn't compile in STABLE?

2008-07-14 Thread gnn
Howdy,

As of today, this afternoon, I see the following:

linking kernel.debug
e1000_api.o(.text+0xad9): In function `e1000_setup_init_funcs':
../../../dev/em/e1000_api.c:343: undefined reference to 
`e1000_init_function_pointers_80003es2lan'
e1000_api.o(.text+0xae8):../../../dev/em/e1000_api.c:340: undefined reference 
to `e1000_init_function_pointers_82571'
e1000_api.o(.text+0xafa):../../../dev/em/e1000_api.c:334: undefined reference 
to `e1000_init_function_pointers_82541'
e1000_api.o(.text+0xb0c):../../../dev/em/e1000_api.c:328: undefined reference 
to `e1000_init_function_pointers_82540'
e1000_api.o(.text+0xb1e):../../../dev/em/e1000_api.c:321: undefined reference 
to `e1000_init_function_pointers_82543'
e1000_api.o(.text+0xb30):../../../dev/em/e1000_api.c:316: undefined reference 
to `e1000_init_function_pointers_82542'
e1000_ich8lan.o(.text+0x98c): In function `e1000_valid_nvm_bank_detect_ich8lan':
../../../dev/em/e1000_ich8lan.c:1032: undefined reference to 
`e1000_translate_register_82542'
e1000_ich8lan.o(.text+0xc32): In function `e1000_acquire_swflag_ich8lan':
../../../dev/em/e1000_ich8lan.c:424: undefined reference to 
`e1000_translate_register_82542'
e1000_ich8lan.o(.text+0xc6e):../../../dev/em/e1000_ich8lan.c:426: undefined 
reference to `e1000_translate_register_82542'
e1000_ich8lan.o(.text+0xc9d):../../../dev/em/e1000_ich8lan.c:422: undefined 
reference to `e1000_translate_register_82542'
e1000_ich8lan.o(.text+0xced):../../../dev/em/e1000_ich8lan.c:436: undefined 
reference to `e1000_translate_register_82542'
e1000_ich8lan.o(.text+0x16bf):../../../dev/em/e1000_ich8lan.c:2700: more 
undefined references to `e1000_translate_register_82542' follow
*** Error code 1


Thoughts?

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: igb doesn't compile in STABLE?

2008-07-15 Thread gnn
At Mon, 14 Jul 2008 14:53:16 -0700,
Jack Vogel wrote:
> 
> Just guessing, did someone change conf/files maybe??
> 

If you build a STABLE kernel with igb AND em then things work and the
kernel uses em.

I'm not sure which thing needs to be changed in conf/files or
otherwise though.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: igb doesn't compile in STABLE?

2008-07-15 Thread gnn
At Tue, 15 Jul 2008 10:07:22 -0700,
Jack Vogel wrote:
> 
> Oh, so the problem is if igb alone is defined?
> 

Yes.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: igb doesn't compile in STABLE?

2008-07-16 Thread gnn
At Tue, 15 Jul 2008 10:35:57 -0700,
Jack Vogel wrote:
> 
> OK, will put on my todo list :)
> 

Thanks.  A kernel built that way (i.e. with igb and em) does actually
work, which is good, but if you're going to split them up we should
get this right before 7.1.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: moving sockbuf in to its own header

2008-07-22 Thread gnn
At Sun, 20 Jul 2008 16:07:29 -0700,
Kip Macy wrote:
> 
> Actually, I'd like to re-factor multiple parts of socketvar in to
> separate files.
> 
> Please provide feedback on the following:
> 
> http://www.fsmware.com/socketvar_refactor.diff
> 

Looks good to me.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: HEADS UP: E1000 networking changes in STABLE/7.1 RELEASE

2008-08-14 Thread gnn
Hi Jack,

Thanks for this and for the concise pciconf line.  We use em (soon to
be igb) interfaces extensively at work.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Small patch to multicast code...

2008-08-21 Thread gnn
Hi,

Turns out there is a bug in the code that loops back multicast
packets.  If the underlying device driver supports checksum offloading
then the packet that is looped back, when it is transmitted on the
wire, is incorrect, due to the fact that the packet is not fully
copied.

Here is a patch.  Comments welcome.

Best,
George

Index: ip_output.c
===
--- ip_output.c (revision 181731)
+++ ip_output.c (working copy)
@@ -1135,7 +1135,7 @@
register struct ip *ip;
struct mbuf *copym;
 
-   copym = m_copy(m, 0, M_COPYALL);
+   copym = m_dup(m, M_DONTWAIT);
if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen))
copym = m_pullup(copym, hlen);
if (copym != NULL) {
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-21 Thread gnn
At Thu, 21 Aug 2008 22:35:19 +0200,
Luigi Rizzo wrote:
> 
> On Thu, Aug 21, 2008 at 03:11:56PM -0400, [EMAIL PROTECTED] wrote:
> > Hi,
> > 
> > Turns out there is a bug in the code that loops back multicast
> > packets.  If the underlying device driver supports checksum offloading
> > then the packet that is looped back, when it is transmitted on the
> > wire, is incorrect, due to the fact that the packet is not fully
> > copied.
> > 
> > Here is a patch.  Comments welcome.
> > 
> > Best,
> > George
> > 
> > Index: ip_output.c
> > ===
> > --- ip_output.c (revision 181731)
> > +++ ip_output.c (working copy)
> > @@ -1135,7 +1135,7 @@
> > register struct ip *ip;
> > struct mbuf *copym;
> >  
> > -   copym = m_copy(m, 0, M_COPYALL);
> > +   copym = m_dup(m, M_DONTWAIT);
> > if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen))
> > copym = m_pullup(copym, hlen);
> > if (copym != NULL) {
> 
> I am slightly puzzled -- what is exactly the problem, i.e. what part
> of the packet on the wire is incorrect ? The IP header is within hlen so
> the m_pullup() should be enough to leave the original content intact.
> 
> The only thing i can think of is that it's the UDP checksum,
> residing beyond hlen, which is overwritten somewhere in the
> call to if_simloop -- in which case perhaps a better fix is
> to m_pullup() the udp header as well ?

It is the checksum that gets trashed, yes.

> (in any case, it is worthwhile to add a comment to explain
> what should be done -- the code paths using m_*() have become
> quite fragile with these hw support enhancements that now
> require selective modifications on previously shared, readonly buffers).

The m_*() routines actually have reasonable comments, it just seems
the wrong one was used here.

Best,
Gerge
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-22 Thread gnn
At Fri, 22 Aug 2008 03:27:11 +0100,
Bruce M. Simpson wrote:
> 
> [EMAIL PROTECTED] wrote:
> >> The only thing i can think of is that it's the UDP checksum,
> >> residing beyond hlen, which is overwritten somewhere in the
> >> call to if_simloop -- in which case perhaps a better fix is
> >> to m_pullup() the udp header as well ?
> >> 
> >
> > It is the checksum that gets trashed, yes.
> > ...
> > The m_*() routines actually have reasonable comments, it just seems
> > the wrong one was used here.
> >   
> 
> Actually, m_copy() has been legacy for some time now -- see comments.
> 
> I'd be concerned that the change to m_dup() (which makes a full mbuf 
> chain copy) rather than m_copym() (which bumps refcounts) is going to 
> eat into the mbuf clusters on fast links, though it's an easy band-aid 
> for the problem.

I gather you mean that a fast link on which also we're looping back
the packet will be an issue?  Since this packet is only going into the
simloop() routine.

> I agree with Luigi that some of the API contract for mbuf(9) doesn't 
> hold any more now that we have TSO and other offload.

I was actually hoping, as the person who last hacked this code, that
you might have a suggestion as to a "right" fix.  

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-22 Thread gnn
At Fri, 22 Aug 2008 21:42:00 +0200,
Luigi Rizzo wrote:
> 
> On Fri, Aug 22, 2008 at 07:43:03PM +0100, Bruce M. Simpson wrote:
> > [EMAIL PROTECTED] wrote:
> > >I gather you mean that a fast link on which also we're looping back
> > >the packet will be an issue?  Since this packet is only going into the
> > >simloop() routine.
> > >  
> > 
> > We end up calling if_simloop() from a few "interesting" places, in 
> > particular the kernel PIM packet handler.
> > 
> > In this particular case we're going to take a full mbuf chain copy every 
> > time we send a packet which needs to be looped back to userland.
> ...
> > In the case of ip_mloopback(), somehow we are stomping on a read-only 
> > copy of an mbuf chain. The use of m_copy() with m_pullup() there is fine 
> > according to the documented uses of mbuf(9), although as Luigi pointed 
> > out, most likely we need to look at the upper-layer protocol too, e.g. 
> > where UDP checksums are also being offloaded.
> 
> in fact, george, if you have an easy way to reproduce the error,
> could you see if reverting your change and instead adding
> sizeof(struct udphdr) to the length argument in the call to m_pullup()
> fixes the problem ?

I don't have sample code I can give but it's simple to set up and
test.

On machine A set up a sender and a listener for the same multicast
group/port.

On machine B set up a listener.

Send from A with the listener on.  B should see nothing and its "bad
checksums" counter should increase.

Turn off listener on A.

Send again, B should get the packet.

If you listen to the traffic with tcpdump on a 3rd machine you'll see
that the checksum is constant, even if the data in the packet, like
the ports, is not.

Your ethernet cards have to have hardware checksum offloading.  I'm
using em/igb in 7-STABLE.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-22 Thread gnn
At Fri, 22 Aug 2008 19:43:03 +0100,
Bruce M. Simpson wrote:
> 
> We end up calling if_simloop() from a few "interesting" places, in 
> particular the kernel PIM packet handler.
> 
> In this particular case we're going to take a full mbuf chain copy every 
> time we send a packet which needs to be looped back to userland.

Right, I know the penalty.

> It's been a while since I've done any in-depth FreeBSD work other
> than hacking on the IGMPv3 snap, and my time is largely tied up with
> other work these days, sadly.
> 
> It doesn't seem right to my mind that we need to make a full copy of
> an mbuf chain with m_dup() to workaround this kind of problem.
> 
> Whilst it may suffice for a band-aid workaround, we may see mbuf
> pool fragmentation as packet rates go up.
> 
> However we are now in a "new world order" where mbuf chains may be
> very tied to the device where they've originated or to where they're
> going.  It isn't clear to me where this kind of intrusion is
> happening.
> 
> In the case of ip_mloopback(), somehow we are stomping on a
> read-only copy of an mbuf chain. The use of m_copy() with m_pullup()
> there is fine according to the documented uses of mbuf(9), although
> as Luigi pointed out, most likely we need to look at the upper-layer
> protocol too, e.g.  where UDP checksums are also being offloaded.
> 
> Some of the code in the IGMPv3 branch actually reworks how loopback
> happens i.e. the preference is not to loop back wherever possible
> because of the locking implications. Check the bms_netdev branch
> history for more info.


Well, what I suspect is the problem are these bits:

udp_output():

/*
 * Set up checksum and output datagram.
 */
if (udp_cksum) {
if (inp->inp_flags & INP_ONESBCAST)
faddr.s_addr = INADDR_BROADCAST;
ui->ui_sum = in_pseudo(ui->ui_src.s_addr, faddr.s_addr,
htons((u_short)len + sizeof(struct udphdr) + IPPROTO_UDP));
m->m_pkthdr.csum_flags = CSUM_UDP;
m->m_pkthdr.csum_data = offsetof(struct udphdr, uh_sum);
} else

ip_mloopback():


copym = m_copy(m, 0, M_COPYALL);
if (copym != NULL && (copym->m_flags & M_EXT || copym->m_len < hlen))
copym = m_pullup(copym, hlen);
if (copym != NULL) {
/* If needed, compute the checksum and mark it as valid. */
if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
in_delayed_cksum(copym);
copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
copym->m_pkthdr.csum_flags |=
CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
copym->m_pkthdr.csum_data = 0x;
}

and:

in_delayed_cksum(struct mbuf *m)
{
struct ip *ip;
u_short csum, offset;

ip = mtod(m, struct ip *);
offset = ip->ip_hl << 2 ;
csum = in_cksum_skip(m, ip->ip_len, offset);
if (m->m_pkthdr.csum_flags & CSUM_UDP && csum == 0)
csum = 0x;
offset += m->m_pkthdr.csum_data;/* checksum offset */


Somehow the data that the device needs to do the proper checksum
offload is getting trashed here.  Now, since it's clear we need a
writable packet structure so that we don't trash the original, I'm
wondering if the m_pullup() will be sufficient.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-22 Thread gnn
At Fri, 22 Aug 2008 22:43:39 +0100,
Bruce M. Simpson wrote:
> 
> [EMAIL PROTECTED] wrote:
> > Somehow the data that the device needs to do the proper checksum
> > offload is getting trashed here.  Now, since it's clear we need a
> > writable packet structure so that we don't trash the original, I'm
> > wondering if the m_pullup() will be sufficient.
> >   
> 
> If it's serious enough to break UDP checksumming on the wire, perhaps we 
> should just swallow the mbuf allocator heap churn and do the m_dup() for 
> now, but slap in a big comment about why it's there.

I think if none of us finds a better way before early next week that's
what I'll do so that this at least works in 7.1.

Best,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-26 Thread gnn
At Tue, 26 Aug 2008 14:50:33 + (UTC),
Bjoern A. Zeeb wrote:
> 
> On Tue, 26 Aug 2008, George V. Neville-Neil wrote:
> 
> Hi,
> 
> > At Mon, 25 Aug 2008 21:40:38 +0200,
> > John Hay wrote:
> >>
> >> I have tried it and it does fix my problem. RIP2 over multicast works
> >> again. :-)
> >
> > Good to hear.  I'm waiting on a bit more feedback but I think I'll be
> > checking this in soon, with a big comment talking about the
> > performance implications etc.
> 
> So wait a second; what was the m_pullup vs. m_dup thing? Has anyone
> actually tried that? I mean using a sledgehammer if a mitten would be
> enough is kind of .. uhm. You get it.

Perhaps I'm confused, I've been off dealing with other issues for a
few days, but m_pullup doesn't make a copy of the packet or its
fields, only makes sure that it's contiguous in memory.  Am I wrong in that?

Since the bug is that two pieces of code modify the same data, in ways
that interfere, I'm not sure how we can avoid making a copy.  It might
be nice to limit the copy, but we'd still need two copies, one for the
loopback device and one for the real device.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-26 Thread gnn
At Tue, 26 Aug 2008 17:56:13 -0700,
Sam Leffler wrote:
> 
> [EMAIL PROTECTED] wrote:
> > At Tue, 26 Aug 2008 14:50:33 + (UTC),
> > Bjoern A. Zeeb wrote:
> >   
> >> On Tue, 26 Aug 2008, George V. Neville-Neil wrote:
> >>
> >> Hi,
> >>
> >> 
> >>> At Mon, 25 Aug 2008 21:40:38 +0200,
> >>> John Hay wrote:
> >>>   
>  I have tried it and it does fix my problem. RIP2 over multicast works
>  again. :-)
>  
> >>> Good to hear.  I'm waiting on a bit more feedback but I think I'll be
> >>> checking this in soon, with a big comment talking about the
> >>> performance implications etc.
> >>>   
> >> So wait a second; what was the m_pullup vs. m_dup thing? Has anyone
> >> actually tried that? I mean using a sledgehammer if a mitten would be
> >> enough is kind of .. uhm. You get it.
> >> 
> >
> > Perhaps I'm confused, I've been off dealing with other issues for a
> > few days, but m_pullup doesn't make a copy of the packet or its
> > fields, only makes sure that it's contiguous in memory.  Am I wrong in that?
> >
> > Since the bug is that two pieces of code modify the same data, in ways
> > that interfere, I'm not sure how we can avoid making a copy.  It might
> > be nice to limit the copy, but we'd still need two copies, one for the
> > loopback device and one for the real device.
> >
> >   
> pull the headers up.  copy just the headers.  no deep copy.
> 

I'm confused, if it's these lines that are screwed up:

/* If needed, compute the checksum and mark it as valid. */
if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
in_delayed_cksum(copym);
copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
copym->m_pkthdr.csum_flags |=
CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
copym->m_pkthdr.csum_data = 0x;

in particular that last line, then how does pulling up the header
help?  That's not part of the packet, that's the checksum data in the
pkthdr itself.

Best,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Small patch to multicast code...

2008-08-29 Thread gnn
At Fri, 29 Aug 2008 18:28:53 +0200,
Luigi Rizzo wrote:
> 
> and to be more explicit - the result of m_pullup is that
> the number of bytes specified as m_pullup argument are in
> a private piece of memory -- the 'data' region within the mbuf -- so
> you can freely play with them without trouble.
> 
> That is why i suggested to just increase the argument to m_pullup
> by the size of the udp header so one can overwrite the checksum
> within the mbuf without touching the shared part in the cluster
> (if any).

I tried various versions of that, but then I noticed that I also had
to save out the pkthdr structure as well.  Did you come up with a
faster workable patch?

For now I'm going to commit the patch I sent originally.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-23 Thread gnn
Hi,

It turns out that the last time anyone looked at this constant was
before 1994 and it's very likely time to turn it into a kernel
tunable.  On hosts that have a high rate of packet transmission
packets can be dropped at the interface queue because this value is
too small.  Rather than make a sweeping code change I propose the
following change to the macro and updating a couple of places in the
IP and IPv6 stacks that were using this macro to set their own global
variables.

I have tested this in my test lab at work, it is not as yet in
production at my day job, but will be soon.

Best,
George


Index: netinet/ip_input.c
===
--- netinet/ip_input.c  (revision 183299)
+++ netinet/ip_input.c  (working copy)
@@ -133,7 +133,6 @@
 struct pfil_head inet_pfil_hook;   /* Packet filter hooks */
 
 static struct  ifqueue ipintrq;
-static int ipqmaxlen = IFQ_MAXLEN;
 
 extern struct domain inetdomain;
 extern struct protosw inetsw[];
@@ -265,7 +264,7 @@
 
/* Initialize various other remaining things. */
ip_id = time_second & 0x;
-   ipintrq.ifq_maxlen = ipqmaxlen;
+   ipintrq.ifq_maxlen = IFQ_MAXLEN;
mtx_init(&ipintrq.ifq_mtx, "ip_inq", NULL, MTX_DEF);
netisr_register(NETISR_IP, ip_input, &ipintrq, NETISR_MPSAFE);
 }
Index: net/if.c
===
--- net/if.c(revision 183299)
+++ net/if.c(working copy)
@@ -135,7 +135,14 @@
 #endif
 
 intif_index = 0;
-intifqmaxlen = IFQ_MAXLEN;
+
+int ifqmaxlen = 50;
+TUNABLE_INT("net.ifqmaxlen", &ifqmaxlen);
+
+SYSCTL_INT(_net, OID_AUTO, ifqmaxlen, CTLFLAG_RD,
+  &ifqmaxlen, 0,
+  "interface queue length");
+
 struct ifnethead ifnet;/* depend on static init XXX */
 struct ifgrouphead ifg_head;
 struct mtx ifnet_lock;
Index: net/if.h
===
--- net/if.h(revision 183299)
+++ net/if.h(working copy)
@@ -221,7 +221,7 @@
 #defineIFCAP_WOL   (IFCAP_WOL_UCAST | IFCAP_WOL_MCAST | 
IFCAP_WOL_MAGIC)
 #defineIFCAP_TOE   (IFCAP_TOE4 | IFCAP_TOE6)
 
-#defineIFQ_MAXLEN  50
+#defineIFQ_MAXLEN  ifqmaxlen
 #defineIFNET_SLOWHZ1   /* granularity is 1 second */
 
 /*
Index: netinet6/ip6_input.c
===
--- netinet6/ip6_input.c(revision 183299)
+++ netinet6/ip6_input.c(working copy)
@@ -115,7 +115,6 @@
 
 u_char ip6_protox[IPPROTO_MAX];
 static struct ifqueue ip6intrq;
-static int ip6qmaxlen = IFQ_MAXLEN;
 struct in6_ifaddr *in6_ifaddr;
 
 extern struct callout in6_tmpaddrtimer_ch;
@@ -178,7 +177,7 @@
printf("%s: WARNING: unable to register pfil hook, "
"error %d\n", __func__, i);
 
-   ip6intrq.ifq_maxlen = ip6qmaxlen;
+   ip6intrq.ifq_maxlen = IFQ_MAXLEN;
mtx_init(&ip6intrq.ifq_mtx, "ip6_inq", NULL, MTX_DEF);
netisr_register(NETISR_IPV6, ip6_input, &ip6intrq, 0);
scope6_init();
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-23 Thread gnn
At Wed, 24 Sep 2008 00:17:18 +0400,
Ruslan Ermilov wrote:
> 
> Hi,
> 
> On Tue, Sep 23, 2008 at 03:29:06PM -0400, [EMAIL PROTECTED] wrote:
> > It turns out that the last time anyone looked at this constant was
> > before 1994 and it's very likely time to turn it into a kernel
> > tunable.  On hosts that have a high rate of packet transmission
> > packets can be dropped at the interface queue because this value is
> > too small.  Rather than make a sweeping code change I propose the
> > following change to the macro and updating a couple of places in the
> > IP and IPv6 stacks that were using this macro to set their own global
> > variables.
> > 
> > I have tested this in my test lab at work, it is not as yet in
> > production at my day job, but will be soon.
> > 
> It's not that bad -- most modern Ethernet drivers initialize interface
> input queues themselves, and don't depend on IFQ_MAXLEN.  The IPv4
> input queue is tunable via net.inet.ip.intr_queue_maxlen.  The IPv6
> queue can similarly be made tunable.  I agree that ifqmaxlen can be
> made tunable because there's still a lot of (mostly for old hardware)
> drivers that use ifqmaxlen and IFQ_MAXLEN, but I'm against changing
> the definition of IFQ_MAXLEN.  Imagine some code like this:
> 

Sorry, this is about the output queue, not the input queue.

Though there are both input and output queues that depend on this.

> void *x[IFQ_MAXLEN];  // here it's 50
> 
> And some function that does:
> 
> for (i = 0; i < IFQ_MAXLEN; i++) {// not necessarily 50
>   x[i] = NULL;
> }
> 

I found no occurrences of the above in our code base.  I used cscope
to search all of src/sys.  Are you aware of any occurrences of this?

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-24 Thread gnn
At Wed, 24 Sep 2008 15:50:32 +0100,
Bruce M. Simpson wrote:
> 
> Hi,
> 
> I agree with the intent of the change that IPv4 and IPv6 input queues 
> should have a tunable queue length. However, the change provided is 
> going to make the definition of IFQ_MAXLEN global and dependent upon a 
> variable.
> 
> [EMAIL PROTECTED] wrote:
> > Hi,
> >
> > It turns out that the last time anyone looked at this constant was
> > before 1994 and it's very likely time to turn it into a kernel
> > tunable.  On hosts that have a high rate of packet transmission
> > packets can be dropped at the interface queue because this value is
> > too small.  Rather than make a sweeping code change I propose the
> > following change to the macro and updating a couple of places in the
> > IP and IPv6 stacks that were using this macro to set their own global
> > variables.
> >   
> 
> This isn't appropriate for many uses of ifq's which might be internal to 
> a given driver or subsystem, and which may use IFQ_MAXLEN for 
> convenience, as Ruslan has pointed out. I have code elsewhere which does 
> this.
> 
> Can you please do this on a per-protocol stack basis? i.e. give IPv4 and 
> IPv6 their own TUNABLE queue length.
> 

Actually what we'd need is N of these, since my target is actually the
send queue, not the input queue.  Let me look at this some more.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Proposed patch, convert IFQ_MAXLEN to kernel tunable...

2008-09-24 Thread gnn
At Wed, 24 Sep 2008 12:53:31 -0700,
John-Mark Gurney wrote:
> 
> George V. Neville-Neil wrote this message on Tue, Sep 23, 2008 at 15:29 -0400:
> > It turns out that the last time anyone looked at this constant was
> > before 1994 and it's very likely time to turn it into a kernel
> > tunable.  On hosts that have a high rate of packet transmission
> > packets can be dropped at the interface queue because this value is
> > too small.  Rather than make a sweeping code change I propose the
> > following change to the macro and updating a couple of places in the
> > IP and IPv6 stacks that were using this macro to set their own global
> > variables.
> 
> The better solution is to resurrect rwatson's patch that eliminates the
> interface queue, and does direct dispatch to the ethernet driver..
> Usually the driver has a queue of 512 or more packets already, so putting
> them into a second queue doesn't provide much benefit besides increasing
> the amount of locking necessary to deliver packets...

Actually I am making this change because I found on 10G hardware the
queue is too small.  Also, there are many systems where you might want
to up this, usually ones that are highly biased towards transmit only,
like a multicast repeater of some sort.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: if_bridge.ko requires INET6...

2006-02-05 Thread gnn
At Sat, 4 Feb 2006 16:16:49 +0100,
Max Laier wrote:
> Here it is.  I'd appreciate feedback.  pflog_packet() uses a lot of complex 
> types which makes it necessary to include pfvar.h.  This is ugly, but I don't 
> know how to work around this.

I gave this a quick read and it looked OK to me.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Zeroing wrong union member in in6_control()

2006-02-07 Thread gnn
At Tue, 07 Feb 2006 11:38:09 -0500,
James Juran wrote:
> 
> [1  ]
> In what looks like a copy&paste remnant from the preceding case, the
> wrong union member is used as the first argument to bzero in
> in6_control().  This doesn't cause an actual bug, but making this change
> would improve code clarity and robustness to change and also avoids a
> warning from a certain static analysis tool.
> 
> I'm not a regular FreeBSD contributor, so if this patch is worthwhile
> can someone please apply it?  If I should send things like this to a
> different mailing list in the future, please let me know.

The Kame list is still the best one for this (kame
<[EMAIL PROTECTED]>) but I've forwarded it for you.

I'll take care of getting this into FreeBSD though.

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Separating the kernel and user land versions of PF_KEY.

2006-02-11 Thread gnn
Hi Folks,

The attached patch makes it so that the user land and kernel land
versions of the pf_key structures are different and therefore no
longer dependent.  This is one step in moving us away from the place
we're in now where changes to one side require changes in the
other. At some point soon a more full overhaul of the code will take
place, likely along the lines of the code found in OpenBSD (look at
net/pfkey*.[ch] there).  Please send feedback etc. on this patch along
to me.

BTW Although this patch contains a lot of p4 cruft it applies cleanly
against HEAD, at least as of a few days ago and passes the CT test
suite ipsec4 which is available by installing the ct port.

Thanks,
George

Change 89716 by [EMAIL PROTECTED] on 2006/01/15 02:41:52

First cut at removing PF_KEY data structures from the keydb.  This code
does not work completely yet but needs to be saved.

Affected files ...

... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/ipsec.c#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key.c#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key_var.h#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/keydb.h#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_ah.c#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_esp.c#2 edit
... //depot/projects/gnn_fast_ipsec/src/sys/netipsec/xform_tcp.c#2 edit

Differences ...

 //depot/projects/gnn_fast_ipsec/src/sys/netipsec/ipsec.c#2 (text+ko) 
Index: sys/netipsec/ipsec.c
--- sys/netipsec/ipsec.c.~1~Sun Feb  5 15:06:16 2006
+++ sys/netipsec/ipsec.cSun Feb  5 15:06:16 2006
@@ -92,6 +92,7 @@
 
 #include 
 
+#define IPSEC_DEBUG
 #ifdef IPSEC_DEBUG
 int ipsec_debug = 1;
 #else

 //depot/projects/gnn_fast_ipsec/src/sys/netipsec/key.c#2 (text+ko) 
Index: sys/netipsec/key.c
--- sys/netipsec/key.c.~1~  Sun Feb  5 15:06:16 2006
+++ sys/netipsec/key.c  Sun Feb  5 15:06:16 2006
@@ -420,7 +420,10 @@
 static struct mbuf *key_setsadbxsa2 __P((u_int8_t, u_int32_t, u_int32_t));
 static struct mbuf *key_setsadbxpolicy __P((u_int16_t, u_int8_t,
u_int32_t));
-static void *key_dup(const void *, u_int, struct malloc_type *);
+static struct seckey *key_dup_keymsg(const struct sadb_key *, u_int, 
+struct malloc_type *);
+static struct seclifetime *key_dup_lifemsg(const struct sadb_lifetime *src,
+   struct malloc_type *type);
 #ifdef INET6
 static int key_ismyaddr6 __P((struct sockaddr_in6 *));
 #endif
@@ -488,6 +491,10 @@
 static int key_senderror __P((struct socket *, struct mbuf *, int));
 static int key_validate_ext __P((const struct sadb_ext *, int));
 static int key_align __P((struct mbuf *, struct sadb_msghdr *));
+static struct mbuf *key_setlifetime(struct seclifetime *src, 
+u_int16_t exttype);
+static struct mbuf *key_setkey(struct seckey *src, u_int16_t exttype);
+
 #if 0
 static const char *key_getfqdn __P((void));
 static const char *key_getuserfqdn __P((void));
@@ -909,8 +916,8 @@
 
/* What the best method is to compare ? */
if (key_preferred_oldsa) {
-   if (candidate->lft_c->sadb_lifetime_addtime >
-   sav->lft_c->sadb_lifetime_addtime) {
+   if (candidate->lft_c->addtime >
+   sav->lft_c->addtime) {
candidate = sav;
}
continue;
@@ -918,8 +925,8 @@
}
 
/* preferred new sa rather than old sa */
-   if (candidate->lft_c->sadb_lifetime_addtime <
-   sav->lft_c->sadb_lifetime_addtime) {
+   if (candidate->lft_c->addtime <
+   sav->lft_c->addtime) {
d = candidate;
candidate = sav;
} else
@@ -930,7 +937,7 @@
 * suitable candidate and the lifetime of the SA is not
 * permanent.
 */
-   if (d->lft_c->sadb_lifetime_addtime != 0) {
+   if (d->lft_c->addtime != 0) {
struct mbuf *m, *result;
u_int8_t satype;
 
@@ -2787,9 +2794,9 @@
} else {
KASSERT(sav->iv == NULL, ("iv but no xform"));
if (sav->key_auth != NULL)
-   bzero(_KEYBUF(sav->key_auth), _KEYLEN(sav->key_auth));
+   bzero(sav->key_auth->key_data, _KEYLEN(sav->key_auth));
if (sav->key_enc != NULL)
-   bzero(_KEYBUF(sav->key_enc), _KEYLEN(sav->key_enc));
+   bzero(sav->key_enc->key_data, _KEYLEN(sav->key_enc));
}
if (sav->key_auth != NULL) {
free(sav->key_auth, M_IPSEC_MISC);
@@ -3038,9 +3045,11 @@
  

Re: FAST_IPSEC and tunnelled packets processing

2006-03-10 Thread gnn
At Thu, 9 Mar 2006 15:53:03 +0100,
VANHULLEBUS Yvan wrote:
> 
> On Wed, Mar 08, 2006 at 08:02:36PM -0800, Sam Leffler wrote:
> [.]
> > If I recall the IPIP handling is different from KAME because there is 
> > support for IPIP encapsulation independent of the IPsec protocols while 
> > KAME only handles IPIP as part of the ESP tunnel configuration.  As to 
> > overhead, in practice, at least back in 4.x where this work was 
> > originally done, the netisr dispatch was effectively shortcircuited 
> > because the dispatch was done from the netisr thread so the net cost was 
> > a enqueue+dequeue of the packet.  I'm not sure about extraneous trips 
> > through ip_input or not stripping headers; this stuff used to work right 
> > but I've not looked at the code in years.
> 
> There IS some code to remove the IPIP header, but it doesn't work.
> 
> I just reported pr kern/94273 with a patch which solves it.
> 

Bug taken by me :-)  I'll try your patch and commit as necessary.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: IP_SENDIF?

2006-03-20 Thread gnn
At Sun, 19 Mar 2006 21:34:19 -1000 (HST),
Dave Cornejo wrote:
> 
> Hi,
> 
> Some time ago (Oct 2004) there was some talk of implementing
> IP_SENDIF, a search of the mailing list turns up nothing since then.
> Did anything ever happen with this?
> 

No, but if you have a patch we're up for reviewing it ;-)

It remains on a long list of things todo.


Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: IPv6 raw socket to send original udp

2006-05-09 Thread gnn
At Mon, 08 May 2006 05:44:51 +0900 (JST),
Hideki Yamamoto wrote:
> 
> 
> Hi,
> 
> I tried to use pf as a traffic shaper for a streaming server, but
> it does not work well.  Input of pf is bursted packets within around 20
> msec, but is not bursted packets within around 100 msec or longer.
> This traffic pattern is the feature of the streaming server.
> 
> As pf is does not work well, I am thinking designinig original shaper
> command on bridge-like freebsd box, and that the command will receive
> the sever packet via libpcap, shape it and then send it constantly to
> another device.  To send packet from bridge-like freebsd box, I plan
> to use RAW IPV6 socket.  However in my small experiment, it does not
> seems good, IP_HDRINCL option does not woks.
> 
> I wonder if IPv6 raw socket can be used only for ICMPv6.
> I would like to use IPv6 raw socket for original udp packet.
> 
> Thanks in advance.
> 

Hi,

I have trimmed the cc to just -net because I am concerned mostly about
the possibility of a bug in the networking code.  Can you provide more
information on what you're seeing on the raw IPv6 socket?  If you
could send a chunk of code, that might help as well.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: nd6_lookup prints bogus messages with point to point devices

2006-05-23 Thread gnn
At Tue, 23 May 2006 13:43:01 +0900,
jinmei wrote:
> Thanks, please do to.  I believe the patch also fixes this problem
> report:
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/93220
> 
> So, could you also confirm this and give feedback to (or close) the
> report?  (I'll send a follow-up message to the report by myself it
> it's appropriate).
> 

Will do.  That was on my list to look at anyways.

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: enc0 patch for ipsec

2006-06-16 Thread gnn
I knew there was something bothering me about enc, now I know what it
was.  I'm glad someone else caught this and that you fixed it.
Thanks.

I'll be testing the patch today.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: SCTP

2006-07-01 Thread gnn
At Fri, 30 Jun 2006 12:36:10 -0400,
randall wrote:
> 
> Hi all:
> 
> The following link:
> 
> http://www.sctp.org/cvs_diff_6_30.bz2
> 
> Will get you a large patch that you can apply to Current that will
> add SCTP.
> 
> Its a bzip2 patch file since it is so large :-D
> 
> It includes the changes to a few base files.. and mainly its the
> complete files diff'd against this mornings current cvs...
> 
> Yes, I know that the build is broken in acpi/acpi_asus but the sctp
> code did compile and build a kernel for me... so once the above is
> fixed.. you should be able to use the patch and check it out :-D
> 
> Oh, you will need to add
> 
> option SCTP
> 
> to your kernel conf... and it might not
> hurt to do a make sysent in sys/kern
> 
> I will prepare a seperate file for the overall libsctp.a
> once I figure out where it should go :-D
> 
> Happy SCTPing.. and if you have any problems with the patch please
> send me an email :-D

And please start testing this because many of us want to integrate
this in the near future :-)

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: SCTP

2006-07-03 Thread gnn
At Mon, 3 Jul 2006 09:48:06 +0200,
Marcin Jessa wrote:
> > And please start testing this because many of us want to integrate
> > this in the near future :-)
> 
> Any hints on how to test SCTP ?
> Not much really about any practical implementation of it
> on http://www.sctp.org/

One trivial toy to play with is NetPIPE, but that's just a bandwidth
tester.  It does show socket programming with SCTP, which is
relatively the same as TCP, until you get to the advanced features,
which NetPIPE doesn't cover.

You'll need my updated NetPIPE until the patches are committed to
their project:

http://www.freebsd.org/~gnn/netpipe.tar.gz

I suspect Randall has a better list of things to try.

Best,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Possible inconsistency in the use of in6_delmulti()

2006-07-19 Thread gnn
At Tue, 18 Jul 2006 12:03:20 -0700,
Tom Parker wrote:
> 
> Hi,
> 
> New to the list here, but fairly familiar with the innards of (at
> least an older) version of the fbsd networking code. I'm fortunate in
> my ability to run purify on a simulated instance of our ported version
> of the networking code.  Purify has picked up a problem that I'm a bit
> mystified as how it can be fixed.  It is present in current versions
> also, I'm interested in any comments people have (I think ours is 4.4
> vintage, but it is hard to tell).
> 
> As far as I can tell, in most calling paths when in6_delmulti() is
> called, it is done after the in6_multi_mship structure has been
> removed from the im6o_memberships list in the relevant PCB.  This
> applies to in6_ifdetach(), in6_pcbpurgeif0, ip6_setmoptions()  etc.
> However in in6_purgeaddr() in6_delmulti is called straight off.  I'm
> not sure if we've violated some usage convention, but purify is
> telling me this causes access violations when we then leave the same
> group using setsockopt().  in6_purgeaddr is called when we remove the
> address from the interface.
> 
> This should be possible in a real kernel.  Add a multicast address to
> an interface, open a socket and listen to the address, then remove the
> address from the interface.
> 
> Am I missing something here or is this a nasty problem in both the
> kernel and our stack port?
> 

It sounds like a bug to me.  Can you file a PR?

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Packet Construction and Protocol Testing...

2006-07-20 Thread gnn
Hi,

Sorry for the length of this email but I figured I'd get this out
early in case there was anyone else who wanted to play with this.

I have now gotten out version 0.1 of the Packet Construction Set.
This is a set of Python libraries which make writing protocol testing
software much easier.  Of course, you have to know Python, but many
people do, and I favor it strongly over other scripting choices.  The
Summer of Code student I'm working with has also been using this
library, with favorable results.

The Source Forge page is here:

http://sourceforge.net/projects/pcs

and the shar files submitted to get the ports created are now on:

http://www.freebsd.org/~gnn/pcs.port.shar
http://www.freebsd.org/~gnn/py-pypcap.shar

The point of all this is to be able to write better protocol level
tests for our network stack.  Examples are in the scripts/ and tests/
directories of the package but a quick snippet may give a good idea of
what I'm getting at:

def test_icmpv4_ping(self):
ip = ipv4()
ip.version = 4
ip.hlen = 5
ip.tos = 0
ip.length = 84
ip.id = 1
ip.flags = 0
ip.offset = 0
ip.ttl = 33
ip.protocol = IPPROTO_ICMP
ip.src = 2130706433
ip.dst = 2130706433

icmp = icmpv4()
icmp.type = 8
icmp.code = 0
icmp.cksum = 0

echo = icmpv4echo()
echo.id = 32767
echo.seq = 1

lo = localhost()
lo.type = 2
packet = Chain([lo, ip, icmp, echo])

input = PcapConnector("lo0")
input.setfilter("icmp")

output = PcapConnector("lo0")
assert (ip != None)

out = output.write(packet.bytes, 88)
assert (out == 88)

This code sends a quick and dirty, ICMPv4 ping packet on localhost.
The point of all this is to be able to specify packets easly (see
pcs/packets/xxx.py) and then to treat the packet as an object.

I intend to write up a paper on this stuff as well.  There is
currently a simple manual (PDF and LaTeX) in the package.

Later,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Packet Construction and Protocol Testing...

2006-07-20 Thread gnn
At Thu, 20 Jul 2006 10:40:41 -0400,
Chuck Swiger wrote:
> This strikes me as a pretty cool thing, thank you for putting the source out 
> there...given a bit of free time, I'd like to at least test this, if not 
> contribute. [1] :-)

Thanks :-)

> The port is missing a dependency on net/py-pcap, BTW, which makes most of the 
> tests fail if one simply downloads the shar file and tries to run them:
> 

For now I wanted to make them separate though the documentation points
out that you can't use the PCAP connector without py-pypcap.

I may add the dependency in a future release.

Thanks, for the patch!

> [1]: If I could only get net/py-pcap to build, I might be able to do a little 
> more...  :-)

You only need net/py-pypcap, but if that's what you meant please let
me know what the build problem is.

Later,
George

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Packet Construction and Protocol Testing...

2006-07-20 Thread gnn
At Thu, 20 Jul 2006 10:48:14 -0400 (EDT),
Andrew R. Reiter wrote:
> 
> 
> Aren't there already tools for doing this -- libnet / libdnet that both 
> have py wrappers?

I looked at all those, and more, but they miss an important point.
That is, in PCS you define a packet like this (from
pcs/packets/ipv4.py):

def __init__(self, bytes = None):
""" define the fields of an IPv4 packet, from RFC 791
This version does not include options."""
version = pcs.Field("version", 4, default = 4)
hlen = pcs.Field("hlen", 4)
tos = pcs.Field("tos", 8)
length = pcs.Field("length", 16)
id = pcs.Field("id", 16)
flags = pcs.Field("flags", 3)
offset = pcs.Field("offset", 13)
ttl = pcs.Field("ttl", 8, default = 64)
protocol = pcs.Field("protocol", 8)
checksum = pcs.Field("checksum", 16)
src = pcs.Field("src", 32)
dst = pcs.Field("dst", 32)
pcs.Packet.__init__(self,
[version, hlen, tos, length, id, flags, offset,
 ttl, protocol, checksum, src, dst],
bytes = bytes)
# Description MUST be set after the PCS layer init
self.description = "IPv4"

which creates a properties in the object to hold the named field.
This is what makes it possible to do:

ip = ipv4()
ip.ttl = 64
ip.src = inet_pton("128.32.1.1")

etc. in your program.  Also note that the bit lengths can be odd, such
as getting the 13 bit offset field.  So, PCS is doing all the packing
and unpacking of the bytes for you.  I intend to put in automatic
bounds checking in an upcoming version.

There is much more about this in the documentation, docs/pcs.pdf in
the package.

Future versions will allow import/export to various formats as well,
such as XML, so that defining packets will be even easier as will
writing tools.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Packet Construction and Protocol Testing...

2006-07-21 Thread gnn
At Fri, 21 Jul 2006 21:17:39 +0200,
troglocan wrote:
> 
> Hi,
> 
> Sorry for the late reply, I just read the thread. Did you take a look
> at Scapy (http://www.secdev.org/scapy). It does exactly (and more)
> what you are trying to do ...
> 
> a+
> 
> ps : also, Scapy6 (http://namabiiru.hongo.wide.ad.jp/scapy6/) provides
> extension of Scapy for IPv6 (some parts of what is advertised on main
> page are currently being reviewed and have been extracted of main file
> temporarily).
> 

Yup, looked at it.  A single file, hard to maintain, and does not
support creating arbitrary packets in the way PCS does, but it does
have some interesting features.

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Changes in the network interface queueing handoff model

2006-07-30 Thread gnn
At Sun, 30 Jul 2006 15:04:48 +0100 (BST),
rwatson wrote:
> Conceptual review as well as banchmarking, etc, would be most welcome.
> 

I remember talking about this at BSDCan and certainly for high end
hardware it seems that it's the right way to go. 

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ipv6 in ipv6 tunnel with FreeBSD 4.11

2006-08-18 Thread gnn
At Fri, 18 Aug 2006 15:28:11 + (GMT),
Julien Abeillé wrote:
> Hi,
>  
>  I am using freebsd 4.11 and trying to setup ipv6 in ipv6 tunnels.

All my stuff is on HEAD and 6 so I don't know if this applies but I
think it should.

>  I have the following testbed
>  4 machines connected in line:
>  
>  M1---M2FreeBSD---M3
> c::1---c::2  |  b::2b::1   |   a::1---a::2
>  
>  I want to create a tunnel between FreeBSD (b::1) and M2 (b::2)
>  
>  Here is my configuration on the FreeBSD machine:
>  em0 : a::1/64
>  em1 b::1/64
>  
>  I do the folllowing to setup the tunnel:
>  
>  ifconfig gif0 create
>  ifconfig gif0 tunnel b::1 b::2
>  ifconfig gif0 d::1/64
>  route add -inet6 -host c::1 -interface gif0
>  
>  I am not sure about what is the gif0 address d::1/64 used for.
>  

Nor am I.  What directions are you following?  I believe that may be
there because the gif tunnel instructions talked about setting up IPv4
tunnels for IPv6.

>  the problem is: when i ping or send any traffic from a::2 to c::1,
>  the FreeBSD machine adds an ipv6 header with b::1 as source, b::2 as 
> destination, but with hop count limit=0
>  
>  Is my configuration ok? 

A few things to note:

1) You need to have

ipv6_gateway_enable="YES"

set to forward packets.

2) Are you trying to tunnel between two interfaces on the same
   machine?  It's hard to tell from your description.  If the FreeBSD
   box is a router between two tunnels then you need two tunnel
   endpoints.  One pointing at M2 and one pointing at M3.

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Re : ipv6 in ipv6 tunnel with FreeBSD 4.11

2006-08-19 Thread gnn
At Sat, 19 Aug 2006 11:45:13 + (GMT),
Julien Abeillé wrote:
> 
> Hi George,
>  
> thanks for your answer. A few precisions then: I do two setups in
> fact, one on IMUNES network emulator (this is why I use FreeBSD
> 4.11), one with 4 real machines. The one with four real machines has
> no tunnel endpoint. I know it is a bit weard, but the other machines
> are linux machines, and I did not want to go in compatibility
> problems (if there are some?).

I don't know if there are compatability issues with Linux but I doubt
it as the same people developed the protocol stacks, at least
initially.

> On this testbed (with the real machines), I just send trafic from M3
> through the FreeBSD machine. I did not set
> ipv6_gateway_enable="YES", but use sysctl. I do not have a BSD here
> (internet cafe) so i do not remember the exact parameter
> (net.inet6.ip6.forwarding?) but i set ipv6 forwarding to one and
> without tunnels I can ping from one end to the other. One question:
> are the two tunnel endpoints supposed to negociate something? If
> yes, I do need another endpoint.

Nope, they don't need to negotiate anything, the machines are just
acting as routers.  You also need to have appropriate routes set.

> In the IMUNES simulation, I have the 4 machines inline the same way
> (M1 M2 M3 M4 ) and setup the tunnel on M2 and M3 (between b::1 and
> b::2). It works but with hop count limit=0. I did the same setup
> with 5 machines inline (M1 M2 M3 M4 M5) and a tunnel between M2 and
> M4. It does not work anymore: if i send trafic through the tunnel
> from M2 to M4, M3 discards the packets and sends an icmpv6 "time
> exceeded..." message to M2.
>  

That is odd, but it may be that one of the machines is considering the
next hop address to be link local, and not global, in which case it
might set the hop limit to be 1, and then it would be decremented to 0
at the other end of the tunnel.  Make sure you're not using link local
addresses on your tunnel endpoints.

> I will try on monday without giving an IPv6 address to the gif
> interface. Indeed I followed the instructions on the FreeBSD
> handbook section IPv6 for IPv6 in IPv4 tunnels. The problem is I did
> not find any instructions for IPv6 in IPv6. The only thing I found
> in kame was: "be careful with IPv6 in IPv6 and IPv4 in IPv4 tunnels
> which often result in infinite routing in the kernel". Maybe it is
> what is happening here.

It could be, but I don't have a setup like that to test.

You might also ask on the [EMAIL PROTECTED] mailing list as well.

Also, keep freebsd-net@freebsd.org cc'd as someone else might be able
to answer this better than I.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: RFC: TSO patch for current

2006-09-01 Thread gnn
At Fri, 1 Sep 2006 15:51:21 -0700,
Jack Vogel wrote:
> 
> This is a patch for the stack and the em driver to enable TSO
> on CURRENT. Previously I had problems getting it to work, but
> this is functional.
> 
> I should note that CURRENT is being a pain right now, when
> I comment out em in the config the kernel panics coming up,
> so I had to substitute this code into the tree. Rather bizarre :)
> 
> I have this functionality running on a 6.1 based system, and
> our test group is already testing against that driver, so far
> things are looking good.
> 
> I have designed it so the driver can continue to be built
> without support. There is also a sysctl in the stack code
> so you can set net.inet.tcp.tso_enable on or off and
> compare.
> 
> I know there may be some refinements to add in, but I
> would like to get this into CURRENT as a start.
> 
> Comments?

A single read through of the patch looks OK to me.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ipv6 host routes

2006-09-03 Thread gnn
At Sun, 3 Sep 2006 15:22:14 +0200,
John Hay wrote:
> 
> Hi,
> 
> Does anybody know how to add a direct IPv6 host route that actually works?
> What I mean is not through a gateway, but for one directly reachable.
> 
> I know it normally isn't needed because it will just work, but I'm
> trying to add FreeBSD IPv6 capability to net/olsrd. It looks like I have
> most of the rest working, but this is one of the last things tripping
> me up.
> 
> I have played for most of the morning with various incantations of
> "route add -inet6 -host ..." and just get various non working
> routes.  For my test I have the machines configured on the same IPv6
> subnet and without adding anything special I can ping them, but not
> after adding a route.
> 
> The reason they (the olsr guys) do it is so that a router can have
> multiple WiFi interfaces all configured on the same subnet. Then
> when they get comms with a machine, they can add a route to it
> through that interface.
> 
> At the moment I'm not even at the point of trying to get multiple
> interfaces on the same subnet working, although I would like to
> do that in the future. It would help if you have a high-site with
> multiple antennas and radios.
> 
> So anybody that know how to add a direct IPv6 host route on FreeBSD?
> 

Can you show us the commands, network layout and the output of netstat
-r and ndp -a?

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


ALpha Release 0.2 of Packet Construction Set

2006-09-06 Thread gnn
This release includes checksumming for IP and ICMP packets (based on
the algorithm in RFC 792) and LengthValue fields so you can easily
encode things like DNS labels and the like.

About half the work was done by Clement, our SoC student working on
IPv6 security issues.

As always comments welcome.

http://pcs.sourceforge.net

I hope to start writing some actual tests now that I have the ability
to handle most of the relevant packet level code.

BTW The package comes with quite a bit of documentation for an alpha
release, as well as demo and test scripts to play with.  

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Can someone take a look at PR 89061 (ipv6 autoconfigure 6to4)

2006-09-17 Thread gnn
At Fri, 8 Sep 2006 17:14:18 -0700,
Matt Reimer wrote:
> 
> Can someone take a look at PR 89061
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=89061). It contains a
> patch adding an /etc/rc.conf knob to autoconfigure an RFC 3068 6to4
> address.
> 

The comments in the PR indicate that awk can't be used at that point,
so if that's true, while I think it's a good idea, the implementation
will have to changed.  From an IPv6 standpoint it's fine though.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


AsiaBSDCon 2007

2006-10-01 Thread gnn
Hi Folks,

Sorry for the slightly OT email but I'm hoping some of the people
dilligently working away on FreeBSD will submit papers and
presentations the upcoming AsiaBSDCon 2007 to be held in Tokyo Japan
in March 2007.

See this link:  http://asiabsdcon.org/

Thanks, and now back to our regularly scheduled program :-)

Later,
George



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Tentative first patch for FAST_IPSEC with IPv6

2006-10-01 Thread gnn
Howdy,

There is now a patch at

http://people.freebsd.org/~gnn/fast_ipv6.patch

which should allow you to run FAST_IPSEC with IPv6.  It is very new,
it has passed most TAHI tests, and does not, so far as I know, panic
the kernel.  This is a patch against HEAD.

Please test and send feedback.

There is still more to do but at least this is now starting to work.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


HEADS UP, minor change to IPv6 link local address setup

2006-10-02 Thread gnn
Hi Folks,

I just committed to HEAD a minor change to our IPv6 support.  Unless a
user sets ipv6_enable to YES in rc.conf link local addresses will NOT
appear on any interface.  This seems to make some sense because you
shouldn't have them if you didn't ask for IPv6 to be enabled.  IPv6
remains in the kernel by default.

Please let me know if there are any issues with this change.  I did
test this but of course not as extensively as all of you can.

I intend to MFC this in 3 days if re@ is willing to let me.

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problems under test of IPv6 Ready Logo Program Phase-2

2006-10-18 Thread gnn
At Wed, 18 Oct 2006 19:08:57 -0800,
chenxiaochen wrote:
> 
> Dear all, This is my second letter here, I am beginning to love here
> for there are many kindly friends, such as SUZUKI
> Shinsuke<[EMAIL PROTECTED]> :) Ok,questions follows...  Does someone
> do research on IPv6 Ready Logo Program? Now I am doing IPv6
> conformance test under TAHI platform and I meet some problems.

Though some of us are using TAHI I do not believe the project itself
is going for the Logo Program.

I am working on IPv6 and IPSec and using TAHI regularly.

> My test setup is below:
>   -+---+- Link1
>|  |
>|  |
>|rl0   |  rl1
>   TN NUT
>|bge0  |  rl0
>|  |
>|  |
>   -+---+- Link0
>  TN:IBM desktop PC,OS is FreeBSD6.1;
>  NUT:IBM desktop PC,OS is FreeBSD6.1
> ---   rl0,rl1,bge0 stand for the NICs of TN and NUT.  
> My test software is v6eval-3.0.10 and package are Self_Test_1-4-2 and 
> v6eval-remotes-3.0.
> 
> 1. Section 5: RFC 2463 - ICMPv6 
>"case 11 Part B: Multicast Destination"  ---  fail
>After TN send Echo Request to global multicast address(ff1e::1:2), the 
> following words appear on NUT's screen-rl1:discard oversize frame (ether 
> type 86dd flags 3 len 1514 > max 1294 )
>However, "case 10 Part A: Unicast Destination" passes.
> 
> 2. Section 2: RFC 2461 - Neighbor Discovery for IPv6
>"127 Part C: Sending Unsolicited RA (Min Values)"  --- fail
>After NUT excutes rtadvd, TN says "Could't observe RA".
>The corresponding rtadvd.conf is 

I don't believe that you need to run your own RA.  TAHI is usually
self contained.

>But when I use Ethereal to capture the IP package, I get RA about 6 
> seconds later after rtadvd is excuted.
>The captured RA's parameters are:
>cur hop limit--64
>router lifetime--1800
>reachable tiem--0
>retrans time--0
>valid lifetime--0x00278d00
>preferred lifetime--0x00093a80
>

You shoudl check if this "just works" without the RA.

> 3. Section 3: RFC 2462 - IPv6 Stateless Address Autoconfiguration 
>All cases fail
>ReasonTN can't observe DAD process. 
>I can't capture DAD packages by Ethereal in the network start process.
> 
>But I can get DAD packages on IBM T43(NIC is bge0, OS is FreeBSD
>6.1) and T30(NIC is fxp0, OS is FreeBSD 5.4) when the network
>start( host test).  Someone ever told me that --- "there is a bug
>in FreeBSD's kernel which prevents DAD being sent. You have to
>force ethernet card into any mode rather than auto-select before
>it is activated, by modifying rc.network" As if rc.network has
>been change to netstart in FreeBSD 6.1. But I don't know how to
>modify it.

I have not heard of this and don't have that hardware so can't check
it.

>  By the way, these is a bug I found about IPv6 Ready Logo Program
>  Phase-2 auotmatic test. Hope this informaiton below will be useful
>  to you.
> 
> 1.install v6eval-remotes-3.0
> 2.# cd /usr/local/v6eval/bin/freebsd-i386/
> 3.# ee racontrol.rmt
> --
> line 288
> "\t:rtime#$rOpt_retrans:"  should be changed into  "\t:retrans#$rOpt_retrans:"
> --
>  
> 

Best,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: check internet connection

2006-10-27 Thread gnn
At Fri, 27 Oct 2006 09:43:17 +1000,
Sam Wun wrote:
> 
> [1  ]
> Hi,
> 
> I want to write a C program to check freebsd's internet connection.
> What s the best way to achieve this checking in layer 2 or 3 of the tcp/ip
> stacks in freebsd?

What do you want to check?  There are many layers of connectivity.

Later,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Path MTU discovery broken in IPSec

2006-10-30 Thread gnn
Hi Khetan,

I'm confused as to why you attribute this to PMTU discovery.  Do you
see ICMP errors indicating that?  Have you run traceroutes in both
directions from each host?

Thanks,
George
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


  1   2   3   >