RE: Sending Ethernet frames

2005-03-21 Thread Don Bowman
From: [EMAIL PROTECTED] On Behalf Of Patrik Arlos
 Hi,
 
  
 
 I'm trying to send 'raw' Ethernet frames. I have however not 
 found any examples of how to do this in BSD. 
 
 Is it possible to open a 'ethernet' socket, similar to a 
 AF_INET?  I need to be able to control the destination 
 address and type/len field in the Ethernet header. 
 
 In Linux it is possible open a SOCK_RAW and bind it to a 
 particular interface, I've tried to use the sockadd_dl but in 
 this case bind dies with error 22, any way to do this? 

You can chmod +w on /dev/bpf* and then open  write to a bpf
device.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Underutilisation of CPU --- am I PCI bus bandwidth limited?

2004-10-25 Thread Don Bowman
From: [EMAIL PROTECTED]
 
 ...
 
 This is rather confusing, as I cannot tell if the system is 
 IO bound or CPU 
 bound. Certainly I would not have expected the 133/64 PCI bus 
 to be saturated 
 given that peak throughput is around 550Mbit/s with 1024-byte 
 packets. (Such a 
 low figure is not unexpected given there are 2 syscalls per packet).

You may find you have not loaned the em driver enough buffers,
(max_rxd, max_txd).
you may find you want to use device polling, poll on idle, and
play with the polling parameters.
In this config i have achieved ~2Gbps of throughput with these
large packets, so i know it can be done.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: packet generator

2004-09-14 Thread Don Bowman
From: Andrew Gallatin [mailto:[EMAIL PROTECTED]
 Andrew Gallatin writes:
 
   xmit routine was called 683441 times.  This means that the 
 queue was
   only a little over two packets deep on average, and vmstat 
 shows idle
   time.  I've tried piping additional packets to nghook mx0:orphans
   input, but that does not seem to increase the queue depth.
   
 
 The problem here seems to be that rather than just slapping the
 packets onto the driver's queue, ng_source passes the mbuf down
 to more of netgraph, where there is at least one spinlock,
 and the driver's ifq lock is taken and released a zillion times
 by ether_output_frame(), etc.
 
 A quick hack (appended) to just slap the mbufs onto the if_snd queue
 gets me from ~410Kpps to 1020Kpps.  I also see very deep queues
 with this (because I'm slamming 4K pkts onto the queue at once..).
 
 This is nearly identical to the linux pktgen figure on the same
 hardware, which makes me feel comfortable that there is a lot of
 headroom in the driver/firmware API and I'm not botching something
 in the FreeBSD driver.
 
 BTW, did you see your 800Kpps on 4.x or 5.x?  If it was 4.x, what do
 you see on 5.x if you still have the same setup handy?
 
 Thanks,

800Kpps was on 4.7. on a dual 2.8GHz Xeon with 100MHz PCI-X on
em. I will try the 5.3.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: dyn buckets

2004-09-10 Thread Don Bowman
From: [EMAIL PROTECTED]
 I have a firewall running 4.10 that handles around 
 20mbits/sec of traffic 
 and has around 500 ipfw rules.
 
 Lately I've noticed that net.inet.ip.fw.curr_dyn_buckets 
 seems to be maxing 
 out.  I've increased net.inet.ip.fw.dyn_buckets a few times, 
 but they seem 
 to max out each time.
 
 Is there any problem with increasing 
 net.inet.ip.fw.dyn_buckets far beyond 
 the default?  (I'm at 2048 now)

I use 
net.inet.ip.fw.dyn_buckets=16384
net.inet.ip.fw.dyn_syn_lifetime=5
net.inet.ip.fw.dyn_max=32000


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: packet generator

2004-09-10 Thread Don Bowman
From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Andrew Gallatin
 Sent: September 10, 2004 19:08 PM
 To: [EMAIL PROTECTED]
 Subject: packet generator
 
 Does anybody have a free, in-kernel tool to generate packets quicky
 and send them out a particular etherent interface on FreeBSD?
 Something similar to pktgen on linux?
 
 I'm trying to excersize just the send-side of programmable firmware
 based NIC.  The recieve side of the NIC firmware is not yet written,
 but I want to get started tuning and shaking the bugs out of the send
 side while the firmware author does the recieve path.  The packets
 just get dropped on the floor by the NIC, so its a good way to test
 the interface..
 

ng_source was a netgraph module we wrote and contributed.
It can transmit ~800Kpps on a PCI-X system. The code is in
src/sys/netgraph/ng_source.c.
I drive it with a tcl library that can create arbitrary
packets with an object-oriented model, let me know if you'd
like to try that.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: device polling takes more CPU hits??

2004-07-26 Thread Don Bowman
From: James [mailto:[EMAIL PROTECTED]
 Hi all,
 
 ...

 
 Any idea why device polling is kind of having... negative 
 impact? Is this b/c
 I have SMP compiled on a box that really doesn't have two 
 cpu's?? Is SMP+APIC_IO
 support even required for HTT use?

I would post the output of 'sysctl kern.polling', its likely
some of the tuning there is insufficient.
What do you have HZ set to (sysctl kern.clockrate)? I would
probably have it set to ~1000.
You will want 'machdep.cpu_idle_hlt=1'.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: device polling takes more CPU hits??

2004-07-26 Thread Don Bowman
From: James [mailto:[EMAIL PROTECTED]
 Hi Don,
 [EMAIL PROTECTED] sysctl kern.clockrate
 kern.clockrate: { hz = 4000, tick = 250, tickadj = 1, profhz 
 = 1024, stathz = 128 }

That's a pretty high HZ, here's what i have:
kern.clockrate: { hz = 2500, tick = 400, tickadj = 1, profhz = 1024, stathz
= 128 }

I have the same box spec as you, only with em (bge doesn't
support polling, but it has its own interrupt coalescer that works...
you can tune that in the if_bge.h I think, there's some comments).
I'm doing ~800Kpps with polling. My polling params are below.

 
 [EMAIL PROTECTED] sysctl kern.polling
 kern.polling.burst: 150
 kern.polling.each_burst: 5
 kern.polling.burst_max: 150
 kern.polling.idle_poll: 1
 kern.polling.poll_in_trap: 1
 kern.polling.user_frac: 50
 kern.polling.reg_frac: 20
 kern.polling.short_ticks: 4909
 kern.polling.lost_polls: 11464
 kern.polling.pending_polls: 0
 kern.polling.residual_burst: 0
 kern.polling.handlers: 1
 kern.polling.enable: 1
 kern.polling.phase: 0
 kern.polling.suspect: 10249
 kern.polling.stalled: 3
   
   [EMAIL PROTECTED] sysctl machdep.cpu_idle_hlt
 machdep.cpu_idle_hlt: 1
 

kern.polling.burst: 1000
kern.polling.each_burst: 80
kern.polling.burst_max: 1000
kern.polling.idle_poll: 1
kern.polling.poll_in_trap: 0
kern.polling.user_frac: 5
kern.polling.reg_frac: 120
kern.polling.short_ticks: 29
kern.polling.lost_polls: 55004
kern.polling.pending_polls: 0
kern.polling.residual_burst: 0
kern.polling.handlers: 4
kern.polling.enable: 1
kern.polling.phase: 0
kern.polling.suspect: 50690
kern.polling.stalled: 25
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: device polling takes more CPU hits??

2004-07-26 Thread Don Bowman
From: Marko Zec [mailto:[EMAIL PROTECTED]
 On Monday 26 July 2004 17:35, Don Bowman wrote:
 [EMAIL PROTECTED] sysctl machdep.cpu_idle_hlt
   machdep.cpu_idle_hlt: 1
 
 
 At least on -STABLE, machdep.cpu_idle_hlt setting is ignored 
 / irrelevant when 
 both kern.polling.enable and kern.polling.idle_poll are set.
 

Hmm, this is more interesting.
Since you are SMP, and using POLLING, i assume you did
like me and commented out the !POLLING in SMP #error statement.
You definitely want the halt on idle. The polling in idle
doesn't work anyway, so try disabling it.

James, not sure if you saw the rest of my email with
my params:

 kern.polling.burst: 1000
 kern.polling.each_burst: 80
 kern.polling.burst_max: 1000
 kern.polling.idle_poll: 0
 kern.polling.poll_in_trap: 0
 kern.polling.user_frac: 5
 kern.polling.reg_frac: 120
 kern.polling.short_ticks: 29
 kern.polling.lost_polls: 55004
 kern.polling.pending_polls: 0
 kern.polling.residual_burst: 0
 kern.polling.handlers: 4
 kern.polling.enable: 1
 kern.polling.phase: 0
 kern.polling.suspect: 50690
 kern.polling.stalled: 25

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: device polling takes more CPU hits??

2004-07-26 Thread Don Bowman
From: James [mailto:[EMAIL PROTECTED]

 
 I have two boxes behind em0 that I can use to generate 
 250kpps to another vlan
 within em0 card as a test, so that bge0 is not involved in 
 the stress test.
 Even when doing so, CPU load climbs higher with device 
 polling turned on.
 Opened up systat, etc to check the interrupts, and em0 is 
 generating 0 
 interrupts with device polling on (as obvious), but general 
 interrupt load
 climbs rock high.. so I don't know what's causing it to 
 climb. Cleared the
 firewall rules as well as a test... no difference :(
 
 Oh also, just FYI, each vlan interface has link0 set, since 
 em(4) supports
 hardware 802.1q tag/detagging.
 

The CPU time during the 'polling' is charged to interrupt,
even though it occurs during softclock. That's why you
see 0 interrupts, but high CPU usage in interrupt.
Did u try lowering the 'register' access?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: device polling takes more CPU hits??

2004-07-26 Thread Don Bowman
From: Luigi Rizzo [mailto:[EMAIL PROTECTED]
 On Mon, Jul 26, 2004 at 01:18:46PM -0700, Kelly Yancey wrote:
 ...
Out of curiousity, what sort of testing did you do to 
 arrive at these
  settings?  I did some testing a while back with a SmartBits 
 box pumping
  packets through a FreeBSD 2.8Ghz box configured to route 
 between two em
  gigabit interfaces; I found that changing the burst_max and 
 each_burst
  parameters had almost no effect on throughput (maximum 1% 
 difference).
 
 fast boxes are pci-bus limited, not CPU limited(*) so 
 changing the burst
 size (which basically amortizes some CPU costs) has little if any
 effect.

The PCI-X bus will probably be 64-bit 133MHz in this case,
the limit moves up to the P64H2 hub for large packets,
to the CPU for small packets. Polling becomes quite
critical to prevent livelock.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Question on SOCK_RAW, implement a bpf-other host tee

2004-07-17 Thread Don Bowman

I'm trying to implement a 'tee' which reads
from bpf, and sends matching packets to
another layer-2 adjacent host.

I'm doing this with SOCK_RAW to try and write
the packet back out. The 'sendto' passes,
but i don't see a packet anywhere.

Am i correct that i can hand an arbitrarily
crafted IP packet into sendto, and the stack
will write the ethernet header on, pick an
interface, etc, based on the address in
the sendto?

I have swapped the ip_len, ip_off fields.

The program I have is below. This is on 4.7.
The handler gets called, the packet there looks 
correct, no error on any system call, yet no
output :(

Suggestions?

/*
 * Copyright 2004 Sandvine Incorporated. All rights reserved
 */

#include stdio.h
#include unistd.h
#include sys/types.h
#include sys/socket.h
#include netinet/in.h
#include netinet/in_systm.h
#include netinet/ip.h
#include pcap.h

void
usage(const char *name)
{
fprintf(stderr, Usage: %s [-I input_interface] [-O output_interface]
[-i output_ip(arp for mac)] [-v]\n, name);
exit(1);
}

typedef struct
{
int s;
struct in_addr output_ip;
}
context;

static int verbose;

static void 
handler(unsigned char *ct,
const struct pcap_pkthdr *hdr,
const unsigned char *pkt)
{
struct ip *ip = (struct ip *)(pkt + 14);
context *ctxt = (context *)ct;
struct sockaddr_in to;
memset(to,0,sizeof(to));
to.sin_family = AF_INET;
to.sin_addr = ctxt-output_ip;
if (verbose)
{
fprintf(stderr, Send %d byte packet\n, hdr-len);
}
ip-ip_len = htons(ip-ip_len);
ip-ip_off = htons(ip-ip_off);
if (sendto(ctxt-s,
   ip,
   hdr-len-14,
   0,
   (struct sockaddr *)to,
   sizeof(to)) != (hdr-len-14) )
{
err(1, sendto);
}
}

static int
doit(const char *input_interface,
 const char *output_interface,
 struct in_addr output_ip)
{
char errbuf[PCAP_ERRBUF_SIZE];
pcap_t *in_d, *out_d;
context ctxt;
int on = 1;
struct bpf_program fp;

in_d = pcap_open_live((char *)input_interface, 1600, 1, 20, errbuf);
if (in_d == 0)
{
errx(1, open of %s failed: %s, input_interface, errbuf);
}

ctxt.output_ip.s_addr = htonl(output_ip.s_addr);
ctxt.s = socket(PF_INET, SOCK_RAW, IPPROTO_RAW);
if (ctxt.s  0)
errx(1, can't open raw socket);
if (setsockopt(ctxt.s, IPPROTO_IP, IP_HDRINCL, (char *)on, sizeof(on))
 0)
{
err(1,setsockopt);
}

memset(fp,0,sizeof(fp));
if (pcap_compile(in_d, fp, ip, 0, 0xfff0)  0)
{
errx(1, failed to compile: %s,pcap_geterr(in_d));
}
if (pcap_setfilter(in_d, fp)  0)
{
errx(1, failed to set filter);
}

pcap_loop(in_d, -1, handler, (unsigned char *)ctxt);
}

int
main(int argc, char *argv[])
{
int ch;
char *input_interface = ipfw0;
char *output_interface = em2;
struct in_addr output_ip;
output_ip.s_addr = 0;

while ((ch = getopt(argc, argv, I:O:i:vh?)) != -1)
{
switch (ch) 
{
case 'I':
input_interface = optarg;
break;
case 'O':
output_interface = optarg;
break;
case 'i':
if (inet_aton(optarg,output_ip)  0)
{
errx(1, unknown ip %s, optarg);
}
break;
case 'v':
verbose = 1;
break;
case 'h':
case '?':
default:
usage(argv[0]);
}
}
if (verbose)
fprintf(stderr, %s-%s(%s)\n,
input_interface,output_interface,inet_ntoa(output_ip));
return doit(input_interface,output_interface,output_ip);
}

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Question on SOCK_RAW, implement a bpf-other host tee

2004-07-17 Thread Don Bowman
From: Don Bowman [mailto:[EMAIL PROTECTED]
 I'm trying to implement a 'tee' which reads
 from bpf, and sends matching packets to
 another layer-2 adjacent host.
 

Sorry to follow up my own post, but...
More specifically, it appears the packet does
try and transmit, but the destination MAC is
(uninitialised?) somewhat random, different
on each packet, not legal.

I can capture it on the correct output interface
with tcpdump. The interface type is xl.

Shouldn't the stack ARP for the destination
in my 'sendto', and fill in the ether header?
The ether-source is filled in, presumably by
the driver.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Looking for a Broadcom BCM5704 datasheet

2004-05-14 Thread Don Bowman
From: Ruslan Ermilov [mailto:[EMAIL PROTECTED]
 On Fri, May 14, 2004 at 09:40:07AM -0700, Paul Saab wrote:
  Ruslan Ermilov wrote:
  
  Dear networkers,
  
  I'm looking for a Broadcom BCM5704[S] technical datasheet. 
  If anyone has
  such a beast, or knows how one could obtain it, please let me know.
  
   
  
  As john pointed out, you can only get this under NDA from 
 broadcom.  
  What exactly are you trying to solve?  I have the latest 
 documentation 
  so I may be able to help you, but I can't give you the docs.
  
 We hoped that with dual-channel NIC we could be able to just move
 the received frame from one port for TX on another port, to overcome
 the 32-bit PCI bus speed limitation, to get better thoroughput with
 GigE.  Bill Paul already explained in private that they are actually
 two distinct SRAMs, and the operation we needed is not supported
 (without PCI involved).
 

I believe it is 64-bit 133MHz PCI-X.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Can't compile Intel gigabit em driver

2004-05-09 Thread Don Bowman
From: Gary Corcoran [mailto:[EMAIL PROTECTED]
 
 Quick background:
 I'm running FreeBSD 4.8-Release and have a new Intel Pro/1000 MT
 NIC I want to install.  While there is a man page for the em
 driver which should be usable, there is no em listed in LINT
 or GENERIC.  Nor is the source code for if_em.c anywhere on my
 system.  So I downloaded the FreeeBSD driver source from Intel,
 which is listed as being for FreeBSD 4.7.  It's their latest code.

em is in the standard source tree for 4.8

src/sys/dev/em

you add 'device em' to your kernel config to compile
it in, or you can load the module by adding 'load_if_em=YES' to
loader.conf

if you installed from the 4.8 CD, you will have the module
present in /modules/if_em.ko

you can type 'kldload if_em' to try that theory, it will
load the driver, and it should now show in 'ifconfig'.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Stupid question about managed switches

2004-04-08 Thread Don Bowman
From: Marc G. Fournier [mailto:[EMAIL PROTECTED]
 On Thu, 8 Apr 2004, Don Bowman wrote:
 
  From: Marc G. Fournier [mailto:[EMAIL PROTECTED]
  
   Please excuse this, but my experience with them is zilch ...
   am going with
   the HP Procurve 2826(?) Layer2/Layer3 switch, as was
   suggested, but I'm
   curious as to how they work ...
  
   For instance, I know when I setup a router, I have an IN IP
   and an OUT IP
   configured ... but, with a managed switch, what do I have?
  
   For instance, right now, I have a default gateway on the
   providers switch
   of 200.46.204.1 ... and my servers are .2, .3, .4 and .5 ...
   if I put a
   managed switch, vs the unmanaged we have now, between the
   providers switch
   and the servers, does my default route then change to be 
 the switch
   itself?  Or is the 'login part' of the switch thought of the
   same way as
   adding just another server to the network, for 
 connectivity purposes?
  
   As I said, stupid question, but for someone whose never 
 played with a
   managed switch before ... :(
  
   Thanks ..
 
  In layer-2 mode, its nothing but a hub. It doesn't change your
  default route or anything. Pretend its not there.
 
  you will need a router connected to this switch, and its
  IP will remain your default route (likely).
 
 'k, but I want to use the managed aspect of it to be able to 
 hard code the
 port rates (ie. to fix this full-duplex issue initially) as well as be
 able to access SNMP so that I can do bandwidth monitoring of external
 traffic ... I have SNMP setup on the FreeBSD boxes right now 
 so that I can
 see network load per server, but I want to be able to isolate the
 'external' traffic from 'internal', by monitoring the 
 specific port that
 is connected to the providers switch ...
 
 So, in both cases, I need to assign an IP somewhere, correct?

Assign the switch an IP address on the same subnet as the router
port its connected to, and on same subnet as the PC's.

The procurve has a really nice serial interface that auto-baud
rate detects. Slap a cable in, hit space twice, and its obvious
from there. Assign it a management IP and route, an SNMP
community.

In the switch, you can create complete isolation using
vlans. This makes complete virtual switches. Although you
can assign a management IP on each vlan, i never bother.
It doesn't sound like this is what you are looking for.

also on this management interface (available via telnet
after you set the ip) you can set the params for each
port (duplex, speed). You can also connect a browser to
it to see some basic stats etc.

Now run something like 'mrtg' cfgmaker against the management
IP of the switch, and you'll have a chart per port.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Stupid question about managed switches

2004-04-07 Thread Don Bowman
From: Marc G. Fournier [mailto:[EMAIL PROTECTED]
 
 Please excuse this, but my experience with them is zilch ... 
 am going with
 the HP Procurve 2826(?) Layer2/Layer3 switch, as was 
 suggested, but I'm
 curious as to how they work ...
 
 For instance, I know when I setup a router, I have an IN IP 
 and an OUT IP
 configured ... but, with a managed switch, what do I have?
 
 For instance, right now, I have a default gateway on the 
 providers switch
 of 200.46.204.1 ... and my servers are .2, .3, .4 and .5 ... 
 if I put a
 managed switch, vs the unmanaged we have now, between the 
 providers switch
 and the servers, does my default route then change to be the switch
 itself?  Or is the 'login part' of the switch thought of the 
 same way as
 adding just another server to the network, for connectivity purposes?
 
 As I said, stupid question, but for someone whose never played with a
 managed switch before ... :(
 
 Thanks ..

In layer-2 mode, its nothing but a hub. It doesn't change your
default route or anything. Pretend its not there.

you will need a router connected to this switch, and its 
IP will remain your default route (likely).

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: FIN_WAIT_[1,2] and LAST_ACK

2004-04-04 Thread Don Bowman
From: Brandon Erhart [mailto:[EMAIL PROTECTED]
 Hello everyone,
 
 I am writing a network application that mirrors a given 
 website (such as a 
 suped-up wget). I use a lot of FDs, and was getting 
 connect() errors when 
 I would run out of local_ip:local_port tuples. I lowered the 
 MSL so that 
 TIME_WAIT would timeout very quick (yes, I know, this is 
 bad, but I'm 
 going for sheer speed here), and it alleviated the problem a bit.
 
 However, I have run into a new problem. I am getting a good amount of 
 blocks stuck in FIN_WAIT_1, FIN_WAIT_2 or LAST_ACK that stick 
 around for a 
 long while. I have been unable to find must information on a 
 timeout for 
 these states. I came across a small patch that modified 
 tcp_timer.c in 
 /usr/src/sys/netinet. It changed line #484 (in FreeBSD 4.9-REL) from:
 
 if (tp-t_state != TCPS_TIME_WAIT 
 
 to
 
 if (tp-t_state  FIN_WAIT_2 
 
 I also tried changing that to .. = FIN_WAIT_2 ..
 
 However, I still end up with quite a few stuck in FIN_WAIT_1, 
 FIN_WAIT_2 or 
 LAST_ACK after the program exits (and whilst the program is 
 running of 
 course). They don't seem to timeout in the same interval that 
 TIME_WAIT does.
 
 Any ideas? Did I modify the right piece of code? I was told 
 to post here as 
 you all would more than likely know!
 

Perhaps you want to lower net.inet.tcp.msl sysctl?

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Odd network issue ... *very* slow scp between two servers

2004-03-07 Thread Don Bowman
From: Marc G. Fournier [mailto:[EMAIL PROTECTED]
 On Sat, 6 Mar 2004, Tim Wilde wrote:
 
  On Sat, 6 Mar 2004, Marc G. Fournier wrote:
 
   I have two servers on the same network switch, sitting 
 one on top of the
   other ... one is running an em (Dual-Xeon 2.4Ghz) device, 
 the other an fxp
   (Dual-PIII 1.3Ghz) device ...
 
  Is it a Cisco Catalyst switch?  If so, you need to switch 
 the em's to
  autoselect, on both the server and switch end.  For some 
 reason, the em
  driver will not properly lock down its rate when talking to a Cisco
  Catalyst switch.  At least, I had an identical problem with 
 em's talking
  to a Catalyst 2950 and that was the fix I came up with.  
 Give it a try and
  see how your results go.
 
 Note that forcing it to 100baseT half-duplex (or 10baseT/UTP 
 half-duplex)
 corrects the problem ... turns out it is only in full-duplex 
 mode that its
 hosed ...

Actually, this is normal behaviour according to the 802.3u spec.
If a device in 'auto' mode is connected to one that is
forced 100FDX, the auto one will negotiate 100HDX.

For example, see HP faq:
http://www.hp.com/rnd/support/faqs/2700.htm#question6

http://roger.friendex.net/duplex_mismatch.htm
has a nice table of this.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: DEVICE_POLLING with SMP

2004-01-29 Thread Don Bowman
 From: Kevin Day [mailto:[EMAIL PROTECTED]
 On Jan 29, 2004, at 1:04 AM, Vlad Galu wrote:
 
  I see no reason for it. Having to switch between multiple kernel
  threads to handle polling may bring too much overhead.
 
 
 
 Would that really be happening though?
 
 If polling is happening in the idle loop, extra overhead 
 doesn't really 
 matter all that much, the CPU is idle, and I can't imagine it 
 being any 
 worse during a livelock inducing amount of traffic.
 
 If it's polling during any other time, the code is exactly the same 
 between the UP and SMP case, and I can't imagine the overhead 
 being all 
 THAT much worse, would it?
 
 My primary goal with it is to stop thrashing context switches 
 when I've 
 got a system acting as a router with 8 network interfaces on it. Even 
 with network card interrupt coalescing there is a whole lot of 
 interrupt activity going on, which polling seems to make a noticeable 
 difference with polling enabled. I'm also very interested in 
 polling's 
 ability to more gracefully handle extremely heavy network traffic 
 without getting into livelock, which may be worth it to some people 
 prone to DoS activity when they have a whole lot of bandwidth to deal 
 with.
 
 I'd be willing to chip in a few bucks for development time if anyone 
 wants to make the changes to try it out. It didn't look that 
 difficult, 
 but my time is pretty booked right now.
 
 -- Kevin

On 4.X, you can simply comment out the check for device polling
and MP operation. The system will now work fine. It will not,
however, poll on idle.

We are running this way and it works very well.

Polling on idle for MP requires a bit more work. If you do that
work, you will have some locking issues to solve.

I have not tried this on current yet.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: crossover between gigE?

2003-12-20 Thread Don Bowman
From: Luigi Rizzo [mailto:[EMAIL PROTECTED]
 
 On Sat, Dec 20, 2003 at 07:11:22AM -0800, Alfred Perlstein wrote:
  Any suggestion of the kind of cable one should look for at Frys
  to run between two gigE card (intel em0) to function as a crossover?
 
 A straight cable with all 4 pairs wired will work. GigE (and 
 many modern
 100Mbit switches) have auto polarity detection.
 
   cheers
   luigi

One caveat on that: if you force any of the phy parameters
(e.g. speed, duplex), this defeats the auto polarity (MDI/MDX)
detection on em, bge, and maybe others. ie to use the technique 
above you need to have autoselect enabled for duplex and speed.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: how to saturate 100Mbit

2003-12-13 Thread Don Bowman
From: DrumFire [mailto:[EMAIL PROTECTED]
 
  dd if=holey-file of=/dev/null bs=10m
  
  I've got about 30% of CPU load for the server (P-133) and less than
  35mbit/s on wire.
 
 Also you can try to dump traffic with tcpdump and send it with
 
 /usr/ports/net/tcpreplay
 
 I'm trying to send 100Mbit/s for 5-6 minutes with Ethernet 
 frame size at
 64 bytes, but I need very good hardware to make this.

There is a netgraph module called ng_source which can do this.
It can achieve about 400Kpps or 1Gbps on a xeon system with
a gigabit card, should be able to saturate a fxp.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Two ISP connections

2003-12-10 Thread Don Bowman
From: Andrea Venturoli [mailto:[EMAIL PROTECTED]
 ** Reply to note from Barney Wolff [EMAIL PROTECTED] Wed, 
 10 Dec 2003 11:39:00 -0500
 
 
  I don't know of anything published that does this, but it's easy to 
  write a perl or shell script that pings the router at the adsl isp 
  and does the necessary things when it disappears and reappears.
 
 Mmh, only problem is one of the ISP is famous for blocking 
 ICMP as a whole, so no pings work. I haven't tried this
 particular line yet, but I may need to use come other protocol.
 
 

see the lft port (layer 4 traceroute) http://www.mainnerve.com/lft/

you can use this to get an ICMP response (albeit not echo)
from your isp this way. [you can't really block icmp would
fragment, it would break PMTU].

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


high number of pcb's, core dump in sysctl -a

2003-11-12 Thread Don Bowman
net.inet.tcp.pcbcount: 76043

when i do 'sysctl net.inet.tcp', i get a core dump,
while trying to read 'net.inet.tcp.pcblist'. 

Is there some built in limit to the size of a sysctl
result?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Giga-bit switches

2003-10-09 Thread Don Bowman
From: Peter J. Blok [mailto:[EMAIL PROTECTED]
 Hi,
 
 This is just a warning. I am setting up a Giga-bit network 
 trying to use Jumbo 
 frames. For NIC the ability to do larger frames is usually 
 listed, but that 
 doesn't seem to be the case for switches.
 
 I have bought a Netgear GS104 switch, which does list a 
 buffer per port of 
 12K. However, according to Netgear support, it is not 
 supported and working. 
 They just say that there is no mentioning of Jumbo frame 
 support, therefore 
 it is not supported. Even on the more expensive Netgear 
 switches it is not 
 listed, so it is trial-on-error policy.
 
 My understanding is that the Giga-bit definition includes 
 large frame support 
 and if you claim to have a Giga-bit switch you should support 
 large frames, 
 unless specifically excluded.

jumbo frames are not part of the standard, and are in
general poorly supported. For some cisco devices, they
do 'mini giants', e.g. ~1600 mtu. Other cisco devices
will support 9K frames, but @ the expensive of lowering
the overall buffering (all frames are assumed to be 9K
now, so ~1/4 of the packets may be buffered).

for cisco devices, the support will be on a line card
by linecard basis.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: I would like to tcpdump and get all the packets...

2003-09-18 Thread Don Bowman
From: Petri Helenius [mailto:[EMAIL PROTECTED]
 Bruce M Simpson wrote:
 
 Er, if you check this URL:
 http://www.freebsd.org/cgi/cvsweb.cgi/src/contrib/tcpdump/CHANGES
 
 Shurely you mean tcpdump 3.7.2, which is already imported 
 (by fenner, with
 additional hacks)?
 
   
 
 I mean libpcap, which also tcpdump uses, if I´m not mistaken. Look in 
 contrib/libpcap
 
 Pete

I found that increasing the bpf buffer size in libpcap
to 256K from the default of 4K made a tremendous difference.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: TCP socket shutdown race condition

2003-08-01 Thread Don Bowman
 From: Mike Silbersack [mailto:[EMAIL PROTECTED]
 On Fri, 1 Aug 2003, Scot Loach wrote:
 
  Earlier this week one of our FreeBSD 4.7 boxes panic'd.  
 I've posted the
  stack trace at the end of this message.  Using google, I've 
 found several
  references to this panic over the past three years, but it 
 seems its never
  been taken to root cause.
 
  The box crashes because the cr_uidinfo pointer in the 
 so_cred structure is
  null.  However, on closer inspection the so_cred structure 
 is corrupted
  (cr_ref=3279453304 for example), so I'm guessing it has 
 already been freed.
  Looking closer at the socket, I see that the SS_NOFDREF 
 flag is set, which
  supports my theory.  The tcpcb is in the CLOSED state, and 
 has the SENTFIN
  flag set.
 
 About how many concurrent connections are you pushing this machine to?
 
 There's an unfortunate problem with uidinfo in 4.x:
 
 struct uidinfo {
 LIST_ENTRY(uidinfo) ui_hash;
 rlim_t  ui_sbsize;  /* socket buffer 
 space consumed */
 longui_proccnt; /* number of processes */
 uid_t   ui_uid; /* uid */
 u_short ui_ref; /* reference count */
 };
 

We are pushing in the ~50-~70K TCP connections to this process.

I think i see what you are suggesting :)

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Help with FreeBSD Bridged Firewall

2003-07-30 Thread Don Bowman
 From: William Knechtel [mailto:[EMAIL PROTECTED]
 Yeah, the arp cache is the problem, thanks for nailing that 
 one for me.
 However, the ipfw rule you supplied doesn't seem to want to work for
 me...  I think for the time being I'll just run a cron job every 15
 minutes or so that clears the arp cache completely.  Thanks again for
 your help!!  I really appreciate it!

you can, with sysctl, change the arp timeout period.
sysctl net.link.ether to see all of them.
net.link.ether.inet.prune_intvl/net.link.ether.inet.max_age 
changes the arp cache age time.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Help with FreeBSD Bridged Firewall

2003-07-29 Thread Don Bowman
 From: William Knechtel [mailto:[EMAIL PROTECTED]

I think you need to allow arp through this device, something 
like:
ipfw add 30 allow layer2 mac-type arp
[not sure which rule to insert it at].

I'm guessing your arp cache is timing out.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


splx() bug in ip_dummynet?

2003-07-24 Thread Don Bowman
1.24.2.2 of ip_dummynet.c [RELENG_4] has a bug I'm thinking, can someone 
comment?
In the below snippet, the value of 's' from splimp() is
overwritten by the return value of alloc_hash(), which is
an errno. If its != 0, then there's a missing splx().
If it is == 0, then splx() is called with the wrong value.

[i've filed a PR against this, and will probably change
the alloc_hash to use a different return value in my tree]


s = splimp();
x-bandwidth = p-bandwidth ;
x-numbytes = 0; /* just in case... */
bcopy(p-if_name, x-if_name, sizeof(p-if_name) );
x-ifp = NULL ; /* reset interface ptr */
x-delay = p-delay ;
set_fs_parms((x-fs), pfs);


if ( x-fs.rq == NULL ) { /* a new pipe */
s = alloc_hash((x-fs), pfs) ;
if (s) {
free(x, M_DUMMYNET);
return s ;
}
x-next = b ;
if (a == NULL)
all_pipes = x ;
else
a-next = x ;
}
splx(s);
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: splx() bug in ip_dummynet?

2003-07-24 Thread Don Bowman
From: Don Bowman [mailto:[EMAIL PROTECTED]

 ...

I believe this patch will correct the issue.

Index: ip_dummynet.c
===
RCS file: /usr/cvs/src/sys/netinet/ip_dummynet.c,v
retrieving revision 1.24.2.17.1000.1
retrieving revision 1.24.2.17.1000.2
diff -U3 -r1.24.2.17.1000.1 -r1.24.2.17.1000.2
--- ip_dummynet.c   21 Jun 2003 20:47:59 -  1.24.2.17.1000.1
+++ ip_dummynet.c   24 Jul 2003 15:27:59 -  1.24.2.17.1000.2
@@ -1571,10 +1571,12 @@
 
 
if ( x-fs.rq == NULL ) { /* a new pipe */
-   s = alloc_hash((x-fs), pfs) ;
-   if (s) {
+   int s1;
+   s1 = alloc_hash((x-fs), pfs) ;
+   if (s1) {
free(x, M_DUMMYNET);
-   return s ;
+   splx(s);
+   return s1 ;
}
x-next = b ;
if (a == NULL)
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: using memory after freed in tcp_syncache (syncache_timer()) with ipfw: patch attached

2003-07-01 Thread Don Bowman
From: Don Bowman [mailto:[EMAIL PROTECTED]
 
 Synopsis: under some ipfw conditions, tcp_syncache has
 syncache_respond() call ip_output call ip_input call syncache_drop(),
 which drops the 'syncache' that is being worked on, or corrupts
 the list, etc. This is typically seen from syncache_timer or
 syncache_add.
 
 I've attached a patch that I believe corrects this problem.
 I'm observing it on 4.7, but I believe it equally affects RELENG_4
 and CURRENT.
 
 This seems to make the problem I was seeing go away. I'm
 currently running with 2K syn/second through the original condition,
 will let it go overnight like that. I think that will flush
 out if i've introduced a leak or other crash.
 
 Can someone who knows this code perhaps critique what I've done?
 
 Essentially I have made syncache_drop() instead defer the delete
 onto a different list. In the timer, I delete the syncache entries
 from the delete list. This costs some performance and memory, but
 was the best way I could come up with.
 

There was an error in the previous patch.

Index: tcp_syncache.c
===
RCS file: /usr/cvs/src/sys/netinet/tcp_syncache.c,v
retrieving revision 1.5.2.8.1000.3
diff -U5 -r1.5.2.8.1000.3 tcp_syncache.c
--- tcp_syncache.c  4 Feb 2003 01:52:03 -   1.5.2.8.1000.3
+++ tcp_syncache.c  1 Jul 2003 14:32:29 -
@@ -83,16 +83,18 @@
 #endif /*IPSEC*/
 
 #include machine/in_cksum.h
 #include vm/vm_zone.h
 
+static int syncache_delete_flag;
 static int tcp_syncookies = 1;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW,
 tcp_syncookies, 0, 
 Use TCP SYN cookies if the syncache overflows);
 
 static void syncache_drop(struct syncache *, struct syncache_head *);
+static void syncache_delete(struct syncache *, struct syncache_head *);
 static void syncache_free(struct syncache *);
 static void syncache_insert(struct syncache *, struct syncache_head *);
 struct syncache *syncache_lookup(struct in_conninfo *, struct syncache_head
**);
 static int  syncache_respond(struct syncache *, struct mbuf *);
 static struct   socket *syncache_socket(struct syncache *, struct socket
*);
@@ -125,10 +127,11 @@
u_int   next_reseed;
TAILQ_HEAD(, syncache) timerq[SYNCACHE_MAXREXMTS + 1];
struct  callout tt_timerq[SYNCACHE_MAXREXMTS + 1];
 };
 static struct tcp_syncache tcp_syncache;
+static TAILQ_HEAD(syncache_delete_list, syncache)  sc_delete_list;
 
 SYSCTL_NODE(_net_inet_tcp, OID_AUTO, syncache, CTLFLAG_RW, 0, TCP SYN
cache);
 
 SYSCTL_INT(_net_inet_tcp_syncache, OID_AUTO, bucketlimit, CTLFLAG_RD,
  tcp_syncache.bucket_limit, 0, Per-bucket hash limit for syncache);
@@ -202,10 +205,13 @@
rtrequest(RTM_DELETE, rt_key(rt),
rt-rt_gateway, rt_mask(rt),
rt-rt_flags, NULL);
RTFREE(rt);
}
+#if defined(DIAGNOSTIC)
+   memset(sc, 0xee, sizeof(struct syncache));
+#endif
zfree(tcp_syncache.zone, sc);
 }
 
 void
 syncache_init(void)
@@ -256,10 +262,12 @@
 * older one.
 */
tcp_syncache.cache_limit -= 1;
tcp_syncache.zone = zinit(syncache, sizeof(struct syncache),
tcp_syncache.cache_limit, ZONE_INTERRUPT, 0);
+
+   TAILQ_INIT(sc_delete_list);
 }
 
 static void
 syncache_insert(sc, sch)
struct syncache *sc;
@@ -312,12 +320,28 @@
 static void
 syncache_drop(sc, sch)
struct syncache *sc;
struct syncache_head *sch;
 {
+   if ((sc-sc_flags  SCF_DELETE) == 0) {
+   sc-sc_flags |= SCF_DELETE;
+   syncache_delete_flag = 1;
+   TAILQ_INSERT_TAIL(sc_delete_list, sc, sc_delete);
+   }
+}
+
+static void
+syncache_delete(sc, sch)
+   struct syncache *sc;
+   struct syncache_head *sch;
+{
int s;
 
+   if ((sc-sc_flags  SCF_DELETE) == 0) {
+   printf(ERROR ERROR ERROR: SCF_DELETE == 0\n);
+   return;
+   }
if (sch == NULL) {
 #ifdef INET6
if (sc-sc_inc.inc_isipv6) {
sch = tcp_syncache.hashbase[
SYNCACHE_HASH6(sc-sc_inc,
tcp_syncache.hashmask)];
@@ -329,10 +353,12 @@
}
}
 
s = splnet();
 
+   TAILQ_REMOVE(sc_delete_list, sc, sc_delete);
+
TAILQ_REMOVE(sch-sch_bucket, sc, sc_hash);
sch-sch_length--;
tcp_syncache.cache_count--;
 
TAILQ_REMOVE(tcp_syncache.timerq[sc-sc_rxtslot], sc, sc_timerq);
@@ -357,10 +383,12 @@
int s;
 
s = splnet();
 if (callout_pending(tcp_syncache.tt_timerq[slot]) ||
 !callout_active(tcp_syncache.tt_timerq[slot])) {
+   if (syncache_delete_flag)
+   goto delete_cleanup;
 splx(s);
 return;
 }
 callout_deactivate(tcp_syncache.tt_timerq[slot]);
 
@@ -390,10 +418,21

RE: using memory after freed in tcp_syncache (syncache_timer()) with ipfw: patch attached

2003-06-30 Thread Don Bowman
Synopsis: under some ipfw conditions, tcp_syncache has
syncache_respond() call ip_output call ip_input call syncache_drop(),
which drops the 'syncache' that is being worked on, or corrupts
the list, etc. This is typically seen from syncache_timer or
syncache_add.

I've attached a patch that I believe corrects this problem.
I'm observing it on 4.7, but I believe it equally affects RELENG_4
and CURRENT.

This seems to make the problem I was seeing go away. I'm
currently running with 2K syn/second through the original condition,
will let it go overnight like that. I think that will flush
out if i've introduced a leak or other crash.

Can someone who knows this code perhaps critique what I've done?

Essentially I have made syncache_drop() instead defer the delete
onto a different list. In the timer, I delete the syncache entries
from the delete list. This costs some performance and memory, but
was the best way I could come up with.

 --don

Index: tcp_syncache.c
===
RCS file: /usr/cvs/src/sys/netinet/tcp_syncache.c,v
retrieving revision 1.5.2.8.1000.3
diff -U3 -r1.5.2.8.1000.3 tcp_syncache.c
--- tcp_syncache.c  4 Feb 2003 01:52:03 -   1.5.2.8.1000.3
+++ tcp_syncache.c  1 Jul 2003 03:05:22 -
@@ -85,6 +85,7 @@
 #include machine/in_cksum.h
 #include vm/vm_zone.h
 
+static int syncache_delete;
 static int tcp_syncookies = 1;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW,
 tcp_syncookies, 0, 
@@ -127,6 +128,7 @@
struct  callout tt_timerq[SYNCACHE_MAXREXMTS + 1];
 };
 static struct tcp_syncache tcp_syncache;
+static TAILQ_HEAD(syncache_delete_list, syncache)  sc_delete_list;
 
 SYSCTL_NODE(_net_inet_tcp, OID_AUTO, syncache, CTLFLAG_RW, 0, TCP SYN
cache);
 
@@ -204,6 +206,9 @@
rt-rt_flags, NULL);
RTFREE(rt);
}
+#if defined(DIAGNOSTIC)
+   memset(sc, 0xee, sizeof(struct syncache));
+#endif
zfree(tcp_syncache.zone, sc);
 }
 
@@ -258,6 +263,8 @@
tcp_syncache.cache_limit -= 1;
tcp_syncache.zone = zinit(syncache, sizeof(struct syncache),
tcp_syncache.cache_limit, ZONE_INTERRUPT, 0);
+
+   TAILQ_INIT(sc_delete_list);
 }
 
 static void
@@ -331,6 +338,18 @@
 
s = splnet();
 
+   if ((sc-sc_flags  SCF_DELETE) == 0) {
+   sc-sc_flags |= SCF_DELETE;
+   syncache_delete = 1;
+   TAILQ_INSERT_TAIL(sc_delete_list, sc, sc_delete);
+
+   splx(s);
+   return;
+   }
+   if (sc-sc_delete.tqe_next || sc-sc_delete.tqe_prev) {
+   TAILQ_REMOVE(sc_delete_list, sc, sc_delete);
+   }
+
TAILQ_REMOVE(sch-sch_bucket, sc, sc_hash);
sch-sch_length--;
tcp_syncache.cache_count--;
@@ -359,6 +378,8 @@
s = splnet();
 if (callout_pending(tcp_syncache.tt_timerq[slot]) ||
 !callout_active(tcp_syncache.tt_timerq[slot])) {
+   if (syncache_delete)
+   goto delete_cleanup;
 splx(s);
 return;
 }
@@ -392,6 +413,17 @@
if (nsc != NULL)
callout_reset(tcp_syncache.tt_timerq[slot],
nsc-sc_rxttime - ticks, syncache_timer, (void
*)(slot));
+
+delete_cleanup:
+   sc = TAILQ_FIRST(sc_delete_list);
+   while (sc != NULL) {
+   nsc = TAILQ_NEXT(sc, sc_delete);
+   syncache_drop(sc, NULL); 
+   sc = nsc;
+   }
+   TAILQ_INIT(sc_delete_list);
+   syncache_delete = 0;
+
splx(s);
 }
 
@@ -1335,6 +1367,7 @@
sc = zalloc(tcp_syncache.zone);
if (sc == NULL)
return (NULL);
+   bzero(sc, sizeof(*sc));
/*
 * Fill in the syncache values.
 * XXX duplicate code from syncache_add
Index: tcp_var.h
===
RCS file: /usr/cvs/src/sys/netinet/tcp_var.h,v
retrieving revision 1.56.2.12
diff -U3 -r1.56.2.12 tcp_var.h
--- tcp_var.h   24 Aug 2002 18:40:26 -  1.56.2.12
+++ tcp_var.h   1 Jul 2003 02:33:57 -
@@ -224,8 +224,10 @@
 #define SCF_CC 0x08/* negotiated CC */
 #define SCF_UNREACH0x10/* icmp unreachable received
*/
 #define SCF_KEEPROUTE  0x20/* keep cloned route */
+#define SCF_DELETE 0x40/* I'm being deleted */
TAILQ_ENTRY(syncache)   sc_hash;
TAILQ_ENTRY(syncache)   sc_timerq;
+   TAILQ_ENTRY(syncache)   sc_delete;
 };
 
 struct syncache_head {
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


using memory after freed in tcp_syncache (syncache_timer())

2003-06-28 Thread Don Bowman
syncache_timer()
 ...
/*
 * syncache_respond() may call back into the syncache to
 * to modify another entry, so do not obtain the next
 * entry on the timer chain until it has completed.
 */
(void) syncache_respond(sc, NULL);
nsc = TAILQ_NEXT(sc, sc_timerq);
tcpstat.tcps_sc_retransmitted++;
TAILQ_REMOVE(tcp_syncache.timerq[slot], sc, sc_timerq);

so what happens is that syncache_respond() calls ip_output,
which ends up calling ip_input, which ends up doing something
that causes 'sc' to be freed. Now 'sc' is freed, we return
to syncache_timer(), and then we use it in nsc = TAILQ_NEXT(...)
line.

This particular part of the problem was introduced in
1.23 of tcp_syncache.c in response to another bug that i had
found.

Does anyone have a suggestion on a proper fix?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: using memory after freed in tcp_syncache (syncache_timer())

2003-06-28 Thread Don Bowman
From: Don Bowman
 ...
It appears this may also occur in syncache_add():
in this case, syncache_respond() alters the list.

sc-sc_tp = tp;
sc-sc_inp_gencnt = tp-t_inpcb-inp_gencnt;
if (syncache_respond(sc, m) == 0) {
s = splnet();
TAILQ_REMOVE(tcp_syncache.timerq[sc-sc_rxtslot],
sc, sc_timerq);
SYNCACHE_TIMEOUT(sc, sc-sc_rxtslot);
splx(s);
tcpstat.tcps_sndacks++;
tcpstat.tcps_sndtotal++;
}
*sop = NULL;
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


nested ipfw dummynet pipes

2003-06-20 Thread Don Bowman
is there any way, in a bridging config, to have nested pipes?

In particular, what i would like to achieve is a rule that
allows e.g. 64kbps per host (src-mask 0x), but
that all these hosts are in an overall 10Mbps pipe. The idea
will be that @ some times of the day the pipe is less than
full, so everyone gets 64kbps, but @ other times of the day
the pipe is full, and I don't want more than 10Mbps flowing.

net.inet.ip.fw.one_pass looks to do what i want but:
Note: bridged and layer 2 packets coming out of a pipe are never
reinjected in the firewall irrespective of the value of this
variable.

suggests this is not the case.

Is there some technique using e.g. netgraph? Or can someone suggest
why the note is there and if it might be easily removed?

e.g. what i have is a system with 

   em0 -- em1
net.link.ether.bridge_cfg=em0 em1
net.link.ether.bridge=1
net.link.ether.bridge_ipfw=1
net.inet.ip.fw.one_pass=1

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: nested ipfw dummynet pipes

2003-06-20 Thread Don Bowman
From: Luigi Rizzo [mailto:[EMAIL PROTECTED]
 
 On Fri, Jun 20, 2003 at 01:41:21PM -0400, Don Bowman wrote:
  is there any way, in a bridging config, to have nested pipes?
 
 net.inet.ip.fw.one_pass=0 should do the job, i think the comment
 in the manpage is now incorrect and the code (in net/bridge.c)
 has been fixed (one-line) to implement this.
 
 Check the commit logs, i don't have them handy at the moment.


Thanks very much, I will check this. I assume this will be true
for IPFW2 rather than IPFW.

It appears that 1.16.2.23, nov 21 2002, RELENG_4 has this
from the log:
MFC: obey to fw_one_pass in bridge and layer 2 firewalling (the latter
only affects ipfw2 users).
Move fw_one_pass from ip_fw[2].c to ip_input.c to avoid depending on
IPFIREWALL.

I will try this out.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: nested ipfw dummynet pipes

2003-06-20 Thread Don Bowman
From: 'Luigi Rizzo' [mailto:[EMAIL PROTECTED]
 On Fri, Jun 20, 2003 at 02:18:17PM -0400, Don Bowman wrote:
 ...
  Thanks very much, I will check this. I assume this will be true
  for IPFW2 rather than IPFW.
 
 one_pass actually affect both.
 the comment in parentheses refers to layer 2 firewalling
 which is an ipfw2-only fature (bridge firewalling
 is also available with ipfw1)

This works correctly, thanks very much. Attached is a trivial
patch to correct the man page.

Is there a benefit to having the single wide pipe first, or
the many narrow pipes first, in the ruleset?

$ cvs diff -U5 ipfw.8
Index: ipfw.8
===
RCS file: /usr/cvs/src/sbin/ipfw/ipfw.8,v
retrieving revision 1.63.2.28
diff -U5 -r1.63.2.28 ipfw.8
--- ipfw.8  30 Sep 2002 20:57:05 -  1.63.2.28
+++ ipfw.8  20 Jun 2003 18:49:02 -
@@ -1587,14 +1587,10 @@
 When set, the packet exiting from the
 .Xr dummynet 4
 pipe is not passed though the firewall again.
 Otherwise, after a pipe action, the packet is
 reinjected into the firewall at the next rule.
-.Pp
-Note: bridged and layer 2 packets coming out of a pipe
-are never reinjected in the firewall irrespective of the
-value of this variable.
 .It Em net.inet.ip.fw.verbose : No 1
 Enables verbose messages.
 .It Em net.inet.ip.fw.verbose_limit : No 0
 Limits the number of messages produced by a verbose firewall.
 .It Em net.link.ether.ipfw : No 0

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Spontan reboot of FreeBSD 4,x box

2003-05-29 Thread Don Bowman
 From: Dennis Pedersen [mailto:[EMAIL PROTECTED]
 
 I have a couple of FreeBSD 4,4 and one 4,7 that are beeing 
 used as firewalls
 in different locations.
 Lately i haven noticed that one of the firewall's was 
 starting to reboot at
 a certin time of the day (give or take maybe 10min).

The time it resets wouldn't correlate to the periodic (e.g.
3am) would it?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Spontan reboot of FreeBSD 4,x box

2003-05-29 Thread Don Bowman
well, I would speculate that your /etc/periodic is
running @ 3am doing things like looking for setuid files,
pruning /tmp, etc, which sparks up some disk activity, forks
a few processes, walks the filesystem, etc, which is tripping some 
bug you have in the kernel, or bad memory. [i have a version
of memtest86 which can be loaded from 'loader' and placed on
a fbsd file system if you wish to try the bad memory theory
conveniently].

I have a similar problem in 4.7 that occurs once in a while
@ 3:01am which seems to randomly corrupt memory. I've been
chasing it for a while but is hasn't been reproducible enough
to find.

This is pure speculation.

man 8 periodic
see /etc/periodic.conf

 -Original Message-
 From: Dennis Pedersen [mailto:[EMAIL PROTECTED]
 Sent: May 28, 2003 16:46
 To: Don Bowman; [EMAIL PROTECTED]
 Subject: Re: Spontan reboot of FreeBSD 4,x box
 
 
 
 - Original Message -
 From: Don Bowman [EMAIL PROTECTED]
 To: 'Dennis Pedersen' [EMAIL PROTECTED]; 
 [EMAIL PROTECTED]
 Sent: Wednesday, May 28, 2003 3:56 PM
 Subject: RE: Spontan reboot of FreeBSD 4,x box
 
 
   From: Dennis Pedersen [mailto:[EMAIL PROTECTED]
  
   I have a couple of FreeBSD 4,4 and one 4,7 that are beeing
   used as firewalls
   in different locations.
   Lately i haven noticed that one of the firewall's was
   starting to reboot at
   a certin time of the day (give or take maybe 10min).
 
  The time it resets wouldn't correlate to the periodic (e.g.
  3am) would it?
 
 On one of the box´s that fits yeah..
 What am i missing?
 cron_enable is set to no in rc.conf and the cron deamon isnt running?
 
 
 Regards,
 Dennis
 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: A problem with too many network interfaces

2003-05-27 Thread Don Bowman
 From: Garrett Wollman [mailto:[EMAIL PROTECTED]
 On Mon, 26 May 2003 14:04:19 -0700 (PDT), 
 =?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?= [EMAIL PROTECTED] said:
 
  A proper BSD port could use something like the trick in 
 Stevens[1] and
  keep retrying the call with a larger bufer until the length of the
  result is the same as in the previous call.
 
 Actually, a proper BSD port would use the net.route.iflist sysctl
 instead.
 
 -GAWollman


$ uname -sr
FreeBSD 4.6-RC

$ sysctl net.route
sysctl: unknown oid 'net.route'

I think since the ports work against other than current branch
it would be difficult to support?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Source ip route lookup on incoming packets?

2003-02-28 Thread Don Bowman
 From: Sten Daniel Sørsdal [mailto:[EMAIL PROTECTED]

 On Thu, Feb 27, 2003 at 02:02:53PM +0100, Sten Daniel S?rsdal wrote:
   What i am looking for is a feature that basically 
 prevents spoofing by looking
   the route for the source and match the incoming interface. 
   A firewall solves the problem but adds alot of 
 administrative overhead and 
   leaves room for error.
 Check the net.inet.ip.check_interface sysctl.
 It may be what you're looking for.
 BMS
 
 Thank you for your reply!
 
 I havent had a clear explanation of that one (tried the RFC too).
 But does this one really stop spoofing for routed packets as well?
 
 I got some border routers running BGP - three of which have 
 full internet feed.
 Would this block spoofed packets from my network and would it block
 incoming source IPs that come from nonexistant networks?

I think the routers would need to have egress filtering enabled,
which isn't all that commonly done.

http://www-users.rwth-aachen.de/jens.hektor/security/cisco-acl.html

for example.

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message


3COM 3C996-SX (bge) fibre support?

2003-01-14 Thread Don Bowman

I see in the cvs comments that this card is supported (1.11 of if_bge.c).
The relevant change seems to be:
+   /*
+* Figure out what sort of media we have by checking the
+* hardware config word in the EEPROM. Note: on some BCM5700
+* cards, this value appears to be unset. If that's the
+* case, we have to rely on identifying the NIC by its PCI
+* subsystem ID, as we do below for the SysKonnect SK-9D41.
+*/
+   bge_read_eeprom(sc, (caddr_t)hwcfg,
+   BGE_EE_HWCFG_OFFSET, sizeof(hwcfg));
+   if ((ntohl(hwcfg)  BGE_HWCFG_MEDIA) == BGE_MEDIA_FIBER)
+   sc-bge_tbi = 1;

sadly, I have a phy-id of 0, so I think I have to use the hackish
method the SK... uses, just below it:
/* The SysKonnect SK-9D41 is a 1000baseSX card. */
if ((pci_read_config(dev, BGE_PCI_SUBSYS, 4)  16) ==
SK_SUBSYSID_9D41)
sc-bge_tbi = 1;

I have the subsystem etc (side-node: there's a bug in the above code,
it should check the vendor id as well):
PCI sub-devid 0x1004 PCI PCI sub-vid 0x10b7

So I added a line of the SK_... type above, to set the 'bge_tbi' to
one for my 1000baseSX card.

However, I see this interface 'flapping', I get snowed with messages
to my console about 'link up' (but never link down). I tried 
forcing the media  mediaopts to 1000Mbps and full-duplex.
The other end of the link sees nothing (no link).

Anyone have a suggestion on where to start? I suspect this is related
to the comment about One thing that confuses me
still is that the 'link state change' bit in the status block seems
to change state an awful lot. (1.10).

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Redundant NIC/Connections

2003-01-02 Thread Don Bowman
 From: Jonathan Disher [mailto:[EMAIL PROTECTED]]
 On Wed, 1 Jan 2003, David J Duchscher wrote:
 
   I was wondering how people are handling redundant 
 connections?  We
   would like to have dual NICs in the FreeBSD box with each NIC
   connected
   to a different switch.  Both switches are in the same broadcast
   domain.
   In pointers, hints on this may done would be greatly 
 appreciated.
  
   I think one of my colleagues responded directly to the 
 poster. We do
   it by a daemon he wrote that monitors interface link 
 status, and also
   pingability of default gateways, and reconfigures 
 interfaces in event
   of a failure, based on the normal configuration file settings
   (/etc/rc.conf)
  
   Instantly in event of link loss; after a few seconds of 
 retrying in
   event of router loss (we use HSRP addresses for routers.)
 
  Yes, I got a few responses including the one mentioned 
 above.  Lack of
  time and changing priorities has prevented me from following up on
  them.  Anshuman Kanwar did mentioned he might release his 
 solution as
  open source if there was interest.  I am at least interested in
  reviewing it.  I just need to find the time.
 
 I would also be very interested in this.  We could write our 
 own, but I'd
 much rather burn that time working on other projects ;-).
 
 Did Anshuman happen to mention what it's written in?  Perl? C? other?

I wonder if the VRRP (http://www.bsdshell.net/hut_fvrrpd.html) can help
here. its available as a port.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Broadcom BCM5703X Gigabit Ethernet woes, panics, no MIIs, oh my!

2002-12-31 Thread Don Bowman
From: George J.V. Cox [mailto:[EMAIL PROTECTED]]

 I have a Dell 1655MC blade server, and a compiled-this-week 
 4.7-STABLE kernel.
 The hardware is a chassis of 6 PCs in a 3U case.  Each blade 
 has two Broadcom
 BCM5703 interfaces.  Unfortunately, its behaviour is rather 
 non-deterministic.  

 ...

I'm seeing similar behaviour with a 5704 (dual gmac). I will 
let you know if I find a fix for it. I'm suspecting the
timing on the eeprom interface right now since I sometimes
get a MAC of 0.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



struct inpcb, INET6

2002-12-14 Thread Don Bowman

Is there a reason that struct inpcb doesn't have
an #ifdef INET6 around
struct {
/* IP options */
struct  mbuf *inp6_options;
/* IP6 options for outgoing packets */
struct  ip6_pktopts *inp6_outputopts;
/* IP multicast options */
struct  ip6_moptions *inp6_moptions;
/* ICMPv6 code type filter */
struct  icmp6_filter *inp6_icmp6filt;
/* IPV6_CHECKSUM setsockopt */
int inp6_cksum;
u_short inp6_ifindex;
short   inp6_hops;
u_int8_tinp6_hlim;
} inp_depend6;

? Its 25 bytes per connection.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 From: Don Bowman [mailto:[EMAIL PROTECTED]]
 I have a setup where I have a transparent proxy using ipfw fwd (to
 localhost).
 Data is sent to this device using a MAC rewrite so that 
 packets arrive with
 my MAC, but the original source and destination IP.
 When I receive the SYN, i accept the connection, which causes an ARP
 to be emitted for the source address, and then the SYN/ACK.

I didn't get much response from this, so I'm going to re-phrase.

Is there any reason that I shouldn't modify the TCP passive accept
so that it remembers both the MAC address of the sender, and the
interface the packet came in on? By doing so, I will avoid
having to issue an ARP for each incoming connection (which adds
latency, and more importantly for me, breaks the ability to use
ipfw 'fwd' rules the way I want). [This is with FreeBSD 4.7 if
it matters].

What's happening is I have 1 router feeding me sessions which
I'm transparently proxying (e.g. squid).
Obviously I can't have a default route back to each of them.

So I have something like:

[Router1]---\
 \
[Router2][BSD]
 /
[Router3]---/

This is done with a layer-2 mac rewrite, ie the router takes the packet,
doesn't modify the IP header, but changes the destination MAC to
be that of the BSD machine.

So, e.g, a packet comes into router1 above (from somewhere on its
left hand side). It may have IPsrc=1.0.0.1, IPdst=2.0.0.1.
It then arrives @ the BSD machine, which will cheerfully say, yup,
I'm 2.0.0.1 (using the beauty of 'ipfw fwd localhost...').
Problem is, it then wants to send a SYN/ACK, there's no route,
so no where to go. I can't make the route be one of those routers,
and the routing tables are too complicated to install (since there
may be BGP on the left of them, etc, etc). Its important for
me the response packets go back through the same path (to avoid
reordering etc).

The next step for me is to use a separate VLAN from each of those
routers to the BSD box (so that the packets appear to come from different
interfaces). I'd like to memorize the interface the packet came in,
and the mac header to use, and just use that without making an enormous
arp table, and going back to the place the SYN came from.

Is there a reason it doesn't work this way currently (before I dive
in and make changes).
If I were to change it to work the way I want, would other people 
be interested?
Would this be interesting as a whole-sale change in behaviour, or as
a sysctl-changeable or #ifdef settable?

Comments greatly appreciated.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 -Original Message-
 From: Chuck Swiger [mailto:[EMAIL PROTECTED]]
 On Wednesday, December 4, 2002, at 03:20  PM, Don Bowman wrote:
 
  What's happening is I have 1 router feeding me sessions which
  I'm transparently proxying (e.g. squid).
  Obviously I can't have a default route back to each of them.
 
  So I have something like:
 
  [Router1]---\
   \
  [Router2][BSD]
   /
  [Router3]---/
 
  This is done with a layer-2 mac rewrite, ie the router 
 takes the packet,
  doesn't modify the IP header, but changes the destination MAC to
  be that of the BSD machine.
 
 You can't have more than one default route, but you certainly 
 can have 
 several static or dynamic routes to select the appropriate 
 router to send 
 responses back.  You could also look into policy-based routing or 
 multihoming the connections, but I guess that depends on what 
 you're doing.
 
   I can't make the route be one of those routers,
   and the routing tables are too complicated to install (since there
   may be BGP on the left of them, etc, etc). Its important for
   me the response packets go back through the same path (to avoid
   reordering etc).
 
 What happens if incoming traffic comes via more than one router at a 
 time-- how should your system decide which path to send 
 replies back?  
 Based on the source IP?

These are isp-sized routers (complicated networks with different
peering points to other networks). Static routes don't work since
they are much too dynamic. Additionally, the widget which is
picking the traffic to send (like Cisco WCCP) is load-balancing,
so there's another striping of data going on.

I'd like to just send it back to the router it came from.
I won't have a single TCP session come from more than one router,
but will have the same source or destination IP come from the different
routers concurrently.

I'm not sure what you mean by policy-based routing. If its the same
thing as on a router, then its not appropriate since it will be
based on IP.

In the example diagram above, I might have a case where host 'A'
sends host 'B' two concurrent TCP sessions. These will both transparently
arrive @ the BSD box, one via router1, one via router2. Triangulation
breaks the application, so A-B(session1) needs to always flow via
the same router it started on.

I'm thinking this is achieved by just caching the interface  destination
MAC etc in the PCB for the TCP session. It does this anyway once its
finished sending the SYN/ACK, its just that it follows routing rules and
ARP's for the SYN/ACK.

This is a common application for e.g. Squid when being fed by more
than one router.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
From: Julian Elischer [mailto:[EMAIL PROTECTED]]
 The arp is issued because the TCP stack is responding to the 
 SYN packet with it's own SYN, but it doesn't have a route to the
 origianal source, so it creates one, as it's local. this means that it
 allocates an ARP entry for it which in turn causes an arp
 request to be sent. The response will result in the SYN being
 transmitted. This is all pretty normal. there will not be another
 ARP sent for 18 minutes for that host.. thw question is..
 
 Why does it think the source is local? are the routers below 
 doing proxy
 arp? Did you give your interface a netmask of 0,0.0.0?
 
 Who responds to the arp?

Its a layer-2 MAC rewrite, so it arrives on a local segment, but
subnetting rules don't apply.
No-one responds to the ARP, hence my problem :)

I know what its doing now is normal, its just that it doesn't work
in my configuration (which isn't typical).

The interface in question has no IP or netmask (or at least, i would
like it to not have one, its not needed).

You COULD write a netgraph node that adds routes as it receives packets
in fact it could keep it's own cache of IP/MAC mappings 
and switch the MACs appropriatly on outgoing packets.
Possibly adding routes would be best.

It would identify the source from the src mac address, and 
add add the appropriate entry to the routing table.
a bit like a learning bridge.

I'm not sure I can write a route-rule for a connection since I could
have a different path back to the same IP for a different TCP
connection. Thus my idea just to let the PCB take care of it.

if there is bgp to the left, you could make this machine take part..
do the routers do bgp?

Not in all cases :(

Is there a reason that return routes are not added every time
a packet is received? Well, yes. For a start it may not be what everyone
wants.  I have made great use of asymetrical routing many times
(e.g. some satelite internet connections are via modem for outgoing
and via the satelite for incoming.)

OK, I understand. So if I make this change, it would only be useful
if it were not the default / disableable. Perhaps it would be a
socket option on the listen() socket... Similar to the SO_DONTROUTE
I guess. Maybe that is what SO_DONTROUTE should mean for listen()?

This is only an issue for passively accepted connections.

This issue comes about due to the way WCCP works with its hashing
buckets and with multiple routers feeding multiple caching servers:
the routers load balance across caches (so each will distribute the
sources addresses on its left to more than one cache).

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 From: Julian Elischer [mailto:[EMAIL PROTECTED]]
 On Wed, 4 Dec 2002, Don Bowman wrote:
   Why does it think the source is local? are the routers below 
   doing proxy
   arp? Did you give your interface a netmask of 0,0.0.0?
   
   Who responds to the arp?
  
  Its a layer-2 MAC rewrite, so it arrives on a local segment, but
  subnetting rules don't apply.
  No-one responds to the ARP, hence my problem :)
 
 Someone must be responding, because the SYN is eventually sent.

Ah, its working currently with a single router. Adding the 2nd router
is breaking it. I currently have a default route back to the first
router. Adding the 2nd router, the back-path always goes through
the first router, which gets confused. (I'm using the term router,
but its actually a content switching device operating @ layer 4,
like cisco WCCP or Cisco CSM or nortel Alteon).

 Here's my suggestion:
 
 write a netgraph node that does all the MAC rewriting.
 Code from the ng_bridge node would be useful.
 attach it to a ng_iface node.
 make the netgraph iface the default route. 
 (route add default -iface ng0)

Let me chew on that for a bit. I'm not sure where it would get the
destination mac from, wouldn't it have to cache the information
the PCB is holding? Wouldn't it be more efficient for me to 
just create the ether-header when the SYN comes in, store it
in the PCB, and use that on each outgoing packet for that tcp
connection, add a sockopt (or use SO_DONTROUTE for this on the
listen socket)?

Thanks for the great suggestions, keep them coming :)

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 From: Chuck Swiger [mailto:[EMAIL PROTECTED]]
 On Wednesday, December 4, 2002, at 03:37  PM, Don Bowman wrote:
 [ ... ]
   These are isp-sized routers (complicated networks with different
  peering points to other networks). Static routes don't work since
  they are much too dynamic. Additionally, the widget which is
  picking the traffic to send (like Cisco WCCP) is load-balancing,
  so there's another striping of data going on.
 
 Yes, but the complicated internal routes maintained within 
 those networks 
 isn't your problem if your machine or network isn't BGP 
 peering with them.

It is in the sense that I have to figure out which one to send
data back to. More than one of them may 'own' a source address
at a given time (for a TCP session).

 
  In the example diagram above, I might have a case where host 'A'
  sends host 'B' two concurrent TCP sessions. These will both 
 transparently
  arrive @ the BSD box, one via router1, one via router2. 
 Triangulation
  breaks the application, so A-B(session1) needs to always flow via
  the same router it started on.
 
 Why?  This sounds like a pretty classic example of A being on 
 a multihomed 
 network, and you should let IP-level routing deal with the 
 problem.  But 
 there are alternatives, I guess-- maybe try putting a buncha 
 interfaces on 
 the BSD box, one for each router being connected to it, and 
 put each pair 
 on their own /30.  That way, the BSD box can quite easily return the 
 traffic back to the originating router

Only if its routing, not for L2 redirection.

 
  I'm thinking this is achieved by just caching the interface 
  destination
  MAC etc in the PCB for the TCP session. It does this anyway once its
  finished sending the SYN/ACK, its just that it follows 
 routing rules and
  ARP's for the SYN/ACK.
 
 Yes.  Pretending machines which are on remote networks are 
 local can be 
 done by re-writing MAC addresses, but that can be achieved by 
 NAT or VPN 
 solutions as well.  Why are you trying to override normal 
 routing behavior 
 when you probably can use it to help solve the problem?

This is a transparent proxy. The proxy needs to know where the
real destination was (in case it needs to open a connection there).
The HTTP protocol solved this by putting the real-ip address in the
header, but most other protocols didn't.
I don't have control of the content switching routers which feeds
this. They work the way they do.

Say for the sake of example you wished to load balance 2 farms 
of telnet servers. You had a device which picked off port 23,
and sent it to you without alterations. You would then look @ the
intended destination address, and pick the right group of telnet
servers, and send the data there. Now say that those devices themselves
where load-balanced. So if a user telneted twice to the same destination,
one path might go through the first redirector, and one through the
2nd. The path back is based on the path it came in.
  [client]
|
  --
  | Load Balancer  |
  --
   |   |
   |   |
  [Redirector1] [Redirector2]
\ /
 \   /
 -
 ||
   [BSD1]   [BSD2]
 ||
 -
  | | | | | | | | | |
Telnet servers(A)   Telnet (B)

So in this case, [client] sends a SYN to port 23 on the virtual address
of telnet(A). The load balancer sends this (and all other traffic)
aribtrarily to Redirector1 or 2. These devices say, Aha!, port 23, let
me use this clever policy based route, and just rewrite the destination
MAC to be either BSD1 or BSD2 (based on some feedback on their load,
availability, etc). BSD1 and 2 have a rule like:
 ipfw fwd localhost,9000 tcp from any to any recv bge0 23
and then on localhost:9000 have listening a clever little app that does:
 accept(), look @ intended destination IP, pick a telnet server in
the farm it so addresses, connect, and then proxy the accepted() connection
to the actively initiated one.

Now, BSD1 / 2 can't use Redirector1/2 as a default route, since they
will be treating them as equals. One of them sent the SYN packet,
I'd love the SYN/ACK to go back to the same one. I know the MAC it
came from, that's where the response should go.

Making it all layer 3 doesn't help me, then I don't have the intended
destination address. Additionally I have the problem that if I have
two routers on my net, and one sends me traffic, I can only respond
to it if its my default route, or if I have a static route for an
IP behind it. Maybe those routers both lead to the same locations?

I can't really use a VPN (GRE etc) tunnel since then I'll have to fragment,
and I'd prefer to avoid that. My first thought

RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 From: Don Bowman [mailto:[EMAIL PROTECTED]]
 I have a setup where I have a transparent proxy using ipfw fwd (to
 localhost).
 Data is sent to this device using a MAC rewrite so that 
 packets arrive with
 my MAC, but the original source and destination IP.
 When I receive the SYN, i accept the connection, which causes an ARP
 to be emitted for the source address, and then the SYN/ACK.

I didn't get much response from this, so I'm going to re-phrase.

Is there any reason that I shouldn't modify the TCP passive accept
so that it remembers both the MAC address of the sender, and the
interface the packet came in on? By doing so, I will avoid
having to issue an ARP for each incoming connection (which adds
latency, and more importantly for me, breaks the ability to use
ipfw 'fwd' rules the way I want). [This is with FreeBSD 4.7 if
it matters].

What's happening is I have 1 router feeding me sessions which
I'm transparently proxying (e.g. squid).
Obviously I can't have a default route back to each of them.

So I have something like:

[Router1]---\
 \
[Router2][BSD]
 /
[Router3]---/

This is done with a layer-2 mac rewrite, ie the router takes the packet,
doesn't modify the IP header, but changes the destination MAC to
be that of the BSD machine.

So, e.g, a packet comes into router1 above (from somewhere on its
left hand side). It may have IPsrc=1.0.0.1, IPdst=2.0.0.1.
It then arrives @ the BSD machine, which will cheerfully say, yup,
I'm 2.0.0.1 (using the beauty of 'ipfw fwd localhost...').
Problem is, it then wants to send a SYN/ACK, there's no route,
so no where to go. I can't make the route be one of those routers,
and the routing tables are too complicated to install (since there
may be BGP on the left of them, etc, etc). Its important for
me the response packets go back through the same path (to avoid
reordering etc).

The next step for me is to use a separate VLAN from each of those
routers to the BSD box (so that the packets appear to come from different
interfaces). I'd like to memorize the interface the packet came in,
and the mac header to use, and just use that without making an enormous
arp table, and going back to the place the SYN came from.

Is there a reason it doesn't work this way currently (before I dive
in and make changes).
If I were to change it to work the way I want, would other people 
be interested?
Would this be interesting as a whole-sale change in behaviour, or as
a sysctl-changeable or #ifdef settable?

Comments greatly appreciated.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-04 Thread Don Bowman
 From: Julian Elischer [mailto:[EMAIL PROTECTED]]
 On Wed, 4 Dec 2002, Don Bowman wrote:
 ...

 It gets the destination MAC address from the SRC AMC field of the
 preceding incoming packets with that IP src, dst and port
 combination i.e. the node would look within the IP header.
 
 
  Wouldn't it be more efficient for me to 
  just create the ether-header when the SYN comes in, store it
  in the PCB, and use that on each outgoing packet for that tcp
  connection, add a sockopt (or use SO_DONTROUTE for this on the
  listen socket)?
 
 yes and no... you would be breaking the layering in 
 the standard code and you'd get crucified for it.
 
 start with the ng_bridge node and make it look within
 the IP header and use that information in it's hash tables instead of 
 MAC addresses. It'll need some hosekeeping code too.
 (to flush old info, though you could reduce this by removing
 entries when you see the FIN packets go past.)

Perhaps I can do this within ipfw? Its only ipfw that is bringing up
this situation, making me respond to things that normally wouldn't
be routed to me. Perhaps 'ipfw' is missing something when it does
a 'fwd' to localhost, another step to make this all work?

FIN are pretty rare :) Too often things just shut off. I'm nervous
about trying to cache the info outside the PCB since it has to
stay in sync (its not like the arp cache, there's no way to get
the info back if you drop it early).
RST is even more problematic since I have to decide if its in-window.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



SO_DONTROUTE, arp's, ipfw fwd, etc

2002-12-02 Thread Don Bowman

I have a setup where I have a transparent proxy using ipfw fwd (to
localhost).
Data is sent to this device using a MAC rewrite so that packets arrive with
my MAC, but the original source and destination IP.
When I receive the SYN, i accept the connection, which causes an ARP
to be emitted for the source address, and then the SYN/ACK.

Now, I would like to have my default route not be on the 'data' interface
which has the ipfw rule. It seems like this would work if:

a) the MAC address for the source address (the router which sent me
the packet) was entered into the ARP cache automatically when the SYN
was received.
b) I used SO_DONTROUTE in my proxy application.

Does anybody have any comments on that? Is there a reason that learning
ARP entries isn't done passively?

I assume that since the receive interface is cached in the syncache,
and then proxied through to the PCB, that the SO_DONTROUTE will cause
the return packets to go back through that same interface?

Is there a simpler way?

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



IPFW question with options and fwd rule

2002-11-26 Thread Don Bowman

If I create a rule to 'fwd' packets with a particular TCP option 
set (or IP option) to a specific local port, and then I accept
on that port, will subsequent packets without that option work?

ie, I have this:

100 fwd localhost,9000 tcp from any to any 1234 tcpoptions ts recv interface

SYN (TCP option SACK=1), Dest port=, Dest ip = random-host
SYN/ACK
ACK (no TCP options)

will the first SYN reach me? (yes I think, even though the IP is not mine
and
the dest port is not me, the ipfw fwd magic takes care).
Will the ACK from the client reach me? (the dest ip is not me, so will the
stack discard, or will the already created PCB take care of this?)

I'd like to carry on a normal TCP conversation, but select the local port
that terminates it based on a TCP option. The destination IP will be
somewhere
else (its a transparent proxy application).

Thanks in advance.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: IPFW question with options and fwd rule

2002-11-26 Thread Don Bowman
 From: Julian Elischer [mailto:[EMAIL PROTECTED]]
 On Tue, 26 Nov 2002, Don Bowman wrote:
 
  
  If I create a rule to 'fwd' packets with a particular TCP option 
  set (or IP option) to a specific local port, and then I accept
  on that port, will subsequent packets without that option work?
  
 ...
 well, no, because  != 1234 :-)
 but, assuming that your rule said , then it would only 
 reach you if
 it has the ts option set.
 
 to be forwarded a packet must match teh rule..
 subsequent packewts must ALSO match the rule.

Sigh, I guess TANSTAAFL shows true. I was hoping once the PCB was setup
that it could act like some sort of packet attractor. Or in other words,
to get the packet stream to play follow the leader on the syn.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch

2002-11-22 Thread Don Bowman
 From: John Polstra [mailto:[EMAIL PROTECTED]]
 In article 184f01c291c9$147e7100$[EMAIL PROTECTED],
 Sam Leffler [EMAIL PROTECTED] wrote:
   I would recommend a committer look this over and 
   commit it. If you wish, I can make the patch *just*
   be the change (changing the 16-bit to 32-bit writes,
   without the VPD stuff), but the other changes seemed
   generally useful.
  
  Please whittle the patch down to just the bug fix; 5.0 is 
 in code freeze.
 
 Don't worry, Sam.  I'm planning to shepherd this stuff into the
 tree, but I don't see it happening for 5.0.
 

Be aware that the bge driver is not too useful (and quite dangerous)
without this change.

Personally I'd like to see it go in 4.8.

--don ([EMAIL PROTECTED] www.sandvine.com)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Sockets and changing IP addresses

2002-11-21 Thread Don Bowman
 From: Wes Peters [mailto:[EMAIL PROTECTED]]
 Archie Cobbs wrote:
  
  I'm curious what -net's opinion is on PR kern/38544:
  
  http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/38554
  
  In summary: if you have a connected socket whose local IP address
  is X, and then change the interface IP address from X to Y, then
  packets written out by the socket will continue to be transmitted
  with source IP address X.
  
  Do people agree that this is a bug and should be fixed?
 
 Yes.  The other end can't possibly reply to address X, so the 
 connection
 is broken at this point.
 

I think the current behaviour is correct. Since the IP-MAC lookup
will remain cached, the communication will continue to work to the old
IP. Changing the IP on the connected socket will make the connection
drop. The best case is the the way it works.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Sockets and changing IP addresses

2002-11-21 Thread Don Bowman
 From: Archie Cobbs [mailto:[EMAIL PROTECTED]]
 Sent: November 21, 2002 16:54
 To: Don Bowman
 Cc: 'Wes Peters'; Archie Cobbs; [EMAIL PROTECTED]
 Subject: Re: Sockets and changing IP addresses
 
 
 Don Bowman wrote:
I'm curious what -net's opinion is on PR kern/38544:

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/38554

In summary: if you have a connected socket whose local 
 IP address
is X, and then change the interface IP address from X to Y, then
packets written out by the socket will continue to be 
 transmitted
with source IP address X.

Do people agree that this is a bug and should be fixed?
   
   Yes.  The other end can't possibly reply to address X, so the 
   connection is broken at this point.
  
  I think the current behaviour is correct. Since the IP-MAC lookup
  will remain cached, the communication will continue to work 
 to the old
  IP. Changing the IP on the connected socket will make the connection
  drop. The best case is the the way it works.
 
 What you're saying doesn't make sense to me. First of all, this has
 nothing to do with ARP tables (although you are right that 
 the router's
 ARP entry for the old IP address will remain valid). Secondly, the
 communiation will NOT work because the host will drop packets sent
 to it with the (now) wrong IP address.
 
 The current behavior is bad because the application does not ever
 receive any notification that the socket it's using is no longer
 valid.

I guess I was thinking of the transparent proxy case (e.g. Squid)
where I have a ipfw fwd rule, and the socket is terminated locally.
Changing the IP address of the interface shouldn't drop my
proxied connection.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch

2002-11-21 Thread Don Bowman
 From: Sam Leffler [mailto:[EMAIL PROTECTED]]
  I would recommend a committer look this over and 
  commit it. If you wish, I can make the patch *just*
  be the change (changing the 16-bit to 32-bit writes,
  without the VPD stuff), but the other changes seemed
  generally useful.
 
 Please whittle the patch down to just the bug fix; 5.0 is in 
 code freeze.
 
 Sam
 

Sigh, I was afraid someone would say that. Will do.
The patch is against RELENG_4, but is fairly trivial. It is below,
just the bug fix is there (changing the writing to the 
receiver control block to be 32-bits all the time).

Patch follows:

Index: if_bge.c
===
RCS file: /cvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.3.2.18
diff -U3 -r1.3.2.18 if_bge.c
--- if_bge.c2 Nov 2002 18:22:23 -   1.3.2.18
+++ if_bge.c22 Nov 2002 02:01:48 -
@@ -913,7 +913,7 @@
 {
int i;
struct bge_rcb *rcb;
-   struct bge_rcb_opaque *rcbo;
+   bge_max_len_flags len_flags;
 
for (i = 0; i  BGE_JUMBO_RX_RING_CNT; i++) {
if (bge_newbuf_jumbo(sc, i, NULL) == ENOBUFS)
@@ -923,9 +923,9 @@
sc-bge_jumbo = i - 1;
 
rcb = sc-bge_rdata-bge_info.bge_jumbo_rx_rcb;
-   rcbo = (struct bge_rcb_opaque *)rcb;
-   rcb-bge_flags = 0;
-   CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_MAXLEN_FLAGS, rcbo-bge_reg2);
+   len_flags.bge_len_flags = rcb-bge_len_flags.bge_len_flags;
+   len_flags.s.bge_flags = 0;
+   CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_MAXLEN_FLAGS,
len_flags.bge_len_flags);
 
CSR_WRITE_4(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc-bge_jumbo);
 
@@ -1133,6 +1133,7 @@
struct bge_rcb *rcb;
struct bge_rcb_opaque *rcbo;
int i;
+   bge_max_len_flags len_flags;
 
/*
 * Initialize the memory window pointer register so that
@@ -1202,12 +1203,13 @@
rcb = sc-bge_rdata-bge_info.bge_std_rx_rcb;
BGE_HOSTADDR(rcb-bge_hostaddr) =
vtophys(sc-bge_rdata-bge_rx_std_ring);
-   rcb-bge_max_len = BGE_MAX_FRAMELEN;
+   len_flags.s.bge_max_len = BGE_MAX_FRAMELEN;
+   len_flags.s.bge_flags = 0;
+   rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags;
if (sc-bge_extram)
rcb-bge_nicaddr = BGE_EXT_STD_RX_RINGS;
else
rcb-bge_nicaddr = BGE_STD_RX_RINGS;
-   rcb-bge_flags = 0;
rcbo = (struct bge_rcb_opaque *)rcb;
CSR_WRITE_4(sc, BGE_RX_STD_RCB_HADDR_HI, rcbo-bge_reg0);
CSR_WRITE_4(sc, BGE_RX_STD_RCB_HADDR_LO, rcbo-bge_reg1);
@@ -1224,12 +1226,13 @@
rcb = sc-bge_rdata-bge_info.bge_jumbo_rx_rcb;
BGE_HOSTADDR(rcb-bge_hostaddr) =
vtophys(sc-bge_rdata-bge_rx_jumbo_ring);
-   rcb-bge_max_len = BGE_MAX_FRAMELEN;
+   len_flags.s.bge_max_len = BGE_MAX_FRAMELEN;
+   len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED;
+   rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags;
if (sc-bge_extram)
rcb-bge_nicaddr = BGE_EXT_JUMBO_RX_RINGS;
else
rcb-bge_nicaddr = BGE_JUMBO_RX_RINGS;
-   rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED;
 
rcbo = (struct bge_rcb_opaque *)rcb;
CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_HADDR_HI, rcbo-bge_reg0);
@@ -1239,7 +1242,9 @@
 
/* Set up dummy disabled mini ring RCB */
rcb = sc-bge_rdata-bge_info.bge_mini_rx_rcb;
-   rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED;
+   len_flags.s.bge_max_len = 0;
+   len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED;
+   rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags;
rcbo = (struct bge_rcb_opaque *)rcb;
CSR_WRITE_4(sc, BGE_RX_MINI_RCB_MAXLEN_FLAGS, rcbo-bge_reg2);
 
@@ -1259,8 +1264,9 @@
rcb = (struct bge_rcb *)(sc-bge_vhandle + BGE_MEMWIN_START +
BGE_SEND_RING_RCB);
for (i = 0; i  BGE_TX_RINGS_EXTSSRAM_MAX; i++) {
-   rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED;
-   rcb-bge_max_len = 0;
+   len_flags.s.bge_max_len = 0;
+   len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED;
+   rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags;
rcb-bge_nicaddr = 0;
rcb++;
}
@@ -1272,17 +1278,20 @@
BGE_HOSTADDR(rcb-bge_hostaddr) =
vtophys(sc-bge_rdata-bge_tx_ring);
rcb-bge_nicaddr = BGE_NIC_TXRING_ADDR(0, BGE_TX_RING_CNT);
-   rcb-bge_max_len = BGE_TX_RING_CNT;
-   rcb-bge_flags = 0;
+   len_flags.s.bge_max_len = BGE_TX_RING_CNT;
+   len_flags.s.bge_flags = 0;
+   rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags;
 
/* Disable all unused RX return rings */
rcb = (struct bge_rcb *)(sc-bge_vhandle + BGE_MEMWIN_START +
BGE_RX_RETURN_RING_RCB);
-   for (i = 0; i  BGE_RX_RINGS_MAX; i++) {
+   rcb++;
+   for (i = 1; i  BGE_RX_RINGS_MAX; i++) {

bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch

2002-11-21 Thread Don Bowman
(apologies if you got this more than once, but after 6
hours it hadn't shown up on the mailing list)

There is a bug in the STABLE (and current) if_bge which
causes the driver to loop forever in interrupt context
(in bge_rxeof()). This is caused by the return ring
length being 1024 in the driver, and erroneously
decided to be 2048 in the chip, which causes it
to return an index off the end off the ring.

You will know you are running into this if your kernel
locks up, ^T still works, and the debugger shows you
in bge_rxeof() or a routine called from it.

This situation can occur regardless of traffic. It
seems to either work or not work from the get-go,
so if you are going to run into it, it will be boolean
from the machine startup.

The patch attached solves this problem by changing the
16-bit writes into the chip's memory window to 32-bit
writes.

The patch also enables the PCI-VPD (See PCI 2.2) output 
(to help diagnose which version of the chip you have, whose
board, how fast the PCI clock is etc).

I would recommend a committer look this over and 
commit it. If you wish, I can make the patch *just*
be the change (changing the 16-bit to 32-bit writes,
without the VPD stuff), but the other changes seemed
generally useful.

Index: if_bge.c
===
RCS file: /cvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.3.2.18
diff -U3 -r1.3.2.18 if_bge.c
--- if_bge.c2 Nov 2002 18:22:23 -   1.3.2.18
+++ if_bge.c21 Nov 2002 20:13:23 -
@@ -114,6 +114,7 @@
 #include dev/bge/if_bgereg.h
 
 #define BGE_CSUM_FEATURES  (CSUM_IP | CSUM_TCP | CSUM_UDP)
+#define BGE_VPD
 
 /* controller miibus0 required.  See GENERIC if you get errors here. */
 #include miibus_if.h
@@ -178,6 +179,7 @@
 static u_int8_tbge_eeprom_getbyte  __P((struct bge_softc *,
int, u_int8_t *));
 static int bge_read_eeprom __P((struct bge_softc *, caddr_t, int,
int));
+static void dump_manufacturing_information __P((struct bge_softc *));
 
 static u_int32_t bge_crc   __P((caddr_t));
 static void bge_setmulti   __P((struct bge_softc *));
@@ -200,11 +202,12 @@
 static int bge_chipinit__P((struct bge_softc *));
 static int bge_blockinit   __P((struct bge_softc *));
 
-#ifdef notdef
+#ifdef BGE_VPD
+static void bge_vpd_crack   __P((struct bge_softc *sc));
 static u_int8_t bge_vpd_readbyte __P((struct bge_softc *, int));
 static void bge_vpd_read_res   __P((struct bge_softc *,
 struct vpd_res *, int));
-static void bge_vpd_read   __P((struct bge_softc *));
+static void bge_vpd_read   __P((struct bge_softc *, const char *));
 #endif
 
 static u_int32_t bge_readmem_ind
@@ -311,7 +314,7 @@
return;
 }
 
-#ifdef notdef
+#ifdef BGE_VPD
 static u_int8_t
 bge_vpd_readbyte(sc, addr)
struct bge_softc *sc;
@@ -355,9 +358,54 @@
return;
 }
 
+/* 
+ * Take the read-only (VPD-R) info and crack it into the other fields
+*/
+static void
+bge_vpd_crack(sc)
+   struct bge_softc *sc;
+{
+   int pos = 0;
+   int len = strlen(sc-bge_vpd_readonly);
+   sc-bge_vpd_pn = unknown;
+   sc-bge_vpd_ec = unknown;
+   sc-bge_vpd_mn = unknown;
+   sc-bge_vpd_sn = unknown;
+   sc-bge_vpd_rv = unknown;
+   while (pos  len) {
+   if (!strncmp(sc-bge_vpd_readonly+pos, VPD_PN, 2)) {
+   sc-bge_vpd_pn = (sc-bge_vpd_readonly+pos+3);
+   } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_EC, 2)) {
+   sc-bge_vpd_ec = (sc-bge_vpd_readonly+pos+3);
+   } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_MN, 2)) {
+   sc-bge_vpd_mn = (sc-bge_vpd_readonly+pos+3);
+   } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_SN, 2)) {
+   sc-bge_vpd_sn = (sc-bge_vpd_readonly+pos+3);
+   } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_RV, 2)) {
+   sc-bge_vpd_rv = (sc-bge_vpd_readonly+pos+3);
+   }
+   sc-bge_vpd_readonly[pos] = '\0';
+   pos += 2;
+   pos += sc-bge_vpd_readonly[pos];
+   pos++;
+   }
+   pos = 0;
+   len = strlen(sc-bge_vpd_readwrite);
+   while (pos  len) {
+   if (!strncmp(sc-bge_vpd_readwrite+pos, VPD_YA, 2)) {
+   sc-bge_vpd_asset_tag =
(sc-bge_vpd_readwrite+pos+3);
+   }
+   sc-bge_vpd_readwrite[pos] = '\0';
+   pos += 2;
+   pos += sc-bge_vpd_readwrite[pos];
+   pos++;
+   }
+}
+
 static void
-bge_vpd_read(sc)
+bge_vpd_read(sc, defname)
struct bge_softc *sc;
+   const char *defname;
 {
int pos = 0, i;
struct vpd_res res;
@@ -366,14 +414,20 @@
free(sc-bge_vpd_prodname, M_DEVBUF);
if (sc-bge_vpd_readonly != NULL)

RE: bug in bge driver with ENOBUFS on 4.7

2002-11-12 Thread Don Bowman
 From: Don Bowman [mailto:don;sandvine.com]
 In bge_rxeof(), there can end up being a condition which causes
 the driver to endlessly interrupt.
 
 if (bge_newbuf_std(sc, sc-bge_std, NULL) == ENOBUFS) {
 ifp-if_ierrors++;
 bge_newbuf_std(sc, sc-bge_std, m);
 continue;
 }
 
 happens. Now, bge_newbuf_std returns ENOBUFS. 'm' is also NULL.
 This causes the received packet to not be dequeued, and the driver
 will then go straight back into interrupt as the chip will 
 reassert the interrupt as soon as we return.

More information... It would appear that we're looping here
in the rx interrupt, the variable 'stdcnt' which counts
the number of standard-sized packets pulled off per iteration
is huge (indicating we've overrun the ring multiple times).

while(sc-bge_rx_saved_considx !=
sc-bge_rdata-bge_status_block.bge_idx[0].bge_rx_prod_idx) {

is the construct that controls when we exit the loop. Clearly
in my case this is never becoming false.
I see 'sc-bge_rx_saved_considx' as 201, and the RHS of the 
expression as 38442. This doesn't seem correct, I think that
both numbers must be = BGE_SSLOTS. 

(kgdb) p/x *cur_rx
$10 = {bge_addr = {bge_addr_hi = 0x0, bge_addr_lo = 0xca2d802}, 
  bge_len = 0x4a, bge_idx = 0xc8, bge_flags = 0x7004, bge_type = 0x0, 
  bge_tcp_udp_csum = 0x9992, bge_ip_csum = 0x, bge_vlan_tag = 0x0, 
  bge_error_flag = 0x0, bge_rsvd = 0x0, bge_opaque = 0x0}

Any suggestions anyone?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Packet forwarding overhead - with ipfw counting

2002-11-10 Thread Don Bowman
From: Kevin Day [mailto:toasty;dragondata.com]
 When we're pushing 250-300mbits through, we're using about 15% of its 
 2.4Ghz P4 Xeon CPU. All of it is in interrupt time... that 
 seems a bit 
 high, but that'll still let us max things out at 1gbit so we're ok.

Try applying these diff to your bge driver, it should reduce your
interrupt time substantially in this configuration.
I also increased net.inet.ip.intr_queue_maxlen to 500 from 50 since
I was seeing drops.

Out of curiousity, which motherboard is this? I've been doing some
modelling using the e7500 vs serverworks, and the serverworks
is significantly better, but no one seems to make a 1U one with
2 PCI-X slots. The e7500 has a 1GB/s half-duplex hublink versus
the 3.2GB/s full-duplex one on the GC-LE.

Index: if_bge.c
===
RCS file: /cvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.3.2.18
diff -C5 -r1.3.2.18 if_bge.c
*** if_bge.c2 Nov 2002 18:22:23 -   1.3.2.18
--- if_bge.c10 Nov 2002 16:12:03 -
***
*** 1654,1668 
error = ENXIO;
goto fail;
}
  
/* Set default tuneable values. */
sc-bge_stat_ticks = BGE_TICKS_PER_SEC;
!   sc-bge_rx_coal_ticks = 150;
!   sc-bge_tx_coal_ticks = 150;
!   sc-bge_rx_max_coal_bds = 64;
!   sc-bge_tx_max_coal_bds = 128;
  
/* Set up ifnet structure */
ifp = sc-arpcom.ac_if;
ifp-if_softc = sc;
ifp-if_unit = sc-bge_unit;
--- 1654,1692 
error = ENXIO;
goto fail;
}
  
/* Set default tuneable values. */
+   /* How often should we update the statistics in host memory? */
sc-bge_stat_ticks = BGE_TICKS_PER_SEC;
! /* The coalescing works as follows: for each of Rx|Tx, there
!  * are two tunables: ticks, and packets. The first one to trip
!  * will cause an interrupt. For exampple, if the ticks is set to
!* 1us, an interrupt will be generated no more than 1us after
!* a packet has come in. If the bds is set to 10, then the
!* interrupt would be after 10 packets had been received.
!* If ticks=1 and bds=10, then the interrupt will come in
!* min(1us, 10packets time), likely 1us.
!* Tuning these to larger values reduces interrupts at the 
!* expense of latency to interactive applications. If you
!* are serving files, make these large. If you are running
!* telnet sessions, make them small.
!*
!* The settings below, 500us means a max interrupt rate
!* of 2000/s due to the ticks elapsing, and 120 means
!* a peak interrupt rate of ~2000/s due to avg packets (512)
arriving
!* (for min sized packets this would be 870, for max
!* sized packets it would be 41: 1Gps / ((8*size)+96))
!*/
! /* RX Interrupt no more than every 500 us */
!   sc-bge_rx_coal_ticks = 500;
!   /* TX Interrupt no more than every 500 us */
!   sc-bge_tx_coal_ticks = 500;
!   /* RX Interrupt no more than every 120 packets */
!   sc-bge_rx_max_coal_bds = 120;
!   /* TX Interrupt no more than every 120 packets */
!   sc-bge_tx_max_coal_bds = 120;
  
/* Set up ifnet structure */
ifp = sc-arpcom.ac_if;
ifp-if_softc = sc;
ifp-if_unit = sc-bge_unit;
Index: if_bgereg.h
===
RCS file: /cvs/src/sys/dev/bge/if_bgereg.h,v
retrieving revision 1.1.2.7
diff -C5 -r1.1.2.7 if_bgereg.h
*** if_bgereg.h 2 Nov 2002 18:17:55 -   1.1.2.7
--- if_bgereg.h 10 Nov 2002 16:12:21 -
***
*** 2057,2068 
   * Memory management stuff. Note: the SSLOTS, MSLOTS and JSLOTS
   * values are tuneable. They control the actual amount of buffers
   * allocated for the standard, mini and jumbo receive rings.
   */
  
! #define BGE_SSLOTS256
! #define BGE_MSLOTS256
  #define BGE_JSLOTS384
  
  #define BGE_JRAWLEN (BGE_JUMBO_FRAMELEN + ETHER_ALIGN + sizeof(u_int64_t))
  #define BGE_JLEN (BGE_JRAWLEN + (sizeof(u_int64_t) - \
(BGE_JRAWLEN % sizeof(u_int64_t
--- 2057,2068 
   * Memory management stuff. Note: the SSLOTS, MSLOTS and JSLOTS
   * values are tuneable. They control the actual amount of buffers
   * allocated for the standard, mini and jumbo receive rings.
   */
  
! #define BGE_SSLOTS384
! #define BGE_MSLOTS384
  #define BGE_JSLOTS384
  
  #define BGE_JRAWLEN (BGE_JUMBO_FRAMELEN + ETHER_ALIGN + sizeof(u_int64_t))
  #define BGE_JLEN (BGE_JRAWLEN + (sizeof(u_int64_t) - \
(BGE_JRAWLEN % sizeof(u_int64_t


--don ([EMAIL PROTECTED] www.sandvine.com)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Suggestions for tcbhashsize size?

2002-11-09 Thread Don Bowman
Are there any guidelines for setting the tcbhashsize ?
I have a system which I'm expecting to keep ~50K TCP connections
going.
Does it follow standard hash table rules that it should be
less than half full?

I currently have net.inet.tcp.tcbhashsize: 4096

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



bug in bge driver with ENOBUFS on 4.7

2002-11-09 Thread Don Bowman
In bge_rxeof(), there can end up being a condition which causes
the driver to endlessly interrupt.

if (bge_newbuf_std(sc, sc-bge_std, NULL) == ENOBUFS) {
ifp-if_ierrors++;
bge_newbuf_std(sc, sc-bge_std, m);
continue;
}

happens. Now, bge_newbuf_std returns ENOBUFS. 'm' is also NULL.
This causes the received packet to not be dequeued, and the driver
will then go straight back into interrupt as the chip will 
reassert the interrupt as soon as we return.

Suggestions on a fix? 
I'm not sure why I ran out of mbufs, I have
kern.ipc.nmbclusters: 9
kern.ipc.nmbufs: 28

(kgdb) p/x mbstat
$11 = {m_mbufs = 0x3a0, m_clusters = 0x39c, m_spare = 0x0, m_clfree = 0x212,

  m_drops = 0x0, m_wait = 0x0, m_drain = 0x0, m_mcfail = 0x0, m_mpfail =
0x0, 
  m_msize = 0x100, m_mclbytes = 0x800, m_minclsize = 0xd5, m_mlen = 0xec, 
  m_mhlen = 0xd4}

but bge_newbuf_std() does this:
if (m == NULL) {
MGETHDR(m_new, M_DONTWAIT, MT_DATA);
if (m_new == NULL) {
return(ENOBUFS);
}
and then returns ENOBUFS.

This is with 4.7-RELEASE.


--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: dhclient turns ethernet card off

2002-11-05 Thread Don Bowman
 From: alexis georges [mailto:floating_in_space_;hotmail.com]
 hey guys
 we had a power cut yesterday..all went down at our home..
 when we got electricity back, my internet wouldnot work..only 
 my computer 
 atually..i found that my eth. card would not turn on..or 
 actually i just 
 foung out now..it does turn on until it get to the 'dhclient 
 dc0' lne in 
 rc.conf..which i need..basically during boot up it turns 
 on..and when it has 
 t exectute the dchlient line, the light on the card 
 disapears..its weird.i 
 have a linksys (LNE TX?)
 anyways the way i have to have my card going is by having 
 start_if.dc0 with 
 a line that turns my card to half-duplex (it needs to be like 
 this) and then 
 in the rc.conf i have the ifconfig_dc0=DHCP
 anyone knoe what could cause my card to literally shut down 
 on dhclient?
 i already tried just in case to change PCI slots, but nothing 
 changed..
 thanks in advance

some routers disable ports if they see too many errors, e.g.
due to a duplex mismatch.
Is your router set to auto? and the NIC is as well?

--don ([EMAIL PROTECTED] www.sandvine.com)
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: MTU problems ...

2002-11-04 Thread Don Bowman
 From: Julian Elischer [mailto:julian;elischer.org]
 There is a program that intercepts tcp session negotiation and
 artificially reduces the negotiated MTU but I can't find it 
 right now..
 I think it was called mssd or something.

/usr/ports/net/tcpmssd

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Problem in High Speed and Long Delay with FreeBSD

2002-11-01 Thread Don Bowman
 From: Fran Lawas-Grodek [mailto:Fran.Lawas-Grodek;grc.nasa.gov]

Perhaps 
sysctl net.inet.tcp.inflight_enable=1
will help?

you may wish to also change tcp.inflight_max.
See tcp(4) as of 4.7.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Problem in High Speed and Long Delay with FreeBSD

2002-11-01 Thread Don Bowman
 From: Fran Lawas-Grodek [mailto:Fran.Lawas-Grodek;grc.nasa.gov]
 Well... our development code that we are to ultimately test was
 developed on 4.1, thus we really need to try to stick with 4.1.
 It does not look like either of the above parameters are available
 until 4.7.

No worries.
Have you checked that both sides are negotiating SACK?
And both sides are negotiating a window scale option sufficiently
large? (sounds like you need a window scale option of at least 5
bits?)
And the socket-buffer to ttcp is actually being set as large
as you think? (perhaps run 'ktrace' or 'truss' on ttcp and look
for an error on the setsockopt).
http://www.rfc-editor.org/rfc/rfc1323.txt has some other
suggestions I think, but I'm guessing you've already gone
over it.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Problem in High Speed and Long Delay with FreeBSD

2002-11-01 Thread Don Bowman
 From: Mark Allman [mailto:mallman;grc.nasa.gov]
 Thanks!  Other ideas?

What MSS is advertised on each end?

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Data payload in SYN packet

2002-11-01 Thread Don Bowman
 From: David Myer [mailto:davidmyer800;yahoo.com]
 Just curious on one thing, we know that SYN packet can
 carry data payload, but I never see any implementation
 that actually does this.

See T/TCP, RFC 1644, and sysctl 'net.inet.tcp.rfc1644'

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: ng_fec hash mechanism versus cisco etherchannel

2002-10-31 Thread Don Bowman
 From: Petri Helenius [mailto:pete;he.iki.fi]
 It does not matter if you send using the other link as long 
 as you send 
 all packets
 for the same stream over the same link to avoid reordering. 
 So yes, it does
 interoperate.

can you end up with a link flap?
e.g. the catalyst does SA learning to pick the port, so it
sends it out port 1. We respond via port 2 since we use the
SIP^DIP. The catalyst switches that through to the other end,
which replies, and comes back via port 1.

I guess this isn't tragic.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: spoofing source code in kernel

2002-10-28 Thread Don Bowman
From: sepehr sohrabi [mailto:sepehr_soh;hotmail.com]
 
 Hi list
 Anyone has source code for spoofing (in kernel) for all input 
 Tcp/IP packets 
 .For any TCP/IP packet recieve it creates an ACK for it .
 someThing like spoofing GW
 CLIENT - GW --- server
 connections are spoofed
 THANX

ipfw with a 'fwd' rule will let you do something like this.
Run a user-mode application on port X, then do
ipfw fwd localhost,X tcp from any to any recv myinterface

and any inbound TCP connection will be terminated locally.

--don ([EMAIL PROTECTED] www.sandvine.com p2p)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Annoying ARP warning messages.

2002-10-28 Thread Don Bowman
From: Julian Elischer [mailto:julian;elischer.org]
 On Mon, 28 Oct 2002, Sean Chittenden wrote:

  In this example, does the xl0 interface share the same MAC address?
 
 umm actually, yes.. sends switches insane.. :-)
 if you don't do the step about source Mac address replacement
 then they have different addresses. (though I can't guarantee that)

Is there support for 802.3ad in FreeBSD? This would be the best
way to gang interfaces together in a standard fashion. It involves
LACP (Link Aggregation Control Protocol), which prevents loops
@ L2 (I think its an extension of STP). Packet reordering is also
solved (the simple round robin scheme achieves rather poor performance
due to this problem).

Another way to do it is with OSPF ECMP (Equal-Cost Multipath Routing),
depends on whether you think L2 is cool or L3 :)

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Annoying ARP warning messages.

2002-10-28 Thread Don Bowman
 From: Julian Elischer [mailto:julian;elischer.org]
  Is there support for 802.3ad in FreeBSD? This would be the best
  way to gang interfaces together in a standard fashion. It involves
  LACP (Link Aggregation Control Protocol), which prevents loops
  @ L2 (I think its an extension of STP). Packet reordering is also
  solved (the simple round robin scheme achieves rather poor 
 performance
  due to this problem).
  
 
 This could be (relatively) easy in netgraph.. it was designed for that
 sort of thing. 
  

I assume you mean with a user-mode daemon, sort of a LACPD, like
in the linux model? (http://www.st.rim.or.jp/~yumo/), and then
a version of one2many that did the src^dst hash to prevent re-ordering?
Or would you implement the control protocol inside netgraph as well?

On a side note, is there anything netgraph can't solve :)

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: device fxp cannot detect Intel On-Board LAN

2002-10-28 Thread Don Bowman
 From: Ng Wee Yong [mailto:ngweeyong;yahoo.com.sg]
 I just install the FreeBSD 4.6.2 - STABLE version. My 
 motherboard is a MSI
 845GE Max-L, 1.8Ghz Pentium 4, On-board LAN is Intel 82562.
 
 FreeBSD just work fine accept it cannot detect my On-Board 
 Intel LAN. ...

kern/39974 describes the issue.

http://www.geocrawler.com/archives/3/145/2002/6/50/9058043/

has a solution for you, changing one line in the fxp driver
to give it this pci vendor/device id.

There is a comment that
Committed to -current, will be MFC'd to -stable very soon.

suggesting this might be in 4.7 stable already.

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Annoying ARP warning messages.

2002-10-26 Thread Don Bowman
Kevin Stevens wrote:
 I have two systems connected through a common network (switch).  They 
 each have two NICs, with one addressed on one IP network and the second 
 on another.  IP works fine.  My problem is that the kernel keeps 
 bitching about seeing the same MAC addresses on both interfaces:
 
 Oct 26 06:15:03 babelfish /kernel: arp: 192.168.168.101 is on em0 but 
 got reply from 00:30:65:00:e6:e6 on xl0

systcl net.link.ether.inet.log_arp_wrong_iface=0

--don ([EMAIL PROTECTED] www.sandvine.com p2p)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Annoying ARP warning messages.

2002-10-26 Thread Don Bowman
 From: Julian Elischer [mailto:julian;elischer.org]

(removed as to why have two NICs on the same network,
sending for general enlightenment of the list...)

This is reasonably common in L2 switched Ethernet. You have
a device which segments the traffic just fine with 
MAC learning. You have the cables all going to the desktops.
You don't want to muck around with partially supported
VLAN tagging @ the desktop. So you run another network
overtop the same Ethernet. You probably wouldn't architect
it up front for that (although I have in our lab, we use
a cat6k for a virtual patch panel, but individual
tests use whatever IP's they desire).

@ the Ethernet level, addressing is only done via
MAC address. Having two packets on the same wire with
differing IP subnets is legal (in fact, you see it all
the time with the destination or source address which
is off your network).

ARP's and all 1's broadcasts (e.g. DHCP) make a bit
of a mess of such a network, but sometimes that's
the lesser evil.

This can also be seen, believe it or not, on a routed
network, if you have something like spanning tree 
protocol which hasn't converged yet, but has been set
for rapid convergence (which assumes the path isn't
a loop until it discovers otherwise). Routers and
switches are merging.

--don ([EMAIL PROTECTED] www.sandvine.com p2p)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



panic in 4.7 in close / sbdrop

2002-10-25 Thread Don Bowman
I have a machine running 4.7. I can panic it by sending a reasonably
high load of tcp open/close from/to it. The trace below is from
a socket from localhost to localhost (sendmail). The max number
of open file descriptors I would have had would be ~4500.
The rx buffer says it has 43008 bytes, but there are no mbufs
chained. The system was not out of mbufs or clusters.

Suggestions on what I might look @?

#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc01c41c7 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01c4639 in panic (fmt=0xc0331205 sbdrop)
at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc01e60e7 in sbdrop (sb=0xeaf677e8, len=43008)
at /usr/src/sys/kern/uipc_socket2.c:877
#4  0xc01e607c in sbflush (sb=0xeaf677e8)
at /usr/src/sys/kern/uipc_socket2.c:852
#5  0xc022697f in tcp_disconnect (tp=0xecf24a40)
at /usr/src/sys/netinet/tcp_usrreq.c:1077
#6  0xc02260f2 in tcp_usr_disconnect (so=0xeaf677a0)
at /usr/src/sys/netinet/tcp_usrreq.c:406
#7  0xc01e3450 in sodisconnect (so=0xeaf677a0)
at /usr/src/sys/kern/uipc_socket.c:422
#8  0xc01e326a in soclose (so=0xeaf677a0)
at /usr/src/sys/kern/uipc_socket.c:302
#9  0xc01d73fa in soo_close (fp=0xd049ab80, p=0xe91bd5a0)
at /usr/src/sys/kern/sys_socket.c:195
#10 0xc01b9c37 in fdrop (fp=0xd049ab80, p=0xe91bd5a0)
at /usr/src/sys/sys/file.h:217
#11 0xc01b9b7f in closef (fp=0xd049ab80, p=0xe91bd5a0)
at /usr/src/sys/kern/kern_descrip.c:1277
#12 0xc01b978c in fdfree (p=0xe91bd5a0)
at /usr/src/sys/kern/kern_descrip.c:1059
#13 0xc01bc475 in exit1 (p=0xe91bd5a0, rv=0)
at /usr/src/sys/kern/kern_exit.c:187
#14 0xc01bc2dc in exit1 (p=0xe91bd5a0, rv=16777218)
at /usr/src/sys/kern/kern_exit.c:103
#15 0xc02edc71 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
  tf_edi = 0, tf_esi = 15, tf_ebp = -1077950764, tf_isp = -221909036, 
  tf_ebx = 0, tf_edx = 126, tf_ecx = -1077950820, tf_eax = 1, 
  tf_trapno = 0, tf_err = 2, tf_eip = 673302376, tf_cs = 31, 
  tf_eflags = 659, tf_esp = -1077950856, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:1175
#16 0xc02da38b in Xint0x80_syscall ()


void
sbdrop(sb, len)
register struct sockbuf *sb;
register int len;
{
register struct mbuf *m;
struct mbuf *next;
 
next = (m = sb-sb_mb) ? m-m_nextpkt : 0;
while (len  0) {
if (m == 0) {
if (next == 0)
panic(sbdrop);
m = next;
next = m-m_nextpkt;
continue;
}
(kgdb) p/x *sb
$39 = {sb_cc = 0xa800, sb_hiwat = 0xe000, sb_mbcnt = 0xbd00, 
  sb_mbmax = 0x4, sb_lowat = 0x1, sb_mb = 0x0, sb_mbtail = 0x0, 
  sb_lastrecord = 0x0, sb_sel = {si_pid = 0x0, si_note = {slh_first = 0x0}, 
si_flags = 0x0}, sb_flags = 0x0, sb_timeo = 0x0}
called from:

void
sbflush(sb)
register struct sockbuf *sb;
{
KASSERT((sb-sb_flags  SB_LOCK) == 0, (sbflush: locked));

while (sb-sb_mbcnt)
sbdrop(sb, (int)sb-sb_cc);


called from:
static struct tcpcb *
tcp_disconnect(tp)
register struct tcpcb *tp;
{
struct socket *so = tp-t_inpcb-inp_socket;
 
if (tp-t_state  TCPS_ESTABLISHED)
tp = tcp_close(tp);
else if ((so-so_options  SO_LINGER)  so-so_linger == 0)
tp = tcp_drop(tp, 0);
else {
soisdisconnecting(so);
sbflush(so-so_rcv); 
tp = tcp_usrclosed(tp);
if (tp)
(void) tcp_output(tp);
}
return (tp);
}
(kgdb) p/x *tp
$44 = {t_segq = {lh_first = 0x0}, t_dupacks = 0x0, unused = 0x0, 
  tt_rexmt = 0xecf24b24, tt_persist = 0xecf24b3c, tt_keep = 0xecf24b54, 
  tt_2msl = 0xecf24b6c, tt_delack = 0xecf24b84, t_inpcb = 0xecf24980, 
  t_state = 0x4, t_flags = 0x801e0, t_force = 0x0, snd_una = 0x8bcbf58f, 
  snd_max = 0x8bcbf58f, snd_nxt = 0x8bcbf58f, snd_up = 0x8bcbf58f, 
  snd_wl1 = 0xab47117a, snd_wl2 = 0x8bcbf58f, iss = 0x8bcbf3cb, 
  irs = 0xab4710f2, rcv_nxt = 0xab47fea8, rcv_adv = 0xab47f17a, 
  rcv_wnd = 0xe000, rcv_up = 0xab47117a, snd_wnd = 0xe000, snd_cwnd =
0x, 
  snd_bwnd = 0x3fffc000, snd_ssthresh = 0x3fffc000, snd_bandwidth = 0x0, 
  snd_recover = 0x8bcbf3cb, t_maxopd = 0x3fd8, t_rcvtime = 0x101c3f1, 
  t_starttime = 0x4588, t_rtttime = 0x0, t_rtseq = 0x8bcbf52f, 
  t_bw_rtttime = 0x4588, t_bw_rtseq = 0x0, t_rxtcur = 0x4b0, 
  t_maxseg = 0x3800, t_srtt = 0x14, t_rttvar = 0xb, t_rxtshift = 0x0, 
  t_rttmin = 0x3e8, t_rttbest = 0x1f, t_rttupdated = 0x5, max_sndwnd =
0xe000, 
  t_softerror = 0x0, t_oobflags = 0x0, t_iobc = 0x0, snd_scale = 0x0, 
  rcv_scale = 0x0, request_r_scale = 0x0, requested_s_scale = 0x0, 
  ts_recent = 0x101c3f1, ts_recent_age = 0x101c3f1, 
  last_ack_sent = 0xab47fea8, cc_send = 0x0, cc_recv = 0x0, 
  snd_cwnd_prev = 0x0, snd_ssthresh_prev = 0x0, t_badrxtwin = 0x0}
(kgdb) p/x 

RE: Machine becomes non-responsive, only ^T shows it as alive under l oad: IPFW, TCP proxying

2002-10-24 Thread Don Bowman
 From: Kevin Stevens [mailto:Kevin_Stevens;pursued-with.net]

  Any suggestions for how one would start debugging this to
  find out where its stuck, and how?
 
 At a guess, you need to tune the state-table retention time down.

If by that you mean the MSL? I've set the MSL to 5000 in this case.
Or do you mean something else?

Should the machine lockup this way? How does one debug
where its gone?

--don ([EMAIL PROTECTED] www.sandvine.com)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: Machine becomes non-responsive, only ^T shows it as alive under l oad: IPFW, TCP proxying

2002-10-24 Thread Don Bowman
 From: Don Bowman 
 
  
  I have an application listening on an ipfw 'fwd' rule.
  I'm sending ~3K new sessions per second to it. It
  has to turn around and issue some of these out as
  a proxy, in response to which some of them the destination
  host won't exist.

For reference, the solution is to upgrade to the latest -STABLE
bge driver. The machine was getting stuck in interrupt.

--don ([EMAIL PROTECTED] www.sandvine.com)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Machine becomes non-responsive, only ^T shows it as alive under load: IPFW, TCP proxying

2002-10-23 Thread Don Bowman

I have an application listening on an ipfw 'fwd' rule.
I'm sending ~3K new sessions per second to it. It
has to turn around and issue some of these out as
a proxy, in response to which some of them the destination
host won't exist.

I have RST limiting on. I'm seeing messages like:
Limiting open port RST response from 1312 to 200 packets per second

come out sometimes.

After a while of such operation (~1/2 hour), the machine
becomes unresponsive: the network interfaces no longer respond,
the serial console responds to ^T yielding a status line,
but ^C etc do nothing, and the bash which was there won't
give me a prompt.

^T indicates my bash is running, 0% of CPU in use, etc.

I have no choice but to power-cycle it.

Any suggestions for how one would start debugging this to
find out where its stuck, and how?

This is running 4.7 STABLE on a single XEON 2.0 GHz, 1GB
of memory. The bandwidth wasn't that high, varying between
3 and 30Mbps.

Perhaps related, sometimes I get: bge0: watchdog timeout -- resetting

The only NIC which is active is bge0. I have an 'em0' which
is idle (no IP), and an fxp0 (which has an IP but is idle).

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



panic with ipfw / dummynet in 4.7 STABLE

2002-10-22 Thread Don Bowman
Take a 4.7 image. Using if_em (if it matters). Turn on
bridging (em0, em2), add these ipfw rules:

ipfw add 305 prob 0.01 drop MAC any 00:04:76:f3:2d:0a setup 
ipfw add 310 prob 0.01 reject MAC any 00:04:76:f3:2d:0a setup 
ipfw add 320 prob 0.01 unreach host MAC any 00:04:76:f3:2d:0a setup 
ipfw add 325 prob 0.01 unreach port MAC any 00:04:76:f3:2d:0a setup 
ipfw add pipe 1 config delay 90 plr 0.0001
ipfw add pipe 2 config delay 150 plr 0.0005
ipfw add 340 prob 0.5 pipe 1 ip from any to any 
ipfw add 345 prob 0.5 pipe 2 ip from any to any 

The system panics almost immediately (~1s). The panic and
trace is below. Its doubtful much traffic was present on
the em0 or em2 interfaces so this probably happened on the first
packet.

I'll turn on -g in the kernel (I thought for sure it was,
but seems no...) and re-run.

This is with -DIPFW2 on.

So I'm doing:

# kldload if_em
# sysctl net.link.ether.bridge_cfg=em0 em2
# sysctl net.link.ether.bridge=1

(after the machine has booted).
Then I run the script above to add the ipfw rules, and it tips over.

bash-2.05a# uname -a
FreeBSD TPC-E1-34 4.7-STABLE FreeBSD 4.7-STABLE #7: Tue Oct 22 22:07:55 EDT
2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/TPC  i386

Machine is a 2x XEON 2.0 GHz w/ Intel 82544 on the motherboard,
and an Intel 82546EB dual GE card in a PCI slot. It is SMP enabled.

SMP 4 cpus
IdlePTD at phsyical address 0x0043
initial pcb at physical address 0x00369780
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 0002; cpuid = 0; lapic.id = 
fault virtual address   = 0x4007
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0204565
stack pointer   = 0x10:0xff807eb4
frame pointer   = 0x10:0xff807edc
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = Idle
interrupt mask  = net  - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0002; cpuid = 0; lapic.id = 
boot() called on cpu#0

syncing disks... 

Fatal trap 12: page fault while in kernel mode
mp_lock = 0003; cpuid = 0; lapic.id = 
fault virtual address   = 0x30
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0266e11
stack pointer   = 0x10:0xff807cc4
frame pointer   = 0x10:0xff807ccc
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = Idle
interrupt mask  = net bio  - SMP: XXX
trap number = 12
panic: page fault
mp_lock = 0003; cpuid = 0; lapic.id = 
boot() called on cpu#0
Uptime: 3m6s

#0  0xc01b19b2 in dumpsys ()
#1  0xc01b1783 in boot ()
#2  0xc01b1bdc in poweroff_wait ()
#3  0xc02cb508 in trap_fatal ()
#4  0xc02cb199 in trap_pfault ()
#5  0xc02cad37 in trap ()
#6  0xc0266e11 in acquire_lock ()
#7  0xc026af24 in softdep_update_inodeblock ()
#8  0xc0265f45 in ffs_update ()
#9  0xc026e357 in ffs_sync ()
#10 0xc01e29bf in sync ()
#11 0xc01b151e in boot ()
#12 0xc01b1bdc in poweroff_wait ()
#13 0xc02cb508 in trap_fatal ()
#14 0xc02cb199 in trap_pfault ()
#15 0xc02cad37 in trap ()
#16 0xc0204565 in dummynet_io ()
#17 0xc020991c in ip_input ()
#18 0xc0209ec7 in ipintr ()
#19 0xc02bca91 in swi_net_next ()

Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.7-STABLE #7: Tue Oct 22 22:07:55 EDT 2002
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/TPC
Timecounter i8254  frequency 1193182 Hz
CPU: Pentium 4 (1996.60-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf24  Stepping = 4
 
Features=0x3febfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA
,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,b28,ACC
real memory  = 1073217536 (1048064K bytes)
avail memory = 1039532032 (1015168K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
Programming 24 pins in IOAPIC #1
Programming 24 pins in IOAPIC #2
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00050014, at 0xfee0
 cpu1 (AP):  apic id:  6, version: 0x00050014, at 0xfee0
 cpu2 (AP):  apic id:  1, version: 0x00050014, at 0xfee0
 cpu3 (AP):  apic id:  7, version: 0x00050014, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178020, at 0xfec0
 io1 (APIC): apic id:  3, version: 0x00178020, at 0xfec8
 io2 (APIC): apic id:  4, version: 0x00178020, at 0xfec80400
Preloaded elf kernel kernel at 0xc0411000.
Preloaded elf module if_fxp.ko at 0xc041109c.
Preloaded elf module miibus.ko at 0xc041113c.
netsmb_dev: loaded
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 24 

RE: panic with ipfw / dummynet in 4.7 STABLE

2002-10-22 Thread Don Bowman
 From: Don Bowman [mailto:don;sandvine.com]
 Take a 4.7 image. Using if_em (if it matters). Turn on
 bridging (em0, em2), add these ipfw rules:
 ...

Here's the same thing again with -g on.

#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc01b1783 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01b1bdc in poweroff_wait (junk=0xc032b319, howto=-1070420497)
at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc02cb508 in trap_fatal (frame=0xff807c84, eva=48)
at /usr/src/sys/i386/i386/trap.c:974
#4  0xc02cb199 in trap_pfault (frame=0xff807c84, usermode=0, eva=48)
at /usr/src/sys/i386/i386/trap.c:867
#5  0xc02cad37 in trap (frame={tf_fs = 1714618392, tf_es = -8388592, 
  tf_ds = -935985136, tf_edi = 0, tf_esi = -935921920, tf_ebp =
-8356660, 
  tf_isp = -8356688, tf_ebx = -1070251844, tf_edx = 1744882756, 
  tf_ecx = -424745920, tf_eax = 0, tf_trapno = 12, tf_err = 0, 
  tf_eip = -1071223279, tf_cs = 8, tf_eflags = 66054, tf_esp =
-935921920, 
  tf_ss = -935921920}) at /usr/src/sys/i386/i386/trap.c:466
#6  0xc0266e11 in acquire_lock (lk=0xc03540bc) at machine/globals.h:114
#7  0xc026af24 in softdep_update_inodeblock (ip=0xc836f700, bp=0xd49e0184, 
waitfor=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3813
#8  0xc0265f45 in ffs_update (vp=0xe6aee440, waitfor=0)
at /usr/src/sys/ufs/ffs/ffs_inode.c:106
#9  0xc026e357 in ffs_sync (mp=0xc82af600, waitfor=2, cred=0xc2066700, 
p=0xc0382120) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1025
#10 0xc01e29bf in sync (p=0xc0382120, uap=0x0)
at /usr/src/sys/kern/vfs_syscalls.c:576
#11 0xc01b151e in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:235
#12 0xc01b1bdc in poweroff_wait (junk=0xc032b319, howto=-1070420497)
at /usr/src/sys/kern/kern_shutdown.c:595
#13 0xc02cb508 in trap_fatal (frame=0xff807e74, eva=1073741831)
at /usr/src/sys/i386/i386/trap.c:974
#14 0xc02cb199 in trap_pfault (frame=0xff807e74, usermode=0, eva=1073741831)
at /usr/src/sys/i386/i386/trap.c:867
#15 0xc02cad37 in trap (frame={tf_fs = -935985128, tf_es = -8388592, 
  tf_ds = -1071644656, tf_edi = -1039546112, tf_esi = -1039546112, 
  tf_ebp = -8356132, tf_isp = -8356192, tf_ebx = 1073741823, 
  tf_edx = 1073741823, tf_ecx = -935640092, tf_eax = 0, tf_trapno = 12, 
  tf_err = 0, tf_eip = -1071626907, tf_cs = 8, tf_eflags = 66054, 
  tf_esp = 24, tf_ss = -1039697888}) at
/usr/src/sys/i386/i386/trap.c:466
#16 0xc0204565 in dummynet_io (m=0xc209c900, pipe_nr=1, dir=2,
fwa=0xff807f34)
at /usr/src/sys/netinet/ip_dummynet.c:1103
#17 0xc020991c in ip_input (m=0xc209c900)
at /usr/src/sys/netinet/ip_input.c:459
#18 0xc0209ec7 in ipintr () at /usr/src/sys/netinet/ip_input.c:843
#19 0xc02bca91 in swi_net_next ()
(kgdb) l
1098 * this is a dummynet rule, so we expect a O_PIPE or O_QUEUE
rule
1099 */
1100fs = locate_flowset(pipe_nr, fwa-rule);
1101if (fs == NULL)
1102goto dropit ;   /* this queue/pipe does not exist! */
1103pipe = fs-pipe ; 
1104if (pipe == NULL) { /* must be a queue, try find a matching pipe
*/
1105for (pipe = all_pipes; pipe  pipe-pipe_nr !=
fs-parent_nr;
1106 pipe = pipe-next)
1107;
(kgdb) p fs
$1 = (struct dn_flow_set *) 0x3fff   ILLEGAL VALUE
(kgdb) p pipe_nr
$6 = 1714618368
(kgdb) p/x pipe_nr
$7 = 0x6633
(kgdb) p/x fwa
$8 = 0xff807f34
(kgdb) p/x fwa-rule
$9 = 0xc83b43c0
(kgdb) p/x *fwa-rule
$10 = {next = 0xc83c2900, next_rule = 0x0, act_ofs = 0x0, cmd_len = 0x4, 
  rulenum = 0x154, set = 0x0, _pad = 0x0, pcnt = 0x1, bcnt = 0x20, 
  timestamp = 0x3db60b9b, cmd = {{opcode = 0x29, len = 0x2, arg1 = 0x0}}}
(kgdb) 


--don ([EMAIL PROTECTED] www.sandvine.com)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



dynamic load of em/fxp/bge

2002-10-16 Thread Don Bowman


I am trying to load the if_em, if_fxp, if_bge drivers
via /boot/loader.conf.

I've added

if_fxp_load=YES
if_bge_load=YES
if_em_load=YES

The problem is that the bge driver doesn't load. It will
if I manually load it after startup with kldload. The issue
seems to be a dependency on miibus, both fxp and bge want
to load it, bge gets an error that its already loaded.

I tried putting 'miibus_load=YES' in loader.conf, but the
same affect is seen.

I've tried from the boot prompt doing an explicit load of these
manually in each order, but to no avail.

As a work-around, I've placed an kldload if_bge in rc.network
before the 'ifconfig -l'.

Any suggestions on why the fxp/bge don't play nice when loaded
automatically, but will work if run manually? Is there a timing
thing that the fxp hasn't initialised its miibus yet?

I have:

fxp0
fxp1
bge0

in this particular machine. The bge will get miibus2 (eventually),
leaving fxp0 to have miibus0, fxp1 to have miibus1 I think.

Suggestions?

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: ENOBUFS

2002-10-16 Thread Don Bowman

Sam Leffler wrote:
 Try my port of the netbsd kttcp kernel module.  You can find it at
 
 http://www.freebsd.org/~sam

this seems to use some things from netbsd like
so_rcv.sb_lastrecord and SBLASTRECORDCHK/SBLASTMBUFCHK.
Is there something else I need to apply to build it on
freebsd -STABLE?

--don ([EMAIL PROTECTED] www.sandvine.com)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



dynamic load of em/fxp/bge

2002-10-15 Thread Don Bowman


I am trying to load the if_em, if_fxp, if_bge drivers
via /boot/loader.conf.

I've added

if_fxp_load=YES
if_bge_load=YES
if_em_load=YES

The problem is that the bge driver doesn't load. It will
if I manually load it after startup with kldload. The issue
seems to be a dependency on miibus, both fxp and bge want
to load it, bge gets an error that its already loaded.

I tried putting 'miibus_load=YES' in loader.conf, but the
same affect is seen.

I've tried from the boot prompt doing an explicit load of these
manually in each order, but to no avail.

As a work-around, I've placed an kldload if_bge in rc.network
before the 'ifconfig -l'.

Any suggestions on why the fxp/bge don't play nice when loaded
automatically, but will work if run manually? Is there a timing
thing that the fxp hasn't initialised its miibus yet?

I have:

fxp0
fxp1
bge0

in this particular machine. The bge will get miibus2 (eventually),
leaving fxp0 to have miibus0, fxp1 to have miibus1 I think.

Suggestions?

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



intel dual gigabit, 82546EB support

2002-10-06 Thread Don Bowman

Is anyone using the intel dual gigabit 82546EB? Does it appear as two
separate em devices, eg em0 and em1?

http://www.intel.com/network/connectivity/products/pro1000mt_dual_server_ada
pter.htm
is a card that has it, also some of the newer supermicro motherboards
(and probably others) incorporate this device.

The em driver does have support for it, but I can't see how
it would make two interfaces from it?

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RE: new zero copy sockets patches available

2002-05-18 Thread Don Bowman


 Andrew Gallatin writes:
 Kenneth D. Merry writes:
   
   I have released a new set of zero copy sockets patches, against
-current
   from today (May 17th, 2002).
 
 Hi Ken,
 
 I'm glad to see that you're still maintining this!
 
 Assuming the mutex issues get sorted out, what do you think the odds
 are of getting this into the tree?  The only possible issue I see is
 with the tigon firmware.   Is the firmware you're using of the same
 vintage as what's in the tree now?  Does it contain all the same
 fixes?

As a related question, will this work with the broadcom gigabit (bge)
driver, which is the Tigon III? If not, what would it take to get
it working?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message