RE: Sending Ethernet frames
From: [EMAIL PROTECTED] On Behalf Of Patrik Arlos Hi, I'm trying to send 'raw' Ethernet frames. I have however not found any examples of how to do this in BSD. Is it possible to open a 'ethernet' socket, similar to a AF_INET? I need to be able to control the destination address and type/len field in the Ethernet header. In Linux it is possible open a SOCK_RAW and bind it to a particular interface, I've tried to use the sockadd_dl but in this case bind dies with error 22, any way to do this? You can chmod +w on /dev/bpf* and then open write to a bpf device. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Underutilisation of CPU --- am I PCI bus bandwidth limited?
From: [EMAIL PROTECTED] ... This is rather confusing, as I cannot tell if the system is IO bound or CPU bound. Certainly I would not have expected the 133/64 PCI bus to be saturated given that peak throughput is around 550Mbit/s with 1024-byte packets. (Such a low figure is not unexpected given there are 2 syscalls per packet). You may find you have not loaned the em driver enough buffers, (max_rxd, max_txd). you may find you want to use device polling, poll on idle, and play with the polling parameters. In this config i have achieved ~2Gbps of throughput with these large packets, so i know it can be done. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: packet generator
From: Andrew Gallatin [mailto:[EMAIL PROTECTED] Andrew Gallatin writes: xmit routine was called 683441 times. This means that the queue was only a little over two packets deep on average, and vmstat shows idle time. I've tried piping additional packets to nghook mx0:orphans input, but that does not seem to increase the queue depth. The problem here seems to be that rather than just slapping the packets onto the driver's queue, ng_source passes the mbuf down to more of netgraph, where there is at least one spinlock, and the driver's ifq lock is taken and released a zillion times by ether_output_frame(), etc. A quick hack (appended) to just slap the mbufs onto the if_snd queue gets me from ~410Kpps to 1020Kpps. I also see very deep queues with this (because I'm slamming 4K pkts onto the queue at once..). This is nearly identical to the linux pktgen figure on the same hardware, which makes me feel comfortable that there is a lot of headroom in the driver/firmware API and I'm not botching something in the FreeBSD driver. BTW, did you see your 800Kpps on 4.x or 5.x? If it was 4.x, what do you see on 5.x if you still have the same setup handy? Thanks, 800Kpps was on 4.7. on a dual 2.8GHz Xeon with 100MHz PCI-X on em. I will try the 5.3. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: dyn buckets
From: [EMAIL PROTECTED] I have a firewall running 4.10 that handles around 20mbits/sec of traffic and has around 500 ipfw rules. Lately I've noticed that net.inet.ip.fw.curr_dyn_buckets seems to be maxing out. I've increased net.inet.ip.fw.dyn_buckets a few times, but they seem to max out each time. Is there any problem with increasing net.inet.ip.fw.dyn_buckets far beyond the default? (I'm at 2048 now) I use net.inet.ip.fw.dyn_buckets=16384 net.inet.ip.fw.dyn_syn_lifetime=5 net.inet.ip.fw.dyn_max=32000 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: packet generator
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Andrew Gallatin Sent: September 10, 2004 19:08 PM To: [EMAIL PROTECTED] Subject: packet generator Does anybody have a free, in-kernel tool to generate packets quicky and send them out a particular etherent interface on FreeBSD? Something similar to pktgen on linux? I'm trying to excersize just the send-side of programmable firmware based NIC. The recieve side of the NIC firmware is not yet written, but I want to get started tuning and shaking the bugs out of the send side while the firmware author does the recieve path. The packets just get dropped on the floor by the NIC, so its a good way to test the interface.. ng_source was a netgraph module we wrote and contributed. It can transmit ~800Kpps on a PCI-X system. The code is in src/sys/netgraph/ng_source.c. I drive it with a tcl library that can create arbitrary packets with an object-oriented model, let me know if you'd like to try that. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: device polling takes more CPU hits??
From: James [mailto:[EMAIL PROTECTED] Hi all, ... Any idea why device polling is kind of having... negative impact? Is this b/c I have SMP compiled on a box that really doesn't have two cpu's?? Is SMP+APIC_IO support even required for HTT use? I would post the output of 'sysctl kern.polling', its likely some of the tuning there is insufficient. What do you have HZ set to (sysctl kern.clockrate)? I would probably have it set to ~1000. You will want 'machdep.cpu_idle_hlt=1'. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: device polling takes more CPU hits??
From: James [mailto:[EMAIL PROTECTED] Hi Don, [EMAIL PROTECTED] sysctl kern.clockrate kern.clockrate: { hz = 4000, tick = 250, tickadj = 1, profhz = 1024, stathz = 128 } That's a pretty high HZ, here's what i have: kern.clockrate: { hz = 2500, tick = 400, tickadj = 1, profhz = 1024, stathz = 128 } I have the same box spec as you, only with em (bge doesn't support polling, but it has its own interrupt coalescer that works... you can tune that in the if_bge.h I think, there's some comments). I'm doing ~800Kpps with polling. My polling params are below. [EMAIL PROTECTED] sysctl kern.polling kern.polling.burst: 150 kern.polling.each_burst: 5 kern.polling.burst_max: 150 kern.polling.idle_poll: 1 kern.polling.poll_in_trap: 1 kern.polling.user_frac: 50 kern.polling.reg_frac: 20 kern.polling.short_ticks: 4909 kern.polling.lost_polls: 11464 kern.polling.pending_polls: 0 kern.polling.residual_burst: 0 kern.polling.handlers: 1 kern.polling.enable: 1 kern.polling.phase: 0 kern.polling.suspect: 10249 kern.polling.stalled: 3 [EMAIL PROTECTED] sysctl machdep.cpu_idle_hlt machdep.cpu_idle_hlt: 1 kern.polling.burst: 1000 kern.polling.each_burst: 80 kern.polling.burst_max: 1000 kern.polling.idle_poll: 1 kern.polling.poll_in_trap: 0 kern.polling.user_frac: 5 kern.polling.reg_frac: 120 kern.polling.short_ticks: 29 kern.polling.lost_polls: 55004 kern.polling.pending_polls: 0 kern.polling.residual_burst: 0 kern.polling.handlers: 4 kern.polling.enable: 1 kern.polling.phase: 0 kern.polling.suspect: 50690 kern.polling.stalled: 25 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: device polling takes more CPU hits??
From: Marko Zec [mailto:[EMAIL PROTECTED] On Monday 26 July 2004 17:35, Don Bowman wrote: [EMAIL PROTECTED] sysctl machdep.cpu_idle_hlt machdep.cpu_idle_hlt: 1 At least on -STABLE, machdep.cpu_idle_hlt setting is ignored / irrelevant when both kern.polling.enable and kern.polling.idle_poll are set. Hmm, this is more interesting. Since you are SMP, and using POLLING, i assume you did like me and commented out the !POLLING in SMP #error statement. You definitely want the halt on idle. The polling in idle doesn't work anyway, so try disabling it. James, not sure if you saw the rest of my email with my params: kern.polling.burst: 1000 kern.polling.each_burst: 80 kern.polling.burst_max: 1000 kern.polling.idle_poll: 0 kern.polling.poll_in_trap: 0 kern.polling.user_frac: 5 kern.polling.reg_frac: 120 kern.polling.short_ticks: 29 kern.polling.lost_polls: 55004 kern.polling.pending_polls: 0 kern.polling.residual_burst: 0 kern.polling.handlers: 4 kern.polling.enable: 1 kern.polling.phase: 0 kern.polling.suspect: 50690 kern.polling.stalled: 25 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: device polling takes more CPU hits??
From: James [mailto:[EMAIL PROTECTED] I have two boxes behind em0 that I can use to generate 250kpps to another vlan within em0 card as a test, so that bge0 is not involved in the stress test. Even when doing so, CPU load climbs higher with device polling turned on. Opened up systat, etc to check the interrupts, and em0 is generating 0 interrupts with device polling on (as obvious), but general interrupt load climbs rock high.. so I don't know what's causing it to climb. Cleared the firewall rules as well as a test... no difference :( Oh also, just FYI, each vlan interface has link0 set, since em(4) supports hardware 802.1q tag/detagging. The CPU time during the 'polling' is charged to interrupt, even though it occurs during softclock. That's why you see 0 interrupts, but high CPU usage in interrupt. Did u try lowering the 'register' access? --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: device polling takes more CPU hits??
From: Luigi Rizzo [mailto:[EMAIL PROTECTED] On Mon, Jul 26, 2004 at 01:18:46PM -0700, Kelly Yancey wrote: ... Out of curiousity, what sort of testing did you do to arrive at these settings? I did some testing a while back with a SmartBits box pumping packets through a FreeBSD 2.8Ghz box configured to route between two em gigabit interfaces; I found that changing the burst_max and each_burst parameters had almost no effect on throughput (maximum 1% difference). fast boxes are pci-bus limited, not CPU limited(*) so changing the burst size (which basically amortizes some CPU costs) has little if any effect. The PCI-X bus will probably be 64-bit 133MHz in this case, the limit moves up to the P64H2 hub for large packets, to the CPU for small packets. Polling becomes quite critical to prevent livelock. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Question on SOCK_RAW, implement a bpf-other host tee
I'm trying to implement a 'tee' which reads from bpf, and sends matching packets to another layer-2 adjacent host. I'm doing this with SOCK_RAW to try and write the packet back out. The 'sendto' passes, but i don't see a packet anywhere. Am i correct that i can hand an arbitrarily crafted IP packet into sendto, and the stack will write the ethernet header on, pick an interface, etc, based on the address in the sendto? I have swapped the ip_len, ip_off fields. The program I have is below. This is on 4.7. The handler gets called, the packet there looks correct, no error on any system call, yet no output :( Suggestions? /* * Copyright 2004 Sandvine Incorporated. All rights reserved */ #include stdio.h #include unistd.h #include sys/types.h #include sys/socket.h #include netinet/in.h #include netinet/in_systm.h #include netinet/ip.h #include pcap.h void usage(const char *name) { fprintf(stderr, Usage: %s [-I input_interface] [-O output_interface] [-i output_ip(arp for mac)] [-v]\n, name); exit(1); } typedef struct { int s; struct in_addr output_ip; } context; static int verbose; static void handler(unsigned char *ct, const struct pcap_pkthdr *hdr, const unsigned char *pkt) { struct ip *ip = (struct ip *)(pkt + 14); context *ctxt = (context *)ct; struct sockaddr_in to; memset(to,0,sizeof(to)); to.sin_family = AF_INET; to.sin_addr = ctxt-output_ip; if (verbose) { fprintf(stderr, Send %d byte packet\n, hdr-len); } ip-ip_len = htons(ip-ip_len); ip-ip_off = htons(ip-ip_off); if (sendto(ctxt-s, ip, hdr-len-14, 0, (struct sockaddr *)to, sizeof(to)) != (hdr-len-14) ) { err(1, sendto); } } static int doit(const char *input_interface, const char *output_interface, struct in_addr output_ip) { char errbuf[PCAP_ERRBUF_SIZE]; pcap_t *in_d, *out_d; context ctxt; int on = 1; struct bpf_program fp; in_d = pcap_open_live((char *)input_interface, 1600, 1, 20, errbuf); if (in_d == 0) { errx(1, open of %s failed: %s, input_interface, errbuf); } ctxt.output_ip.s_addr = htonl(output_ip.s_addr); ctxt.s = socket(PF_INET, SOCK_RAW, IPPROTO_RAW); if (ctxt.s 0) errx(1, can't open raw socket); if (setsockopt(ctxt.s, IPPROTO_IP, IP_HDRINCL, (char *)on, sizeof(on)) 0) { err(1,setsockopt); } memset(fp,0,sizeof(fp)); if (pcap_compile(in_d, fp, ip, 0, 0xfff0) 0) { errx(1, failed to compile: %s,pcap_geterr(in_d)); } if (pcap_setfilter(in_d, fp) 0) { errx(1, failed to set filter); } pcap_loop(in_d, -1, handler, (unsigned char *)ctxt); } int main(int argc, char *argv[]) { int ch; char *input_interface = ipfw0; char *output_interface = em2; struct in_addr output_ip; output_ip.s_addr = 0; while ((ch = getopt(argc, argv, I:O:i:vh?)) != -1) { switch (ch) { case 'I': input_interface = optarg; break; case 'O': output_interface = optarg; break; case 'i': if (inet_aton(optarg,output_ip) 0) { errx(1, unknown ip %s, optarg); } break; case 'v': verbose = 1; break; case 'h': case '?': default: usage(argv[0]); } } if (verbose) fprintf(stderr, %s-%s(%s)\n, input_interface,output_interface,inet_ntoa(output_ip)); return doit(input_interface,output_interface,output_ip); } ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Question on SOCK_RAW, implement a bpf-other host tee
From: Don Bowman [mailto:[EMAIL PROTECTED] I'm trying to implement a 'tee' which reads from bpf, and sends matching packets to another layer-2 adjacent host. Sorry to follow up my own post, but... More specifically, it appears the packet does try and transmit, but the destination MAC is (uninitialised?) somewhat random, different on each packet, not legal. I can capture it on the correct output interface with tcpdump. The interface type is xl. Shouldn't the stack ARP for the destination in my 'sendto', and fill in the ether header? The ether-source is filled in, presumably by the driver. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Looking for a Broadcom BCM5704 datasheet
From: Ruslan Ermilov [mailto:[EMAIL PROTECTED] On Fri, May 14, 2004 at 09:40:07AM -0700, Paul Saab wrote: Ruslan Ermilov wrote: Dear networkers, I'm looking for a Broadcom BCM5704[S] technical datasheet. If anyone has such a beast, or knows how one could obtain it, please let me know. As john pointed out, you can only get this under NDA from broadcom. What exactly are you trying to solve? I have the latest documentation so I may be able to help you, but I can't give you the docs. We hoped that with dual-channel NIC we could be able to just move the received frame from one port for TX on another port, to overcome the 32-bit PCI bus speed limitation, to get better thoroughput with GigE. Bill Paul already explained in private that they are actually two distinct SRAMs, and the operation we needed is not supported (without PCI involved). I believe it is 64-bit 133MHz PCI-X. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Can't compile Intel gigabit em driver
From: Gary Corcoran [mailto:[EMAIL PROTECTED] Quick background: I'm running FreeBSD 4.8-Release and have a new Intel Pro/1000 MT NIC I want to install. While there is a man page for the em driver which should be usable, there is no em listed in LINT or GENERIC. Nor is the source code for if_em.c anywhere on my system. So I downloaded the FreeeBSD driver source from Intel, which is listed as being for FreeBSD 4.7. It's their latest code. em is in the standard source tree for 4.8 src/sys/dev/em you add 'device em' to your kernel config to compile it in, or you can load the module by adding 'load_if_em=YES' to loader.conf if you installed from the 4.8 CD, you will have the module present in /modules/if_em.ko you can type 'kldload if_em' to try that theory, it will load the driver, and it should now show in 'ifconfig'. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Stupid question about managed switches
From: Marc G. Fournier [mailto:[EMAIL PROTECTED] On Thu, 8 Apr 2004, Don Bowman wrote: From: Marc G. Fournier [mailto:[EMAIL PROTECTED] Please excuse this, but my experience with them is zilch ... am going with the HP Procurve 2826(?) Layer2/Layer3 switch, as was suggested, but I'm curious as to how they work ... For instance, I know when I setup a router, I have an IN IP and an OUT IP configured ... but, with a managed switch, what do I have? For instance, right now, I have a default gateway on the providers switch of 200.46.204.1 ... and my servers are .2, .3, .4 and .5 ... if I put a managed switch, vs the unmanaged we have now, between the providers switch and the servers, does my default route then change to be the switch itself? Or is the 'login part' of the switch thought of the same way as adding just another server to the network, for connectivity purposes? As I said, stupid question, but for someone whose never played with a managed switch before ... :( Thanks .. In layer-2 mode, its nothing but a hub. It doesn't change your default route or anything. Pretend its not there. you will need a router connected to this switch, and its IP will remain your default route (likely). 'k, but I want to use the managed aspect of it to be able to hard code the port rates (ie. to fix this full-duplex issue initially) as well as be able to access SNMP so that I can do bandwidth monitoring of external traffic ... I have SNMP setup on the FreeBSD boxes right now so that I can see network load per server, but I want to be able to isolate the 'external' traffic from 'internal', by monitoring the specific port that is connected to the providers switch ... So, in both cases, I need to assign an IP somewhere, correct? Assign the switch an IP address on the same subnet as the router port its connected to, and on same subnet as the PC's. The procurve has a really nice serial interface that auto-baud rate detects. Slap a cable in, hit space twice, and its obvious from there. Assign it a management IP and route, an SNMP community. In the switch, you can create complete isolation using vlans. This makes complete virtual switches. Although you can assign a management IP on each vlan, i never bother. It doesn't sound like this is what you are looking for. also on this management interface (available via telnet after you set the ip) you can set the params for each port (duplex, speed). You can also connect a browser to it to see some basic stats etc. Now run something like 'mrtg' cfgmaker against the management IP of the switch, and you'll have a chart per port. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Stupid question about managed switches
From: Marc G. Fournier [mailto:[EMAIL PROTECTED] Please excuse this, but my experience with them is zilch ... am going with the HP Procurve 2826(?) Layer2/Layer3 switch, as was suggested, but I'm curious as to how they work ... For instance, I know when I setup a router, I have an IN IP and an OUT IP configured ... but, with a managed switch, what do I have? For instance, right now, I have a default gateway on the providers switch of 200.46.204.1 ... and my servers are .2, .3, .4 and .5 ... if I put a managed switch, vs the unmanaged we have now, between the providers switch and the servers, does my default route then change to be the switch itself? Or is the 'login part' of the switch thought of the same way as adding just another server to the network, for connectivity purposes? As I said, stupid question, but for someone whose never played with a managed switch before ... :( Thanks .. In layer-2 mode, its nothing but a hub. It doesn't change your default route or anything. Pretend its not there. you will need a router connected to this switch, and its IP will remain your default route (likely). ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FIN_WAIT_[1,2] and LAST_ACK
From: Brandon Erhart [mailto:[EMAIL PROTECTED] Hello everyone, I am writing a network application that mirrors a given website (such as a suped-up wget). I use a lot of FDs, and was getting connect() errors when I would run out of local_ip:local_port tuples. I lowered the MSL so that TIME_WAIT would timeout very quick (yes, I know, this is bad, but I'm going for sheer speed here), and it alleviated the problem a bit. However, I have run into a new problem. I am getting a good amount of blocks stuck in FIN_WAIT_1, FIN_WAIT_2 or LAST_ACK that stick around for a long while. I have been unable to find must information on a timeout for these states. I came across a small patch that modified tcp_timer.c in /usr/src/sys/netinet. It changed line #484 (in FreeBSD 4.9-REL) from: if (tp-t_state != TCPS_TIME_WAIT to if (tp-t_state FIN_WAIT_2 I also tried changing that to .. = FIN_WAIT_2 .. However, I still end up with quite a few stuck in FIN_WAIT_1, FIN_WAIT_2 or LAST_ACK after the program exits (and whilst the program is running of course). They don't seem to timeout in the same interval that TIME_WAIT does. Any ideas? Did I modify the right piece of code? I was told to post here as you all would more than likely know! Perhaps you want to lower net.inet.tcp.msl sysctl? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Odd network issue ... *very* slow scp between two servers
From: Marc G. Fournier [mailto:[EMAIL PROTECTED] On Sat, 6 Mar 2004, Tim Wilde wrote: On Sat, 6 Mar 2004, Marc G. Fournier wrote: I have two servers on the same network switch, sitting one on top of the other ... one is running an em (Dual-Xeon 2.4Ghz) device, the other an fxp (Dual-PIII 1.3Ghz) device ... Is it a Cisco Catalyst switch? If so, you need to switch the em's to autoselect, on both the server and switch end. For some reason, the em driver will not properly lock down its rate when talking to a Cisco Catalyst switch. At least, I had an identical problem with em's talking to a Catalyst 2950 and that was the fix I came up with. Give it a try and see how your results go. Note that forcing it to 100baseT half-duplex (or 10baseT/UTP half-duplex) corrects the problem ... turns out it is only in full-duplex mode that its hosed ... Actually, this is normal behaviour according to the 802.3u spec. If a device in 'auto' mode is connected to one that is forced 100FDX, the auto one will negotiate 100HDX. For example, see HP faq: http://www.hp.com/rnd/support/faqs/2700.htm#question6 http://roger.friendex.net/duplex_mismatch.htm has a nice table of this. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: DEVICE_POLLING with SMP
From: Kevin Day [mailto:[EMAIL PROTECTED] On Jan 29, 2004, at 1:04 AM, Vlad Galu wrote: I see no reason for it. Having to switch between multiple kernel threads to handle polling may bring too much overhead. Would that really be happening though? If polling is happening in the idle loop, extra overhead doesn't really matter all that much, the CPU is idle, and I can't imagine it being any worse during a livelock inducing amount of traffic. If it's polling during any other time, the code is exactly the same between the UP and SMP case, and I can't imagine the overhead being all THAT much worse, would it? My primary goal with it is to stop thrashing context switches when I've got a system acting as a router with 8 network interfaces on it. Even with network card interrupt coalescing there is a whole lot of interrupt activity going on, which polling seems to make a noticeable difference with polling enabled. I'm also very interested in polling's ability to more gracefully handle extremely heavy network traffic without getting into livelock, which may be worth it to some people prone to DoS activity when they have a whole lot of bandwidth to deal with. I'd be willing to chip in a few bucks for development time if anyone wants to make the changes to try it out. It didn't look that difficult, but my time is pretty booked right now. -- Kevin On 4.X, you can simply comment out the check for device polling and MP operation. The system will now work fine. It will not, however, poll on idle. We are running this way and it works very well. Polling on idle for MP requires a bit more work. If you do that work, you will have some locking issues to solve. I have not tried this on current yet. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: crossover between gigE?
From: Luigi Rizzo [mailto:[EMAIL PROTECTED] On Sat, Dec 20, 2003 at 07:11:22AM -0800, Alfred Perlstein wrote: Any suggestion of the kind of cable one should look for at Frys to run between two gigE card (intel em0) to function as a crossover? A straight cable with all 4 pairs wired will work. GigE (and many modern 100Mbit switches) have auto polarity detection. cheers luigi One caveat on that: if you force any of the phy parameters (e.g. speed, duplex), this defeats the auto polarity (MDI/MDX) detection on em, bge, and maybe others. ie to use the technique above you need to have autoselect enabled for duplex and speed. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: how to saturate 100Mbit
From: DrumFire [mailto:[EMAIL PROTECTED] dd if=holey-file of=/dev/null bs=10m I've got about 30% of CPU load for the server (P-133) and less than 35mbit/s on wire. Also you can try to dump traffic with tcpdump and send it with /usr/ports/net/tcpreplay I'm trying to send 100Mbit/s for 5-6 minutes with Ethernet frame size at 64 bytes, but I need very good hardware to make this. There is a netgraph module called ng_source which can do this. It can achieve about 400Kpps or 1Gbps on a xeon system with a gigabit card, should be able to saturate a fxp. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Two ISP connections
From: Andrea Venturoli [mailto:[EMAIL PROTECTED] ** Reply to note from Barney Wolff [EMAIL PROTECTED] Wed, 10 Dec 2003 11:39:00 -0500 I don't know of anything published that does this, but it's easy to write a perl or shell script that pings the router at the adsl isp and does the necessary things when it disappears and reappears. Mmh, only problem is one of the ISP is famous for blocking ICMP as a whole, so no pings work. I haven't tried this particular line yet, but I may need to use come other protocol. see the lft port (layer 4 traceroute) http://www.mainnerve.com/lft/ you can use this to get an ICMP response (albeit not echo) from your isp this way. [you can't really block icmp would fragment, it would break PMTU]. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
high number of pcb's, core dump in sysctl -a
net.inet.tcp.pcbcount: 76043 when i do 'sysctl net.inet.tcp', i get a core dump, while trying to read 'net.inet.tcp.pcblist'. Is there some built in limit to the size of a sysctl result? --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Giga-bit switches
From: Peter J. Blok [mailto:[EMAIL PROTECTED] Hi, This is just a warning. I am setting up a Giga-bit network trying to use Jumbo frames. For NIC the ability to do larger frames is usually listed, but that doesn't seem to be the case for switches. I have bought a Netgear GS104 switch, which does list a buffer per port of 12K. However, according to Netgear support, it is not supported and working. They just say that there is no mentioning of Jumbo frame support, therefore it is not supported. Even on the more expensive Netgear switches it is not listed, so it is trial-on-error policy. My understanding is that the Giga-bit definition includes large frame support and if you claim to have a Giga-bit switch you should support large frames, unless specifically excluded. jumbo frames are not part of the standard, and are in general poorly supported. For some cisco devices, they do 'mini giants', e.g. ~1600 mtu. Other cisco devices will support 9K frames, but @ the expensive of lowering the overall buffering (all frames are assumed to be 9K now, so ~1/4 of the packets may be buffered). for cisco devices, the support will be on a line card by linecard basis. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: I would like to tcpdump and get all the packets...
From: Petri Helenius [mailto:[EMAIL PROTECTED] Bruce M Simpson wrote: Er, if you check this URL: http://www.freebsd.org/cgi/cvsweb.cgi/src/contrib/tcpdump/CHANGES Shurely you mean tcpdump 3.7.2, which is already imported (by fenner, with additional hacks)? I mean libpcap, which also tcpdump uses, if I´m not mistaken. Look in contrib/libpcap Pete I found that increasing the bpf buffer size in libpcap to 256K from the default of 4K made a tremendous difference. --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: TCP socket shutdown race condition
From: Mike Silbersack [mailto:[EMAIL PROTECTED] On Fri, 1 Aug 2003, Scot Loach wrote: Earlier this week one of our FreeBSD 4.7 boxes panic'd. I've posted the stack trace at the end of this message. Using google, I've found several references to this panic over the past three years, but it seems its never been taken to root cause. The box crashes because the cr_uidinfo pointer in the so_cred structure is null. However, on closer inspection the so_cred structure is corrupted (cr_ref=3279453304 for example), so I'm guessing it has already been freed. Looking closer at the socket, I see that the SS_NOFDREF flag is set, which supports my theory. The tcpcb is in the CLOSED state, and has the SENTFIN flag set. About how many concurrent connections are you pushing this machine to? There's an unfortunate problem with uidinfo in 4.x: struct uidinfo { LIST_ENTRY(uidinfo) ui_hash; rlim_t ui_sbsize; /* socket buffer space consumed */ longui_proccnt; /* number of processes */ uid_t ui_uid; /* uid */ u_short ui_ref; /* reference count */ }; We are pushing in the ~50-~70K TCP connections to this process. I think i see what you are suggesting :) --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Help with FreeBSD Bridged Firewall
From: William Knechtel [mailto:[EMAIL PROTECTED] Yeah, the arp cache is the problem, thanks for nailing that one for me. However, the ipfw rule you supplied doesn't seem to want to work for me... I think for the time being I'll just run a cron job every 15 minutes or so that clears the arp cache completely. Thanks again for your help!! I really appreciate it! you can, with sysctl, change the arp timeout period. sysctl net.link.ether to see all of them. net.link.ether.inet.prune_intvl/net.link.ether.inet.max_age changes the arp cache age time. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Help with FreeBSD Bridged Firewall
From: William Knechtel [mailto:[EMAIL PROTECTED] I think you need to allow arp through this device, something like: ipfw add 30 allow layer2 mac-type arp [not sure which rule to insert it at]. I'm guessing your arp cache is timing out. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
splx() bug in ip_dummynet?
1.24.2.2 of ip_dummynet.c [RELENG_4] has a bug I'm thinking, can someone comment? In the below snippet, the value of 's' from splimp() is overwritten by the return value of alloc_hash(), which is an errno. If its != 0, then there's a missing splx(). If it is == 0, then splx() is called with the wrong value. [i've filed a PR against this, and will probably change the alloc_hash to use a different return value in my tree] s = splimp(); x-bandwidth = p-bandwidth ; x-numbytes = 0; /* just in case... */ bcopy(p-if_name, x-if_name, sizeof(p-if_name) ); x-ifp = NULL ; /* reset interface ptr */ x-delay = p-delay ; set_fs_parms((x-fs), pfs); if ( x-fs.rq == NULL ) { /* a new pipe */ s = alloc_hash((x-fs), pfs) ; if (s) { free(x, M_DUMMYNET); return s ; } x-next = b ; if (a == NULL) all_pipes = x ; else a-next = x ; } splx(s); ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: splx() bug in ip_dummynet?
From: Don Bowman [mailto:[EMAIL PROTECTED] ... I believe this patch will correct the issue. Index: ip_dummynet.c === RCS file: /usr/cvs/src/sys/netinet/ip_dummynet.c,v retrieving revision 1.24.2.17.1000.1 retrieving revision 1.24.2.17.1000.2 diff -U3 -r1.24.2.17.1000.1 -r1.24.2.17.1000.2 --- ip_dummynet.c 21 Jun 2003 20:47:59 - 1.24.2.17.1000.1 +++ ip_dummynet.c 24 Jul 2003 15:27:59 - 1.24.2.17.1000.2 @@ -1571,10 +1571,12 @@ if ( x-fs.rq == NULL ) { /* a new pipe */ - s = alloc_hash((x-fs), pfs) ; - if (s) { + int s1; + s1 = alloc_hash((x-fs), pfs) ; + if (s1) { free(x, M_DUMMYNET); - return s ; + splx(s); + return s1 ; } x-next = b ; if (a == NULL) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: using memory after freed in tcp_syncache (syncache_timer()) with ipfw: patch attached
From: Don Bowman [mailto:[EMAIL PROTECTED] Synopsis: under some ipfw conditions, tcp_syncache has syncache_respond() call ip_output call ip_input call syncache_drop(), which drops the 'syncache' that is being worked on, or corrupts the list, etc. This is typically seen from syncache_timer or syncache_add. I've attached a patch that I believe corrects this problem. I'm observing it on 4.7, but I believe it equally affects RELENG_4 and CURRENT. This seems to make the problem I was seeing go away. I'm currently running with 2K syn/second through the original condition, will let it go overnight like that. I think that will flush out if i've introduced a leak or other crash. Can someone who knows this code perhaps critique what I've done? Essentially I have made syncache_drop() instead defer the delete onto a different list. In the timer, I delete the syncache entries from the delete list. This costs some performance and memory, but was the best way I could come up with. There was an error in the previous patch. Index: tcp_syncache.c === RCS file: /usr/cvs/src/sys/netinet/tcp_syncache.c,v retrieving revision 1.5.2.8.1000.3 diff -U5 -r1.5.2.8.1000.3 tcp_syncache.c --- tcp_syncache.c 4 Feb 2003 01:52:03 - 1.5.2.8.1000.3 +++ tcp_syncache.c 1 Jul 2003 14:32:29 - @@ -83,16 +83,18 @@ #endif /*IPSEC*/ #include machine/in_cksum.h #include vm/vm_zone.h +static int syncache_delete_flag; static int tcp_syncookies = 1; SYSCTL_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW, tcp_syncookies, 0, Use TCP SYN cookies if the syncache overflows); static void syncache_drop(struct syncache *, struct syncache_head *); +static void syncache_delete(struct syncache *, struct syncache_head *); static void syncache_free(struct syncache *); static void syncache_insert(struct syncache *, struct syncache_head *); struct syncache *syncache_lookup(struct in_conninfo *, struct syncache_head **); static int syncache_respond(struct syncache *, struct mbuf *); static struct socket *syncache_socket(struct syncache *, struct socket *); @@ -125,10 +127,11 @@ u_int next_reseed; TAILQ_HEAD(, syncache) timerq[SYNCACHE_MAXREXMTS + 1]; struct callout tt_timerq[SYNCACHE_MAXREXMTS + 1]; }; static struct tcp_syncache tcp_syncache; +static TAILQ_HEAD(syncache_delete_list, syncache) sc_delete_list; SYSCTL_NODE(_net_inet_tcp, OID_AUTO, syncache, CTLFLAG_RW, 0, TCP SYN cache); SYSCTL_INT(_net_inet_tcp_syncache, OID_AUTO, bucketlimit, CTLFLAG_RD, tcp_syncache.bucket_limit, 0, Per-bucket hash limit for syncache); @@ -202,10 +205,13 @@ rtrequest(RTM_DELETE, rt_key(rt), rt-rt_gateway, rt_mask(rt), rt-rt_flags, NULL); RTFREE(rt); } +#if defined(DIAGNOSTIC) + memset(sc, 0xee, sizeof(struct syncache)); +#endif zfree(tcp_syncache.zone, sc); } void syncache_init(void) @@ -256,10 +262,12 @@ * older one. */ tcp_syncache.cache_limit -= 1; tcp_syncache.zone = zinit(syncache, sizeof(struct syncache), tcp_syncache.cache_limit, ZONE_INTERRUPT, 0); + + TAILQ_INIT(sc_delete_list); } static void syncache_insert(sc, sch) struct syncache *sc; @@ -312,12 +320,28 @@ static void syncache_drop(sc, sch) struct syncache *sc; struct syncache_head *sch; { + if ((sc-sc_flags SCF_DELETE) == 0) { + sc-sc_flags |= SCF_DELETE; + syncache_delete_flag = 1; + TAILQ_INSERT_TAIL(sc_delete_list, sc, sc_delete); + } +} + +static void +syncache_delete(sc, sch) + struct syncache *sc; + struct syncache_head *sch; +{ int s; + if ((sc-sc_flags SCF_DELETE) == 0) { + printf(ERROR ERROR ERROR: SCF_DELETE == 0\n); + return; + } if (sch == NULL) { #ifdef INET6 if (sc-sc_inc.inc_isipv6) { sch = tcp_syncache.hashbase[ SYNCACHE_HASH6(sc-sc_inc, tcp_syncache.hashmask)]; @@ -329,10 +353,12 @@ } } s = splnet(); + TAILQ_REMOVE(sc_delete_list, sc, sc_delete); + TAILQ_REMOVE(sch-sch_bucket, sc, sc_hash); sch-sch_length--; tcp_syncache.cache_count--; TAILQ_REMOVE(tcp_syncache.timerq[sc-sc_rxtslot], sc, sc_timerq); @@ -357,10 +383,12 @@ int s; s = splnet(); if (callout_pending(tcp_syncache.tt_timerq[slot]) || !callout_active(tcp_syncache.tt_timerq[slot])) { + if (syncache_delete_flag) + goto delete_cleanup; splx(s); return; } callout_deactivate(tcp_syncache.tt_timerq[slot]); @@ -390,10 +418,21
RE: using memory after freed in tcp_syncache (syncache_timer()) with ipfw: patch attached
Synopsis: under some ipfw conditions, tcp_syncache has syncache_respond() call ip_output call ip_input call syncache_drop(), which drops the 'syncache' that is being worked on, or corrupts the list, etc. This is typically seen from syncache_timer or syncache_add. I've attached a patch that I believe corrects this problem. I'm observing it on 4.7, but I believe it equally affects RELENG_4 and CURRENT. This seems to make the problem I was seeing go away. I'm currently running with 2K syn/second through the original condition, will let it go overnight like that. I think that will flush out if i've introduced a leak or other crash. Can someone who knows this code perhaps critique what I've done? Essentially I have made syncache_drop() instead defer the delete onto a different list. In the timer, I delete the syncache entries from the delete list. This costs some performance and memory, but was the best way I could come up with. --don Index: tcp_syncache.c === RCS file: /usr/cvs/src/sys/netinet/tcp_syncache.c,v retrieving revision 1.5.2.8.1000.3 diff -U3 -r1.5.2.8.1000.3 tcp_syncache.c --- tcp_syncache.c 4 Feb 2003 01:52:03 - 1.5.2.8.1000.3 +++ tcp_syncache.c 1 Jul 2003 03:05:22 - @@ -85,6 +85,7 @@ #include machine/in_cksum.h #include vm/vm_zone.h +static int syncache_delete; static int tcp_syncookies = 1; SYSCTL_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW, tcp_syncookies, 0, @@ -127,6 +128,7 @@ struct callout tt_timerq[SYNCACHE_MAXREXMTS + 1]; }; static struct tcp_syncache tcp_syncache; +static TAILQ_HEAD(syncache_delete_list, syncache) sc_delete_list; SYSCTL_NODE(_net_inet_tcp, OID_AUTO, syncache, CTLFLAG_RW, 0, TCP SYN cache); @@ -204,6 +206,9 @@ rt-rt_flags, NULL); RTFREE(rt); } +#if defined(DIAGNOSTIC) + memset(sc, 0xee, sizeof(struct syncache)); +#endif zfree(tcp_syncache.zone, sc); } @@ -258,6 +263,8 @@ tcp_syncache.cache_limit -= 1; tcp_syncache.zone = zinit(syncache, sizeof(struct syncache), tcp_syncache.cache_limit, ZONE_INTERRUPT, 0); + + TAILQ_INIT(sc_delete_list); } static void @@ -331,6 +338,18 @@ s = splnet(); + if ((sc-sc_flags SCF_DELETE) == 0) { + sc-sc_flags |= SCF_DELETE; + syncache_delete = 1; + TAILQ_INSERT_TAIL(sc_delete_list, sc, sc_delete); + + splx(s); + return; + } + if (sc-sc_delete.tqe_next || sc-sc_delete.tqe_prev) { + TAILQ_REMOVE(sc_delete_list, sc, sc_delete); + } + TAILQ_REMOVE(sch-sch_bucket, sc, sc_hash); sch-sch_length--; tcp_syncache.cache_count--; @@ -359,6 +378,8 @@ s = splnet(); if (callout_pending(tcp_syncache.tt_timerq[slot]) || !callout_active(tcp_syncache.tt_timerq[slot])) { + if (syncache_delete) + goto delete_cleanup; splx(s); return; } @@ -392,6 +413,17 @@ if (nsc != NULL) callout_reset(tcp_syncache.tt_timerq[slot], nsc-sc_rxttime - ticks, syncache_timer, (void *)(slot)); + +delete_cleanup: + sc = TAILQ_FIRST(sc_delete_list); + while (sc != NULL) { + nsc = TAILQ_NEXT(sc, sc_delete); + syncache_drop(sc, NULL); + sc = nsc; + } + TAILQ_INIT(sc_delete_list); + syncache_delete = 0; + splx(s); } @@ -1335,6 +1367,7 @@ sc = zalloc(tcp_syncache.zone); if (sc == NULL) return (NULL); + bzero(sc, sizeof(*sc)); /* * Fill in the syncache values. * XXX duplicate code from syncache_add Index: tcp_var.h === RCS file: /usr/cvs/src/sys/netinet/tcp_var.h,v retrieving revision 1.56.2.12 diff -U3 -r1.56.2.12 tcp_var.h --- tcp_var.h 24 Aug 2002 18:40:26 - 1.56.2.12 +++ tcp_var.h 1 Jul 2003 02:33:57 - @@ -224,8 +224,10 @@ #define SCF_CC 0x08/* negotiated CC */ #define SCF_UNREACH0x10/* icmp unreachable received */ #define SCF_KEEPROUTE 0x20/* keep cloned route */ +#define SCF_DELETE 0x40/* I'm being deleted */ TAILQ_ENTRY(syncache) sc_hash; TAILQ_ENTRY(syncache) sc_timerq; + TAILQ_ENTRY(syncache) sc_delete; }; struct syncache_head { ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
using memory after freed in tcp_syncache (syncache_timer())
syncache_timer() ... /* * syncache_respond() may call back into the syncache to * to modify another entry, so do not obtain the next * entry on the timer chain until it has completed. */ (void) syncache_respond(sc, NULL); nsc = TAILQ_NEXT(sc, sc_timerq); tcpstat.tcps_sc_retransmitted++; TAILQ_REMOVE(tcp_syncache.timerq[slot], sc, sc_timerq); so what happens is that syncache_respond() calls ip_output, which ends up calling ip_input, which ends up doing something that causes 'sc' to be freed. Now 'sc' is freed, we return to syncache_timer(), and then we use it in nsc = TAILQ_NEXT(...) line. This particular part of the problem was introduced in 1.23 of tcp_syncache.c in response to another bug that i had found. Does anyone have a suggestion on a proper fix? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: using memory after freed in tcp_syncache (syncache_timer())
From: Don Bowman ... It appears this may also occur in syncache_add(): in this case, syncache_respond() alters the list. sc-sc_tp = tp; sc-sc_inp_gencnt = tp-t_inpcb-inp_gencnt; if (syncache_respond(sc, m) == 0) { s = splnet(); TAILQ_REMOVE(tcp_syncache.timerq[sc-sc_rxtslot], sc, sc_timerq); SYNCACHE_TIMEOUT(sc, sc-sc_rxtslot); splx(s); tcpstat.tcps_sndacks++; tcpstat.tcps_sndtotal++; } *sop = NULL; ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
nested ipfw dummynet pipes
is there any way, in a bridging config, to have nested pipes? In particular, what i would like to achieve is a rule that allows e.g. 64kbps per host (src-mask 0x), but that all these hosts are in an overall 10Mbps pipe. The idea will be that @ some times of the day the pipe is less than full, so everyone gets 64kbps, but @ other times of the day the pipe is full, and I don't want more than 10Mbps flowing. net.inet.ip.fw.one_pass looks to do what i want but: Note: bridged and layer 2 packets coming out of a pipe are never reinjected in the firewall irrespective of the value of this variable. suggests this is not the case. Is there some technique using e.g. netgraph? Or can someone suggest why the note is there and if it might be easily removed? e.g. what i have is a system with em0 -- em1 net.link.ether.bridge_cfg=em0 em1 net.link.ether.bridge=1 net.link.ether.bridge_ipfw=1 net.inet.ip.fw.one_pass=1 --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: nested ipfw dummynet pipes
From: Luigi Rizzo [mailto:[EMAIL PROTECTED] On Fri, Jun 20, 2003 at 01:41:21PM -0400, Don Bowman wrote: is there any way, in a bridging config, to have nested pipes? net.inet.ip.fw.one_pass=0 should do the job, i think the comment in the manpage is now incorrect and the code (in net/bridge.c) has been fixed (one-line) to implement this. Check the commit logs, i don't have them handy at the moment. Thanks very much, I will check this. I assume this will be true for IPFW2 rather than IPFW. It appears that 1.16.2.23, nov 21 2002, RELENG_4 has this from the log: MFC: obey to fw_one_pass in bridge and layer 2 firewalling (the latter only affects ipfw2 users). Move fw_one_pass from ip_fw[2].c to ip_input.c to avoid depending on IPFIREWALL. I will try this out. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: nested ipfw dummynet pipes
From: 'Luigi Rizzo' [mailto:[EMAIL PROTECTED] On Fri, Jun 20, 2003 at 02:18:17PM -0400, Don Bowman wrote: ... Thanks very much, I will check this. I assume this will be true for IPFW2 rather than IPFW. one_pass actually affect both. the comment in parentheses refers to layer 2 firewalling which is an ipfw2-only fature (bridge firewalling is also available with ipfw1) This works correctly, thanks very much. Attached is a trivial patch to correct the man page. Is there a benefit to having the single wide pipe first, or the many narrow pipes first, in the ruleset? $ cvs diff -U5 ipfw.8 Index: ipfw.8 === RCS file: /usr/cvs/src/sbin/ipfw/ipfw.8,v retrieving revision 1.63.2.28 diff -U5 -r1.63.2.28 ipfw.8 --- ipfw.8 30 Sep 2002 20:57:05 - 1.63.2.28 +++ ipfw.8 20 Jun 2003 18:49:02 - @@ -1587,14 +1587,10 @@ When set, the packet exiting from the .Xr dummynet 4 pipe is not passed though the firewall again. Otherwise, after a pipe action, the packet is reinjected into the firewall at the next rule. -.Pp -Note: bridged and layer 2 packets coming out of a pipe -are never reinjected in the firewall irrespective of the -value of this variable. .It Em net.inet.ip.fw.verbose : No 1 Enables verbose messages. .It Em net.inet.ip.fw.verbose_limit : No 0 Limits the number of messages produced by a verbose firewall. .It Em net.link.ether.ipfw : No 0 ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Spontan reboot of FreeBSD 4,x box
From: Dennis Pedersen [mailto:[EMAIL PROTECTED] I have a couple of FreeBSD 4,4 and one 4,7 that are beeing used as firewalls in different locations. Lately i haven noticed that one of the firewall's was starting to reboot at a certin time of the day (give or take maybe 10min). The time it resets wouldn't correlate to the periodic (e.g. 3am) would it? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Spontan reboot of FreeBSD 4,x box
well, I would speculate that your /etc/periodic is running @ 3am doing things like looking for setuid files, pruning /tmp, etc, which sparks up some disk activity, forks a few processes, walks the filesystem, etc, which is tripping some bug you have in the kernel, or bad memory. [i have a version of memtest86 which can be loaded from 'loader' and placed on a fbsd file system if you wish to try the bad memory theory conveniently]. I have a similar problem in 4.7 that occurs once in a while @ 3:01am which seems to randomly corrupt memory. I've been chasing it for a while but is hasn't been reproducible enough to find. This is pure speculation. man 8 periodic see /etc/periodic.conf -Original Message- From: Dennis Pedersen [mailto:[EMAIL PROTECTED] Sent: May 28, 2003 16:46 To: Don Bowman; [EMAIL PROTECTED] Subject: Re: Spontan reboot of FreeBSD 4,x box - Original Message - From: Don Bowman [EMAIL PROTECTED] To: 'Dennis Pedersen' [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Wednesday, May 28, 2003 3:56 PM Subject: RE: Spontan reboot of FreeBSD 4,x box From: Dennis Pedersen [mailto:[EMAIL PROTECTED] I have a couple of FreeBSD 4,4 and one 4,7 that are beeing used as firewalls in different locations. Lately i haven noticed that one of the firewall's was starting to reboot at a certin time of the day (give or take maybe 10min). The time it resets wouldn't correlate to the periodic (e.g. 3am) would it? On one of the box´s that fits yeah.. What am i missing? cron_enable is set to no in rc.conf and the cron deamon isnt running? Regards, Dennis ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: A problem with too many network interfaces
From: Garrett Wollman [mailto:[EMAIL PROTECTED] On Mon, 26 May 2003 14:04:19 -0700 (PDT), =?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?= [EMAIL PROTECTED] said: A proper BSD port could use something like the trick in Stevens[1] and keep retrying the call with a larger bufer until the length of the result is the same as in the previous call. Actually, a proper BSD port would use the net.route.iflist sysctl instead. -GAWollman $ uname -sr FreeBSD 4.6-RC $ sysctl net.route sysctl: unknown oid 'net.route' I think since the ports work against other than current branch it would be difficult to support? --don ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Source ip route lookup on incoming packets?
From: Sten Daniel Sørsdal [mailto:[EMAIL PROTECTED] On Thu, Feb 27, 2003 at 02:02:53PM +0100, Sten Daniel S?rsdal wrote: What i am looking for is a feature that basically prevents spoofing by looking the route for the source and match the incoming interface. A firewall solves the problem but adds alot of administrative overhead and leaves room for error. Check the net.inet.ip.check_interface sysctl. It may be what you're looking for. BMS Thank you for your reply! I havent had a clear explanation of that one (tried the RFC too). But does this one really stop spoofing for routed packets as well? I got some border routers running BGP - three of which have full internet feed. Would this block spoofed packets from my network and would it block incoming source IPs that come from nonexistant networks? I think the routers would need to have egress filtering enabled, which isn't all that commonly done. http://www-users.rwth-aachen.de/jens.hektor/security/cisco-acl.html for example. --don To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
3COM 3C996-SX (bge) fibre support?
I see in the cvs comments that this card is supported (1.11 of if_bge.c). The relevant change seems to be: + /* +* Figure out what sort of media we have by checking the +* hardware config word in the EEPROM. Note: on some BCM5700 +* cards, this value appears to be unset. If that's the +* case, we have to rely on identifying the NIC by its PCI +* subsystem ID, as we do below for the SysKonnect SK-9D41. +*/ + bge_read_eeprom(sc, (caddr_t)hwcfg, + BGE_EE_HWCFG_OFFSET, sizeof(hwcfg)); + if ((ntohl(hwcfg) BGE_HWCFG_MEDIA) == BGE_MEDIA_FIBER) + sc-bge_tbi = 1; sadly, I have a phy-id of 0, so I think I have to use the hackish method the SK... uses, just below it: /* The SysKonnect SK-9D41 is a 1000baseSX card. */ if ((pci_read_config(dev, BGE_PCI_SUBSYS, 4) 16) == SK_SUBSYSID_9D41) sc-bge_tbi = 1; I have the subsystem etc (side-node: there's a bug in the above code, it should check the vendor id as well): PCI sub-devid 0x1004 PCI PCI sub-vid 0x10b7 So I added a line of the SK_... type above, to set the 'bge_tbi' to one for my 1000baseSX card. However, I see this interface 'flapping', I get snowed with messages to my console about 'link up' (but never link down). I tried forcing the media mediaopts to 1000Mbps and full-duplex. The other end of the link sees nothing (no link). Anyone have a suggestion on where to start? I suspect this is related to the comment about One thing that confuses me still is that the 'link state change' bit in the status block seems to change state an awful lot. (1.10). --don To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Redundant NIC/Connections
From: Jonathan Disher [mailto:[EMAIL PROTECTED]] On Wed, 1 Jan 2003, David J Duchscher wrote: I was wondering how people are handling redundant connections? We would like to have dual NICs in the FreeBSD box with each NIC connected to a different switch. Both switches are in the same broadcast domain. In pointers, hints on this may done would be greatly appreciated. I think one of my colleagues responded directly to the poster. We do it by a daemon he wrote that monitors interface link status, and also pingability of default gateways, and reconfigures interfaces in event of a failure, based on the normal configuration file settings (/etc/rc.conf) Instantly in event of link loss; after a few seconds of retrying in event of router loss (we use HSRP addresses for routers.) Yes, I got a few responses including the one mentioned above. Lack of time and changing priorities has prevented me from following up on them. Anshuman Kanwar did mentioned he might release his solution as open source if there was interest. I am at least interested in reviewing it. I just need to find the time. I would also be very interested in this. We could write our own, but I'd much rather burn that time working on other projects ;-). Did Anshuman happen to mention what it's written in? Perl? C? other? I wonder if the VRRP (http://www.bsdshell.net/hut_fvrrpd.html) can help here. its available as a port. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Broadcom BCM5703X Gigabit Ethernet woes, panics, no MIIs, oh my!
From: George J.V. Cox [mailto:[EMAIL PROTECTED]] I have a Dell 1655MC blade server, and a compiled-this-week 4.7-STABLE kernel. The hardware is a chassis of 6 PCs in a 3U case. Each blade has two Broadcom BCM5703 interfaces. Unfortunately, its behaviour is rather non-deterministic. ... I'm seeing similar behaviour with a 5704 (dual gmac). I will let you know if I find a fix for it. I'm suspecting the timing on the eeprom interface right now since I sometimes get a MAC of 0. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
struct inpcb, INET6
Is there a reason that struct inpcb doesn't have an #ifdef INET6 around struct { /* IP options */ struct mbuf *inp6_options; /* IP6 options for outgoing packets */ struct ip6_pktopts *inp6_outputopts; /* IP multicast options */ struct ip6_moptions *inp6_moptions; /* ICMPv6 code type filter */ struct icmp6_filter *inp6_icmp6filt; /* IPV6_CHECKSUM setsockopt */ int inp6_cksum; u_short inp6_ifindex; short inp6_hops; u_int8_tinp6_hlim; } inp_depend6; ? Its 25 bytes per connection. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Don Bowman [mailto:[EMAIL PROTECTED]] I have a setup where I have a transparent proxy using ipfw fwd (to localhost). Data is sent to this device using a MAC rewrite so that packets arrive with my MAC, but the original source and destination IP. When I receive the SYN, i accept the connection, which causes an ARP to be emitted for the source address, and then the SYN/ACK. I didn't get much response from this, so I'm going to re-phrase. Is there any reason that I shouldn't modify the TCP passive accept so that it remembers both the MAC address of the sender, and the interface the packet came in on? By doing so, I will avoid having to issue an ARP for each incoming connection (which adds latency, and more importantly for me, breaks the ability to use ipfw 'fwd' rules the way I want). [This is with FreeBSD 4.7 if it matters]. What's happening is I have 1 router feeding me sessions which I'm transparently proxying (e.g. squid). Obviously I can't have a default route back to each of them. So I have something like: [Router1]---\ \ [Router2][BSD] / [Router3]---/ This is done with a layer-2 mac rewrite, ie the router takes the packet, doesn't modify the IP header, but changes the destination MAC to be that of the BSD machine. So, e.g, a packet comes into router1 above (from somewhere on its left hand side). It may have IPsrc=1.0.0.1, IPdst=2.0.0.1. It then arrives @ the BSD machine, which will cheerfully say, yup, I'm 2.0.0.1 (using the beauty of 'ipfw fwd localhost...'). Problem is, it then wants to send a SYN/ACK, there's no route, so no where to go. I can't make the route be one of those routers, and the routing tables are too complicated to install (since there may be BGP on the left of them, etc, etc). Its important for me the response packets go back through the same path (to avoid reordering etc). The next step for me is to use a separate VLAN from each of those routers to the BSD box (so that the packets appear to come from different interfaces). I'd like to memorize the interface the packet came in, and the mac header to use, and just use that without making an enormous arp table, and going back to the place the SYN came from. Is there a reason it doesn't work this way currently (before I dive in and make changes). If I were to change it to work the way I want, would other people be interested? Would this be interesting as a whole-sale change in behaviour, or as a sysctl-changeable or #ifdef settable? Comments greatly appreciated. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
-Original Message- From: Chuck Swiger [mailto:[EMAIL PROTECTED]] On Wednesday, December 4, 2002, at 03:20 PM, Don Bowman wrote: What's happening is I have 1 router feeding me sessions which I'm transparently proxying (e.g. squid). Obviously I can't have a default route back to each of them. So I have something like: [Router1]---\ \ [Router2][BSD] / [Router3]---/ This is done with a layer-2 mac rewrite, ie the router takes the packet, doesn't modify the IP header, but changes the destination MAC to be that of the BSD machine. You can't have more than one default route, but you certainly can have several static or dynamic routes to select the appropriate router to send responses back. You could also look into policy-based routing or multihoming the connections, but I guess that depends on what you're doing. I can't make the route be one of those routers, and the routing tables are too complicated to install (since there may be BGP on the left of them, etc, etc). Its important for me the response packets go back through the same path (to avoid reordering etc). What happens if incoming traffic comes via more than one router at a time-- how should your system decide which path to send replies back? Based on the source IP? These are isp-sized routers (complicated networks with different peering points to other networks). Static routes don't work since they are much too dynamic. Additionally, the widget which is picking the traffic to send (like Cisco WCCP) is load-balancing, so there's another striping of data going on. I'd like to just send it back to the router it came from. I won't have a single TCP session come from more than one router, but will have the same source or destination IP come from the different routers concurrently. I'm not sure what you mean by policy-based routing. If its the same thing as on a router, then its not appropriate since it will be based on IP. In the example diagram above, I might have a case where host 'A' sends host 'B' two concurrent TCP sessions. These will both transparently arrive @ the BSD box, one via router1, one via router2. Triangulation breaks the application, so A-B(session1) needs to always flow via the same router it started on. I'm thinking this is achieved by just caching the interface destination MAC etc in the PCB for the TCP session. It does this anyway once its finished sending the SYN/ACK, its just that it follows routing rules and ARP's for the SYN/ACK. This is a common application for e.g. Squid when being fed by more than one router. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Julian Elischer [mailto:[EMAIL PROTECTED]] The arp is issued because the TCP stack is responding to the SYN packet with it's own SYN, but it doesn't have a route to the origianal source, so it creates one, as it's local. this means that it allocates an ARP entry for it which in turn causes an arp request to be sent. The response will result in the SYN being transmitted. This is all pretty normal. there will not be another ARP sent for 18 minutes for that host.. thw question is.. Why does it think the source is local? are the routers below doing proxy arp? Did you give your interface a netmask of 0,0.0.0? Who responds to the arp? Its a layer-2 MAC rewrite, so it arrives on a local segment, but subnetting rules don't apply. No-one responds to the ARP, hence my problem :) I know what its doing now is normal, its just that it doesn't work in my configuration (which isn't typical). The interface in question has no IP or netmask (or at least, i would like it to not have one, its not needed). You COULD write a netgraph node that adds routes as it receives packets in fact it could keep it's own cache of IP/MAC mappings and switch the MACs appropriatly on outgoing packets. Possibly adding routes would be best. It would identify the source from the src mac address, and add add the appropriate entry to the routing table. a bit like a learning bridge. I'm not sure I can write a route-rule for a connection since I could have a different path back to the same IP for a different TCP connection. Thus my idea just to let the PCB take care of it. if there is bgp to the left, you could make this machine take part.. do the routers do bgp? Not in all cases :( Is there a reason that return routes are not added every time a packet is received? Well, yes. For a start it may not be what everyone wants. I have made great use of asymetrical routing many times (e.g. some satelite internet connections are via modem for outgoing and via the satelite for incoming.) OK, I understand. So if I make this change, it would only be useful if it were not the default / disableable. Perhaps it would be a socket option on the listen() socket... Similar to the SO_DONTROUTE I guess. Maybe that is what SO_DONTROUTE should mean for listen()? This is only an issue for passively accepted connections. This issue comes about due to the way WCCP works with its hashing buckets and with multiple routers feeding multiple caching servers: the routers load balance across caches (so each will distribute the sources addresses on its left to more than one cache). --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Julian Elischer [mailto:[EMAIL PROTECTED]] On Wed, 4 Dec 2002, Don Bowman wrote: Why does it think the source is local? are the routers below doing proxy arp? Did you give your interface a netmask of 0,0.0.0? Who responds to the arp? Its a layer-2 MAC rewrite, so it arrives on a local segment, but subnetting rules don't apply. No-one responds to the ARP, hence my problem :) Someone must be responding, because the SYN is eventually sent. Ah, its working currently with a single router. Adding the 2nd router is breaking it. I currently have a default route back to the first router. Adding the 2nd router, the back-path always goes through the first router, which gets confused. (I'm using the term router, but its actually a content switching device operating @ layer 4, like cisco WCCP or Cisco CSM or nortel Alteon). Here's my suggestion: write a netgraph node that does all the MAC rewriting. Code from the ng_bridge node would be useful. attach it to a ng_iface node. make the netgraph iface the default route. (route add default -iface ng0) Let me chew on that for a bit. I'm not sure where it would get the destination mac from, wouldn't it have to cache the information the PCB is holding? Wouldn't it be more efficient for me to just create the ether-header when the SYN comes in, store it in the PCB, and use that on each outgoing packet for that tcp connection, add a sockopt (or use SO_DONTROUTE for this on the listen socket)? Thanks for the great suggestions, keep them coming :) --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Chuck Swiger [mailto:[EMAIL PROTECTED]] On Wednesday, December 4, 2002, at 03:37 PM, Don Bowman wrote: [ ... ] These are isp-sized routers (complicated networks with different peering points to other networks). Static routes don't work since they are much too dynamic. Additionally, the widget which is picking the traffic to send (like Cisco WCCP) is load-balancing, so there's another striping of data going on. Yes, but the complicated internal routes maintained within those networks isn't your problem if your machine or network isn't BGP peering with them. It is in the sense that I have to figure out which one to send data back to. More than one of them may 'own' a source address at a given time (for a TCP session). In the example diagram above, I might have a case where host 'A' sends host 'B' two concurrent TCP sessions. These will both transparently arrive @ the BSD box, one via router1, one via router2. Triangulation breaks the application, so A-B(session1) needs to always flow via the same router it started on. Why? This sounds like a pretty classic example of A being on a multihomed network, and you should let IP-level routing deal with the problem. But there are alternatives, I guess-- maybe try putting a buncha interfaces on the BSD box, one for each router being connected to it, and put each pair on their own /30. That way, the BSD box can quite easily return the traffic back to the originating router Only if its routing, not for L2 redirection. I'm thinking this is achieved by just caching the interface destination MAC etc in the PCB for the TCP session. It does this anyway once its finished sending the SYN/ACK, its just that it follows routing rules and ARP's for the SYN/ACK. Yes. Pretending machines which are on remote networks are local can be done by re-writing MAC addresses, but that can be achieved by NAT or VPN solutions as well. Why are you trying to override normal routing behavior when you probably can use it to help solve the problem? This is a transparent proxy. The proxy needs to know where the real destination was (in case it needs to open a connection there). The HTTP protocol solved this by putting the real-ip address in the header, but most other protocols didn't. I don't have control of the content switching routers which feeds this. They work the way they do. Say for the sake of example you wished to load balance 2 farms of telnet servers. You had a device which picked off port 23, and sent it to you without alterations. You would then look @ the intended destination address, and pick the right group of telnet servers, and send the data there. Now say that those devices themselves where load-balanced. So if a user telneted twice to the same destination, one path might go through the first redirector, and one through the 2nd. The path back is based on the path it came in. [client] | -- | Load Balancer | -- | | | | [Redirector1] [Redirector2] \ / \ / - || [BSD1] [BSD2] || - | | | | | | | | | | Telnet servers(A) Telnet (B) So in this case, [client] sends a SYN to port 23 on the virtual address of telnet(A). The load balancer sends this (and all other traffic) aribtrarily to Redirector1 or 2. These devices say, Aha!, port 23, let me use this clever policy based route, and just rewrite the destination MAC to be either BSD1 or BSD2 (based on some feedback on their load, availability, etc). BSD1 and 2 have a rule like: ipfw fwd localhost,9000 tcp from any to any recv bge0 23 and then on localhost:9000 have listening a clever little app that does: accept(), look @ intended destination IP, pick a telnet server in the farm it so addresses, connect, and then proxy the accepted() connection to the actively initiated one. Now, BSD1 / 2 can't use Redirector1/2 as a default route, since they will be treating them as equals. One of them sent the SYN packet, I'd love the SYN/ACK to go back to the same one. I know the MAC it came from, that's where the response should go. Making it all layer 3 doesn't help me, then I don't have the intended destination address. Additionally I have the problem that if I have two routers on my net, and one sends me traffic, I can only respond to it if its my default route, or if I have a static route for an IP behind it. Maybe those routers both lead to the same locations? I can't really use a VPN (GRE etc) tunnel since then I'll have to fragment, and I'd prefer to avoid that. My first thought
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Don Bowman [mailto:[EMAIL PROTECTED]] I have a setup where I have a transparent proxy using ipfw fwd (to localhost). Data is sent to this device using a MAC rewrite so that packets arrive with my MAC, but the original source and destination IP. When I receive the SYN, i accept the connection, which causes an ARP to be emitted for the source address, and then the SYN/ACK. I didn't get much response from this, so I'm going to re-phrase. Is there any reason that I shouldn't modify the TCP passive accept so that it remembers both the MAC address of the sender, and the interface the packet came in on? By doing so, I will avoid having to issue an ARP for each incoming connection (which adds latency, and more importantly for me, breaks the ability to use ipfw 'fwd' rules the way I want). [This is with FreeBSD 4.7 if it matters]. What's happening is I have 1 router feeding me sessions which I'm transparently proxying (e.g. squid). Obviously I can't have a default route back to each of them. So I have something like: [Router1]---\ \ [Router2][BSD] / [Router3]---/ This is done with a layer-2 mac rewrite, ie the router takes the packet, doesn't modify the IP header, but changes the destination MAC to be that of the BSD machine. So, e.g, a packet comes into router1 above (from somewhere on its left hand side). It may have IPsrc=1.0.0.1, IPdst=2.0.0.1. It then arrives @ the BSD machine, which will cheerfully say, yup, I'm 2.0.0.1 (using the beauty of 'ipfw fwd localhost...'). Problem is, it then wants to send a SYN/ACK, there's no route, so no where to go. I can't make the route be one of those routers, and the routing tables are too complicated to install (since there may be BGP on the left of them, etc, etc). Its important for me the response packets go back through the same path (to avoid reordering etc). The next step for me is to use a separate VLAN from each of those routers to the BSD box (so that the packets appear to come from different interfaces). I'd like to memorize the interface the packet came in, and the mac header to use, and just use that without making an enormous arp table, and going back to the place the SYN came from. Is there a reason it doesn't work this way currently (before I dive in and make changes). If I were to change it to work the way I want, would other people be interested? Would this be interesting as a whole-sale change in behaviour, or as a sysctl-changeable or #ifdef settable? Comments greatly appreciated. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: SO_DONTROUTE, arp's, ipfw fwd, etc
From: Julian Elischer [mailto:[EMAIL PROTECTED]] On Wed, 4 Dec 2002, Don Bowman wrote: ... It gets the destination MAC address from the SRC AMC field of the preceding incoming packets with that IP src, dst and port combination i.e. the node would look within the IP header. Wouldn't it be more efficient for me to just create the ether-header when the SYN comes in, store it in the PCB, and use that on each outgoing packet for that tcp connection, add a sockopt (or use SO_DONTROUTE for this on the listen socket)? yes and no... you would be breaking the layering in the standard code and you'd get crucified for it. start with the ng_bridge node and make it look within the IP header and use that information in it's hash tables instead of MAC addresses. It'll need some hosekeeping code too. (to flush old info, though you could reduce this by removing entries when you see the FIN packets go past.) Perhaps I can do this within ipfw? Its only ipfw that is bringing up this situation, making me respond to things that normally wouldn't be routed to me. Perhaps 'ipfw' is missing something when it does a 'fwd' to localhost, another step to make this all work? FIN are pretty rare :) Too often things just shut off. I'm nervous about trying to cache the info outside the PCB since it has to stay in sync (its not like the arp cache, there's no way to get the info back if you drop it early). RST is even more problematic since I have to decide if its in-window. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
SO_DONTROUTE, arp's, ipfw fwd, etc
I have a setup where I have a transparent proxy using ipfw fwd (to localhost). Data is sent to this device using a MAC rewrite so that packets arrive with my MAC, but the original source and destination IP. When I receive the SYN, i accept the connection, which causes an ARP to be emitted for the source address, and then the SYN/ACK. Now, I would like to have my default route not be on the 'data' interface which has the ipfw rule. It seems like this would work if: a) the MAC address for the source address (the router which sent me the packet) was entered into the ARP cache automatically when the SYN was received. b) I used SO_DONTROUTE in my proxy application. Does anybody have any comments on that? Is there a reason that learning ARP entries isn't done passively? I assume that since the receive interface is cached in the syncache, and then proxied through to the PCB, that the SO_DONTROUTE will cause the return packets to go back through that same interface? Is there a simpler way? --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
IPFW question with options and fwd rule
If I create a rule to 'fwd' packets with a particular TCP option set (or IP option) to a specific local port, and then I accept on that port, will subsequent packets without that option work? ie, I have this: 100 fwd localhost,9000 tcp from any to any 1234 tcpoptions ts recv interface SYN (TCP option SACK=1), Dest port=, Dest ip = random-host SYN/ACK ACK (no TCP options) will the first SYN reach me? (yes I think, even though the IP is not mine and the dest port is not me, the ipfw fwd magic takes care). Will the ACK from the client reach me? (the dest ip is not me, so will the stack discard, or will the already created PCB take care of this?) I'd like to carry on a normal TCP conversation, but select the local port that terminates it based on a TCP option. The destination IP will be somewhere else (its a transparent proxy application). Thanks in advance. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: IPFW question with options and fwd rule
From: Julian Elischer [mailto:[EMAIL PROTECTED]] On Tue, 26 Nov 2002, Don Bowman wrote: If I create a rule to 'fwd' packets with a particular TCP option set (or IP option) to a specific local port, and then I accept on that port, will subsequent packets without that option work? ... well, no, because != 1234 :-) but, assuming that your rule said , then it would only reach you if it has the ts option set. to be forwarded a packet must match teh rule.. subsequent packewts must ALSO match the rule. Sigh, I guess TANSTAAFL shows true. I was hoping once the PCB was setup that it could act like some sort of packet attractor. Or in other words, to get the packet stream to play follow the leader on the syn. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch
From: John Polstra [mailto:[EMAIL PROTECTED]] In article 184f01c291c9$147e7100$[EMAIL PROTECTED], Sam Leffler [EMAIL PROTECTED] wrote: I would recommend a committer look this over and commit it. If you wish, I can make the patch *just* be the change (changing the 16-bit to 32-bit writes, without the VPD stuff), but the other changes seemed generally useful. Please whittle the patch down to just the bug fix; 5.0 is in code freeze. Don't worry, Sam. I'm planning to shepherd this stuff into the tree, but I don't see it happening for 5.0. Be aware that the bge driver is not too useful (and quite dangerous) without this change. Personally I'd like to see it go in 4.8. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Sockets and changing IP addresses
From: Wes Peters [mailto:[EMAIL PROTECTED]] Archie Cobbs wrote: I'm curious what -net's opinion is on PR kern/38544: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/38554 In summary: if you have a connected socket whose local IP address is X, and then change the interface IP address from X to Y, then packets written out by the socket will continue to be transmitted with source IP address X. Do people agree that this is a bug and should be fixed? Yes. The other end can't possibly reply to address X, so the connection is broken at this point. I think the current behaviour is correct. Since the IP-MAC lookup will remain cached, the communication will continue to work to the old IP. Changing the IP on the connected socket will make the connection drop. The best case is the the way it works. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Sockets and changing IP addresses
From: Archie Cobbs [mailto:[EMAIL PROTECTED]] Sent: November 21, 2002 16:54 To: Don Bowman Cc: 'Wes Peters'; Archie Cobbs; [EMAIL PROTECTED] Subject: Re: Sockets and changing IP addresses Don Bowman wrote: I'm curious what -net's opinion is on PR kern/38544: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/38554 In summary: if you have a connected socket whose local IP address is X, and then change the interface IP address from X to Y, then packets written out by the socket will continue to be transmitted with source IP address X. Do people agree that this is a bug and should be fixed? Yes. The other end can't possibly reply to address X, so the connection is broken at this point. I think the current behaviour is correct. Since the IP-MAC lookup will remain cached, the communication will continue to work to the old IP. Changing the IP on the connected socket will make the connection drop. The best case is the the way it works. What you're saying doesn't make sense to me. First of all, this has nothing to do with ARP tables (although you are right that the router's ARP entry for the old IP address will remain valid). Secondly, the communiation will NOT work because the host will drop packets sent to it with the (now) wrong IP address. The current behavior is bad because the application does not ever receive any notification that the socket it's using is no longer valid. I guess I was thinking of the transparent proxy case (e.g. Squid) where I have a ipfw fwd rule, and the socket is terminated locally. Changing the IP address of the interface shouldn't drop my proxied connection. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch
From: Sam Leffler [mailto:[EMAIL PROTECTED]] I would recommend a committer look this over and commit it. If you wish, I can make the patch *just* be the change (changing the 16-bit to 32-bit writes, without the VPD stuff), but the other changes seemed generally useful. Please whittle the patch down to just the bug fix; 5.0 is in code freeze. Sam Sigh, I was afraid someone would say that. Will do. The patch is against RELENG_4, but is fairly trivial. It is below, just the bug fix is there (changing the writing to the receiver control block to be 32-bits all the time). Patch follows: Index: if_bge.c === RCS file: /cvs/src/sys/dev/bge/if_bge.c,v retrieving revision 1.3.2.18 diff -U3 -r1.3.2.18 if_bge.c --- if_bge.c2 Nov 2002 18:22:23 - 1.3.2.18 +++ if_bge.c22 Nov 2002 02:01:48 - @@ -913,7 +913,7 @@ { int i; struct bge_rcb *rcb; - struct bge_rcb_opaque *rcbo; + bge_max_len_flags len_flags; for (i = 0; i BGE_JUMBO_RX_RING_CNT; i++) { if (bge_newbuf_jumbo(sc, i, NULL) == ENOBUFS) @@ -923,9 +923,9 @@ sc-bge_jumbo = i - 1; rcb = sc-bge_rdata-bge_info.bge_jumbo_rx_rcb; - rcbo = (struct bge_rcb_opaque *)rcb; - rcb-bge_flags = 0; - CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_MAXLEN_FLAGS, rcbo-bge_reg2); + len_flags.bge_len_flags = rcb-bge_len_flags.bge_len_flags; + len_flags.s.bge_flags = 0; + CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_MAXLEN_FLAGS, len_flags.bge_len_flags); CSR_WRITE_4(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc-bge_jumbo); @@ -1133,6 +1133,7 @@ struct bge_rcb *rcb; struct bge_rcb_opaque *rcbo; int i; + bge_max_len_flags len_flags; /* * Initialize the memory window pointer register so that @@ -1202,12 +1203,13 @@ rcb = sc-bge_rdata-bge_info.bge_std_rx_rcb; BGE_HOSTADDR(rcb-bge_hostaddr) = vtophys(sc-bge_rdata-bge_rx_std_ring); - rcb-bge_max_len = BGE_MAX_FRAMELEN; + len_flags.s.bge_max_len = BGE_MAX_FRAMELEN; + len_flags.s.bge_flags = 0; + rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags; if (sc-bge_extram) rcb-bge_nicaddr = BGE_EXT_STD_RX_RINGS; else rcb-bge_nicaddr = BGE_STD_RX_RINGS; - rcb-bge_flags = 0; rcbo = (struct bge_rcb_opaque *)rcb; CSR_WRITE_4(sc, BGE_RX_STD_RCB_HADDR_HI, rcbo-bge_reg0); CSR_WRITE_4(sc, BGE_RX_STD_RCB_HADDR_LO, rcbo-bge_reg1); @@ -1224,12 +1226,13 @@ rcb = sc-bge_rdata-bge_info.bge_jumbo_rx_rcb; BGE_HOSTADDR(rcb-bge_hostaddr) = vtophys(sc-bge_rdata-bge_rx_jumbo_ring); - rcb-bge_max_len = BGE_MAX_FRAMELEN; + len_flags.s.bge_max_len = BGE_MAX_FRAMELEN; + len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED; + rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags; if (sc-bge_extram) rcb-bge_nicaddr = BGE_EXT_JUMBO_RX_RINGS; else rcb-bge_nicaddr = BGE_JUMBO_RX_RINGS; - rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED; rcbo = (struct bge_rcb_opaque *)rcb; CSR_WRITE_4(sc, BGE_RX_JUMBO_RCB_HADDR_HI, rcbo-bge_reg0); @@ -1239,7 +1242,9 @@ /* Set up dummy disabled mini ring RCB */ rcb = sc-bge_rdata-bge_info.bge_mini_rx_rcb; - rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED; + len_flags.s.bge_max_len = 0; + len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED; + rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags; rcbo = (struct bge_rcb_opaque *)rcb; CSR_WRITE_4(sc, BGE_RX_MINI_RCB_MAXLEN_FLAGS, rcbo-bge_reg2); @@ -1259,8 +1264,9 @@ rcb = (struct bge_rcb *)(sc-bge_vhandle + BGE_MEMWIN_START + BGE_SEND_RING_RCB); for (i = 0; i BGE_TX_RINGS_EXTSSRAM_MAX; i++) { - rcb-bge_flags = BGE_RCB_FLAG_RING_DISABLED; - rcb-bge_max_len = 0; + len_flags.s.bge_max_len = 0; + len_flags.s.bge_flags = BGE_RCB_FLAG_RING_DISABLED; + rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags; rcb-bge_nicaddr = 0; rcb++; } @@ -1272,17 +1278,20 @@ BGE_HOSTADDR(rcb-bge_hostaddr) = vtophys(sc-bge_rdata-bge_tx_ring); rcb-bge_nicaddr = BGE_NIC_TXRING_ADDR(0, BGE_TX_RING_CNT); - rcb-bge_max_len = BGE_TX_RING_CNT; - rcb-bge_flags = 0; + len_flags.s.bge_max_len = BGE_TX_RING_CNT; + len_flags.s.bge_flags = 0; + rcb-bge_len_flags.bge_len_flags = len_flags.bge_len_flags; /* Disable all unused RX return rings */ rcb = (struct bge_rcb *)(sc-bge_vhandle + BGE_MEMWIN_START + BGE_RX_RETURN_RING_RCB); - for (i = 0; i BGE_RX_RINGS_MAX; i++) { + rcb++; + for (i = 1; i BGE_RX_RINGS_MAX; i++) {
bge bug w/ out of bounds return receiver, staying in rxeof all the time, patch
(apologies if you got this more than once, but after 6 hours it hadn't shown up on the mailing list) There is a bug in the STABLE (and current) if_bge which causes the driver to loop forever in interrupt context (in bge_rxeof()). This is caused by the return ring length being 1024 in the driver, and erroneously decided to be 2048 in the chip, which causes it to return an index off the end off the ring. You will know you are running into this if your kernel locks up, ^T still works, and the debugger shows you in bge_rxeof() or a routine called from it. This situation can occur regardless of traffic. It seems to either work or not work from the get-go, so if you are going to run into it, it will be boolean from the machine startup. The patch attached solves this problem by changing the 16-bit writes into the chip's memory window to 32-bit writes. The patch also enables the PCI-VPD (See PCI 2.2) output (to help diagnose which version of the chip you have, whose board, how fast the PCI clock is etc). I would recommend a committer look this over and commit it. If you wish, I can make the patch *just* be the change (changing the 16-bit to 32-bit writes, without the VPD stuff), but the other changes seemed generally useful. Index: if_bge.c === RCS file: /cvs/src/sys/dev/bge/if_bge.c,v retrieving revision 1.3.2.18 diff -U3 -r1.3.2.18 if_bge.c --- if_bge.c2 Nov 2002 18:22:23 - 1.3.2.18 +++ if_bge.c21 Nov 2002 20:13:23 - @@ -114,6 +114,7 @@ #include dev/bge/if_bgereg.h #define BGE_CSUM_FEATURES (CSUM_IP | CSUM_TCP | CSUM_UDP) +#define BGE_VPD /* controller miibus0 required. See GENERIC if you get errors here. */ #include miibus_if.h @@ -178,6 +179,7 @@ static u_int8_tbge_eeprom_getbyte __P((struct bge_softc *, int, u_int8_t *)); static int bge_read_eeprom __P((struct bge_softc *, caddr_t, int, int)); +static void dump_manufacturing_information __P((struct bge_softc *)); static u_int32_t bge_crc __P((caddr_t)); static void bge_setmulti __P((struct bge_softc *)); @@ -200,11 +202,12 @@ static int bge_chipinit__P((struct bge_softc *)); static int bge_blockinit __P((struct bge_softc *)); -#ifdef notdef +#ifdef BGE_VPD +static void bge_vpd_crack __P((struct bge_softc *sc)); static u_int8_t bge_vpd_readbyte __P((struct bge_softc *, int)); static void bge_vpd_read_res __P((struct bge_softc *, struct vpd_res *, int)); -static void bge_vpd_read __P((struct bge_softc *)); +static void bge_vpd_read __P((struct bge_softc *, const char *)); #endif static u_int32_t bge_readmem_ind @@ -311,7 +314,7 @@ return; } -#ifdef notdef +#ifdef BGE_VPD static u_int8_t bge_vpd_readbyte(sc, addr) struct bge_softc *sc; @@ -355,9 +358,54 @@ return; } +/* + * Take the read-only (VPD-R) info and crack it into the other fields +*/ +static void +bge_vpd_crack(sc) + struct bge_softc *sc; +{ + int pos = 0; + int len = strlen(sc-bge_vpd_readonly); + sc-bge_vpd_pn = unknown; + sc-bge_vpd_ec = unknown; + sc-bge_vpd_mn = unknown; + sc-bge_vpd_sn = unknown; + sc-bge_vpd_rv = unknown; + while (pos len) { + if (!strncmp(sc-bge_vpd_readonly+pos, VPD_PN, 2)) { + sc-bge_vpd_pn = (sc-bge_vpd_readonly+pos+3); + } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_EC, 2)) { + sc-bge_vpd_ec = (sc-bge_vpd_readonly+pos+3); + } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_MN, 2)) { + sc-bge_vpd_mn = (sc-bge_vpd_readonly+pos+3); + } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_SN, 2)) { + sc-bge_vpd_sn = (sc-bge_vpd_readonly+pos+3); + } else if (!strncmp(sc-bge_vpd_readonly+pos, VPD_RV, 2)) { + sc-bge_vpd_rv = (sc-bge_vpd_readonly+pos+3); + } + sc-bge_vpd_readonly[pos] = '\0'; + pos += 2; + pos += sc-bge_vpd_readonly[pos]; + pos++; + } + pos = 0; + len = strlen(sc-bge_vpd_readwrite); + while (pos len) { + if (!strncmp(sc-bge_vpd_readwrite+pos, VPD_YA, 2)) { + sc-bge_vpd_asset_tag = (sc-bge_vpd_readwrite+pos+3); + } + sc-bge_vpd_readwrite[pos] = '\0'; + pos += 2; + pos += sc-bge_vpd_readwrite[pos]; + pos++; + } +} + static void -bge_vpd_read(sc) +bge_vpd_read(sc, defname) struct bge_softc *sc; + const char *defname; { int pos = 0, i; struct vpd_res res; @@ -366,14 +414,20 @@ free(sc-bge_vpd_prodname, M_DEVBUF); if (sc-bge_vpd_readonly != NULL)
RE: bug in bge driver with ENOBUFS on 4.7
From: Don Bowman [mailto:don;sandvine.com] In bge_rxeof(), there can end up being a condition which causes the driver to endlessly interrupt. if (bge_newbuf_std(sc, sc-bge_std, NULL) == ENOBUFS) { ifp-if_ierrors++; bge_newbuf_std(sc, sc-bge_std, m); continue; } happens. Now, bge_newbuf_std returns ENOBUFS. 'm' is also NULL. This causes the received packet to not be dequeued, and the driver will then go straight back into interrupt as the chip will reassert the interrupt as soon as we return. More information... It would appear that we're looping here in the rx interrupt, the variable 'stdcnt' which counts the number of standard-sized packets pulled off per iteration is huge (indicating we've overrun the ring multiple times). while(sc-bge_rx_saved_considx != sc-bge_rdata-bge_status_block.bge_idx[0].bge_rx_prod_idx) { is the construct that controls when we exit the loop. Clearly in my case this is never becoming false. I see 'sc-bge_rx_saved_considx' as 201, and the RHS of the expression as 38442. This doesn't seem correct, I think that both numbers must be = BGE_SSLOTS. (kgdb) p/x *cur_rx $10 = {bge_addr = {bge_addr_hi = 0x0, bge_addr_lo = 0xca2d802}, bge_len = 0x4a, bge_idx = 0xc8, bge_flags = 0x7004, bge_type = 0x0, bge_tcp_udp_csum = 0x9992, bge_ip_csum = 0x, bge_vlan_tag = 0x0, bge_error_flag = 0x0, bge_rsvd = 0x0, bge_opaque = 0x0} Any suggestions anyone? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Packet forwarding overhead - with ipfw counting
From: Kevin Day [mailto:toasty;dragondata.com] When we're pushing 250-300mbits through, we're using about 15% of its 2.4Ghz P4 Xeon CPU. All of it is in interrupt time... that seems a bit high, but that'll still let us max things out at 1gbit so we're ok. Try applying these diff to your bge driver, it should reduce your interrupt time substantially in this configuration. I also increased net.inet.ip.intr_queue_maxlen to 500 from 50 since I was seeing drops. Out of curiousity, which motherboard is this? I've been doing some modelling using the e7500 vs serverworks, and the serverworks is significantly better, but no one seems to make a 1U one with 2 PCI-X slots. The e7500 has a 1GB/s half-duplex hublink versus the 3.2GB/s full-duplex one on the GC-LE. Index: if_bge.c === RCS file: /cvs/src/sys/dev/bge/if_bge.c,v retrieving revision 1.3.2.18 diff -C5 -r1.3.2.18 if_bge.c *** if_bge.c2 Nov 2002 18:22:23 - 1.3.2.18 --- if_bge.c10 Nov 2002 16:12:03 - *** *** 1654,1668 error = ENXIO; goto fail; } /* Set default tuneable values. */ sc-bge_stat_ticks = BGE_TICKS_PER_SEC; ! sc-bge_rx_coal_ticks = 150; ! sc-bge_tx_coal_ticks = 150; ! sc-bge_rx_max_coal_bds = 64; ! sc-bge_tx_max_coal_bds = 128; /* Set up ifnet structure */ ifp = sc-arpcom.ac_if; ifp-if_softc = sc; ifp-if_unit = sc-bge_unit; --- 1654,1692 error = ENXIO; goto fail; } /* Set default tuneable values. */ + /* How often should we update the statistics in host memory? */ sc-bge_stat_ticks = BGE_TICKS_PER_SEC; ! /* The coalescing works as follows: for each of Rx|Tx, there ! * are two tunables: ticks, and packets. The first one to trip ! * will cause an interrupt. For exampple, if the ticks is set to !* 1us, an interrupt will be generated no more than 1us after !* a packet has come in. If the bds is set to 10, then the !* interrupt would be after 10 packets had been received. !* If ticks=1 and bds=10, then the interrupt will come in !* min(1us, 10packets time), likely 1us. !* Tuning these to larger values reduces interrupts at the !* expense of latency to interactive applications. If you !* are serving files, make these large. If you are running !* telnet sessions, make them small. !* !* The settings below, 500us means a max interrupt rate !* of 2000/s due to the ticks elapsing, and 120 means !* a peak interrupt rate of ~2000/s due to avg packets (512) arriving !* (for min sized packets this would be 870, for max !* sized packets it would be 41: 1Gps / ((8*size)+96)) !*/ ! /* RX Interrupt no more than every 500 us */ ! sc-bge_rx_coal_ticks = 500; ! /* TX Interrupt no more than every 500 us */ ! sc-bge_tx_coal_ticks = 500; ! /* RX Interrupt no more than every 120 packets */ ! sc-bge_rx_max_coal_bds = 120; ! /* TX Interrupt no more than every 120 packets */ ! sc-bge_tx_max_coal_bds = 120; /* Set up ifnet structure */ ifp = sc-arpcom.ac_if; ifp-if_softc = sc; ifp-if_unit = sc-bge_unit; Index: if_bgereg.h === RCS file: /cvs/src/sys/dev/bge/if_bgereg.h,v retrieving revision 1.1.2.7 diff -C5 -r1.1.2.7 if_bgereg.h *** if_bgereg.h 2 Nov 2002 18:17:55 - 1.1.2.7 --- if_bgereg.h 10 Nov 2002 16:12:21 - *** *** 2057,2068 * Memory management stuff. Note: the SSLOTS, MSLOTS and JSLOTS * values are tuneable. They control the actual amount of buffers * allocated for the standard, mini and jumbo receive rings. */ ! #define BGE_SSLOTS256 ! #define BGE_MSLOTS256 #define BGE_JSLOTS384 #define BGE_JRAWLEN (BGE_JUMBO_FRAMELEN + ETHER_ALIGN + sizeof(u_int64_t)) #define BGE_JLEN (BGE_JRAWLEN + (sizeof(u_int64_t) - \ (BGE_JRAWLEN % sizeof(u_int64_t --- 2057,2068 * Memory management stuff. Note: the SSLOTS, MSLOTS and JSLOTS * values are tuneable. They control the actual amount of buffers * allocated for the standard, mini and jumbo receive rings. */ ! #define BGE_SSLOTS384 ! #define BGE_MSLOTS384 #define BGE_JSLOTS384 #define BGE_JRAWLEN (BGE_JUMBO_FRAMELEN + ETHER_ALIGN + sizeof(u_int64_t)) #define BGE_JLEN (BGE_JRAWLEN + (sizeof(u_int64_t) - \ (BGE_JRAWLEN % sizeof(u_int64_t --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Suggestions for tcbhashsize size?
Are there any guidelines for setting the tcbhashsize ? I have a system which I'm expecting to keep ~50K TCP connections going. Does it follow standard hash table rules that it should be less than half full? I currently have net.inet.tcp.tcbhashsize: 4096 --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
bug in bge driver with ENOBUFS on 4.7
In bge_rxeof(), there can end up being a condition which causes the driver to endlessly interrupt. if (bge_newbuf_std(sc, sc-bge_std, NULL) == ENOBUFS) { ifp-if_ierrors++; bge_newbuf_std(sc, sc-bge_std, m); continue; } happens. Now, bge_newbuf_std returns ENOBUFS. 'm' is also NULL. This causes the received packet to not be dequeued, and the driver will then go straight back into interrupt as the chip will reassert the interrupt as soon as we return. Suggestions on a fix? I'm not sure why I ran out of mbufs, I have kern.ipc.nmbclusters: 9 kern.ipc.nmbufs: 28 (kgdb) p/x mbstat $11 = {m_mbufs = 0x3a0, m_clusters = 0x39c, m_spare = 0x0, m_clfree = 0x212, m_drops = 0x0, m_wait = 0x0, m_drain = 0x0, m_mcfail = 0x0, m_mpfail = 0x0, m_msize = 0x100, m_mclbytes = 0x800, m_minclsize = 0xd5, m_mlen = 0xec, m_mhlen = 0xd4} but bge_newbuf_std() does this: if (m == NULL) { MGETHDR(m_new, M_DONTWAIT, MT_DATA); if (m_new == NULL) { return(ENOBUFS); } and then returns ENOBUFS. This is with 4.7-RELEASE. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: dhclient turns ethernet card off
From: alexis georges [mailto:floating_in_space_;hotmail.com] hey guys we had a power cut yesterday..all went down at our home.. when we got electricity back, my internet wouldnot work..only my computer atually..i found that my eth. card would not turn on..or actually i just foung out now..it does turn on until it get to the 'dhclient dc0' lne in rc.conf..which i need..basically during boot up it turns on..and when it has t exectute the dchlient line, the light on the card disapears..its weird.i have a linksys (LNE TX?) anyways the way i have to have my card going is by having start_if.dc0 with a line that turns my card to half-duplex (it needs to be like this) and then in the rc.conf i have the ifconfig_dc0=DHCP anyone knoe what could cause my card to literally shut down on dhclient? i already tried just in case to change PCI slots, but nothing changed.. thanks in advance some routers disable ports if they see too many errors, e.g. due to a duplex mismatch. Is your router set to auto? and the NIC is as well? --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: MTU problems ...
From: Julian Elischer [mailto:julian;elischer.org] There is a program that intercepts tcp session negotiation and artificially reduces the negotiated MTU but I can't find it right now.. I think it was called mssd or something. /usr/ports/net/tcpmssd --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Problem in High Speed and Long Delay with FreeBSD
From: Fran Lawas-Grodek [mailto:Fran.Lawas-Grodek;grc.nasa.gov] Perhaps sysctl net.inet.tcp.inflight_enable=1 will help? you may wish to also change tcp.inflight_max. See tcp(4) as of 4.7. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Problem in High Speed and Long Delay with FreeBSD
From: Fran Lawas-Grodek [mailto:Fran.Lawas-Grodek;grc.nasa.gov] Well... our development code that we are to ultimately test was developed on 4.1, thus we really need to try to stick with 4.1. It does not look like either of the above parameters are available until 4.7. No worries. Have you checked that both sides are negotiating SACK? And both sides are negotiating a window scale option sufficiently large? (sounds like you need a window scale option of at least 5 bits?) And the socket-buffer to ttcp is actually being set as large as you think? (perhaps run 'ktrace' or 'truss' on ttcp and look for an error on the setsockopt). http://www.rfc-editor.org/rfc/rfc1323.txt has some other suggestions I think, but I'm guessing you've already gone over it. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Problem in High Speed and Long Delay with FreeBSD
From: Mark Allman [mailto:mallman;grc.nasa.gov] Thanks! Other ideas? What MSS is advertised on each end? --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Data payload in SYN packet
From: David Myer [mailto:davidmyer800;yahoo.com] Just curious on one thing, we know that SYN packet can carry data payload, but I never see any implementation that actually does this. See T/TCP, RFC 1644, and sysctl 'net.inet.tcp.rfc1644' --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: ng_fec hash mechanism versus cisco etherchannel
From: Petri Helenius [mailto:pete;he.iki.fi] It does not matter if you send using the other link as long as you send all packets for the same stream over the same link to avoid reordering. So yes, it does interoperate. can you end up with a link flap? e.g. the catalyst does SA learning to pick the port, so it sends it out port 1. We respond via port 2 since we use the SIP^DIP. The catalyst switches that through to the other end, which replies, and comes back via port 1. I guess this isn't tragic. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: spoofing source code in kernel
From: sepehr sohrabi [mailto:sepehr_soh;hotmail.com] Hi list Anyone has source code for spoofing (in kernel) for all input Tcp/IP packets .For any TCP/IP packet recieve it creates an ACK for it . someThing like spoofing GW CLIENT - GW --- server connections are spoofed THANX ipfw with a 'fwd' rule will let you do something like this. Run a user-mode application on port X, then do ipfw fwd localhost,X tcp from any to any recv myinterface and any inbound TCP connection will be terminated locally. --don ([EMAIL PROTECTED] www.sandvine.com p2p) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Annoying ARP warning messages.
From: Julian Elischer [mailto:julian;elischer.org] On Mon, 28 Oct 2002, Sean Chittenden wrote: In this example, does the xl0 interface share the same MAC address? umm actually, yes.. sends switches insane.. :-) if you don't do the step about source Mac address replacement then they have different addresses. (though I can't guarantee that) Is there support for 802.3ad in FreeBSD? This would be the best way to gang interfaces together in a standard fashion. It involves LACP (Link Aggregation Control Protocol), which prevents loops @ L2 (I think its an extension of STP). Packet reordering is also solved (the simple round robin scheme achieves rather poor performance due to this problem). Another way to do it is with OSPF ECMP (Equal-Cost Multipath Routing), depends on whether you think L2 is cool or L3 :) --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Annoying ARP warning messages.
From: Julian Elischer [mailto:julian;elischer.org] Is there support for 802.3ad in FreeBSD? This would be the best way to gang interfaces together in a standard fashion. It involves LACP (Link Aggregation Control Protocol), which prevents loops @ L2 (I think its an extension of STP). Packet reordering is also solved (the simple round robin scheme achieves rather poor performance due to this problem). This could be (relatively) easy in netgraph.. it was designed for that sort of thing. I assume you mean with a user-mode daemon, sort of a LACPD, like in the linux model? (http://www.st.rim.or.jp/~yumo/), and then a version of one2many that did the src^dst hash to prevent re-ordering? Or would you implement the control protocol inside netgraph as well? On a side note, is there anything netgraph can't solve :) --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: device fxp cannot detect Intel On-Board LAN
From: Ng Wee Yong [mailto:ngweeyong;yahoo.com.sg] I just install the FreeBSD 4.6.2 - STABLE version. My motherboard is a MSI 845GE Max-L, 1.8Ghz Pentium 4, On-board LAN is Intel 82562. FreeBSD just work fine accept it cannot detect my On-Board Intel LAN. ... kern/39974 describes the issue. http://www.geocrawler.com/archives/3/145/2002/6/50/9058043/ has a solution for you, changing one line in the fxp driver to give it this pci vendor/device id. There is a comment that Committed to -current, will be MFC'd to -stable very soon. suggesting this might be in 4.7 stable already. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Annoying ARP warning messages.
Kevin Stevens wrote: I have two systems connected through a common network (switch). They each have two NICs, with one addressed on one IP network and the second on another. IP works fine. My problem is that the kernel keeps bitching about seeing the same MAC addresses on both interfaces: Oct 26 06:15:03 babelfish /kernel: arp: 192.168.168.101 is on em0 but got reply from 00:30:65:00:e6:e6 on xl0 systcl net.link.ether.inet.log_arp_wrong_iface=0 --don ([EMAIL PROTECTED] www.sandvine.com p2p) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Annoying ARP warning messages.
From: Julian Elischer [mailto:julian;elischer.org] (removed as to why have two NICs on the same network, sending for general enlightenment of the list...) This is reasonably common in L2 switched Ethernet. You have a device which segments the traffic just fine with MAC learning. You have the cables all going to the desktops. You don't want to muck around with partially supported VLAN tagging @ the desktop. So you run another network overtop the same Ethernet. You probably wouldn't architect it up front for that (although I have in our lab, we use a cat6k for a virtual patch panel, but individual tests use whatever IP's they desire). @ the Ethernet level, addressing is only done via MAC address. Having two packets on the same wire with differing IP subnets is legal (in fact, you see it all the time with the destination or source address which is off your network). ARP's and all 1's broadcasts (e.g. DHCP) make a bit of a mess of such a network, but sometimes that's the lesser evil. This can also be seen, believe it or not, on a routed network, if you have something like spanning tree protocol which hasn't converged yet, but has been set for rapid convergence (which assumes the path isn't a loop until it discovers otherwise). Routers and switches are merging. --don ([EMAIL PROTECTED] www.sandvine.com p2p) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
panic in 4.7 in close / sbdrop
I have a machine running 4.7. I can panic it by sending a reasonably high load of tcp open/close from/to it. The trace below is from a socket from localhost to localhost (sendmail). The max number of open file descriptors I would have had would be ~4500. The rx buffer says it has 43008 bytes, but there are no mbufs chained. The system was not out of mbufs or clusters. Suggestions on what I might look @? #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 #1 0xc01c41c7 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316 #2 0xc01c4639 in panic (fmt=0xc0331205 sbdrop) at /usr/src/sys/kern/kern_shutdown.c:595 #3 0xc01e60e7 in sbdrop (sb=0xeaf677e8, len=43008) at /usr/src/sys/kern/uipc_socket2.c:877 #4 0xc01e607c in sbflush (sb=0xeaf677e8) at /usr/src/sys/kern/uipc_socket2.c:852 #5 0xc022697f in tcp_disconnect (tp=0xecf24a40) at /usr/src/sys/netinet/tcp_usrreq.c:1077 #6 0xc02260f2 in tcp_usr_disconnect (so=0xeaf677a0) at /usr/src/sys/netinet/tcp_usrreq.c:406 #7 0xc01e3450 in sodisconnect (so=0xeaf677a0) at /usr/src/sys/kern/uipc_socket.c:422 #8 0xc01e326a in soclose (so=0xeaf677a0) at /usr/src/sys/kern/uipc_socket.c:302 #9 0xc01d73fa in soo_close (fp=0xd049ab80, p=0xe91bd5a0) at /usr/src/sys/kern/sys_socket.c:195 #10 0xc01b9c37 in fdrop (fp=0xd049ab80, p=0xe91bd5a0) at /usr/src/sys/sys/file.h:217 #11 0xc01b9b7f in closef (fp=0xd049ab80, p=0xe91bd5a0) at /usr/src/sys/kern/kern_descrip.c:1277 #12 0xc01b978c in fdfree (p=0xe91bd5a0) at /usr/src/sys/kern/kern_descrip.c:1059 #13 0xc01bc475 in exit1 (p=0xe91bd5a0, rv=0) at /usr/src/sys/kern/kern_exit.c:187 #14 0xc01bc2dc in exit1 (p=0xe91bd5a0, rv=16777218) at /usr/src/sys/kern/kern_exit.c:103 #15 0xc02edc71 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 15, tf_ebp = -1077950764, tf_isp = -221909036, tf_ebx = 0, tf_edx = 126, tf_ecx = -1077950820, tf_eax = 1, tf_trapno = 0, tf_err = 2, tf_eip = 673302376, tf_cs = 31, tf_eflags = 659, tf_esp = -1077950856, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1175 #16 0xc02da38b in Xint0x80_syscall () void sbdrop(sb, len) register struct sockbuf *sb; register int len; { register struct mbuf *m; struct mbuf *next; next = (m = sb-sb_mb) ? m-m_nextpkt : 0; while (len 0) { if (m == 0) { if (next == 0) panic(sbdrop); m = next; next = m-m_nextpkt; continue; } (kgdb) p/x *sb $39 = {sb_cc = 0xa800, sb_hiwat = 0xe000, sb_mbcnt = 0xbd00, sb_mbmax = 0x4, sb_lowat = 0x1, sb_mb = 0x0, sb_mbtail = 0x0, sb_lastrecord = 0x0, sb_sel = {si_pid = 0x0, si_note = {slh_first = 0x0}, si_flags = 0x0}, sb_flags = 0x0, sb_timeo = 0x0} called from: void sbflush(sb) register struct sockbuf *sb; { KASSERT((sb-sb_flags SB_LOCK) == 0, (sbflush: locked)); while (sb-sb_mbcnt) sbdrop(sb, (int)sb-sb_cc); called from: static struct tcpcb * tcp_disconnect(tp) register struct tcpcb *tp; { struct socket *so = tp-t_inpcb-inp_socket; if (tp-t_state TCPS_ESTABLISHED) tp = tcp_close(tp); else if ((so-so_options SO_LINGER) so-so_linger == 0) tp = tcp_drop(tp, 0); else { soisdisconnecting(so); sbflush(so-so_rcv); tp = tcp_usrclosed(tp); if (tp) (void) tcp_output(tp); } return (tp); } (kgdb) p/x *tp $44 = {t_segq = {lh_first = 0x0}, t_dupacks = 0x0, unused = 0x0, tt_rexmt = 0xecf24b24, tt_persist = 0xecf24b3c, tt_keep = 0xecf24b54, tt_2msl = 0xecf24b6c, tt_delack = 0xecf24b84, t_inpcb = 0xecf24980, t_state = 0x4, t_flags = 0x801e0, t_force = 0x0, snd_una = 0x8bcbf58f, snd_max = 0x8bcbf58f, snd_nxt = 0x8bcbf58f, snd_up = 0x8bcbf58f, snd_wl1 = 0xab47117a, snd_wl2 = 0x8bcbf58f, iss = 0x8bcbf3cb, irs = 0xab4710f2, rcv_nxt = 0xab47fea8, rcv_adv = 0xab47f17a, rcv_wnd = 0xe000, rcv_up = 0xab47117a, snd_wnd = 0xe000, snd_cwnd = 0x, snd_bwnd = 0x3fffc000, snd_ssthresh = 0x3fffc000, snd_bandwidth = 0x0, snd_recover = 0x8bcbf3cb, t_maxopd = 0x3fd8, t_rcvtime = 0x101c3f1, t_starttime = 0x4588, t_rtttime = 0x0, t_rtseq = 0x8bcbf52f, t_bw_rtttime = 0x4588, t_bw_rtseq = 0x0, t_rxtcur = 0x4b0, t_maxseg = 0x3800, t_srtt = 0x14, t_rttvar = 0xb, t_rxtshift = 0x0, t_rttmin = 0x3e8, t_rttbest = 0x1f, t_rttupdated = 0x5, max_sndwnd = 0xe000, t_softerror = 0x0, t_oobflags = 0x0, t_iobc = 0x0, snd_scale = 0x0, rcv_scale = 0x0, request_r_scale = 0x0, requested_s_scale = 0x0, ts_recent = 0x101c3f1, ts_recent_age = 0x101c3f1, last_ack_sent = 0xab47fea8, cc_send = 0x0, cc_recv = 0x0, snd_cwnd_prev = 0x0, snd_ssthresh_prev = 0x0, t_badrxtwin = 0x0} (kgdb) p/x
RE: Machine becomes non-responsive, only ^T shows it as alive under l oad: IPFW, TCP proxying
From: Kevin Stevens [mailto:Kevin_Stevens;pursued-with.net] Any suggestions for how one would start debugging this to find out where its stuck, and how? At a guess, you need to tune the state-table retention time down. If by that you mean the MSL? I've set the MSL to 5000 in this case. Or do you mean something else? Should the machine lockup this way? How does one debug where its gone? --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: Machine becomes non-responsive, only ^T shows it as alive under l oad: IPFW, TCP proxying
From: Don Bowman I have an application listening on an ipfw 'fwd' rule. I'm sending ~3K new sessions per second to it. It has to turn around and issue some of these out as a proxy, in response to which some of them the destination host won't exist. For reference, the solution is to upgrade to the latest -STABLE bge driver. The machine was getting stuck in interrupt. --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Machine becomes non-responsive, only ^T shows it as alive under load: IPFW, TCP proxying
I have an application listening on an ipfw 'fwd' rule. I'm sending ~3K new sessions per second to it. It has to turn around and issue some of these out as a proxy, in response to which some of them the destination host won't exist. I have RST limiting on. I'm seeing messages like: Limiting open port RST response from 1312 to 200 packets per second come out sometimes. After a while of such operation (~1/2 hour), the machine becomes unresponsive: the network interfaces no longer respond, the serial console responds to ^T yielding a status line, but ^C etc do nothing, and the bash which was there won't give me a prompt. ^T indicates my bash is running, 0% of CPU in use, etc. I have no choice but to power-cycle it. Any suggestions for how one would start debugging this to find out where its stuck, and how? This is running 4.7 STABLE on a single XEON 2.0 GHz, 1GB of memory. The bandwidth wasn't that high, varying between 3 and 30Mbps. Perhaps related, sometimes I get: bge0: watchdog timeout -- resetting The only NIC which is active is bge0. I have an 'em0' which is idle (no IP), and an fxp0 (which has an IP but is idle). --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
panic with ipfw / dummynet in 4.7 STABLE
Take a 4.7 image. Using if_em (if it matters). Turn on bridging (em0, em2), add these ipfw rules: ipfw add 305 prob 0.01 drop MAC any 00:04:76:f3:2d:0a setup ipfw add 310 prob 0.01 reject MAC any 00:04:76:f3:2d:0a setup ipfw add 320 prob 0.01 unreach host MAC any 00:04:76:f3:2d:0a setup ipfw add 325 prob 0.01 unreach port MAC any 00:04:76:f3:2d:0a setup ipfw add pipe 1 config delay 90 plr 0.0001 ipfw add pipe 2 config delay 150 plr 0.0005 ipfw add 340 prob 0.5 pipe 1 ip from any to any ipfw add 345 prob 0.5 pipe 2 ip from any to any The system panics almost immediately (~1s). The panic and trace is below. Its doubtful much traffic was present on the em0 or em2 interfaces so this probably happened on the first packet. I'll turn on -g in the kernel (I thought for sure it was, but seems no...) and re-run. This is with -DIPFW2 on. So I'm doing: # kldload if_em # sysctl net.link.ether.bridge_cfg=em0 em2 # sysctl net.link.ether.bridge=1 (after the machine has booted). Then I run the script above to add the ipfw rules, and it tips over. bash-2.05a# uname -a FreeBSD TPC-E1-34 4.7-STABLE FreeBSD 4.7-STABLE #7: Tue Oct 22 22:07:55 EDT 2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/TPC i386 Machine is a 2x XEON 2.0 GHz w/ Intel 82544 on the motherboard, and an Intel 82546EB dual GE card in a PCI slot. It is SMP enabled. SMP 4 cpus IdlePTD at phsyical address 0x0043 initial pcb at physical address 0x00369780 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode mp_lock = 0002; cpuid = 0; lapic.id = fault virtual address = 0x4007 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0204565 stack pointer = 0x10:0xff807eb4 frame pointer = 0x10:0xff807edc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = net - SMP: XXX trap number = 12 panic: page fault mp_lock = 0002; cpuid = 0; lapic.id = boot() called on cpu#0 syncing disks... Fatal trap 12: page fault while in kernel mode mp_lock = 0003; cpuid = 0; lapic.id = fault virtual address = 0x30 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0266e11 stack pointer = 0x10:0xff807cc4 frame pointer = 0x10:0xff807ccc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = net bio - SMP: XXX trap number = 12 panic: page fault mp_lock = 0003; cpuid = 0; lapic.id = boot() called on cpu#0 Uptime: 3m6s #0 0xc01b19b2 in dumpsys () #1 0xc01b1783 in boot () #2 0xc01b1bdc in poweroff_wait () #3 0xc02cb508 in trap_fatal () #4 0xc02cb199 in trap_pfault () #5 0xc02cad37 in trap () #6 0xc0266e11 in acquire_lock () #7 0xc026af24 in softdep_update_inodeblock () #8 0xc0265f45 in ffs_update () #9 0xc026e357 in ffs_sync () #10 0xc01e29bf in sync () #11 0xc01b151e in boot () #12 0xc01b1bdc in poweroff_wait () #13 0xc02cb508 in trap_fatal () #14 0xc02cb199 in trap_pfault () #15 0xc02cad37 in trap () #16 0xc0204565 in dummynet_io () #17 0xc020991c in ip_input () #18 0xc0209ec7 in ipintr () #19 0xc02bca91 in swi_net_next () Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.7-STABLE #7: Tue Oct 22 22:07:55 EDT 2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/TPC Timecounter i8254 frequency 1193182 Hz CPU: Pentium 4 (1996.60-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf24 Stepping = 4 Features=0x3febfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA ,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,b28,ACC real memory = 1073217536 (1048064K bytes) avail memory = 1039532032 (1015168K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 - irq 0 Programming 24 pins in IOAPIC #1 Programming 24 pins in IOAPIC #2 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee0 cpu1 (AP): apic id: 6, version: 0x00050014, at 0xfee0 cpu2 (AP): apic id: 1, version: 0x00050014, at 0xfee0 cpu3 (AP): apic id: 7, version: 0x00050014, at 0xfee0 io0 (APIC): apic id: 2, version: 0x00178020, at 0xfec0 io1 (APIC): apic id: 3, version: 0x00178020, at 0xfec8 io2 (APIC): apic id: 4, version: 0x00178020, at 0xfec80400 Preloaded elf kernel kernel at 0xc0411000. Preloaded elf module if_fxp.ko at 0xc041109c. Preloaded elf module miibus.ko at 0xc041113c. netsmb_dev: loaded Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 24
RE: panic with ipfw / dummynet in 4.7 STABLE
From: Don Bowman [mailto:don;sandvine.com] Take a 4.7 image. Using if_em (if it matters). Turn on bridging (em0, em2), add these ipfw rules: ... Here's the same thing again with -g on. #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 #1 0xc01b1783 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:316 #2 0xc01b1bdc in poweroff_wait (junk=0xc032b319, howto=-1070420497) at /usr/src/sys/kern/kern_shutdown.c:595 #3 0xc02cb508 in trap_fatal (frame=0xff807c84, eva=48) at /usr/src/sys/i386/i386/trap.c:974 #4 0xc02cb199 in trap_pfault (frame=0xff807c84, usermode=0, eva=48) at /usr/src/sys/i386/i386/trap.c:867 #5 0xc02cad37 in trap (frame={tf_fs = 1714618392, tf_es = -8388592, tf_ds = -935985136, tf_edi = 0, tf_esi = -935921920, tf_ebp = -8356660, tf_isp = -8356688, tf_ebx = -1070251844, tf_edx = 1744882756, tf_ecx = -424745920, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1071223279, tf_cs = 8, tf_eflags = 66054, tf_esp = -935921920, tf_ss = -935921920}) at /usr/src/sys/i386/i386/trap.c:466 #6 0xc0266e11 in acquire_lock (lk=0xc03540bc) at machine/globals.h:114 #7 0xc026af24 in softdep_update_inodeblock (ip=0xc836f700, bp=0xd49e0184, waitfor=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3813 #8 0xc0265f45 in ffs_update (vp=0xe6aee440, waitfor=0) at /usr/src/sys/ufs/ffs/ffs_inode.c:106 #9 0xc026e357 in ffs_sync (mp=0xc82af600, waitfor=2, cred=0xc2066700, p=0xc0382120) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1025 #10 0xc01e29bf in sync (p=0xc0382120, uap=0x0) at /usr/src/sys/kern/vfs_syscalls.c:576 #11 0xc01b151e in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:235 #12 0xc01b1bdc in poweroff_wait (junk=0xc032b319, howto=-1070420497) at /usr/src/sys/kern/kern_shutdown.c:595 #13 0xc02cb508 in trap_fatal (frame=0xff807e74, eva=1073741831) at /usr/src/sys/i386/i386/trap.c:974 #14 0xc02cb199 in trap_pfault (frame=0xff807e74, usermode=0, eva=1073741831) at /usr/src/sys/i386/i386/trap.c:867 #15 0xc02cad37 in trap (frame={tf_fs = -935985128, tf_es = -8388592, tf_ds = -1071644656, tf_edi = -1039546112, tf_esi = -1039546112, tf_ebp = -8356132, tf_isp = -8356192, tf_ebx = 1073741823, tf_edx = 1073741823, tf_ecx = -935640092, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1071626907, tf_cs = 8, tf_eflags = 66054, tf_esp = 24, tf_ss = -1039697888}) at /usr/src/sys/i386/i386/trap.c:466 #16 0xc0204565 in dummynet_io (m=0xc209c900, pipe_nr=1, dir=2, fwa=0xff807f34) at /usr/src/sys/netinet/ip_dummynet.c:1103 #17 0xc020991c in ip_input (m=0xc209c900) at /usr/src/sys/netinet/ip_input.c:459 #18 0xc0209ec7 in ipintr () at /usr/src/sys/netinet/ip_input.c:843 #19 0xc02bca91 in swi_net_next () (kgdb) l 1098 * this is a dummynet rule, so we expect a O_PIPE or O_QUEUE rule 1099 */ 1100fs = locate_flowset(pipe_nr, fwa-rule); 1101if (fs == NULL) 1102goto dropit ; /* this queue/pipe does not exist! */ 1103pipe = fs-pipe ; 1104if (pipe == NULL) { /* must be a queue, try find a matching pipe */ 1105for (pipe = all_pipes; pipe pipe-pipe_nr != fs-parent_nr; 1106 pipe = pipe-next) 1107; (kgdb) p fs $1 = (struct dn_flow_set *) 0x3fff ILLEGAL VALUE (kgdb) p pipe_nr $6 = 1714618368 (kgdb) p/x pipe_nr $7 = 0x6633 (kgdb) p/x fwa $8 = 0xff807f34 (kgdb) p/x fwa-rule $9 = 0xc83b43c0 (kgdb) p/x *fwa-rule $10 = {next = 0xc83c2900, next_rule = 0x0, act_ofs = 0x0, cmd_len = 0x4, rulenum = 0x154, set = 0x0, _pad = 0x0, pcnt = 0x1, bcnt = 0x20, timestamp = 0x3db60b9b, cmd = {{opcode = 0x29, len = 0x2, arg1 = 0x0}}} (kgdb) --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
dynamic load of em/fxp/bge
I am trying to load the if_em, if_fxp, if_bge drivers via /boot/loader.conf. I've added if_fxp_load=YES if_bge_load=YES if_em_load=YES The problem is that the bge driver doesn't load. It will if I manually load it after startup with kldload. The issue seems to be a dependency on miibus, both fxp and bge want to load it, bge gets an error that its already loaded. I tried putting 'miibus_load=YES' in loader.conf, but the same affect is seen. I've tried from the boot prompt doing an explicit load of these manually in each order, but to no avail. As a work-around, I've placed an kldload if_bge in rc.network before the 'ifconfig -l'. Any suggestions on why the fxp/bge don't play nice when loaded automatically, but will work if run manually? Is there a timing thing that the fxp hasn't initialised its miibus yet? I have: fxp0 fxp1 bge0 in this particular machine. The bge will get miibus2 (eventually), leaving fxp0 to have miibus0, fxp1 to have miibus1 I think. Suggestions? --don To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: ENOBUFS
Sam Leffler wrote: Try my port of the netbsd kttcp kernel module. You can find it at http://www.freebsd.org/~sam this seems to use some things from netbsd like so_rcv.sb_lastrecord and SBLASTRECORDCHK/SBLASTMBUFCHK. Is there something else I need to apply to build it on freebsd -STABLE? --don ([EMAIL PROTECTED] www.sandvine.com) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
dynamic load of em/fxp/bge
I am trying to load the if_em, if_fxp, if_bge drivers via /boot/loader.conf. I've added if_fxp_load=YES if_bge_load=YES if_em_load=YES The problem is that the bge driver doesn't load. It will if I manually load it after startup with kldload. The issue seems to be a dependency on miibus, both fxp and bge want to load it, bge gets an error that its already loaded. I tried putting 'miibus_load=YES' in loader.conf, but the same affect is seen. I've tried from the boot prompt doing an explicit load of these manually in each order, but to no avail. As a work-around, I've placed an kldload if_bge in rc.network before the 'ifconfig -l'. Any suggestions on why the fxp/bge don't play nice when loaded automatically, but will work if run manually? Is there a timing thing that the fxp hasn't initialised its miibus yet? I have: fxp0 fxp1 bge0 in this particular machine. The bge will get miibus2 (eventually), leaving fxp0 to have miibus0, fxp1 to have miibus1 I think. Suggestions? --don To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
intel dual gigabit, 82546EB support
Is anyone using the intel dual gigabit 82546EB? Does it appear as two separate em devices, eg em0 and em1? http://www.intel.com/network/connectivity/products/pro1000mt_dual_server_ada pter.htm is a card that has it, also some of the newer supermicro motherboards (and probably others) incorporate this device. The em driver does have support for it, but I can't see how it would make two interfaces from it? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
RE: new zero copy sockets patches available
Andrew Gallatin writes: Kenneth D. Merry writes: I have released a new set of zero copy sockets patches, against -current from today (May 17th, 2002). Hi Ken, I'm glad to see that you're still maintining this! Assuming the mutex issues get sorted out, what do you think the odds are of getting this into the tree? The only possible issue I see is with the tigon firmware. Is the firmware you're using of the same vintage as what's in the tree now? Does it contain all the same fixes? As a related question, will this work with the broadcom gigabit (bge) driver, which is the Tigon III? If not, what would it take to get it working? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message