Re: carp and random disconnects
On 2006/03/10 17:23, Bryan Irvine wrote: So what we have are some servers on LAN1 with a default gateway of the carp IP on the firewalls. Somebody located on either LAN2 or LAN3 telnets to one of those servers, get connected and goes on about their daily business. Sometime later their connection drops. It happened after we installed the carp firewalls, and seems to be related to ICMP-Redirect coming from the real IP, as opposed to the carp one the request went to. good description, thanks. turning off redirects (sysctl -w net.inet.ip.redirect=0) would let you verify this hypothesis, and if it's valid and the traffic to the LANs isn't too heavy, could give you a work-around too. if not, maybe a packet trace from one of the LAN2 or LAN3 hosts might shed light.
Re: carp and random disconnects
On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote: We seem to be having a problem with random disconnects after instituting carp on our gateway. The problem is only happening with our telnet[1] users connected to our legacy systems. The problem only happens with remote users that come in via T1 and don't go through the gateway. The machines they are connecting to are using 10.0.0.1 as it's gateway and seems to occassionaly choke when receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on which one is master) when it has queried 10.0.0.1. It's really hard to duplicate and as such I don't have much debug info. A user might be connected for hours or a few minutes. Ideas on what I should be looking for? Adding a static routes to the legacy servers corrects this, but I don't really want to do that every time a site complains about disconnects (if there is an easier way that is). snip Would route-to be something I'd want to look at to fix this? Here's a dmesg from this machine: OpenBSD 3.8 (GENERIC) #138: Sat Sep 10 15:41:37 MDT 2005 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: AMD Athlon(TM) XP 1600+ (AuthenticAMD 686-class, 256KB L2 cache) 1.41 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE cpu0: AMD Powernow: FID real mem = 1073307648 (1048152K) avail mem = 972767232 (949968K) using 4278 buffers containing 53768192 bytes (52508K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(82) BIOS, date 05/07/03, BIOS32 rev. 0 @ 0xf17b0 apm0 at bios0: Power Management spec V1.2 (BIOS mgmt disabled) apm0: APM power management enable: unrecognized device ID (9) apm0: APM engage (device 1): power management disabled (1) apm0: AC on, battery charge unknown apm0: flags b0102 dobusy 0 doidle 1 pcibios0 at bios0: rev 2.1 @ 0xf/0x1e62 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf1d90/208 (11 entries) pcibios0: PCI Interrupt Router at 000:17:0 (VIA VT82C586 ISA rev 0x00) pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc/0xcc00 0xd/0x1800 0xd4000/0x1000 0xd8000/0x1800 cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 0 function 0 VIA VT8366 PCI rev 0x00 ppb0 at pci0 dev 1 function 0 VIA VT8366 AGP rev 0x00 pci1 at ppb0 bus 1 vga1 at pci1 dev 0 function 0 Nvidia GeForce2 MX rev 0xb2 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) cmpci0 at pci0 dev 5 function 0 C-Media Electronics CMI8738/C3DX Audio rev 0x10: irq 10 audio0 at cmpci0 uhci0 at pci0 dev 9 function 0 VIA VT83C572 USB rev 0x50: irq 5 usb0 at uhci0: USB revision 1.0 uhub0 at usb0 uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1 at pci0 dev 9 function 1 VIA VT83C572 USB rev 0x50: irq 11 usb1 at uhci1: USB revision 1.0 uhub1 at usb1 uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered ehci0 at pci0 dev 9 function 2 VIA VT6202 USB rev 0x51: irq 10 usb2 at ehci0: USB revision 2.0 uhub2 at usb2 uhub2: VIA EHCI root hub, rev 2.00/1.00, addr 1 uhub2: 4 ports with 4 removable, self powered fxp0 at pci0 dev 12 function 0 Intel 82557 rev 0x0c, i82550: irq 5, address 00:0e:0c:71:1d:91 inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4 fxp1 at pci0 dev 13 function 0 Intel 82557 rev 0x08, i82559: irq 11, address 00:90:37:34:55:26 inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4 fxp2 at pci0 dev 14 function 0 Intel 82557 rev 0x08, i82559: irq 10, address 00:90:37:34:54:4d fxp2: Disabling dynamic standby mode in EEPROM, New ID 0x4080, cksum @ 0x3f: 0x - 0xc701 inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4 fxp3 at pci0 dev 15 function 0 Intel 82557 rev 0x08, i82559: irq 12, address 00:90:27:43:4f:b6 inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4 fxp4 at pci0 dev 16 function 0 Intel 82557 rev 0x0c, i82550: irq 5, address 00:0e:0c:74:ef:11 inphy4 at fxp4 phy 1: i82555 10/100 PHY, rev. 4 pcib0 at pci0 dev 17 function 0 VIA VT8233 ISA rev 0x00 pciide0 at pci0 dev 17 function 1 VIA VT82C571 IDE rev 0x06: ATA133, channel 0 configured to compatibility, channel 1 configured to compatibility wd0 at pciide0 channel 0 drive 0: WDC WD800JB-00CRA1 wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5 atapiscsi0 at pciide0 channel 1 drive 0 scsibus0 at atapiscsi0: 2 targets cd0 at scsibus0 targ 0 lun 0: CyberDrv, CW078D CD-R/RW, 120D SCSI0 5/cdrom removable cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 uhci2 at pci0 dev 17 function 2 VIA VT83C572 USB rev 0x23: irq 9 usb3 at uhci2: USB revision 1.0 uhub3 at usb3 uhub3: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered uhci3 at pci0 dev 17 function 3 VIA VT83C572 USB rev 0x23: irq 9 usb4 at uhci3: USB revision 1.0 uhub4 at usb4 uhub4: VIA UHCI root hub, rev 1.00/1.00, addr 1 uhub4: 2 ports with 2 removable, self powered isa0 at pcib0
Re: carp and random disconnects
On 2006/03/10 12:19, Bryan Irvine wrote: On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote: The problem only happens with remote users that come in via T1 and don't go through the gateway. The machines they are connecting to are using 10.0.0.1 as it's gateway and seems to occassionaly choke when receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on which one is master) when it has queried 10.0.0.1. Your post is missing a bit of information about the network, but if I'm not mistaken you sometimes have the start of the connection not passing through either firewall? If that's the case either make sure you allow packets from established connections that you don't have state for (this means you lose some of the protection of PF's stateful checking): i.e. don't use flags S/SA in the relevant rules... or rearrange the network routing so you don't need redirects (if you want advice on this you'll definitely need to post more details about the carp/PF setup, how the affected users reach the relevant hosts, etc: output from netstat -rn and ifconfig at strategic places will help illustrate, the PF ruleset may help too).
Re: carp and random disconnects
On 3/10/06, Stuart Henderson [EMAIL PROTECTED] wrote: On 2006/03/10 12:19, Bryan Irvine wrote: On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote: The problem only happens with remote users that come in via T1 and don't go through the gateway. The machines they are connecting to are using 10.0.0.1 as it's gateway and seems to occassionaly choke when receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on which one is master) when it has queried 10.0.0.1. Your post is missing a bit of information about the network, but if I'm not mistaken you sometimes have the start of the connection not passing through either firewall? If that's the case either make sure you allow packets from established connections that you don't have state for (this means you lose some of the protection of PF's stateful checking): i.e. don't use flags S/SA in the relevant rules... or rearrange the network routing so you don't need redirects (if you want advice on this you'll definitely need to post more details about the carp/PF setup, how the affected users reach the relevant hosts, etc: output from netstat -rn and ifconfig at strategic places will help illustrate, the PF ruleset may help too). The packets never pass *through* the firewall, but since the firewall is the default gateway it gets queried for certain routes, which pass through one of the cisco's. (Apologies for the ASCII) Internet / \ [fw1]-carp-[fw2] \ / LAN1 | Cisco / \ T1aT1b | | LAN2 LAN3 (There's more than 3 LANs but for simplicity we'll just show 2) So what we have are some servers on LAN1 with a default gateway of the carp IP on the firewalls. Somebody located on either LAN2 or LAN3 telnets to one of those servers, get connected and goes on about their daily business. Sometime later their connection drops. It happened after we installed the carp firewalls, and seems to be related to ICMP-Redirect coming from the real IP, as opposed to the carp one the request went to. pf.conf: ### ## Interface Macros ## WAN = fxp0 DMZ = fxp3 LOOPBACK = lo0 LAN1 = fxp1 LAN2 = fxp2 LANS = { $LAN1 $LAN2 } ALL = { $LAN1 $LAN2 $WAN $DMZ } KENTLEGACY = '192.233.103.0/24' KENT = '10.0.0.0/16' BELLEVUE = '10.1.0.0/16' #Virtual access interface on cisco's VIRTUAL = '192.168.210.0/24' PENINSULA = '192.233.99.0/24' MERCER = '192.168.98.0/24' LEGACYWEB = '207.109.73.0/24' REDMOND = '10.2.0.1/24' WEB = '10.5.1.0/24' #NATS = { $KENTLEGACY $KENT '192.233.100.0/24' '192.168.99.0/24' } NATS = { $KENTLEGACY $KENT $BELLEVUE } # ## Server Macros ## # localhost = 127.0.0.1 firebox2 = 64.1.201.130 Addesk = 64.1.201.146 FTPServer = 64.1.201.147 mailservers = { mx.kcjn.com 10.0.1.1 } ghost = 64.1.201.149 smtp = 64.1.201.150 www3 = www3.kcjn.com www5 = 64.1.201.153 ### ## Port Macros ## ### ftpproxy = 8021 vnc = 5900 ## Start the fun!!! ## set limit { states 2, frags 2} # ## Clean packets ## # scrub in all ## Start up NAT ## nat on $WAN inet from $KENTLEGACY to any - ($WAN) nat on $WAN inet from $KENT to any - ($WAN) nat on $WAN inet from $BELLEVUE to any - ($WAN) nat on $WAN inet from $VIRTUAL to any - ($WAN) #nat on $WAN inet from $NAT4 to any - ($WAN) nat on $WAN inet from $PENINSULA to any - ($WAN) nat on $WAN inet from $MERCER to any - ($WAN) nat on $WAN inet from $LEGACYWEB to any - ($WAN) nat on $WAN inet from $REDMOND to any - ($WAN) nat on $WAN inet from $WEB to any - ($WAN) ### ## spam tarpitting ## ### table spamd persist table spamd-white persist table spamd-mywhite persist file /etc/pf/whitelist.txt rdr pass on $WAN proto tcp from spamd-mywhite to port smtp - mx.kcjn.com port smtp rdr pass on $WAN inet proto tcp from spamd to any port smtp - 127.0.0.1 port 8025 rdr pass on $WAN inet proto tcp from !spamd-white to any port smtp - 127.0.0.1 port 8025 # ## Redirection for squid ## # #don't redirect local connections no rdr on $LANS inet proto tcp from $NATS to { 192.233.100.110 10.0.5.1 10.0.5.2 10.0.5.3 10.0.5.4 64.1.201.149 64.122.4.29 207.109.73.105 207.109.73.66 intranet.horvitznewspapers.net } port www #Don't proxy proxied connections no rdr on $LANS inet proto tcp from { 10.0.5.1 10.0.5.2 10.0.5.3 10.0.5.4 64.1.201.149 64.122.4.29 207.109.73.105 207.109.73.66 } to any port www #redirect rule for Squid #rdr pass on $LANS inet proto tcp from $NATS to any port www - $localhost port 3128 # ## FTP Proxy ## # no rdr on $LANS proto tcp from any to { 10.0.5.8 10.0.0.191
Re: carp and random disconnects
On 3/10/06, Steven S [EMAIL PROTECTED] wrote: Bryan Irvine wrote: ... ... It happened after we installed the carp firewalls, and seems to be related to ICMP-Redirect coming from the real IP, as opposed to the carp one the request went to. ... Interesting, in my experiments carp interfaces didn't send ICMP redirects at all... The CARP interface is not. I'm not sure if it's supposed to or not. I'm guessing because that is the only thing that has changed. With the exception of the carp and pfsync rules, this is the exact same ruleset from the old firewall. here's what I see on the firewall when I try a traceroute to a remote network that goes through a different gateway. 17:51:50.581658 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit 17:51:50.585106 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit 17:51:50.585402 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit The results of the traceroute: 1 10.0.0.2 (10.0.0.2) 0.971 ms 0.268 ms 4.880 ms 2 10.0.0.201 (10.0.0.201) 0.508 ms 0.503 ms 0.359 ms 3 172.19.1.10 (172.19.1.10) 111.714 ms 111.264 ms 111.691 ms 4 172.19.4.10 (172.19.4.10) 111.331 ms 113.438 ms 111.278 ms Am I missing something or barking up the wrong tree? --Bryan
Re: carp and random disconnects
Bryan Irvine wrote: On 3/10/06, Steven S [EMAIL PROTECTED] wrote: Bryan Irvine wrote: ... ... It happened after we installed the carp firewalls, and seems to be related to ICMP-Redirect coming from the real IP, as opposed to the carp one the request went to. ... Interesting, in my experiments carp interfaces didn't send ICMP redirects at all... The CARP interface is not. I'm not sure if it's supposed to or not. I'm guessing because that is the only thing that has changed. With the exception of the carp and pfsync rules, this is the exact same ruleset from the old firewall. here's what I see on the firewall when I try a traceroute to a remote network that goes through a different gateway. 17:51:50.581658 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit 17:51:50.585106 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit 17:51:50.585402 10.0.0.2 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit The results of the traceroute: 1 10.0.0.2 (10.0.0.2) 0.971 ms 0.268 ms 4.880 ms 2 10.0.0.201 (10.0.0.201) 0.508 ms 0.503 ms 0.359 ms 3 172.19.1.10 (172.19.1.10) 111.714 ms 111.264 ms 111.691 ms 4 172.19.4.10 (172.19.4.10) 111.331 ms 113.438 ms 111.278 ms Am I missing something or barking up the wrong tree? --Bryan I experienced similar issues. The carp interface does not send an ICMP redirect (I have not had the time to find out why) but instead routes the packet, creating state if you're running PF. My users experienced slowness so I ended up adding static routes on the servers (only about 5 of them) for the short-term. There appears to be two things broken, ICMP redirects and routing back through a carp interface. -Steve S.
Re: carp and random disconnects
Bryan Irvine wrote: ... ... It happened after we installed the carp firewalls, and seems to be related to ICMP-Redirect coming from the real IP, as opposed to the carp one the request went to. ... Interesting, in my experiments carp interfaces didn't send ICMP redirects at all... http://marc.theaimsgroup.com/?l=openbsd-miscm=113772490126174w=2 -Steve S.
carp and random disconnects
We seem to be having a problem with random disconnects after instituting carp on our gateway. The problem is only happening with our telnet[1] users connected to our legacy systems. The problem only happens with remote users that come in via T1 and don't go through the gateway. The machines they are connecting to are using 10.0.0.1 as it's gateway and seems to occassionaly choke when receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on which one is master) when it has queried 10.0.0.1. It's really hard to duplicate and as such I don't have much debug info. A user might be connected for hours or a few minutes. Ideas on what I should be looking for? Adding a static routes to the legacy servers corrects this, but I don't really want to do that every time a site complains about disconnects (if there is an easier way that is). This happens on every server that uses telnet[1] (about 4 or 5). [1] Don't get me started on the whole telnet thing. --Bryan