Re: carp and random disconnects

2006-03-12 Thread Stuart Henderson
On 2006/03/10 17:23, Bryan Irvine wrote:
 So what we have are some servers on LAN1 with a default gateway of the
 carp IP on the firewalls.  Somebody located on either LAN2 or LAN3
 telnets to one of those servers, get connected and goes on about their
 daily business.
 
 Sometime later their connection drops.
 
 It happened after we installed the carp firewalls, and seems to be
 related to ICMP-Redirect coming from the real IP, as opposed to the
 carp one the request went to.

good description, thanks.

turning off redirects (sysctl -w net.inet.ip.redirect=0) would let you
verify this hypothesis, and if it's valid and the traffic to the LANs
isn't too heavy, could give you a work-around too.

if not, maybe a packet trace from one of the LAN2 or LAN3 hosts might
shed light.



Re: carp and random disconnects

2006-03-10 Thread Bryan Irvine
On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote:
 We seem to be having a problem with random disconnects after
 instituting carp on our gateway.  The problem is only happening with
 our telnet[1] users connected to our legacy systems.

 The problem only happens with remote users that come in via T1 and
 don't go through the gateway.  The machines they are connecting to are
 using 10.0.0.1 as it's gateway and seems to occassionaly choke when
 receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on
 which one is master) when it has queried 10.0.0.1.

 It's really hard to duplicate and as such I don't have much debug
 info. A user might be connected for hours or a few minutes.

 Ideas on what I should be looking for?  Adding a static routes to the
 legacy servers corrects this, but I don't really want to do that every
 time a site complains about disconnects (if there is an easier way
 that is).

snip

Would route-to be something I'd want to look at to fix this?

Here's a dmesg from this machine:

OpenBSD 3.8 (GENERIC) #138: Sat Sep 10 15:41:37 MDT 2005
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: AMD Athlon(TM) XP 1600+ (AuthenticAMD 686-class, 256KB L2
cache) 1.41 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
cpu0: AMD Powernow: FID
real mem  = 1073307648 (1048152K)
avail mem = 972767232 (949968K)
using 4278 buffers containing 53768192 bytes (52508K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(82) BIOS, date 05/07/03, BIOS32 rev. 0 @ 0xf17b0
apm0 at bios0: Power Management spec V1.2 (BIOS mgmt disabled)
apm0: APM power management enable: unrecognized device ID (9)
apm0: APM engage (device 1): power management disabled (1)
apm0: AC on, battery charge unknown
apm0: flags b0102 dobusy 0 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xf/0x1e62
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf1d90/208 (11 entries)
pcibios0: PCI Interrupt Router at 000:17:0 (VIA VT82C586 ISA rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc/0xcc00 0xd/0x1800 0xd4000/0x1000 0xd8000/0x1800
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 VIA VT8366 PCI rev 0x00
ppb0 at pci0 dev 1 function 0 VIA VT8366 AGP rev 0x00
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 Nvidia GeForce2 MX rev 0xb2
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
cmpci0 at pci0 dev 5 function 0 C-Media Electronics CMI8738/C3DX
Audio rev 0x10: irq 10
audio0 at cmpci0
uhci0 at pci0 dev 9 function 0 VIA VT83C572 USB rev 0x50: irq 5
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 9 function 1 VIA VT83C572 USB rev 0x50: irq 11
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 9 function 2 VIA VT6202 USB rev 0x51: irq 10
usb2 at ehci0: USB revision 2.0
uhub2 at usb2
uhub2: VIA EHCI root hub, rev 2.00/1.00, addr 1
uhub2: 4 ports with 4 removable, self powered
fxp0 at pci0 dev 12 function 0 Intel 82557 rev 0x0c, i82550: irq 5,
address 00:0e:0c:71:1d:91
inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4
fxp1 at pci0 dev 13 function 0 Intel 82557 rev 0x08, i82559: irq 11,
address 00:90:37:34:55:26
inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
fxp2 at pci0 dev 14 function 0 Intel 82557 rev 0x08, i82559: irq 10,
address 00:90:37:34:54:4d
fxp2: Disabling dynamic standby mode in EEPROM, New ID 0x4080, cksum @
0x3f: 0x - 0xc701
inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4
fxp3 at pci0 dev 15 function 0 Intel 82557 rev 0x08, i82559: irq 12,
address 00:90:27:43:4f:b6
inphy3 at fxp3 phy 1: i82555 10/100 PHY, rev. 4
fxp4 at pci0 dev 16 function 0 Intel 82557 rev 0x0c, i82550: irq 5,
address 00:0e:0c:74:ef:11
inphy4 at fxp4 phy 1: i82555 10/100 PHY, rev. 4
pcib0 at pci0 dev 17 function 0 VIA VT8233 ISA rev 0x00
pciide0 at pci0 dev 17 function 1 VIA VT82C571 IDE rev 0x06: ATA133,
channel 0 configured to compatibility, channel 1 configured to
compatibility
wd0 at pciide0 channel 0 drive 0: WDC WD800JB-00CRA1
wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: CyberDrv, CW078D CD-R/RW, 120D SCSI0
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
uhci2 at pci0 dev 17 function 2 VIA VT83C572 USB rev 0x23: irq 9
usb3 at uhci2: USB revision 1.0
uhub3 at usb3
uhub3: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
uhci3 at pci0 dev 17 function 3 VIA VT83C572 USB rev 0x23: irq 9
usb4 at uhci3: USB revision 1.0
uhub4 at usb4
uhub4: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub4: 2 ports with 2 removable, self powered
isa0 at pcib0

Re: carp and random disconnects

2006-03-10 Thread Stuart Henderson
On 2006/03/10 12:19, Bryan Irvine wrote:
 On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote:
  The problem only happens with remote users that come in via T1 and
  don't go through the gateway.  The machines they are connecting to are
  using 10.0.0.1 as it's gateway and seems to occassionaly choke when
  receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on
  which one is master) when it has queried 10.0.0.1.

Your post is missing a bit of information about the network, but if I'm
not mistaken you sometimes have the start of the connection not passing
through either firewall? If that's the case either make sure you allow
packets from established connections that you don't have state for (this
means you lose some of the protection of PF's stateful checking): i.e.
don't use flags S/SA in the relevant rules... or rearrange the network
routing so you don't need redirects (if you want advice on this you'll
definitely need to post more details about the carp/PF setup, how the
affected users reach the relevant hosts, etc: output from netstat -rn
and ifconfig at strategic places will help illustrate, the PF ruleset
may help too).



Re: carp and random disconnects

2006-03-10 Thread Bryan Irvine
On 3/10/06, Stuart Henderson [EMAIL PROTECTED] wrote:
 On 2006/03/10 12:19, Bryan Irvine wrote:
  On 3/6/06, Bryan Irvine [EMAIL PROTECTED] wrote:
   The problem only happens with remote users that come in via T1 and
   don't go through the gateway.  The machines they are connecting to are
   using 10.0.0.1 as it's gateway and seems to occassionaly choke when
   receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on
   which one is master) when it has queried 10.0.0.1.

 Your post is missing a bit of information about the network, but if I'm
 not mistaken you sometimes have the start of the connection not passing
 through either firewall? If that's the case either make sure you allow
 packets from established connections that you don't have state for (this
 means you lose some of the protection of PF's stateful checking): i.e.
 don't use flags S/SA in the relevant rules... or rearrange the network
 routing so you don't need redirects (if you want advice on this you'll
 definitely need to post more details about the carp/PF setup, how the
 affected users reach the relevant hosts, etc: output from netstat -rn
 and ifconfig at strategic places will help illustrate, the PF ruleset
 may help too).

The packets never pass *through* the firewall, but since the firewall
is the default gateway it gets queried for certain routes, which pass
through one of the cisco's.

(Apologies for the ASCII)

 Internet
   / \
[fw1]-carp-[fw2]
 \  /
   LAN1
  |
Cisco
/  \
T1aT1b
 |   |
 LAN2  LAN3

(There's more than 3 LANs but for simplicity we'll just show 2)

So what we have are some servers on LAN1 with a default gateway of the
carp IP on the firewalls.  Somebody located on either LAN2 or LAN3
telnets to one of those servers, get connected and goes on about their
daily business.

Sometime later their connection drops.

It happened after we installed the carp firewalls, and seems to be
related to ICMP-Redirect coming from the real IP, as opposed to the
carp one the request went to.

pf.conf:

   ###
  ##  Interface Macros  ##
 
WAN = fxp0
DMZ = fxp3
LOOPBACK = lo0

LAN1 = fxp1
LAN2 = fxp2
LANS = { $LAN1 $LAN2 }

ALL = { $LAN1 $LAN2 $WAN $DMZ }

KENTLEGACY = '192.233.103.0/24'
KENT = '10.0.0.0/16'
BELLEVUE = '10.1.0.0/16'
#Virtual access interface on cisco's
VIRTUAL = '192.168.210.0/24'
PENINSULA = '192.233.99.0/24'
MERCER = '192.168.98.0/24'
LEGACYWEB = '207.109.73.0/24'
REDMOND = '10.2.0.1/24'
WEB = '10.5.1.0/24'

#NATS = { $KENTLEGACY $KENT '192.233.100.0/24' '192.168.99.0/24' }
NATS = { $KENTLEGACY $KENT $BELLEVUE }

   #
  ##  Server Macros  ##
 #
localhost = 127.0.0.1
firebox2 = 64.1.201.130
Addesk = 64.1.201.146
FTPServer = 64.1.201.147
mailservers = { mx.kcjn.com 10.0.1.1 }
ghost = 64.1.201.149
smtp = 64.1.201.150
www3 = www3.kcjn.com
www5 = 64.1.201.153

   ###
  ##  Port Macros  ##
 ###
ftpproxy = 8021
vnc = 5900

   
  ##  Start the fun!!!  ##
 

set limit { states 2, frags 2}

   #
  ##  Clean packets  ##
 #
scrub in all

   
  ##  Start up NAT  ##
 
nat on $WAN inet from $KENTLEGACY to any - ($WAN)
nat on $WAN inet from $KENT to any - ($WAN)
nat on $WAN inet from $BELLEVUE to any - ($WAN)
nat on $WAN inet from $VIRTUAL to any - ($WAN)
#nat on $WAN inet from $NAT4 to any - ($WAN)
nat on $WAN inet from $PENINSULA to any - ($WAN)
nat on $WAN inet from $MERCER to any - ($WAN)
nat on $WAN inet from $LEGACYWEB to any - ($WAN)
nat on $WAN inet from $REDMOND to any - ($WAN)
nat on $WAN inet from $WEB to any - ($WAN)


   ###
  ##  spam tarpitting  ##
 ###
table spamd persist
table spamd-white persist
table spamd-mywhite persist file /etc/pf/whitelist.txt

rdr pass on $WAN proto tcp from spamd-mywhite to port smtp -
mx.kcjn.com port smtp
rdr pass on $WAN inet proto tcp from spamd to any port smtp -
127.0.0.1 port 8025
rdr pass on $WAN inet proto tcp from !spamd-white to any port smtp
- 127.0.0.1 port 8025

   #
  ##  Redirection for squid  ##
 #
#don't redirect local connections
no rdr on $LANS inet proto tcp from $NATS to { 192.233.100.110
10.0.5.1 10.0.5.2 10.0.5.3 10.0.5.4 64.1.201.149 64.122.4.29
207.109.73.105 207.109.73.66 intranet.horvitznewspapers.net } port www

#Don't proxy proxied connections
no rdr on $LANS inet proto tcp from { 10.0.5.1 10.0.5.2 10.0.5.3
10.0.5.4 64.1.201.149 64.122.4.29 207.109.73.105 207.109.73.66 } to
any port www

#redirect rule for Squid
#rdr pass on $LANS inet proto tcp from $NATS to any port www -
$localhost port 3128


   #
  ##  FTP Proxy  ##
 #
no rdr on $LANS proto tcp from any to { 10.0.5.8 10.0.0.191

Re: carp and random disconnects

2006-03-10 Thread Bryan Irvine
On 3/10/06, Steven S [EMAIL PROTECTED] wrote:
 Bryan Irvine wrote:
 ...
 ...
  It happened after we installed the carp firewalls, and seems to be
  related to ICMP-Redirect coming from the real IP, as opposed to the
  carp one the request went to.
 
 ...

 Interesting, in my experiments carp interfaces didn't send ICMP redirects at
 all...

The CARP interface is not.  I'm not sure if it's supposed to or not. 
I'm guessing because that is the only thing that has changed.  With
the exception of the carp and pfsync rules, this is the exact same
ruleset from the old firewall.

here's what I see on the firewall when I try a traceroute to a remote
network that goes through a different gateway.

17:51:50.581658 10.0.0.2  10.0.253.236.kent-dhcp.kcjn.com: icmp: time
exceeded in-transit
17:51:50.585106 10.0.0.2  10.0.253.236.kent-dhcp.kcjn.com: icmp: time
exceeded in-transit
17:51:50.585402 10.0.0.2  10.0.253.236.kent-dhcp.kcjn.com: icmp: time
exceeded in-transit

The results of the traceroute:
 1  10.0.0.2 (10.0.0.2)  0.971 ms  0.268 ms  4.880 ms
 2  10.0.0.201 (10.0.0.201)  0.508 ms  0.503 ms  0.359 ms
 3  172.19.1.10 (172.19.1.10)  111.714 ms  111.264 ms  111.691 ms
 4  172.19.4.10 (172.19.4.10)  111.331 ms  113.438 ms  111.278 ms


Am I missing something or barking up the wrong tree?

--Bryan



Re: carp and random disconnects

2006-03-10 Thread Steven S
Bryan Irvine wrote:
 On 3/10/06, Steven S [EMAIL PROTECTED] wrote:
 Bryan Irvine wrote:
 ...
 ...
 It happened after we installed the carp firewalls, and seems to be
 related to ICMP-Redirect coming from the real IP, as opposed to the
 carp one the request went to. 
 
 ...
 
 Interesting, in my experiments carp interfaces didn't send ICMP
 redirects at all...
 
 The CARP interface is not.  I'm not sure if it's supposed to or not.
 I'm guessing because that is the only thing that has changed.  With
 the exception of the carp and pfsync rules, this is the exact same
 ruleset from the old firewall.
 
 here's what I see on the firewall when I try a traceroute to a remote
 network that goes through a different gateway.
 
 17:51:50.581658 10.0.0.2  10.0.253.236.kent-dhcp.kcjn.com: icmp:
 time exceeded in-transit 17:51:50.585106 10.0.0.2 
 10.0.253.236.kent-dhcp.kcjn.com: icmp: time exceeded in-transit
 17:51:50.585402 10.0.0.2  10.0.253.236.kent-dhcp.kcjn.com: icmp:
 time exceeded in-transit  
 
 The results of the traceroute:
  1  10.0.0.2 (10.0.0.2)  0.971 ms  0.268 ms  4.880 ms
  2  10.0.0.201 (10.0.0.201)  0.508 ms  0.503 ms  0.359 ms
  3  172.19.1.10 (172.19.1.10)  111.714 ms  111.264 ms  111.691 ms
  4  172.19.4.10 (172.19.4.10)  111.331 ms  113.438 ms  111.278 ms
 
 
 Am I missing something or barking up the wrong tree?
 
 --Bryan

I experienced similar issues.  The carp interface does not send an ICMP
redirect (I have not had the time to find out why) but instead routes the
packet, creating state if you're running PF.  My users experienced
slowness so I ended up adding static routes on the servers (only about 5
of them) for the short-term.  There appears to be two things broken, ICMP
redirects and routing back through a carp interface.

-Steve S.



Re: carp and random disconnects

2006-03-10 Thread Steven S
Bryan Irvine wrote:
...
...
 It happened after we installed the carp firewalls, and seems to be
 related to ICMP-Redirect coming from the real IP, as opposed to the
 carp one the request went to. 
 
...

Interesting, in my experiments carp interfaces didn't send ICMP redirects at
all...

http://marc.theaimsgroup.com/?l=openbsd-miscm=113772490126174w=2

-Steve S.



carp and random disconnects

2006-03-06 Thread Bryan Irvine
We seem to be having a problem with random disconnects after
instituting carp on our gateway.  The problem is only happening with
our telnet[1] users connected to our legacy systems.

The problem only happens with remote users that come in via T1 and
don't go through the gateway.  The machines they are connecting to are
using 10.0.0.1 as it's gateway and seems to occassionaly choke when
receiving an icmp-redirect from 10.0.0.2 (or 10.0.0.3 depending on
which one is master) when it has queried 10.0.0.1.

It's really hard to duplicate and as such I don't have much debug
info. A user might be connected for hours or a few minutes.

Ideas on what I should be looking for?  Adding a static routes to the
legacy servers corrects this, but I don't really want to do that every
time a site complains about disconnects (if there is an easier way
that is).

This happens on every server that uses telnet[1] (about 4 or 5).

[1] Don't get me started on the whole telnet thing.

--Bryan