Re: IPSec Packet Loss Help
Hi Zach. Ah great news! I noticed your email before the weekend but didn't have a chance to reply. Please you worked it out. The remote network routes I use don't point at the local inside CARP IP but instead at the local inside physical IP (each firewalls own IP just to set the source). Yea setting the NAT fixed some of the issues for us with communicating with the firewalls themselves.. Restrict the NAT rule if you like so you only NAT to the internal CARP IP when trying to talk to either the firewalls physical IPs. No need to NAT for traffic to the rest of the LAN as that only ever replies back to the CARP IP as the GW etc.. Cheers, andy. On Mon 10 Mar 2014 16:25:59 GMT, Zach Leslie wrote: Hope this helps, Thanks, Andy. Once I removed the routes for the remote network point to the internal carp interface, everything works like I expect. Super stable. Thanks for your time. I'll mess with the NAT for monitoring soonish and see if I can get that working.
Re: IPSec Packet Loss Help
> Hope this helps, Thanks, Andy. Once I removed the routes for the remote network point to the internal carp interface, everything works like I expect. Super stable. Thanks for your time. I'll mess with the NAT for monitoring soonish and see if I can get that working. -- Zach
Re: IPSec Packet Loss Help
> I had to disable monitoring of the internal interfaces of both remote > firewalls, as it killed the VPN when you ping'ed the backup firewall. The > packets get there, but the reply is sent back directly from the backup and > not via the master. > > To fix that I added a NAT rule, and could then monitor and connect to the > internal interfaces of both remote firewalls again.. > (These pf.conf examples and files below are from our remote office > firewalls. carp0 = external, carp1 = internal); > match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to (carp1) > > pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) } > keep state (if-bound) > pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep state > (if-bound) > > pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } queue > (_wan_vpn,_wan_pri) set prio (2,5) I currently don't have any queuing on this link, but my rules look pretty close to the same here. tcpdump -nei pflog0 doesn't show any blocks for my traffic with 'block in log' set, so I don't think PF is getting in the way. Though, as you mention above, if the tunnel drops *because* I am hitting this internal address, this would be a problem. > PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't > fragment (needed on both sides naturally); > match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } to { > $hq_lan } scrub (no-df max-mss 1400) > set skip on $if_pfsync I have set this, though I don't really know how to verify if its working like expected. If I tcpdump the internal carp interface, and ping through the tunnel and to a device on the other side, I see the packets traverse the link. If I increase the ping size (-s 1500) I see fragmentation, but that doesn't really tell me that the rule is working, does it? Maybe I don't understand what that is supposed to be doing. > >>I also submitted some suggested modifications to /etc/rc.d/sasyncd and > >>/etc/rc.d/isakmpd here in the past which makes the setup and failover of > >>VPNs much faster and more stable. > > > >I did see those scripts, though they seem to be more solving the startup > >time of the daemons. My issue is more keeping the service up than start > >time. > > Yea they sort the startup, shutdown and also ensure a prompt the failover. I > wrote them during 5.2 so may not be so important but they add a level of > failsafe otherwise. > Keeping the tunnel up should simply be a case of making sure the backup > *never* sends encaped packets itself.. Oh, I missed this last read through. What does this do? Why? I assume this is the reason for the route on the firewall machine to use the local internal carp for the remote network. I understand the reason to route the packet to the internal carp interface on the backup, but on the primary I am unsure. So where 172.16.32.1 is the internal carp, I have a route on both the master and the backup. route add 172.16.0.0/24 172.16.32.1 Assuming that 172.16.0.0/24 is the remote network. > >It sounds like your setup is similar to my own. You don't see theses > >kinds of instability using sasyncd? If you have a look at my OP, the > >sasyncd.conf is in there. Its possible I have a configuration error, > >but just reading over the manpage again, I don't know what it would be. > > > >This is really troubling me. > > > > No, none at all. Our tunnels are *really* stable. I can reboot a firewall > and the tunnel only stops for a few seconds before switching over > gracefully. I really want to be able to say the same for these. > /etc/sasyncd.conf > peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable directly > connected between firewalls) > interface carp0 > group carp > listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc.. > sharedkey 0x > flushmode startup > control isakmpd I now have exactly this, except where the options here are specified as default in the manpage. I left off control for example, as isakmpd is default, as is group carp. > /etc/isakmpd.conf > [general] > listen-on=, I've added this for my install and verified that ports are listening only on those addresses. > /etc/ipsec.conf > # Macros > local_gw="" > local_net="" > remote_gw="" > remote_net="" > > ike dynamic esp from $local_net to $remote_net \ > local $local_gw peer $remote_gw \ > main auth hmac-sha2-256 enc aes group modp1024 \ > quick auth hmac-sha2-256 enc aes group modp1024 \ > srcid $local_gw dstid $remote_gw \ > psk The only thing I was missing here was the srcid and dstid, but that didn't seem to make a difference. So now I have the sasyncd and pfsync both going over a directly connected link, and sasyncd is only listening on that interface address, as well as 'set skip' on the interface in pf. All seems like what you have. Just for troubleshooting, I've only added the sasyncd to one side, since without HA is stable, I'd like to introduce one set of change at a time for testi
Re: IPSec Packet Loss Help
On Fri, Mar 07, 2014 at 04:35:45PM +, Andy wrote: > Hi > > On Thu 06 Mar 2014 23:03:58 GMT, Zach Leslie wrote: > >On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote: > >>Hi, haven't read your original email but if my assumptions about your setup > >>are correct is the VPN tunnel dropping every now and then? > > > >Thats correct. Daemons start up quick, negotiations happen, and then > >periodically the tunnel is just not available, despite the SAs being > >available on the masters and the slaves. Disabling -S on isakmpd and > >turning off sasyncd makes the tunnel stay up for much longer, 7 hours > >and counting. > > > >>You need the static route to point to the internal interface to make sure > >>that packets generated by the firewall itself have a source IP set to the > >>internal net thus allowing the IPSec policy route to be used (as it defines > >>both the source and dest net, not just the dest net like a normal route). > > > >This I have, and packets flow. Still unclear about which route takes > >precedence, encap or inet. > > > > encap (I tried to add a route for the remote net on the firewall pointing to > the internal switch, which would bounce the packet back to the CARP IP, thus > getting packets from the backup to the master and over the VPN. But it > doesn't work, the encap route is used first and so the tunnel drops). I suppose I should be more explicit. For packets generated by the host, it seems to use the inet routing table. Perhaps thats because the source IP is not in the flow. > >>We had to modify all our monitoring scripts to not 'phone home' if the box > >>is a backup. Only the master firewall can use the VPN. > > > >I've ended up monitoring the host using internal interface, which in > >turn tells me the tunnel is available. > > I had to disable monitoring of the internal interfaces of both remote > firewalls, as it killed the VPN when you ping'ed the backup firewall. The > packets get there, but the reply is sent back directly from the backup and > not via the master. What do you mean killed the VPN? As in you saw packet loss, or the tunnel went down completely, or just that those packets did not return. > To fix that I added a NAT rule, and could then monitor and connect to the > internal interfaces of both remote firewalls again.. > (These pf.conf examples and files below are from our remote office > firewalls. carp0 = external, carp1 = internal); > match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to (carp1) > > pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) } > keep state (if-bound) > pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep state > (if-bound) > > pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } queue > (_wan_vpn,_wan_pri) set prio (2,5) I've not implemented this NAT yet, simply because the packets in my case do get returned. I have a route on both firewalls to the remote network to point at the carp address shared between them. Though your mention of the VPN going down sounds familiar. > PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't > fragment (needed on both sides naturally); > match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } to { > $hq_lan } scrub (no-df max-mss 1400) > set skip on $if_pfsync I'll add this shortly and test. > Yea they sort the startup, shutdown and also ensure a prompt the failover. I > wrote them during 5.2 so may not be so important but they add a level of > failsafe otherwise. > Keeping the tunnel up should simply be a case of making sure the backup > *never* sends encaped packets itself.. Committing them to source seems reasonable if they are an improvement on what is there. > /etc/sasyncd.conf > peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable directly > connected between firewalls) > interface carp0 > group carp > listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc.. > sharedkey 0x > flushmode startup > control isakmpd My sasyncd.conf looks different, so I'm going to try some of this now. Enabling sasyncd and adding -S to the isamkpd causes serious instability right now. Disabling it makes everything stable again, so I'm hoping its just a configuration issue. So this means that you have PFsync and sasyncd going over the same directly attached interfaces, correct? I'll report back when I've implemented the changes. Thanks for the advice. -- Zach
Re: IPSec Packet Loss Help
Hi On Thu 06 Mar 2014 23:03:58 GMT, Zach Leslie wrote: On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote: Hi, haven't read your original email but if my assumptions about your setup are correct is the VPN tunnel dropping every now and then? Thats correct. Daemons start up quick, negotiations happen, and then periodically the tunnel is just not available, despite the SAs being available on the masters and the slaves. Disabling -S on isakmpd and turning off sasyncd makes the tunnel stay up for much longer, 7 hours and counting. You need the static route to point to the internal interface to make sure that packets generated by the firewall itself have a source IP set to the internal net thus allowing the IPSec policy route to be used (as it defines both the source and dest net, not just the dest net like a normal route). This I have, and packets flow. Still unclear about which route takes precedence, encap or inet. encap (I tried to add a route for the remote net on the firewall pointing to the internal switch, which would bounce the packet back to the CARP IP, thus getting packets from the backup to the master and over the VPN. But it doesn't work, the encap route is used first and so the tunnel drops). We had to modify all our monitoring scripts to not 'phone home' if the box is a backup. Only the master firewall can use the VPN. I've ended up monitoring the host using internal interface, which in turn tells me the tunnel is available. I had to disable monitoring of the internal interfaces of both remote firewalls, as it killed the VPN when you ping'ed the backup firewall. The packets get there, but the reply is sent back directly from the backup and not via the master. To fix that I added a NAT rule, and could then monitor and connect to the internal interfaces of both remote firewalls again.. (These pf.conf examples and files below are from our remote office firewalls. carp0 = external, carp1 = internal); match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to (carp1) pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) } keep state (if-bound) pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep state (if-bound) pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } queue (_wan_vpn,_wan_pri) set prio (2,5) PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't fragment (needed on both sides naturally); match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } to { $hq_lan } scrub (no-df max-mss 1400) set skip on $if_pfsync I also submitted some suggested modifications to /etc/rc.d/sasyncd and /etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs much faster and more stable. I did see those scripts, though they seem to be more solving the startup time of the daemons. My issue is more keeping the service up than start time. Yea they sort the startup, shutdown and also ensure a prompt the failover. I wrote them during 5.2 so may not be so important but they add a level of failsafe otherwise. Keeping the tunnel up should simply be a case of making sure the backup *never* sends encaped packets itself.. It sounds like your setup is similar to my own. You don't see theses kinds of instability using sasyncd? If you have a look at my OP, the sasyncd.conf is in there. Its possible I have a configuration error, but just reading over the manpage again, I don't know what it would be. This is really troubling me. No, none at all. Our tunnels are *really* stable. I can reboot a firewall and the tunnel only stops for a few seconds before switching over gracefully. /etc/sasyncd.conf peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable directly connected between firewalls) interface carp0 group carp listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc.. sharedkey 0x flushmode startup control isakmpd /etc/isakmpd.conf [general] listen-on=, /etc/ipsec.conf # Macros local_gw="" local_net="" remote_gw="" remote_net="" ike dynamic esp from $local_net to $remote_net \ local $local_gw peer $remote_gw \ main auth hmac-sha2-256 enc aes group modp1024 \ quick auth hmac-sha2-256 enc aes group modp1024 \ srcid $local_gw dstid $remote_gw \ psk /etc/rc.d/isakmpd.conf; #!/bin/sh # # $OpenBSD: isakmpd,v 1.1 2011/07/06 18:55:36 robert Exp $ daemon="/sbin/isakmpd" . /etc/rc.d/rc.subr pexp="isakmpd: monitor \[priv\]" rc_pre() { [ X"${sasyncd_flags}" != X"NO" ] && \ daemon_flags="-S ${daemon_flags}" return 0 } rc_stop() { if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then ipsecctl -d -f /etc/ipsec.conf; fi sleep 1 if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then ipsecctl -d -f /etc/ipsec.conf; fi if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then ipsecctl -F -f /etc/ipsec.conf; fi pkill -f "^${pexp}" } rc_cmd $1 /etc/rc.d/sasyncd #!/bin/sh # # $Open
Re: IPSec Packet Loss Help
On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote: > Hi, haven't read your original email but if my assumptions about your setup > are correct is the VPN tunnel dropping every now and then? Thats correct. Daemons start up quick, negotiations happen, and then periodically the tunnel is just not available, despite the SAs being available on the masters and the slaves. Disabling -S on isakmpd and turning off sasyncd makes the tunnel stay up for much longer, 7 hours and counting. > You need the static route to point to the internal interface to make sure > that packets generated by the firewall itself have a source IP set to the > internal net thus allowing the IPSec policy route to be used (as it defines > both the source and dest net, not just the dest net like a normal route). This I have, and packets flow. Still unclear about which route takes precedence, encap or inet. > We had to modify all our monitoring scripts to not 'phone home' if the box is > a backup. Only the master firewall can use the VPN. I've ended up monitoring the host using internal interface, which in turn tells me the tunnel is available. > I also submitted some suggested modifications to /etc/rc.d/sasyncd and > /etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs > much faster and more stable. I did see those scripts, though they seem to be more solving the startup time of the daemons. My issue is more keeping the service up than start time. It sounds like your setup is similar to my own. You don't see theses kinds of instability using sasyncd? If you have a look at my OP, the sasyncd.conf is in there. Its possible I have a configuration error, but just reading over the manpage again, I don't know what it would be. This is really troubling me. -- Zach
Re: IPSec Packet Loss Help
Hi, haven't read your original email but if my assumptions about your setup are correct is the VPN tunnel dropping every now and then? I had a similar issue with 4 OBSD firewalls (2 at each end), all running isakmpd and sasyncd to keep the SAs in sync between a pair. With the tunnels explicitly set to use the CARP IPs as endpoints. You need the static route to point to the internal interface to make sure that packets generated by the firewall itself have a source IP set to the internal net thus allowing the IPSec policy route to be used (as it defines both the source and dest net, not just the dest net like a normal route). However whilst pinging the remote net from the live firewall works, I found that if you try and ping the remote net from the backup firewall, the packet does *not* get routed to the CARP master (why would it!). The IPSec route is there.. The backup firewall encaps the packet and sends it out its own external interface using the dets from sasyncd, but the master firewall at the other end will see the packet as coming from 'another' box with the same details but not the full correct policy and so the tunnel is blocked for a period until the master renegotiates it. We had to modify all our monitoring scripts to not 'phone home' if the box is a backup. Only the master firewall can use the VPN. I also submitted some suggested modifications to /etc/rc.d/sasyncd and /etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs much faster and more stable. Can dig out if you can't find them. Cheers, Andy Sent from my iPhone > On 6 Mar 2014, at 19:00, Zach Leslie wrote: > > On Wed, Mar 05, 2014 at 11:05:11PM -0600, Amit Kulkarni wrote: >>> If PF information is needed, I can provide and obscure, but I didn't >>> expect it to be >>> the issue. >> >> i am no expert on this. but if it is a packet loss issue, you need to post >> the obscured pf.conf > > Fair point. I've not seen any related dropped packets due to PF with > tcpdump -nei pflog0, so I didn't think it would be related to PF. > > Not line-wrapped for readability. > > match out on em0 from ! (em0:network) to any nat-to (em0:0) > block drop in log all > pass out all flags S/SA > pass out on em0 proto tcp all flags S/SA modulate state > pass in proto icmp from to > pass in proto ipv6-icmp from to > pass in from to ! flags S/SA > pass in from to any flags S/SA > pass in proto udp from to port = 53 > pass out on em0 proto udp all > pass out on em0 proto icmp all > pass on em0 inet proto carp all > pass on em0 proto icmp from any to (em0:network) > pass on em1 inet proto pfsync all > pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 500 > pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 4500 > pass in on em0 inet proto esp from 66.77.88.10 to 1.2.3.5 > pass in on enc0 inet proto ipencap from 66.77.88.10 to 1.2.3.5 keep state > (if-bound) > pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 500 > pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 4500 > pass out on em0 inet proto esp from 1.2.3.5 to 66.77.88.10 > pass out on enc0 inet proto ipencap from 1.2.3.5 to 66.77.88.10 keep state > (if-bound) > pass out on enc0 inet from 1.2.3.5 to flags S/SA keep state > (if-bound) > pass out on enc0 from to flags S/SA keep state > (if-bound) > pass in on enc0 from to flags S/SA keep state > (if-bound) > pass in on enc0 inet from 66.77.88.10 to flags S/SA keep state > (if-bound) > pass in on em0 inet proto tcp from 66.77.88.17 to any port = 22 flags S/SA > pass in on em0 inet proto tcp from 1.2.3.7 to 1.2.3.6 port = 500 flags S/SA > > This morning as a test, I've disabled isakmpd sync feature, and shutdown > sasycnd on all firewalls, as well as isakmpd on the secondaries at each > location and the connection seems to be much improved. I've not lost > any connections in the last 4 hours which is much improved. > > Not sure if sasyncd is actually causing the issue, but disabling it > to gain an improved connections certainly doesn't seem great from an HA > standpoint. > > I've also got a couple static routes in the inet table that point the > remote network to the internal carp address for routing purposes. This > allows the traffic generated by the secondary firewall to reach the > remote network due to the fact that the secondary does not hold the > master status of the carp, and therfore can't use the IPSec directly. > > I do wonder though, since I also have a flow for the same network, the > encap and inet routing table have a route for the same network. Which > takes priority? Just something to point out since it could be causing > troubles. > > Regards, > > -- > Zach
Re: IPSec Packet Loss Help
On Wed, Mar 05, 2014 at 11:05:11PM -0600, Amit Kulkarni wrote: > > If PF information is needed, I can provide and obscure, but I didn't > > expect it to be > > the issue. > > > > i am no expert on this. but if it is a packet loss issue, you need to post > the obscured pf.conf Fair point. I've not seen any related dropped packets due to PF with tcpdump -nei pflog0, so I didn't think it would be related to PF. Not line-wrapped for readability. match out on em0 from ! (em0:network) to any nat-to (em0:0) block drop in log all pass out all flags S/SA pass out on em0 proto tcp all flags S/SA modulate state pass in proto icmp from to pass in proto ipv6-icmp from to pass in from to ! flags S/SA pass in from to any flags S/SA pass in proto udp from to port = 53 pass out on em0 proto udp all pass out on em0 proto icmp all pass on em0 inet proto carp all pass on em0 proto icmp from any to (em0:network) pass on em1 inet proto pfsync all pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 500 pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 4500 pass in on em0 inet proto esp from 66.77.88.10 to 1.2.3.5 pass in on enc0 inet proto ipencap from 66.77.88.10 to 1.2.3.5 keep state (if-bound) pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 500 pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 4500 pass out on em0 inet proto esp from 1.2.3.5 to 66.77.88.10 pass out on enc0 inet proto ipencap from 1.2.3.5 to 66.77.88.10 keep state (if-bound) pass out on enc0 inet from 1.2.3.5 to flags S/SA keep state (if-bound) pass out on enc0 from to flags S/SA keep state (if-bound) pass in on enc0 from to flags S/SA keep state (if-bound) pass in on enc0 inet from 66.77.88.10 to flags S/SA keep state (if-bound) pass in on em0 inet proto tcp from 66.77.88.17 to any port = 22 flags S/SA pass in on em0 inet proto tcp from 1.2.3.7 to 1.2.3.6 port = 500 flags S/SA This morning as a test, I've disabled isakmpd sync feature, and shutdown sasycnd on all firewalls, as well as isakmpd on the secondaries at each location and the connection seems to be much improved. I've not lost any connections in the last 4 hours which is much improved. Not sure if sasyncd is actually causing the issue, but disabling it to gain an improved connections certainly doesn't seem great from an HA standpoint. I've also got a couple static routes in the inet table that point the remote network to the internal carp address for routing purposes. This allows the traffic generated by the secondary firewall to reach the remote network due to the fact that the secondary does not hold the master status of the carp, and therfore can't use the IPSec directly. I do wonder though, since I also have a flow for the same network, the encap and inet routing table have a route for the same network. Which takes priority? Just something to point out since it could be causing troubles. Regards, -- Zach
Re: IPSec Packet Loss Help
> OpenBSD 5.4 GENERIC#37 amd64 I've just booted the MP kernel on all four systems just to test and I am still seeing the behaviour. I can prompt the packet loss by generating load on the CPU. Running Puppet on the machines drives up the CPU usage considerably, at which point my remote session hangs. When CPU returns to normal levels, the traffic flows again. I also failed to mention that I am running OSPF at only one of the locations right now, so a single pair of the firewalls. -- Zach
IPSec Packet Loss Help
I've recently deployed a set of OpenBSD firewalls and nearing a time when they need to go production, but I've got an issue that I can't nail down. I've got a pair of OpenBSD 5.4 systems running on Soekris 6501 at each location, for a total of four firewalls. Each pair is running the sasycnd, pfsync, carp combo, all of which seems to be working correctly. OpenBSD 5.4 GENERIC#37 amd64 * pfctl -ss shows the states are making it between peers * carp fails over nicely, with maybe 1 packet lost in my icmp testing * ipsecctl -sa shows associations sync to its peer * IPSec connections are established to the remote datacenter All of the above looks to be working. I initially had both sides of the IPSec configured with active mode, but I thought it was causing my packet loss perhaps to simultaneous initiation of the security association, so I set passive on one of the locations. The issue I am seeing is that traffic will periodically stop flowing between the sites. During this time, ipsecctl -sa shows that there are associations, netstat -rn says routes are up. My next thought was that perhaps phase1 key exchange was happening to frequently, so I increased it to 2 days, leaving phase2 default, but it still happens. Even still, I expect increasing the timeout is only masking the actual problem. I'm using nagios to monitor the connection, using the internal IPs of the firewalls as the address to ping, and from the history of nagios, it looks like I am having connection issues about every 15 minutes, and these are only the ones that are detected. Sometimes its just a couple packets. Sometimes its down for a good 90 seconds. Every five minutes or so I see this in daemon logs: Mar 5 19:32:32 opdx-fw1 isakmpd[28109]: isakmpd: quick mode done (as responder): src: 1.2.3.5 dst: 66.77.88.10 Which I expect due to the lifetime of the phase2 being set to default value. We deploy the configs using Puppet, so the consistency across machines should be solid, though that doesn't say much about errors in configs we deploy. I also wondered if the single threaded kernel might play a part here. 80% CPU when running Puppet. Would booting the MP kernel help? The CPUs do support it. In any case, I'm stuck. My coworkers are all looking at me wondering when they can purchase some shiny new commercial firewalls, and I'd really like to have a success story here. I can always switch do doing some SSH tunnel, or OpenVPN or some such, but since OpenBSD has IPSec built into core, a) I'd like to use it, and b) I expect it to work. I'm hoping someone on the list can point out something I am doing wrong. This is the fist time I've run OpenBSD in production, so my methods may not be conventional. IPSec is configured to talk to the remote site CARP address, so below thats 1.2.3.5 and 66.77.88.10. Here are (I think) the relevant configs. Please help. # SiteA ipsec.conf ike esp from { 66.77.88.10 10.224.0.0/12 } to { 1.2.3.5 10.240.0.0/12 } local 66.77.88.10 peer 1.2.3.5 main auth hmac-sha2-256 enc blowfish lifetime 172800 quick auth hmac-sha2-384 enc blowfish psk "secret" # SiteB ipsec.conf ike passive esp from { 1.2.3.5 10.240.0.0/12 } to { 66.77.88.10 10.224.0.0/12 } local 1.2.3.5 peer 66.77.88.10 main auth hmac-sha2-256 enc blowfish lifetime 172800 quick auth hmac-sha2-384 enc blowfish psk "secret" # rc.conf ntpd_flags="" isakmpd_flags="-K -S -v" sasyncd_flags="" ipsec=YES syslogd_flags="-h" snmpd_flags=YES #sasyncd on one of the system peer 1.2.3.7 interface carp1 sharedkey 0xsuperlonghex If PF information is needed, I can provide and obscure, but I didn't expect it to be the issue. What else can I use that would help me troubleshoot this? What more information can I provide that would help narrow this down? Regards, -- Zach