Re: IPSec Packet Loss Help

2014-03-10 Thread Andy

Hi Zach.

Ah great news!

I noticed your email before the weekend but didn't have a chance to 
reply. Please you worked it out.


The remote network routes I use don't point at the local inside CARP IP 
but instead at the local inside physical IP (each firewalls own IP just 
to set the source).


Yea setting the NAT fixed some of the issues for us with communicating 
with the firewalls themselves.. Restrict the NAT rule if you like so 
you only NAT to the internal CARP IP when trying to talk to either the 
firewalls physical IPs. No need to NAT for traffic to the rest of the 
LAN as that only ever replies back to the CARP IP as the GW etc..


Cheers, andy.

On Mon 10 Mar 2014 16:25:59 GMT, Zach Leslie wrote:

Hope this helps,


Thanks, Andy.  Once I removed the routes for the remote network point to
the internal carp interface, everything works like I expect.  Super
stable.  Thanks for your time.  I'll mess with the NAT for monitoring
soonish and see if I can get that working.




Re: IPSec Packet Loss Help

2014-03-10 Thread Zach Leslie
> Hope this helps,

Thanks, Andy.  Once I removed the routes for the remote network point to
the internal carp interface, everything works like I expect.  Super
stable.  Thanks for your time.  I'll mess with the NAT for monitoring
soonish and see if I can get that working.

-- 
Zach



Re: IPSec Packet Loss Help

2014-03-07 Thread Zach Leslie
> I had to disable monitoring of the internal interfaces of both remote
> firewalls, as it killed the VPN when you ping'ed the backup firewall. The
> packets get there, but the reply is sent back directly from the backup and
> not via the master.
> 
> To fix that I added a NAT rule, and could then monitor and connect to the
> internal interfaces of both remote firewalls again..
> (These pf.conf examples and files below are from our remote office
> firewalls. carp0 = external, carp1 = internal);
> match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to (carp1)
> 
> pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) }
> keep state (if-bound)
> pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep state
> (if-bound)
> 
> pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } queue
> (_wan_vpn,_wan_pri) set prio (2,5)

I currently don't have any queuing on this link, but my rules look
pretty close to the same here.  tcpdump -nei pflog0 doesn't show any
blocks for my traffic with 'block in log' set, so I don't think PF is
getting in the way.

Though, as you mention above, if the tunnel drops *because* I am hitting
this internal address, this would be a problem.

> PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't
> fragment (needed on both sides naturally);
> match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } to {
> $hq_lan } scrub (no-df max-mss 1400)
> set skip on $if_pfsync

I have set this, though I don't really know how to verify if its working
like expected.  If I tcpdump the internal carp interface, and ping
through the tunnel and to a device on the other side, I see the packets
traverse the link.  If I increase the ping size (-s 1500) I see
fragmentation, but that doesn't really tell me that the rule is working,
does it?  Maybe I don't understand what that is supposed to be doing.

> >>I also submitted some suggested modifications to /etc/rc.d/sasyncd and 
> >>/etc/rc.d/isakmpd here in the past which makes the setup and failover of 
> >>VPNs much faster and more stable.
> >
> >I did see those scripts, though they seem to be more solving the startup
> >time of the daemons.  My issue is more keeping the service up than start
> >time.
> 
> Yea they sort the startup, shutdown and also ensure a prompt the failover. I
> wrote them during 5.2 so may not be so important but they add a level of
> failsafe otherwise.
> Keeping the tunnel up should simply be a case of making sure the backup
> *never* sends encaped packets itself..

Oh, I missed this last read through.  What does this do?  Why?  I assume
this is the reason for the route on the firewall machine to use the
local internal carp for the remote network.

I understand the reason to route the packet to the internal carp
interface on the backup, but on the primary I am unsure.  So where
172.16.32.1 is the internal carp, I have a route on both the master and
the backup.

route add 172.16.0.0/24 172.16.32.1

Assuming that 172.16.0.0/24 is the remote network.

> >It sounds like your setup is similar to my own.  You don't see theses
> >kinds of instability using sasyncd?  If you have a look at my OP, the
> >sasyncd.conf is in there.  Its possible I have a configuration error,
> >but just reading over the manpage again, I don't know what it would be.
> >
> >This is really troubling me.
> >
> 
> No, none at all. Our tunnels are *really* stable. I can reboot a firewall
> and the tunnel only stops for a few seconds before switching over
> gracefully.

I really want to be able to say the same for these.

> /etc/sasyncd.conf
> peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable directly
> connected between firewalls)
> interface carp0
> group carp
> listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc..
> sharedkey 0x
> flushmode startup
> control isakmpd

I now have exactly this, except where the options here are specified as
default in the manpage.  I left off control for example, as isakmpd is
default, as is group carp.

> /etc/isakmpd.conf
> [general]
> listen-on=,

I've added this for my install and verified that ports are listening
only on those addresses.

> /etc/ipsec.conf
> # Macros
> local_gw=""
> local_net=""
> remote_gw=""
> remote_net=""
> 
> ike dynamic esp from $local_net to $remote_net \
> local $local_gw peer $remote_gw \
> main auth hmac-sha2-256 enc aes group modp1024 \
> quick auth hmac-sha2-256 enc aes group modp1024 \
> srcid $local_gw dstid $remote_gw \
> psk 

The only thing I was missing here was the srcid and dstid, but that
didn't seem to make a difference.

So now I have the sasyncd and pfsync both going over a directly
connected link, and sasyncd is only listening on that interface address,
as well as 'set skip' on the interface in pf.  All seems like what you
have.

Just for troubleshooting, I've only added the sasyncd to one side, since
without HA is stable, I'd like to introduce one set of change at a time
for testi

Re: IPSec Packet Loss Help

2014-03-07 Thread Zach Leslie
On Fri, Mar 07, 2014 at 04:35:45PM +, Andy wrote:
> Hi
> 
> On Thu 06 Mar 2014 23:03:58 GMT, Zach Leslie wrote:
> >On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote:
> >>Hi, haven't read your original email but if my assumptions about your setup 
> >>are correct is the VPN tunnel dropping every now and then?
> >
> >Thats correct.  Daemons start up quick, negotiations happen, and then
> >periodically the tunnel is just not available, despite the SAs being
> >available on the masters and the slaves.  Disabling -S on isakmpd and
> >turning off sasyncd makes the tunnel stay up for much longer, 7 hours
> >and counting.
> >
> >>You need the static route to point to the internal interface to make sure 
> >>that packets generated by the firewall itself have a source IP set to the 
> >>internal net thus allowing the IPSec policy route to be used (as it defines 
> >>both the source and dest net, not just the dest net like a normal route).
> >
> >This I have, and packets flow.  Still unclear about which route takes
> >precedence, encap or inet.
> >
> 
> encap (I tried to add a route for the remote net on the firewall pointing to
> the internal switch, which would bounce the packet back to the CARP IP, thus
> getting packets from the backup to the master and over the VPN. But it
> doesn't work, the encap route is used first and so the tunnel drops).

I suppose I should be more explicit.  For packets generated by the host,
it seems to use the inet routing table.  Perhaps thats because the
source IP is not in the flow.

> >>We had to modify all our monitoring scripts to not 'phone home' if the box 
> >>is a backup. Only the master firewall can use the VPN.
> >
> >I've ended up monitoring the host using internal interface, which in
> >turn tells me the tunnel is available.
> 
> I had to disable monitoring of the internal interfaces of both remote
> firewalls, as it killed the VPN when you ping'ed the backup firewall. The
> packets get there, but the reply is sent back directly from the backup and
> not via the master.

What do you mean killed the VPN?  As in you saw packet loss, or the
tunnel went down completely, or just that those packets did not return.

> To fix that I added a NAT rule, and could then monitor and connect to the
> internal interfaces of both remote firewalls again..
> (These pf.conf examples and files below are from our remote office
> firewalls. carp0 = external, carp1 = internal);
> match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to (carp1)
> 
> pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) }
> keep state (if-bound)
> pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep state
> (if-bound)
> 
> pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } queue
> (_wan_vpn,_wan_pri) set prio (2,5)

I've not implemented this NAT yet, simply because the packets in my case
do get returned.  I have a route on both firewalls to the remote network
to point at the carp address shared between them.  Though your mention
of the VPN going down sounds familiar.

> PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't
> fragment (needed on both sides naturally);
> match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } to {
> $hq_lan } scrub (no-df max-mss 1400)
> set skip on $if_pfsync

I'll add this shortly and test.

> Yea they sort the startup, shutdown and also ensure a prompt the failover. I
> wrote them during 5.2 so may not be so important but they add a level of
> failsafe otherwise.
> Keeping the tunnel up should simply be a case of making sure the backup
> *never* sends encaped packets itself..

Committing them to source seems reasonable if they are an improvement on
what is there.

> /etc/sasyncd.conf
> peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable directly
> connected between firewalls)
> interface carp0
> group carp
> listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc..
> sharedkey 0x
> flushmode startup
> control isakmpd

My sasyncd.conf looks different, so I'm going to try some of this now.
Enabling sasyncd and adding -S to the isamkpd causes serious instability
right now.  Disabling it makes everything stable again, so I'm hoping
its just a configuration issue.  So this means that you have PFsync and
sasyncd going over the same directly attached interfaces, correct?

I'll report back when I've implemented the changes.

Thanks for the advice.

-- 
Zach



Re: IPSec Packet Loss Help

2014-03-07 Thread Andy

Hi

On Thu 06 Mar 2014 23:03:58 GMT, Zach Leslie wrote:

On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote:

Hi, haven't read your original email but if my assumptions about your setup are 
correct is the VPN tunnel dropping every now and then?


Thats correct.  Daemons start up quick, negotiations happen, and then
periodically the tunnel is just not available, despite the SAs being
available on the masters and the slaves.  Disabling -S on isakmpd and
turning off sasyncd makes the tunnel stay up for much longer, 7 hours
and counting.


You need the static route to point to the internal interface to make sure that 
packets generated by the firewall itself have a source IP set to the internal 
net thus allowing the IPSec policy route to be used (as it defines both the 
source and dest net, not just the dest net like a normal route).


This I have, and packets flow.  Still unclear about which route takes
precedence, encap or inet.



encap (I tried to add a route for the remote net on the firewall 
pointing to the internal switch, which would bounce the packet back to 
the CARP IP, thus getting packets from the backup to the master and 
over the VPN. But it doesn't work, the encap route is used first and so 
the tunnel drops).



We had to modify all our monitoring scripts to not 'phone home' if the box is a 
backup. Only the master firewall can use the VPN.


I've ended up monitoring the host using internal interface, which in
turn tells me the tunnel is available.


I had to disable monitoring of the internal interfaces of both remote 
firewalls, as it killed the VPN when you ping'ed the backup firewall. 
The packets get there, but the reply is sent back directly from the 
backup and not via the master.


To fix that I added a NAT rule, and could then monitor and connect to 
the internal interfaces of both remote firewalls again..
(These pf.conf examples and files below are from our remote office 
firewalls. carp0 = external, carp1 = internal);
match out on $if_lan from { $hq_lan } to ($if_lan:network) nat-to 
(carp1)


pass in quick on enc0 proto ipencap from { $ext_ip_hqfw } to { (carp0) 
} keep state (if-bound)
pass in quick on enc0 from { $hq_lan } to { $if_lan:network } keep 
state (if-bound)


pass quick on $if_lan from { $hq_lan, (carp1) } to { $if_lan:network } 
queue (_wan_vpn,_wan_pri) set prio (2,5)



PS; Also don't forget to restrict the MTU of VPN traffic so it doesn't 
fragment (needed on both sides naturally);
match in on $if_lan proto { tcp, udp, icmp } from { $if_lan:network } 
to { $hq_lan } scrub (no-df max-mss 1400)

set skip on $if_pfsync




I also submitted some suggested modifications to /etc/rc.d/sasyncd and 
/etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs 
much faster and more stable.


I did see those scripts, though they seem to be more solving the startup
time of the daemons.  My issue is more keeping the service up than start
time.


Yea they sort the startup, shutdown and also ensure a prompt the 
failover. I wrote them during 5.2 so may not be so important but they 
add a level of failsafe otherwise.
Keeping the tunnel up should simply be a case of making sure the backup 
*never* sends encaped packets itself..




It sounds like your setup is similar to my own.  You don't see theses
kinds of instability using sasyncd?  If you have a look at my OP, the
sasyncd.conf is in there.  Its possible I have a configuration error,
but just reading over the manpage again, I don't know what it would be.

This is really troubling me.



No, none at all. Our tunnels are *really* stable. I can reboot a 
firewall and the tunnel only stops for a few seconds before switching 
over gracefully.


/etc/sasyncd.conf
peer 192.168.30.253 <- The other IP on the PFSYNC interface (cable 
directly connected between firewalls)

interface carp0
group carp
listen on 192.168.30.252 inet port 500 <- This PFSYNC IP etc..
sharedkey 0x
flushmode startup
control isakmpd

/etc/isakmpd.conf
[general]
listen-on=,

/etc/ipsec.conf
# Macros
local_gw=""
local_net=""
remote_gw=""
remote_net=""

ike dynamic esp from $local_net to $remote_net \
local $local_gw peer $remote_gw \
main auth hmac-sha2-256 enc aes group modp1024 \
quick auth hmac-sha2-256 enc aes group modp1024 \
srcid $local_gw dstid $remote_gw \
psk 

/etc/rc.d/isakmpd.conf;
#!/bin/sh
#
# $OpenBSD: isakmpd,v 1.1 2011/07/06 18:55:36 robert Exp $

daemon="/sbin/isakmpd"

. /etc/rc.d/rc.subr

pexp="isakmpd: monitor \[priv\]"

rc_pre() {
   [ X"${sasyncd_flags}" != X"NO" ] && \
   daemon_flags="-S ${daemon_flags}"
   return 0
}

rc_stop() {
   if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then 
ipsecctl -d -f /etc/ipsec.conf; fi

   sleep 1
   if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then 
ipsecctl -d -f /etc/ipsec.conf; fi
   if [ `ifconfig | grep "status: master" | wc -l` > 0 ]; then 
ipsecctl -F -f /etc/ipsec.conf; fi

   pkill -f "^${pexp}"
}

rc_cmd $1

/etc/rc.d/sasyncd
#!/bin/sh
#
# $Open

Re: IPSec Packet Loss Help

2014-03-06 Thread Zach Leslie
On Thu, Mar 06, 2014 at 08:16:34PM +, Andy Lemin wrote:
> Hi, haven't read your original email but if my assumptions about your setup 
> are correct is the VPN tunnel dropping every now and then?

Thats correct.  Daemons start up quick, negotiations happen, and then
periodically the tunnel is just not available, despite the SAs being
available on the masters and the slaves.  Disabling -S on isakmpd and
turning off sasyncd makes the tunnel stay up for much longer, 7 hours
and counting.

> You need the static route to point to the internal interface to make sure 
> that packets generated by the firewall itself have a source IP set to the 
> internal net thus allowing the IPSec policy route to be used (as it defines 
> both the source and dest net, not just the dest net like a normal route).

This I have, and packets flow.  Still unclear about which route takes
precedence, encap or inet.

> We had to modify all our monitoring scripts to not 'phone home' if the box is 
> a backup. Only the master firewall can use the VPN.

I've ended up monitoring the host using internal interface, which in
turn tells me the tunnel is available.

> I also submitted some suggested modifications to /etc/rc.d/sasyncd and 
> /etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs 
> much faster and more stable.

I did see those scripts, though they seem to be more solving the startup
time of the daemons.  My issue is more keeping the service up than start
time.

It sounds like your setup is similar to my own.  You don't see theses
kinds of instability using sasyncd?  If you have a look at my OP, the
sasyncd.conf is in there.  Its possible I have a configuration error,
but just reading over the manpage again, I don't know what it would be.

This is really troubling me.

-- 
Zach



Re: IPSec Packet Loss Help

2014-03-06 Thread Andy Lemin
Hi, haven't read your original email but if my assumptions about your setup are 
correct is the VPN tunnel dropping every now and then?

I had a similar issue with 4 OBSD firewalls (2 at each end), all running 
isakmpd and sasyncd to keep the SAs in sync between a pair. With the tunnels 
explicitly set to use the CARP IPs as endpoints.

You need the static route to point to the internal interface to make sure that 
packets generated by the firewall itself have a source IP set to the internal 
net thus allowing the IPSec policy route to be used (as it defines both the 
source and dest net, not just the dest net like a normal route).

However whilst pinging the remote net from the live firewall works, I found 
that if you try and ping the remote net from the backup firewall, the packet 
does *not* get routed to the CARP master (why would it!). The IPSec route is 
there..

The backup firewall encaps the packet and sends it out its own external 
interface using the dets from sasyncd, but the master firewall at the other end 
will see the packet as coming from 'another' box with the same details but not 
the full correct policy and so the tunnel is blocked for a period until the 
master renegotiates it.

We had to modify all our monitoring scripts to not 'phone home' if the box is a 
backup. Only the master firewall can use the VPN.

I also submitted some suggested modifications to /etc/rc.d/sasyncd and 
/etc/rc.d/isakmpd here in the past which makes the setup and failover of VPNs 
much faster and more stable.

Can dig out if you can't find them.

Cheers, Andy

Sent from my iPhone

> On 6 Mar 2014, at 19:00, Zach Leslie  wrote:
> 
> On Wed, Mar 05, 2014 at 11:05:11PM -0600, Amit Kulkarni wrote:
>>> If PF information is needed, I can provide and obscure, but I didn't
>>> expect it to be
>>> the issue.
>> 
>> i am no expert on this. but if it is a packet loss issue, you need to post
>> the obscured pf.conf
> 
> Fair point.  I've not seen any related dropped packets due to PF with
> tcpdump -nei pflog0, so I didn't think it would be related to PF.
> 
> Not line-wrapped for readability.
> 
> match out on em0 from ! (em0:network) to any nat-to (em0:0)
> block drop in log all
> pass out all flags S/SA
> pass out on em0 proto tcp all flags S/SA modulate state
> pass in proto icmp from  to 
> pass in proto ipv6-icmp from  to 
> pass in from  to !  flags S/SA
> pass in from  to any flags S/SA
> pass in proto udp from  to  port = 53
> pass out on em0 proto udp all
> pass out on em0 proto icmp all
> pass on em0 inet proto carp all
> pass on em0 proto icmp from any to (em0:network)
> pass on em1 inet proto pfsync all
> pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 500
> pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 4500
> pass in on em0 inet proto esp from 66.77.88.10 to 1.2.3.5
> pass in on enc0 inet proto ipencap from 66.77.88.10 to 1.2.3.5 keep state 
> (if-bound)
> pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 500
> pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 4500
> pass out on em0 inet proto esp from 1.2.3.5 to 66.77.88.10
> pass out on enc0 inet proto ipencap from 1.2.3.5 to 66.77.88.10 keep state 
> (if-bound)
> pass out on enc0 inet from 1.2.3.5 to  flags S/SA keep state 
> (if-bound)
> pass out on enc0 from  to  flags S/SA keep state 
> (if-bound)
> pass in on enc0 from  to  flags S/SA keep state 
> (if-bound)
> pass in on enc0 inet from 66.77.88.10 to  flags S/SA keep state 
> (if-bound)
> pass in on em0 inet proto tcp from 66.77.88.17 to any port = 22 flags S/SA
> pass in on em0 inet proto tcp from 1.2.3.7 to 1.2.3.6 port = 500 flags S/SA
> 
> This morning as a test, I've disabled isakmpd sync feature, and shutdown
> sasycnd on all firewalls, as well as isakmpd on the secondaries at each
> location and the connection seems to be much improved.  I've not lost
> any connections in the last 4 hours which is much improved.
> 
> Not sure if sasyncd is actually causing the issue, but disabling it
> to gain an improved connections certainly doesn't seem great from an HA
> standpoint.
> 
> I've also got a couple static routes in the inet table that point the
> remote network to the internal carp address for routing purposes.  This
> allows the traffic generated by the secondary firewall to reach the
> remote network due to the fact that the secondary does not hold the
> master status of the carp, and therfore can't use the IPSec directly.
> 
> I do wonder though, since I also have a flow for the same network, the
> encap and inet routing table have a route for the same network.  Which
> takes priority?  Just something to point out since it could be causing
> troubles.
> 
> Regards,
> 
> -- 
> Zach



Re: IPSec Packet Loss Help

2014-03-06 Thread Zach Leslie
On Wed, Mar 05, 2014 at 11:05:11PM -0600, Amit Kulkarni wrote:
> > If PF information is needed, I can provide and obscure, but I didn't
> > expect it to be
> > the issue.
> >
> 
> i am no expert on this. but if it is a packet loss issue, you need to post
> the obscured pf.conf

Fair point.  I've not seen any related dropped packets due to PF with
tcpdump -nei pflog0, so I didn't think it would be related to PF.

Not line-wrapped for readability.

match out on em0 from ! (em0:network) to any nat-to (em0:0)
block drop in log all
pass out all flags S/SA
pass out on em0 proto tcp all flags S/SA modulate state
pass in proto icmp from  to 
pass in proto ipv6-icmp from  to 
pass in from  to !  flags S/SA
pass in from  to any flags S/SA
pass in proto udp from  to  port = 53
pass out on em0 proto udp all
pass out on em0 proto icmp all
pass on em0 inet proto carp all
pass on em0 proto icmp from any to (em0:network)
pass on em1 inet proto pfsync all
pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 500
pass in on em0 inet proto udp from 66.77.88.10 to 1.2.3.5 port = 4500
pass in on em0 inet proto esp from 66.77.88.10 to 1.2.3.5
pass in on enc0 inet proto ipencap from 66.77.88.10 to 1.2.3.5 keep state 
(if-bound)
pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 500
pass out on em0 inet proto udp from 1.2.3.5 to 66.77.88.10 port = 4500
pass out on em0 inet proto esp from 1.2.3.5 to 66.77.88.10
pass out on enc0 inet proto ipencap from 1.2.3.5 to 66.77.88.10 keep state 
(if-bound)
pass out on enc0 inet from 1.2.3.5 to  flags S/SA keep state 
(if-bound)
pass out on enc0 from  to  flags S/SA keep state (if-bound)
pass in on enc0 from  to  flags S/SA keep state (if-bound)
pass in on enc0 inet from 66.77.88.10 to  flags S/SA keep state 
(if-bound)
pass in on em0 inet proto tcp from 66.77.88.17 to any port = 22 flags S/SA
pass in on em0 inet proto tcp from 1.2.3.7 to 1.2.3.6 port = 500 flags S/SA

This morning as a test, I've disabled isakmpd sync feature, and shutdown
sasycnd on all firewalls, as well as isakmpd on the secondaries at each
location and the connection seems to be much improved.  I've not lost
any connections in the last 4 hours which is much improved.

Not sure if sasyncd is actually causing the issue, but disabling it
to gain an improved connections certainly doesn't seem great from an HA
standpoint.

I've also got a couple static routes in the inet table that point the
remote network to the internal carp address for routing purposes.  This
allows the traffic generated by the secondary firewall to reach the
remote network due to the fact that the secondary does not hold the
master status of the carp, and therfore can't use the IPSec directly.

I do wonder though, since I also have a flow for the same network, the
encap and inet routing table have a route for the same network.  Which
takes priority?  Just something to point out since it could be causing
troubles.

Regards,

-- 
Zach



Re: IPSec Packet Loss Help

2014-03-05 Thread Zach Leslie
> OpenBSD 5.4 GENERIC#37 amd64

I've just booted the MP kernel on all four systems just to test and I am
still seeing the behaviour.  I can prompt the packet loss by generating
load on the CPU.  Running Puppet on the machines drives up the CPU usage
considerably, at which point my remote session hangs.  When CPU returns
to normal levels, the traffic flows again.

I also failed to mention that I am running OSPF at only one of the
locations right now, so a single pair of the firewalls.

-- 
Zach



IPSec Packet Loss Help

2014-03-05 Thread Zach Leslie
I've recently deployed a set of OpenBSD firewalls and nearing a time
when they need to go production, but I've got an issue that I can't nail
down.

I've got a pair of OpenBSD 5.4 systems running on Soekris 6501 at each
location, for a total of four firewalls.  Each pair is running the
sasycnd, pfsync, carp combo, all of which seems to be working correctly.

OpenBSD 5.4 GENERIC#37 amd64

* pfctl -ss shows the states are making it between peers
* carp fails over nicely, with maybe 1 packet lost in my icmp testing
* ipsecctl -sa shows associations sync to its peer
* IPSec connections are established to the remote datacenter

All of the above looks to be working.  I initially had both sides of the
IPSec configured with active mode, but I thought it was causing my
packet loss perhaps to simultaneous initiation of the security
association, so I set passive on one of the locations.

The issue I am seeing is that traffic will periodically stop flowing
between the sites.  During this time, ipsecctl -sa shows that there are
associations, netstat -rn says routes are up.

My next thought was that perhaps phase1 key exchange was happening to
frequently, so I increased it to 2 days, leaving phase2 default, but it
still happens.  Even still, I expect increasing the timeout is only
masking the actual problem.

I'm using nagios to monitor the connection, using the internal IPs of
the firewalls as the address to ping, and from the history of nagios, it
looks like I am having connection issues about every 15 minutes, and
these are only the ones that are detected.  Sometimes its just a couple
packets.  Sometimes its down for a good 90 seconds.

Every five minutes or so I see this in daemon logs:
Mar  5 19:32:32 opdx-fw1 isakmpd[28109]: isakmpd: quick mode done (as
responder): src: 1.2.3.5 dst: 66.77.88.10

Which I expect due to the lifetime of the phase2 being set to default
value.

We deploy the configs using Puppet, so the consistency across machines
should be solid, though that doesn't say much about errors in configs we
deploy.

I also wondered if the single threaded kernel might play a part here.
80% CPU when running Puppet.  Would booting the MP kernel help?  The
CPUs do support it.

In any case, I'm stuck.  My coworkers are all looking at me wondering
when they can purchase some shiny new commercial firewalls, and I'd
really like to have a success story here.  I can always switch do doing
some SSH tunnel, or OpenVPN or some such, but since OpenBSD has IPSec
built into core, a) I'd like to use it, and b) I expect it to work.  I'm
hoping someone on the list can point out something I am doing wrong.

This is the fist time I've run OpenBSD in production, so my methods may
not be conventional.

IPSec is configured to talk to the remote site CARP address, so below
thats 1.2.3.5 and 66.77.88.10.

Here are (I think) the relevant configs.  Please help.

# SiteA ipsec.conf
ike  esp from { 66.77.88.10 10.224.0.0/12 } to { 1.2.3.5
10.240.0.0/12 } local 66.77.88.10 peer 1.2.3.5 main auth
hmac-sha2-256 enc blowfish lifetime 172800 quick auth hmac-sha2-384 enc
blowfish psk "secret"

# SiteB ipsec.conf
ike passive esp from { 1.2.3.5 10.240.0.0/12 } to { 66.77.88.10
10.224.0.0/12 } local 1.2.3.5 peer 66.77.88.10 main auth
hmac-sha2-256 enc blowfish lifetime 172800 quick auth hmac-sha2-384 enc
blowfish psk "secret"

# rc.conf
ntpd_flags=""
isakmpd_flags="-K -S -v"
sasyncd_flags=""
ipsec=YES
syslogd_flags="-h"
snmpd_flags=YES

#sasyncd on one of the system
peer 1.2.3.7
interface carp1
sharedkey 0xsuperlonghex

If PF information is needed, I can provide and obscure, but I didn't expect it 
to be
the issue.

What else can I use that would help me troubleshoot this?  What more
information can I provide that would help narrow this down?

Regards,

-- 
Zach