Hello,

I don't know if this may help you, but I have a working BGP setup with two 
routers active/active.
I don't use pfsync, but keep state (sloppy).

This is less secure according to pf.conf(5), but that's not really a concern 
for me as those routers are not my border firewalls...
But maybe I am mistaking doing this ?

--
Cordialement,
Pierre BARDOU


-----Message d'origine-----
De : David Gwynne [mailto:da...@gwynne.id.au] 
Envoyé : jeudi 4 juillet 2013 09:47
À : loic.b...@unix-experience.fr
Cc : misc@openbsd.org
Objet : Re: PF sync doesn't not work very well

On 03/07/2013, at 6:23 PM, Loïc Blot <loic.b...@unix-experience.fr> wrote:

> Okay, defer is now enabled on pfsync interface (sorry for my last 
> idea, i haven't the man on me :) ).
> It seems the problem isn't resolved.
> The transfer starts but blocked at random time.

i have hit this too, despite being the person most responsible for trying to 
make pfsync work in active-active (hi bob!) configurations.

the problem is the tcp window tracking pf does, and how pfsync tries to cope 
with different routers being responsible for different halves of the packet 
flow. pfsync tries to merge each side of the tcp windows and tries to detect 
split paths to exchange updates more rapidly for those states. however, i find 
at some point the actual tcp windows move too fast for pfsync to keep up and 
all the real packets fall out of the window, causing the stalls you're talking 
about.

my solution is to try and prefer one half of the firewalls for all traffic, and 
use the second for handling failure. the split path handling works well enough 
that we can support traffic while we change roles (moving master to slave and 
slave to master) and the upstream hasnt figured it out yet via ospf.

sorry for the bad news. i might try and have a look at the state merge code 
again and see if there's something obvious i am missing.

cheers,
dlg

> --
> Best regards,
> 
> Loïc BLOT, Engineering
> UNIX Systems, Security and Networks
> http://www.unix-experience.fr
> 
> 
> Le mercredi 03 juillet 2013 à 08:12 +0200, Loïc BLOT a écrit :
>> Hi,
>> Thanks for your reply. I wasn't careful about this section.
>> If i understand i must add defer option to my WAN iface (or i'm wrong 
>> i must add it to my vlan995 iface ?) ?
>> 
>> I will test it this morning, and i return back to misc :)
>> --
>> Best regards,
>> Loc BLOT,
>> UNIX systems, security and network expert 
>> http://www.unix-experience.fr
>> 
>> 
>> Le mercredi 03 juillet 2013  02:02 +0200, mxb a crit :
>>> pfsync(4) explains this:
>>> 
>>> " The pfsync interface will attempt to collapse multiple state 
>>> updates
>> into
>>>     a single packet where possible.  The maximum number of times a single
>>>     state can be updated before a pfsync packet will be sent out is
>>>     controlled by the maxupd parameter "
>>> 
>>> 
>>> and
>>> 
>>> " Where more than one firewall might actively handle packets, e.g. with
>>>     certain ospfd(8), bgpd(8) or carp(4) configurations, it is 
>>> beneficial
>> to
>>>     defer transmission of the initial packet of a connection.  The pfsync
>>>     state insert message is sent immediately; the packet is queued until
>>>     either this message is acknowledged by another system, or a 
>>> timeout
>> has
>>>     expired.  This behaviour is enabled with the defer parameter to
>>>     ifconfig(8).
>>> "
>>> 
>>> 
>>> Eg. "defer: on", yours is "off".
>>> 
>>> //mxb
>>> 
>>> 
>>> On 2 jul 2013, at 21:54, Loc BLOT <loic.b...@unix-experience.fr> wrote:
>>> 
>>>> Hi all
>>>> I have a strange issue (or i haven't read pfsync correctly but i 
>>>> don't think this is the problem :D)
>>>> 
>>>> I'm using 2 OpenBSD as BGP+OSPF routers at the border of one site.
>>>> 
>>>> Those BGP routers are secure with strong PF in stateful mode, and 
>>>> the stateful is working very well on each router. Because of my 
>>>> full mesh BGP configuration, the outgoing layer 7 sessions can 
>>>> leave my network by one router and responses can income by the other.
>>>> 
>>>> To resolve this issue, i have created a dedidated VLAN for the 
>>>> pfsync traffic and attached pfsync to this VLAN.
>>>> 
>>>> Here is a sample output of ifconfig on my first router:
>>>> 
>>>> vlan995: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>       lladdr a0:36:9f:10:4a:a6
>>>>       priority: 0
>>>>       vlan: 995 parent interface: trunk1
>>>>       groups: vlan
>>>>       status: active
>>>>       inet6 fe80::a236:9fff:fe10:4aa6%vlan995 prefixlen 64 scopeid
>>>> 0x10
>>>>       inet 10.117.1.129 netmask 0xfffffff8 broadcast 10.117.1.135
>>>> pfsync0: flags=41<UP,RUNNING> mtu 1500
>>>>       priority: 0
>>>>       pfsync: syncdev: vlan995 maxupd: 255 defer: off
>>>>       groups: carp pfsync
>>>> 
>>>> And here on my second router:
>>>> 
>>>> vlan995: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>       lladdr a0:36:9f:17:e2:1e
>>>>       priority: 0
>>>>       vlan: 995 parent interface: trunk1
>>>>       groups: vlan
>>>>       status: active
>>>>       inet6 fe80::a236:9fff:fe17:e21e%vlan995 prefixlen 64 scopeid
>>>> 0x10
>>>>       inet 10.117.1.130 netmask 0xfffffff8 broadcast 10.117.1.135
>>>> pfsync0: flags=41<UP,RUNNING> mtu 1500
>>>>       priority: 0
>>>>       pfsync: syncdev: vlan995 maxupd: 255 defer: off
>>>>       groups: carp pfsync
>>>> 
>>>> As you see in next tcpdump capture, there is some discussions 
>>>> between the two routers:
>>>> 
>>>> # tcpdump -nni vlan995
>>>> tcpdump: listening on vlan995, link-type EN10MB
>>>> tcpdump: WARNING: compensating for unaligned libpcap packets
>>>> 23:41:13.699617 10.117.1.130: PFSYNCv6 len 108
>>>>   act UPD ST COMP count 1
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 23:41:14.158500 10.117.1.129: PFSYNCv6 len 108
>>>>   act UPD ST COMP count 1
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 23:41:14.941396 SSTP STP config root=83e3.0:a:b8:7b:27:80 
>>>> rootcost=3
>>>> bridge=c3e3.0:17:e:2e:f:80 port=142 ifcost=130 age=1/0 max=20/0
>>>> hello=2/0 fwdelay=15/0 pvid=995
>>>> 23:41:14.949617 10.117.1.130: PFSYNCv6 len 108
>>>>   act UPD ST COMP count 1
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 23:41:15.237655 10.117.1.129: PFSYNCv6 len 640
>>>>   act UPD ST COMP count 1
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 23:41:15.949617 10.117.1.130: PFSYNCv6 len 124
>>>>   act UPD ST COMP count 1
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 23:41:16.255230 10.117.1.129: PFSYNCv6 len 36
>>>>   act DEL ST COMP count 1
>>>>       id: 51d16a3500006c33 creatorid: a10bbd21
>>>> (DF) [tos 0x10]
>>>> 23:41:16.946454 SSTP STP config root=83e3.0:a:b8:7b:27:80 
>>>> rootcost=3
>>>> bridge=c3e3.0:17:e:2e:f:80 port=142 ifcost=130 age=1/0 max=20/0
>>>> hello=2/0 fwdelay=15/0 pvid=995
>>>> 23:41:16.949619 10.117.1.130: PFSYNCv6 len 1116
>>>>   act UPD ST COMP count 13
>>>>   ...
>>>> (DF) [tos 0x10]
>>>> 
>>>> 
>>>> The problem is simple, when i initiate a stateful connection from 
>>>> one server, the return (by second router) is blocked by PF (i see 
>>>> the return with pflog0)
>>>> 
>>>> To be precise here is an example (and tested path):
>>>> 
>>>> OBSD NTP -> OBSD router 1 -> WAN...ftp.fr.openbsd.org...WAN -> OBSD 
>>>> router 2 || blocked
>>>> 
>>>> PF allow in/out routing traffic from this server but incoming from 
>>>> WAN is blocked by default
>>>> 
>>>> Can you confirm to me that pfsync may add a state for outgoing tcp 
>>>> connection in the second router when the first router add it ?
>>>> Have you got any idea on this issue ?
>>>> 
>>>> --
>>>> Best regards,
>>>> Loc BLOT,
>>>> UNIX systems, security and network expert 
>>>> http://www.unix-experience.fr
>>>> 
>>>> [demime 1.01d removed an attachment of type 
>>>> application/pgp-signature
>> which had a name of signature.asc]
>> 
>> [demime 1.01d removed an attachment of type application/pgp-signature 
>> which had a name of signature.asc]

Reply via email to