Broadcast behavior in 4.7 [Was: Re: Trying to set diskless(8) -- hanging in "RPC timeout for server"]

2010-05-11 Thread Pascal Lalonde
I just happened to run into the same issue right after upgrading to 4.7
(however, you mention 4.6, so I'm uncertain we're dealing with the same
cause).

Basically, the issue I'm seeing is that portmap/rpc.bootparamd don't see
the incoming packets for 172.16.255.255 (my own network being
172.16.5.0/25, so broadcast is 172.16.5.127).

There were some changes made to sys/netinet/in.c, especially rev 1.56.

As far as I know, the diskless machine cannot learn its netmask through
RARP, so will assume a netmask based on the class of the network the
machine is in, hence the 172.16.255.255 broadcast. Before rev 1.56 of
netinet/in.c, it seems the kernel would accept broadcasts for the
broadcast address associated to your network "class". Or at least that's
the behavior I observe when running "portmap -d". After updating to 1.56
and up, portmap/rpc.bootparamd don't see the requests for
172.16.255.255.

As a workaround, I succeeded by either keeping a 4.6 kernel around to
answer the bootparam requests, or forcing a broadcast address of
172.16.255.255 on the bootparamd server. Not particularly clean, but it
did the trick.

As for a permanent fix, I am unsure. I don't know of any way other than
RARP to do diskless in OpenBSD, at least on i386/amd64.

Any thoughts?

--
Pascal



On Wed, May 12, 2010 at 12:30:39AM +0200, Stefan Unterweger wrote:
> * Fred Crowson on Tue, May 11, 2010 at 10:43:09PM +0100:
> > What does your dhcpd.conf look like on your server?
>
> I have several subnets served via DHCP, so I have reported only
> the relevant one together with the global options:
>
> | server-name "Neu-Sorpigal";
> | option domain-name "intranet.aleturo.com";
> | default-lease-time 86400;
> |
> | shared-network wired {
> | option domain-name "wired.intranet.aleturo.com";
> | option domain-name-servers 172.23.12.2;
> | option netbios-name-servers 172.23.12.2;
> | option routers 172.23.12.2;
> |
> | filename "pxeboot";
> | next-server 172.23.12.2;
> | option root-path "/export/client/";
> |
> | subnet 172.23.0.0 netmask 255.255.0.0 {
> | allow unknown-clients;
> | range 172.23.13.128 172.23.13.254;
> | }
> | }
>
> I've added the options "next-server" and "root-path" just now,
> since I've seen mention of it in pxeboot(8). Prior to that, only
> the "filename" directive was there. Everything else however,
> including the tcpdumps, is not impressed by that.
>
> > It might be worth having -vv and -X on your tcpdump it might provide
> > more info as to the problem.
>
> I didn't include the dump from phase 2, where pxeboot and the
> kernel are served by tftp and whatelse, since that's an insane
> amount of data. This tcpdump was started just before the kernel
> tried to connect to NFS, that is, before the second burst.
>
> | $ tcpdump -X -vv -n -s 160 -i em0 host 172.23.13.138
> | tcpdump: listening on em0, link-type EN10MB
> | 00:19:48.612571 rarp reply 00:00:e2:87:e8:76 at 172.23.13.138
> |   : 0001 0800 0604 0004 000e 0c06 be26 ac17  >&,.
> |   0010: 0c02  e287 e876 ac17 0d8ab.hv,...
> |
> | 00:19:48.613207 arp who-has 172.23.13.138 tell 172.23.13.138
> |   : 0001 0800 0604 0001  e287 e876 ac17  ..b.hv,.
> |   0010: 0d8a    ac17 0d8a    ,...
> |   0020:          ..
> |
> | 00:19:48.630322 172.23.13.138.718 > 172.23.255.255.111: [udp sum ok] udp
96 (ttl 64, id 65499, len 124)
> |   : 4500 007c ffdb  4011 14dd ac17 0d8a  E..|...@..],...
> |   0010: ac17  02ce 006f 0068 eac4 90ad 0bca  ,..N.o.hjD.-.J
> |   0020:    0002 0001 86a0  0002  ... 
> |   0030:  0005  0001  0014    
> |   0040:          
> |   0050:     0001 86ba  0001  ...:
> |   0060:  0001  0014  0001  00ac  ...,
> |   0070:  0017  000d  008a
> |
> | 00:19:49.620480 172.23.13.138.718 > 172.23.255.255.111: [udp sum ok] udp
96 (ttl 64, id 60019, len 124)
> |   : 4500 007c ea73  4011 2a45 ac17 0d8a  E..|j...@.*e,...
> |   0010: ac17  02ce 006f 0068 eac4 90ad 0bca  ,..N.o.hjD.-.J
> |   0020:    0002 0001 86a0  0002  ... 
> |   0030:  0005  0001  0014    
> |   0040:          
> |   0050:     0001 86ba  0001  ...:
> |   0060:  0001  0014  0001  00ac  ...,
> |   0070:  0017  000d  008a
> |
> | 00:19:51.620513 172.23.13.138.718 > 172.23.255.255.111: [udp sum ok] udp
96 (ttl 64, id 63711, len 124)
> |   : 4500 007c f8df  4011 1bd9 ac17 0d8a  E..|x...@..y,...
> |   0010: ac17  02ce 006f 0068 eac4 90ad 0bca  ,..N.o.hjD.-.J
> |   0020:    0002

Re: Removing pf_pool

2010-01-13 Thread Pascal Lalonde
On Wed, Jan 13, 2010 at 01:58:30PM +0900, Ryan McBride wrote:
> 
> My first thought is to wonder why you're not running with a symmetrical
> cluster. But I realise that we are not always in control of such things,
> and one of PFs functions is to get help people work around bad network
> design.

Right on. We depend heavily on "weights". We have a site that receives
many hits/sec, with a bunch of dual-quad cores behind, processing heavy
pages (which we have no control over ;-). Even though most have the same
amount of RAM and cores, a difference in the processor model will
require such a weight adjustment to prevent it from going overboard.
We're tight on resources, both computer and monetary ... a common story
I suppose.


> There are a few things you can do here to get a similar effect.
> 
> 2) Use the 'probability' keyword 
> 
>   pass quick on em0 inet proto tcp from any to 192.168.100.100 \
>   probability 50% rdr-to 10.0.0.1
>   pass quick on em0 inet proto tcp from any to 192.168.100.100 \
>   probability 70% rdr-to 10.0.0.2
>   pass quick on em0 inet proto tcp from any to 192.168.100.100 \
>   rdr-to 10.0.0.3

I hadn't thought of this one. It might be a good solution for us.
Thanks for the tip.


> The changes just committed are actually cleanup that needs to happen if
> you want to see some more intelligent weighted load balancing in PF than
> these hacks. But that is still a far ways off, definately after 4.7.

Still, I'm very glad to hear that the idea has been floating around.

Thanks a lot,

-- 
Pascal



Removing pf_pool

2010-01-12 Thread Pascal Lalonde
I just caught the following from openbsd-cvs:

http://marc.info/?l=openbsd-cvs&m=126326657232193&w=2

If my understanding is correct, this means that it will become
impossible to emulate weighted round robin with constructs like the one
below, since duplicate IPs will be "flattened" once converted to a
standard PF table?

rdr on em0 inet proto tcp \
from any to 192.168.100.100 port = www -> {
10.0.0.1, 10.0.0.1, 10.0.0.1, \
10.0.0.2, 10.0.0.2, \
10.0.0.3 \
} round-robin

Is this right?

-- 
Pascal



Re: ifstated with carp0

2009-09-30 Thread Pascal Lalonde
On Mon, Sep 28, 2009 at 08:06:36AM +0200, Laurent CARON wrote:
> On 28/09/2009 04:28, Steven Surdock wrote:
>> ...
>> HERE IS IFSTATED DETECTING THE FAILOVER, WHICH SHOULD HAVE HAPPENED ON
>> SEP 25, BUT DIDN'T
>> Sep 26 14:19:03 fw2 ifstated[16189]: changing state to normal
>> Sep 26 14:19:03 fw2 ifstated[16189]: running date|mail -s 'FW2 is now
>> the backup firewall' root
<...>
>
> I feel happy not to be the only one experiencing this behavior, although  
> this might be a config error on both sides ;)
>

This looks quite familiar to me as well. Have a look here:

http://marc.info/?l=openbsd-misc&m=124942995116023&w=2

Could you try testing CARP failover and monitoring with "route -n monitor" ?

If it's really a bug with ifstated, route monitor should catch all the
state changes I suppose. But in my case it didn't. Would be nice if
someone else could confirm the behavior I'm getting.

Thanks,
-- 
Pascal



Re: ifstated with multiple CARP interfaces

2009-08-04 Thread Pascal Lalonde
On Tue, Aug 04, 2009 at 01:20:17AM +, Stuart Henderson wrote:
> I don't understand what you mean by "VLAN on carp1", can you explain it
> a bit more please?

My bad. I confused things a little.

It's as you say, carpdevs set to vlan interfaces. In this case, carp1010
and carp1011 have vlan1010 and vlan1011 respectively as carpdevs, and
those vlans both have em3 as their parent interface. There is also
carp3 that has em3 as its carpdev.

So:
carp0 (index 11) carpdev em0
carp1 (index 12) carpdev em1
carp1010 (index 13) carpdev vlan1010 (which is a vlan on em3)
carp1011 (index 14) carpdev vlan1011 (which is another vlan on em3)
carp2 (index 15) carpdev em2
carp3 (index 16) carpdev em3


> Do you see the same result from other software e.g. "route -n monitor"?
> (use recent -current [or, if your dns is ok, remove the -n option] to
> display interface names rather than index numbers) 

Neat. I didn't know about route monitor. So I tried again with route -n
monitor, with 4.5 GENERIC.MP.

Here's the output after setting it to master:


got message of size 208 on Tue Aug  4 19:45:06 2009
RTM_IFINFO: iface status change: len 208, if# 12, link: master, 
flags:
got message of size 104 on Tue Aug  4 19:45:06 2009
RTM_DELETE: Delete Route: len 104, priority 0, table 0, pid: 0, seq 0, errno 3
flags:
locks:  inits: 
sockaddrs: 
 10.0.1.250
got message of size 120 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 120, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 10.0.1.250 10.0.1.250
got message of size 100 on Tue Aug  4 19:45:06 2009
RTM_NEWADDR: address being added to iface: len 100, metric 0, flags:
sockaddrs: 
 ::::: 00:00:5e:00:01:78 fe80::200:5eff:fe00:178%carp1
got message of size 136 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 136, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 fe80::200:5eff:fe00:178%carp1 00:00:5e:00:01:78
got message of size 104 on Tue Aug  4 19:45:06 2009
RTM_DELETE: Delete Route: len 104, priority 0, table 0, pid: 0, seq 0, errno 3
flags:
locks:  inits: 
sockaddrs: 
 10.0.1.15
got message of size 132 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 132, priority 0, table 0, pid: 0, seq 0, errno 17
flags:
locks:  inits: 
sockaddrs: 
 10.0.1.0 10.0.1.15 255.255.255.0 default
got message of size 208 on Tue Aug  4 19:45:06 2009
RTM_IFINFO: iface status change: len 208, if# 13, link: master, 
flags:
got message of size 104 on Tue Aug  4 19:45:06 2009
RTM_DELETE: Delete Route: len 104, priority 0, table 0, pid: 0, seq 0, errno 3
flags:
locks:  inits: 
sockaddrs: 
 10.0.10.250
got message of size 120 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 120, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 10.0.10.250 10.0.10.250
got message of size 104 on Tue Aug  4 19:45:06 2009
RTM_NEWADDR: address being added to iface: len 104, metric 0, flags:
sockaddrs: 
 ::::: 00:00:5e:00:01:6e fe80::200:5eff:fe00:16e%carp1010
got message of size 136 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 136, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 fe80::200:5eff:fe00:16e%carp1010 00:00:5e:00:01:6e
got message of size 208 on Tue Aug  4 19:45:06 2009
RTM_IFINFO: iface status change: len 208, if# 11, link: master, 
flags:
got message of size 104 on Tue Aug  4 19:45:06 2009
RTM_DELETE: Delete Route: len 104, priority 0, table 0, pid: 0, seq 0, errno 3
flags:
locks:  inits: 
sockaddrs: 
 10.137.16.192
got message of size 120 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 120, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 10.137.16.192 10.137.16.192
got message of size 100 on Tue Aug  4 19:45:06 2009
RTM_NEWADDR: address being added to iface: len 100, metric 0, flags:
sockaddrs: 
 ::::: 00:00:5e:00:01:6e fe80::200:5eff:fe00:16e%carp0
got message of size 136 on Tue Aug  4 19:45:06 2009
RTM_ADD: Add Route: len 136, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 fe80::200:5eff:fe00:16e%carp0 00:00:5e:00:01:6e
got message of size 208 on Tue Aug  4 19:45:06 2009
RTM_IFINFO: iface status change: len 208, if# 14, link: master, 
flags:


And back to slave:


got message of size 208 on Tue Aug  4 19:45:32 2009
RTM_IFINFO: iface status change: len 208, if# 11, link: backup, 
flags:
got message of size 104 on Tue Aug  4 19:45:32 2009
RTM_DELETE: Delete Route: len 104, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 10.137.16.192
got message of size 136 on Tue Aug  4 19:45:32 2009
RTM_DELETE: Delete Route: len 136, priority 0, table 0, pid: 0, seq 0, errno 0
flags:
locks:  inits: 
sockaddrs: 
 fe80::200:5eff:fe00:16e%carp0 00:00:5e:00:01:6e
got message of size 100 on Tue Aug  4 19:45:32 2009
RTM_DELADDR: address being removed from iface: len 100, metric 0, flags:
sockaddrs: 
 ::::: 00:00:5e:00:01:6e f

ifstated with multiple CARP interfaces

2009-07-27 Thread Pascal Lalonde
Hello,

we have a problem with ifstated detecting state change on multiple CARP
interfaces.

After digging deeper, it seems that reading on the routing socket does
not give us all the state changes that we'd expect. We tried with the
latest snapshot kernel and got the same behavior.

Our CARP interfaces are as follows:
carp0
carp1
carp1010 (VLAN on carp1)
carp1011 (Another VLAN on carp1)
carp2
carp3

The condition we'd like to test is:
carp_up = 'carp0.link.up \
&& carp1.link.up \
&& carp1010.link.up \
&& carp1011.link.up \
&& carp2.link.up \
&& carp3.link.up'

Doing a little check using a quick C program (see below), and playing
with the demote counter, we can clearly see why the condition is not
always met as it should:

# ./getifinfo &
[1] 20942
# ifconfig -g carp carpdemote 50
carp0 -> LINK_STATE_DOWN
carp2 -> LINK_STATE_DOWN
carp3 -> LINK_STATE_DOWN
carp1010 -> LINK_STATE_DOWN
carp1011 -> LINK_STATE_DOWN
# ifconfig -g carp -carpdemote 50
carp1 -> LINK_STATE_UP
carp1010 -> LINK_STATE_UP
carp0 -> LINK_STATE_UP
carp2 -> LINK_STATE_UP
carp1011 -> LINK_STATE_UP
carp3 -> LINK_STATE_UP
# ifconfig -g carp carpdemote 50
carp0 -> LINK_STATE_DOWN
carp1 -> LINK_STATE_DOWN
carp2 -> LINK_STATE_DOWN
carp3 -> LINK_STATE_DOWN
carp1010 -> LINK_STATE_DOWN
# ifconfig -g carp -carpdemote 50
carp0 -> LINK_STATE_UP
carp1 -> LINK_STATE_UP
carp1010 -> LINK_STATE_UP
carp1011 -> LINK_STATE_UP
# 

Question is, is it normal that the routing socket doesn't report all
changes? Anyone else having similar issues?

Thanks in advance,
--
Pascal


getifinfo.c:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

char *if_states[] = {
"LINK_STATE_UNKNOWN",
"LINK_STATE_DOWN",
"LINK_STATE_UP",
"LINK_STATE_HALF_DUPLEX",
"LINK_STATE_FULL_DUPLEX"
};

int
main(int argc, char **argv)
{
int  rt_fd;
char msg[2048];
struct ifaddrs  *ifap, *ifa;
struct rt_msghdr*rtm = (struct rt_msghdr *)&msg;
struct if_msghdr*ifm = (struct if_msghdr *)&msg;
int  len;
char ifs[64][16];

if (getifaddrs(&ifap))
err(1, "getifaddrs");

for (ifa = ifap; ifa->ifa_next != NULL; ifa = ifa->ifa_next) {
strlcpy(ifs[if_nametoindex(ifa->ifa_name)], ifa->ifa_name, 16);
}

freeifaddrs(ifap);

if ((rt_fd = socket(PF_ROUTE, SOCK_RAW, 0)) < 0)
err(1, "no routing socket");

while ((len = read(rt_fd, msg, sizeof(msg {
if (len < sizeof(struct rt_msghdr)) {
warnx("len < sizeof(struct rt_msghdr)");
continue;
}

if (rtm->rtm_version != RTM_VERSION)
continue;

if (rtm->rtm_type != RTM_IFINFO)
continue;

printf("%s -> %s\n", ifs[ifm->ifm_index],
if_states[ifm->ifm_data.ifi_link_state]);
}
return 0;
}



Re: state key linking mismatch w/GRE, since 4.5

2009-06-30 Thread Pascal Lalonde
On Fri, Jun 12, 2009 at 05:56:43AM +0200, Henning Brauer wrote:
> * Pascal Lalonde  [2009-06-12 00:28]:
> > Jun 11 18:08:19 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
> > if=bge0, stored af=2, a0: 10.136.192.199:30285, a1: 10.216.8.1:22,
> > proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.
> > Jun 11 18:08:21 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
> > if=bge0, stored af=2, a0: 10.136.248.119:42137, a1: 10.137.0.130:993,
> > proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.
> 
> fixed in -current and no need to worry really

Small followup on this, for people who would happen to run in the same
problem.

We were just bitten by this issue. With our smaller VPN gateways (<10
flows with ESP/GRE), the extra logging didn't cause any issues. But once
we upgraded our main VPN endpoint (roughly 176 flows) to 4.5, seems it
didn't like the amount of printf()'s generated; the load would make it
unusuable, causing CARP flapping, with a very high (>80%) interrupt%.
Fortunately we still had our other node in 4.4 to fallback to.

I can confirm that on our test setup with a -current kernel, those messages
don't show up anymore.

In the meantime, we applied the following to let us control whether we
wish to see those warnings or not:


--- sys/net/pf.c.orig   Tue Jun 30 18:13:34 2009
+++ sys/net/pf.cTue Jun 30 18:44:00 2009
@@ -860,19 +860,22 @@
return (0);
else {
/* mismatch. must not happen. */
-   printf("pf: state key linking mismatch! dir=%s, "
-   "if=%s, stored af=%u, a0: ",
-   dir == PF_OUT ? "OUT" : "IN", kif->pfik_name, a->af);
-   pf_print_host(&a->addr[0], a->port[0], a->af);
-   printf(", a1: ");
-   pf_print_host(&a->addr[1], a->port[1], a->af);
-   printf(", proto=%u", a->proto);
-   printf(", found af=%u, a0: ", b->af);
-   pf_print_host(&b->addr[0], b->port[0], b->af);
-   printf(", a1: ");
-   pf_print_host(&b->addr[1], b->port[1], b->af);
-   printf(", proto=%u", b->proto);
-   printf(".\n");
+   if (pf_status.debug >= PF_DEBUG_MISC) {
+   printf("pf: state key linking mismatch! dir=%s, "
+   "if=%s, stored af=%u, a0: ",
+   dir == PF_OUT ? "OUT" : "IN", kif->pfik_name,
+   a->af);
+   pf_print_host(&a->addr[0], a->port[0], a->af);
+   printf(", a1: ");
+   pf_print_host(&a->addr[1], a->port[1], a->af);
+   printf(", proto=%u", a->proto);
+   printf(", found af=%u, a0: ", b->af);
+   pf_print_host(&b->addr[0], b->port[0], b->af);
+   printf(", a1: ");
+   pf_print_host(&b->addr[1], b->port[1], b->af);
+   printf(", proto=%u", b->proto);
+   printf(".\n");
+   }
return (-1);
}
 }


-- 
Pascal



Weighted round robin

2009-06-14 Thread Pascal Lalonde
Hello,

I was wondering about how to achieve some kind of weighted round-robin
with OpenBSD...

So far, we can achieve some limited weighted round-robin by using rdr's
with lists, and repeating the stronger nodes in the list.
(Is there a limit to the number of nodes in a list??)

This is what we use on our webserver pool. Having this kind of control
is necessary, since pretty much every unused machine we have ends
up in the webserver pool to meet growing demand.

The only sad thing is that we can't use relayd to automate removal of
failed nodes, because we don't use tables. We're planning to resort on a
scripted solution to generate a ruleset based on our own host checks
(using rdr lists, anchors, etc.) . But before taking this path, I was
wondering if people on misc@ had found clever ways of achieving the same
result without too much scripting.

Thanks in advance,
-- 
Pascal



Re: state key linking mismatch w/GRE, since 4.5

2009-06-12 Thread Pascal Lalonde
On Fri, Jun 12, 2009 at 05:56:43AM +0200, Henning Brauer wrote:
> * Pascal Lalonde  [2009-06-12 00:28]:
> > Jun 11 18:08:19 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
> > if=bge0, stored af=2, a0: 10.136.192.199:30285, a1: 10.216.8.1:22,
> > proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.
> > Jun 11 18:08:21 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
> > if=bge0, stored af=2, a0: 10.136.248.119:42137, a1: 10.137.0.130:993,
> > proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.
> 
> fixed in -current and no need to worry really

Good to hear!

Many thanks,
-- 
Pascal



state key linking mismatch w/GRE, since 4.5

2009-06-11 Thread Pascal Lalonde
Hello,

recently we upgraded some of our firewalls from OpenBSD 4.4 to 4.5.
Since then, we've been getting loads of the following message
(external addresses substitued with AAA's and BBB's):

Jun 11 18:08:19 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
if=bge0, stored af=2, a0: 10.136.192.199:30285, a1: 10.216.8.1:22,
proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.
Jun 11 18:08:21 celeborn /bsd: pf: state key linking mismatch! dir=OUT,
if=bge0, stored af=2, a0: 10.136.248.119:42137, a1: 10.137.0.130:993,
proto=6, found af=2, a0: AAA.AAA.AAA.AAA, a1: BBB.BBB.BBB.BBB, proto=47.


Relevant states, taken right after the errors showed up in syslog:

all gre BBB.BBB.BBB.BBB <- AAA.AAA.AAA.AAA   MULTIPLE:MULTIPLE
all tcp 10.216.8.1:22 <- 10.136.192.199:30285 ESTABLISHED:ESTABLISHED
all tcp 10.136.192.199:30285 -> 10.216.8.1:22 ESTABLISHED:ESTABLISHED
all tcp 10.137.0.130:993 <- 10.136.248.119:42137 FIN_WAIT_2:FIN_WAIT_2
all tcp 10.136.248.119:42137 -> 10.137.0.130:993 FIN_WAIT_2:FIN_WAIT_2

gre25: flags=9011 mtu 1476
description: TUNNELING-10/8
priority: 0
groups: gre
physical address inet BBB.BBB.BBB.BBB --> AAA.AAA.AAA.AAA
inet6 fe80::204:23ff:feb1:73c4%gre25 ->  prefixlen 64 scopeid 0x12
inet 192.168.253.136 --> 192.168.136.253 netmask 0x

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
defaultBBB.BBB.BBB.CCCUGS4  1317018 - 8 bge0
10/8   192.168.136.253UGS0   769241 - 8 gre25
10.136.248/21  link#4 UC140 - 4 em3
BBB.BBB.BBB.0/27   link#9 UC110 - 4 bge0
...


Status: Enabled for 0 days 02:24:21   Debug: Urgent

State Table  Total Rate
  current entries 6281   
  searches14179937 1637.2/s
  inserts   586841   67.8/s
  removals  580560   67.0/s
Counters
  match 498717   57.6/s
  bad-offset 00.0/s
  fragment   00.0/s
  short  00.0/s
  normalize  00.0/s
  memory 00.0/s
  bad-timestamp  00.0/s
  congestion 00.0/s
  ip-option  00.0/s
  proto-cksum00.0/s
  state-mismatch280.0/s
  state-insert   50.0/s
  state-limit00.0/s
  src-limit  00.0/s
  synproxy   00.0/s


This is happening only on firewalls where we use GRE tunnels.

I guess that rev. 1.618 of pf.c, which was added in 4.5, is causing
those messages to appear. But we're not experimenting any network
problems despite the errors.

The ruleset being a bit lengthy, I left it out, but can send it
on demand.

Is there need to worry about those errors? 

Thanks,
-- 
Pascal



Re: relayd - Hosts flapping unexpectedly

2009-05-28 Thread Pascal Lalonde
On Thu, May 21, 2009 at 11:05:40AM +0100, Dan Carley wrote:
> 
> We've been playing with relayd recently - both from 4.5 and the latest
> snapshot.
> 
> Approximately every hour we are seeing one or two state changes logged. But
> I can't see reason for the change of state and there doesn't appear to be a
> pattern in the way that the hosts are failed.

We just happen to notice the same thing here.

Here's the info I could gather on this, but I suspect the
problem might not be relayd itself.

My relayd configuration is as such:

relayd.conf:

interval 5
log updates
timeout 3000

table  {
10.0.1.10
10.0.2.10
10.0.10.10
}

redirect test2 {
listen on 10.0.1.15 port 30099
forward to  check tcp
}

redirect test {
listen on 10.137.16.192 port 30100
forward to  check tcp
}


# relayctl show summary
Id  TypeNameAvlblty Status
1   redirecttest2   active
1   table   floods:30099active
(3 hosts)
1   host10.0.1.10   100.00% up
2   host10.0.2.10   100.00% up
3   host10.0.10.10  100.00% up
2   redirecttestactive
2   table   floods:30100active
(3 hosts)
4   host10.0.1.10   100.00% up
5   host10.0.2.10   100.00% up
6   host10.0.10.10  100.00% up


Now, at random times (1-2 / hour average), we get the following error in the
logs:

May 26 18:00:31 testfw1 relayd[25554]: host 10.0.1.10,
check tcp (0ms), state up -> down, availability 99.92%
May 26 18:00:36 testfw1 relayd[25554]: host 10.0.1.10,
check tcp (0ms), state down -> up, availability 99.92%

But, we can confirm that the service does not go down in reality. The
firewalls are redundant with the same relayd config, and they don't see
the service going down at the same time (they do, however, both get the
same behavior for up/down's).

Adding some debugging code in relayd, I found that connect() returns
EADDRINUSE at check_tcp.c:87. This seemed strange at first since a few
lines above the SO_REUSEPORT is set on the socket. Also, the firewalls
used to test this are almost sleeping with less than 100 sockets at a
time, mostly used by relayd performing TCP checks. So we're clearly not
running out of ephemeral ports.

Just for the sake of trying, I took the CVS source for relayd,
commented out the SO_REUSEPORT option, recompiled and restarted it.
Strangely, now the up/down's are gone. I would expect SO_REUSEPORT to
prevent EADDRINUSE errors, so I'm a bit puzzled...

Could anyone help shed light on this?

Thanks,
-- 
Pascal



relayctl host disable doesn't loop through all hosts

2009-03-31 Thread Pascal Lalonde
Hello,

I've been playing with relayd lately. There is a behavior which seems
unintuitive and I was wondering if that was a bug or the intended
behavior.

When I try to disable a host (e.g.: relayctl host disable 10.0.1.101),
and that host is part of more than one table, only the first occurence
gets disabled. I'm testing with relayd from Feb 28th snapshot.

I would suppose it should disable all occurences, since disabling by ID
already lets you choose specific instances of that host.

# relayctl show summary
Id  TypeNameAvlblty Status
1   redirecttestactive
1   table   test:8080   active
(3 hosts)
1   host10.0.1.101  100.00% up
2   host10.0.1.102  100.00% up
3   host10.0.1.103  100.00% up
2   redirecttest2   active
2   table   test2:3 active
(6 hosts)
4   host10.0.1.101  100.00% up
5   host10.0.1.102  100.00% up
6   host10.0.1.103  100.00% up
7   host10.0.1.104  100.00% up
8   host10.0.1.105  100.00% up
9   host10.0.1.106  100.00% up
# relayctl host disable 10.0.1.101
command succeeded
# relayctl show summary
Id  TypeNameAvlblty Status
1   redirecttestactive
1   table   test:8080   active
(2 hosts)
1   host10.0.1.101  disabled
2   host10.0.1.102  100.00% up
3   host10.0.1.103  100.00% up
2   redirecttest2   active
2   table   test2:3 active
(6 hosts)
4   host10.0.1.101  100.00% up
5   host10.0.1.102  100.00% up
6   host10.0.1.103  100.00% up
7   host10.0.1.104  100.00% up
8   host10.0.1.105  100.00% up
9   host10.0.1.106  100.00% up

Thanks in advance!
-- 
Pascal