bgpctl(8): Print MPLS label info in show rib detail output

2021-11-29 Thread Mitchell Krome
Hi,

I noticed bgpctl didn't seem to have a way to show the MPLS label
attached to MPLS L3VPN routes. The label was already there in the prefix
info it just wasn't printed, so this diff adds printing the label info
if the prefix has one in the show rib detail command. 

Example output below. I was originally going to add it between Nexthop
and Neighbor but when I did that I thought it could easily be mistaken
to be the MPLS label of the next hop instead of the VPN label. So
instead I put it on a new line.

BGP routing table entry for rd 4372800702:702 172.19.195.2/31
4372800702 4372800702 4372800702 65521
Nexthop 172.17.0.7 (via 172.17.6.30) Neighbor 172.17.0.3 (172.17.0.3)
Origin IGP, metric 0, localpref 100, weight 0, ovs not-found, internal, 
valid
Label 524280
Last update: 01:45:32 ago
Ext. Communities: rt 4372800702:702
Originator Id: 172.17.0.7
Cluster Id List:  0.0.0.255




diff --git a/usr.sbin/bgpctl/output.c b/usr.sbin/bgpctl/output.c
index 22c7dcce2..a9b1e09cd 100644
--- a/usr.sbin/bgpctl/output.c
+++ b/usr.sbin/bgpctl/output.c
@@ -19,6 +19,8 @@
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */

+#include 
+
 #include 
 #include 
 #include 
@@ -936,6 +938,8 @@ show_rib_detail(struct ctl_show_rib *r, u_char *asdata, 
size_t aslen,
 {
struct in_addr   id;
char*aspath, *s;
+   u_int32_tlabel;
+   int  i;

printf("\nBGP routing table entry for %s/%u%c",
log_addr(>prefix), r->prefixlen,
@@ -961,9 +965,22 @@ show_rib_detail(struct ctl_show_rib *r, u_char *asdata, 
size_t aslen,
fmt_origin(r->origin, 0), r->med, r->local_pref, r->weight,
fmt_ovs(r->validation_state, 0));
printf("%s", fmt_flags(r->flags, 0));
+   printf("%c", EOL0(flag0));

-   printf("%cLast update: %s ago%c", EOL0(flag0),
-   fmt_timeframe(r->age), EOL0(flag0));
+   if (r->prefix.labellen) {
+   printf("Label");
+   for (i = 0; i < r->prefix.labellen / 3; ++i) {
+   printf(" ");
+   label = (r->prefix.labelstack[i * 3] << 12) |
+   (r->prefix.labelstack[i * 3 + 1] << 4) |
+   (r->prefix.labelstack[i * 3 + 2] >> 4);
+   printf("%u", label);
+   }
+   printf("%c", EOL0(flag0));
+   }
+
+   printf("Last update: %s ago%c", fmt_timeframe(r->age),
+   EOL0(flag0));
 }

 static void



rtsock: Check prefixlength before deciding an existing L2 cached route is the same as a new route being added

2021-11-27 Thread Mitchell Krome
Hi,

I ran into a problem where when using /31 netmasks on point to point
links, I am unable to add a larger summary route that happens to have
the same network address as the /31 link route the box has. It's a
bit hard to explain so hopefully the art below helps.


bsd1rtr2
  +-+ +-+
  | vio0  172.19.6.33/31|<--->|172.19.6.32/31  if1  |
  +-+ +-+
172.19.6.32/27 -> 172.19.6.32

$route show
...
172.17.6.32/31 172.17.6.33UCn10 - 4 vio0
172.17.6.3250:40:61:d6:05:e3  UHLch 79 5959 - 3 vio0
172.17.6.3352:54:00:e7:0c:c5  UHLl   0   70 - 1 vio0

When I try to add the /27 route, it comes back with invalid argument

$ doas route add 172.17.6.32/27 172.17.6.32
add net 172.17.6.32/27: gateway 172.17.6.32: Invalid argument

I managed to track this down to a check in rtsock.c where it thinks that
the route I am adding is a modification of the existing L2 route for
172.19.7.32. The Invalid Argument is because later on in the RTM_CHANGE
code it wants a AF of the new nexthop to be the same as the route it
thinks its changing. When using /30 routes this would never happen as
you wouldn't get a host on the network address I guess. 

Diff below fixed the problem by also checking the prefix length matches
before assuming the route isn't a new route and jumping into RTM_CHANGE
code.

The code in change: also checks the AF are the same, before it will
do a modification so I'm not sure if this should also check that before
assuming the route isn't actually a new one?

As far as I could tell reading the other code, it seemed like the only
way to get RTF_CACHED flag was to be a L2 nexthop, so I don't think that
there would be a case where the prefix length of the new and old entries
don't match but the intent was to go through the change path. 




diff --git a/sys/net/rtsock.c b/sys/net/rtsock.c
index a717d112e..9dbc8ca90 100644
--- a/sys/net/rtsock.c
+++ b/sys/net/rtsock.c
@@ -937,7 +937,10 @@ rtm_output(struct rt_msghdr *rtm, struct rtentry **prt,
 * cached route because this can lead to races in the
 * receive path.  Instead we update the L2 cache.
 */
-   if ((rt != NULL) && ISSET(rt->rt_flags, RTF_CACHED)) {
+   plen = rtable_satoplen(info->rti_info[RTAX_DST]->sa_family,
+   info->rti_info[RTAX_NETMASK]);
+   if ((rt != NULL) && (plen == rt->rt_plen) &&
+   ISSET(rt->rt_flags, RTF_CACHED)) {
ifp = if_get(rt->rt_ifidx);
if (ifp == NULL) {
rtfree(rt);



bfd: respond to poll sequence from peer

2019-06-03 Thread Mitchell Krome
Hi,

Testing bfd against frr on linux, their bfd implementation sends polls
as soon as the session is up even if the session timers haven't
changed from what it was advertising while it was in the down state.
Currently openbsd bfd doesn't respond to polls, so this diff adds that
support. tcpdump output during session setup (.9 is openbsd):

14:56:31.225339 10.10.20.9.58974 > 10.10.20.10.bfd-control: [udp sum ok] BFD v1 
length 24 state down flags [] diag none my-discrim 3396727743 your-discrim 0 
mintx 100 minrx 100 minecho 0 multiplier 3 [to s 0xc0] (ttl 255, id 
48014, len 52)
14:56:31.533645 10.10.20.10.49143 > 10.10.20.9.bfd-control: [udp sum ok] BFD v1 
length 24 state init flags [] diag neighbor-down my-discrim 2 your-discrim 
3396727743 mintx 150 minrx 150 minecho 5 mul tiplier 3 (DF) [tos 
0xc0] (ttl 255, id 36838, len 52)
14:56:32.022601 10.10.20.9.58974 > 10.10.20.10.bfd-control: [udp sum ok] BFD v1 
length 24 state up flags [] diag none my-discrim 3396727743 your-discrim 2 
mintx 100 minrx 150 minecho 0 multiplier 3 [tos 0xc0] (ttl 255, id 
21474, len 52)
14:56:32.023134 10.10.20.10.49143 > 10.10.20.9.bfd-control: [udp sum ok] BFD v1 
length 24 state up flags [P] diag none my-discrim 2 your-discrim 3396727743 
mintx 150 minrx 150 minecho 5 multiplier 3 (DF) [tos 0xc0] (ttl 
255, id 36952, len 52)
14:56:32.023207 10.10.20.9.58974 > 10.10.20.10.bfd-control: [udp sum ok] BFD v1 
length 24 state up flags [F] diag none my-discrim 3396727743 your-discrim 2 
mintx 150 minrx 150 minecho 0 multiplier 3 [tos 0xc0] (ttl 255, id 
23805, len 52)
14:56:32.997091 10.10.20.10.49143 > 10.10.20.9.bfd-control: [udp sum ok] BFD v1 
length 24 state up flags [] diag none my-discrim 2 your-discrim 3396727743 
mintx 150 minrx 150 minecho 5 multiplier 3 ( DF) [tos 0xc0] (ttl 
255, id 36991, len 52)

I also added some handling for generating polls and receiving finals while
in there, but there isn't any code to actually start our own poll
sequence just yet.

I only have frr and openbsd peers to test with - if anybody has
something else hooked up would be good to check what they do.

Mitchell


diff --git sys/net/bfd.c sys/net/bfd.c
index 42995531a8a..2ae287a15bb 100644
--- sys/net/bfd.c
+++ sys/net/bfd.c
@@ -741,6 +741,8 @@ bfd_reset(struct bfd_config *bfd)
 
bfd->bc_mode = BFD_MODE_ASYNC;
bfd->bc_state = BFD_STATE_DOWN;
+   bfd->bc_poll_seq = 0;
+   bfd->bc_poll_rcvd = 0;
 
/* rfc5880 6.8.18 */
bfd->bc_neighbor->bn_lstate = BFD_STATE_DOWN;
@@ -825,7 +827,10 @@ bfd_input(struct bfd_config *bfd, struct mbuf *m)
bfd->bc_neighbor->bn_rdiscr = ntohl(peer->bfd_my_discriminator);
bfd->bc_neighbor->bn_rstate = state;
bfd->bc_neighbor->bn_rdemand = (flags & BFD_FLAG_D);
-   bfd->bc_poll = (flags & BFD_FLAG_F);
+
+   if (flags & BFD_FLAG_F && bfd->bc_poll_seq) {
+   bfd->bc_poll_seq = 0;
+   }
 
/* Local change to the algorithm, we don't accept below 50ms */
if (ntohl(peer->bfd_required_min_rx_interval) < BFD_MINIMUM)
@@ -891,6 +896,12 @@ bfd_input(struct bfd_config *bfd, struct mbuf *m)
 
bfd->bc_error = 0;
 
+   /* Reply to poll if we aren't down */
+   if (flags & BFD_FLAG_P && bfd->bc_state > BFD_STATE_DOWN) {
+   bfd->bc_poll_rcvd = 1;
+   bfd_send_control(bfd);
+   }
+
  discard:
bfd->bc_neighbor->bn_rdiag = diag;
m_free(m);
@@ -979,6 +990,13 @@ bfd_send_control(void *x)
 
h->bfd_ver_diag = ((BFD_VERSION << 5) | (bfd->bc_neighbor->bn_ldiag));
h->bfd_sta_flags = (bfd->bc_state << 6);
+   /* Can't send a poll and a final in the same packet. */
+   if (bfd->bc_poll_rcvd) {
+   h->bfd_sta_flags |= BFD_FLAG_F;
+   bfd->bc_poll_rcvd = 0;
+   } else if (bfd->bc_poll_seq) {
+   h->bfd_sta_flags |= BFD_FLAG_P;
+   }
h->bfd_detect_multi = bfd->bc_neighbor->bn_mult;
h->bfd_length = BFD_HDRLEN;
h->bfd_my_discriminator = htonl(bfd->bc_neighbor->bn_ldiscr);
diff --git sys/net/bfd.h sys/net/bfd.h
index 3e8da45086f..8ee372faa5d 100644
--- sys/net/bfd.h
+++ sys/net/bfd.h
@@ -143,7 +143,8 @@ struct bfd_config {
time_t   bc_lastuptime;
unsigned int bc_laststate;
unsigned int bc_state;
-   unsigned int bc_poll;
+   unsigned int bc_poll_seq;
+   unsigned int bc_poll_rcvd;
unsigned int bc_error;
uint32_t bc_minrx;
uint32_t bc_mintx;



bfd: Fixup state and flags bitmask

2019-06-02 Thread Mitchell Krome
I copied the defines from the sys/net bfd code to use in tcpdump and
noticed that I wasn't seeing any poll bits from the remote side, even
though tcpdump on the remote side showed it was sending them. Turns out
the bitmask for state and flags is a bit off. Diff below fixes them to
match what's in the RFC (and the comment a bit higher in the file)

Mitchell


diff --git sys/net/bfd.c sys/net/bfd.c
index 0d2ec8af512..db8d571f35b 100644
--- sys/net/bfd.c
+++ sys/net/bfd.c
@@ -90,8 +90,8 @@ struct bfd_auth_header {
 #define BFD_VERSION1   /* RFC 5880 Page 6 */
 #define BFD_VER(x) (((x) & 0xe0) >> 5)
 #define BFD_DIAG(x)((x) & 0x1f)
-#define BFD_STATE(x)   (((x) & 0xf0) >> 6)
-#define BFD_FLAGS(x)   ((x) & 0x0f)
+#define BFD_STATE(x)   (((x) & 0xc0) >> 6)
+#define BFD_FLAGS(x)   ((x) & 0x3f)
 #define BFD_HDRLEN 24  /* RFC 5880 Page 37 */
 #define BFD_AUTH_SIMPLE_LEN16 + 3  /* RFC 5880 Page 10 */
 #define BFD_AUTH_MD5_LEN   24  /* RFC 5880 Page 11 */



BFD: fix reporting of up->down transition in routing message

2019-05-25 Thread Mitchell Krome
Hi,

I was playing with BFD and noticed that the transition from up to down
didnt seem to be reported correctly in the routing message. The uptime
didn't seem to reset and both state and laststate show as down.

RTM_BFD: bidirectional forwarding detection: len 136
BFD: async state down remote admindown laststate down error 0
 diag none remote none
 discr 1766582563 remote 1427874800
 uptime 05m42s last state time 01s
 mintx 100 minrx 100 minecho 0 multiplier 3

I tracked this down to what I think is a typo. The code for updating the
laststate and timer is looking at laststate before it's been updated.
Patch below makes it work as expected

RTM_BFD: bidirectional forwarding detection: len 136
BFD: async state down remote admindown laststate up error 0
 diag none remote none
 discr 967007404 remote 1766582563
 uptime 00s last state time 10s
 mintx 100 minrx 100 minecho 0 multiplier 3

Mitchell



diff --git sys/net/bfd.c sys/net/bfd.c
index 0d2ec8af512..e1f778ecc8e 100644
--- sys/net/bfd.c
+++ sys/net/bfd.c
@@ -927,7 +928,7 @@ bfd_set_state(struct bfd_config *bfd, unsigned int state)
bfd->bc_laststate = bfd->bc_state;
/* FALLTHROUGH */
case BFD_STATE_DOWN:
-   if (bfd->bc_laststate == BFD_STATE_UP) {
+   if (bfd->bc_state == BFD_STATE_UP) {
bfd->bc_laststate = bfd->bc_state;
bfd_set_uptime(bfd);
}



tcp_trace print flag names instead of f's

2019-04-05 Thread Mitchell Krome
The macro for printing flags just prints a bunch of f's in an order you
can't tell a whole lot from. This code looks like it hasn't been touched
in 20 years so maybe an old compiler used to do something different?
This makes it print the flag names.

Before:
0x808b4cc0 ESTABLISHED:output [93471a53..93471a98)@f8f54269, urp=0 
-> ESTABLISHED

After:
0x808b4ee0 ESTABLISHED:output [91d4a412..91d4a457)@bdd41aae, 
urp=0 -> ESTABLISHED

Mitchell


diff --git sys/netinet/tcp_debug.c sys/netinet/tcp_debug.c
index 4ebf93214..21e888cf2 100644
--- sys/netinet/tcp_debug.c
+++ sys/netinet/tcp_debug.c
@@ -200,7 +200,7 @@ tcp_trace(short act, short ostate, struct tcpcb *tp, struct 
tcpcb *otp,
flags = th->th_flags;
if (flags) {
char *cp = "<";
-#define pf(f) { if (th->th_flags_##f) { printf("%s%s", cp, "f"); cp = ","; 
} }
+#define pf(f) { if (th->th_flags_##f) { printf("%s%s", cp, #f); cp = ","; } 
}
pf(SYN); pf(ACK); pf(FIN); pf(RST); pf(PUSH); pf(URG);
printf(">");
}



Re: ospfd: Apply netmask to stub prefixes before adding the route to the route table

2019-04-02 Thread Mitchell Krome
On 2/04/2019 3:30 pm, Remi Locherer wrote:
> Hi Mitchell
> 
> On Sat, Mar 30, 2019 at 04:10:09PM +1000, Mitchell Krome wrote:
>> I kept finding I had a lingering /30 route when I turned off one of my
>> test boxes. I tracked it down to ospfd sending RTM_ADD for a stub
>> network with the non-masked prefix. The RTM_ADD path applies the mask
>> inside the kernel, so the route got added as expected, but the
>> RTM_DELETE enforces an exact match, so it could never remove the route.
>>
>> The advertised stub network was as follows:
>>
>> Link connected to: Stub Network
>>  Link ID (Network ID): 10.10.20.2
>> Link Data (Network Mask): 255.255.255.252
>>  Metric: 10
> 
> Please send the details of your setup so it is easy to reproduce the issue.
> - OpenBSD version
> - ospfd.conf
> - interface configs
> - routing table

I am running a kernel I compiled myself with source from ~2 weeks ago.
See the bottom for other info.

> 
>> ospfd sends the interface address rather than network address as the
>> link ID. The RFC says "set the Link ID of the Type 3 link to the
>> subnet's IP address" which to me means we probably should also apply the
>> mask before we add the stub to the LSA to avoid getting into this place
>> to start with? 
> 
> This only applies to Type 3 LSAs. Below table is from RFC 2328
> chapter 12.1.4:
> 
> LS Type   Link State ID
> ___
> 1 The originating router's Router ID.
> 2 The IP interface address of the
>   network's Designated Router.
> 3 The destination network's IP address.
> 4 The Router ID of the described AS
>   boundary router.
> 5 The destination network's IP address.
> 
>>
>> The patch below just masks the stub network before it gets added to the
>> route table, so that we can properly delete it. I can send a patch to
>> mask it before sending the LSA too if the consensus is that is how it
>> should be.
> 
> With your patch you change the case "LSA_TYPE_ROUTER" (LS Type 1) and not
> LS type 3.

Inside the LSA type 1 there is a type 3 link which is a "stub network".
That is what I was changing. Under 12.4.1.1 second dotpoint it says for
a point to point network add a type 3 link. Maybe I got the terminology
wrong, but this was definitely the thing I intended to change

   Link type   Description   Link ID
   __
   1   Point-to-pointNeighbor Router ID
   link
   2   Link to transit   Interface address of
   network   Designated Router
   3   Link to stub  IP network number
   network
   4   Virtual link  Neighbor Router ID


   Table 18: Link descriptions in the
  router-LSA.


> 
> Remi
> 

Box 1:

openbsd1# cat /etc/hostname.lo1
inet 10.0.0.1/32

openbsd1# cat /etc/hostname.gre0
up
mpls
tunnel 10.10.10.1 10.10.10.2
tunneldomain 10
inet 10.10.20.1/30
dest 10.10.20.2

openbsd1# cat /etc/ospfd.conf
area 0.0.0.0 {
interface lo1 {
passive
}
interface gre0
}

Box 2:

openbsd2# cat /etc/hostname.lo1
inet 10.0.0.2/32

openbsd2# cat /etc/hostname.gre0
up
mpls
tunnel 10.10.10.2 10.10.10.1
tunneldomain 10
inet 10.10.20.2/30
dest 10.10.20.1

openbsd2# cat /etc/hostname.gre1
up
mpls
tunnel 10.10.10.5 10.10.10.6
tunneldomain 10
inet 10.10.20.5/30
dest 10.10.20.6

openbsd2# cat /etc/ospfd.conf
area 0.0.0.0 {
interface lo1 {
passive
}
interface gre0
interface gre1
}

Box 3:

openbsd3# cat /etc/hostname.lo1
inet 10.0.0.3/32

openbsd3# cat /etc/hostname.gre0
up
mpls
tunnel 10.10.10.6 10.10.10.5
tunneldomain 10
inet 10.10.20.6/30
dest 10.10.20.5

openbsd3# cat /etc/ospfd.conf
area 0.0.0.0 {
interface lo1 {
passive
}
interface gre0
}


1: Box 1 has ospfd disabled. Route table on box 3:

openbsd3# route show
Routing tables

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
10.0.0.2/3210.10.20.5 UG 11 -32 gre0
10.0.0.3   10.0.0.3   UHl0  156 32768 1 lo1
10.10.20.0/30  10.10.20.5 UG 00 -32 gre0
10.10.20.5 10.10.20.6 UHh5  431 - 8 gre0
10.10.20.6 10.10.20.5 UHl0  1

ospfd: Apply netmask to stub prefixes before adding the route to the route table

2019-03-30 Thread Mitchell Krome
I kept finding I had a lingering /30 route when I turned off one of my
test boxes. I tracked it down to ospfd sending RTM_ADD for a stub
network with the non-masked prefix. The RTM_ADD path applies the mask
inside the kernel, so the route got added as expected, but the
RTM_DELETE enforces an exact match, so it could never remove the route.

The advertised stub network was as follows:

Link connected to: Stub Network
Link ID (Network ID): 10.10.20.2
Link Data (Network Mask): 255.255.255.252
Metric: 10

ospfd sends the interface address rather than network address as the
link ID. The RFC says "set the Link ID of the Type 3 link to the
subnet's IP address" which to me means we probably should also apply the
mask before we add the stub to the LSA to avoid getting into this place
to start with? 

The patch below just masks the stub network before it gets added to the
route table, so that we can properly delete it. I can send a patch to 
mask it before sending the LSA too if the consensus is that is how it
should be.

Mitchell

diff --git usr.sbin/ospfd/rde_spf.c usr.sbin/ospfd/rde_spf.c
index 736f2e575..d842a2c20 100644
--- usr.sbin/ospfd/rde_spf.c
+++ usr.sbin/ospfd/rde_spf.c
@@ -195,7 +195,7 @@ rt_calc(struct vertex *v, struct area *area, struct 
ospfd_conf *conf)
if (rtr_link->type != LINK_TYPE_STUB_NET)
continue;
 
-   addr.s_addr = rtr_link->id;
+   addr.s_addr = rtr_link->id & rtr_link->data;
adv_rtr.s_addr = htonl(v->adv_rtr);
 
rt_update(addr, mask2prefixlen(rtr_link->data),



Re: ospfd: Warn when the router ID changes during config reload

2019-03-23 Thread Mitchell Krome


On 24/03/2019 7:23 am, Theo de Raadt wrote:
> Sebastian Benoit  wrote:
> 
>> Mitchell Krome(mitchellkr...@gmail.com) on 2019.03.23 20:27:17 +1000:
>>> Was messing around with ospf and got myself into a situation where the
>>> router ID's were the same on two boxes because I only did a reload on
>>> one of them when I changed the loopback IP's.
>>
>> Thats sub optimal i believe...
>>
>>> This adds a warning when reloading if the router ID changes (there was
>>> already a comment saying as much). Same patch can probably be applied to
>>> ospf6d if people think it's useful.
>>
>> I think it would be better to abort the reload if the router-id is changed,
>> i.e. not load the new config at all.
> 
> That's the right approach in all our other daemons:
> 
> if the configuration change cannot be installed correctly, consider
> it invalid and abort.  Someone would need to write code to make it
> valid..
> 

That makes sense. I checked the manuals for the routers I use at work
and they also required the ospf process to be restarted for the config
to take effect after changing the router id.

I moved the check up into ospf_reload because it doesn't make sense
sending the config to all the children if we know we're going to abort.

Mitchell


diff --git a/usr.sbin/ospfd/ospfd.c b/usr.sbin/ospfd/ospfd.c
index d01a2fa66..bc2d007a7 100644
--- a/usr.sbin/ospfd/ospfd.c
+++ b/usr.sbin/ospfd/ospfd.c
@@ -642,6 +642,13 @@ ospf_reload(void)
if ((xconf = parse_config(conffile, ospfd_conf->opts)) == NULL)
return (-1);

+   /* Abort the reload if rtr_id changed */
+   if (ospfd_conf->rtr_id.s_addr != xconf->rtr_id.s_addr) {
+   log_warnx("Router-ID changed in new configuration. This "
+   "requires ospfd to be restarted");
+   return (-1);
+   }
+
/* send config to childs */
if (ospf_sendboth(IMSG_RECONF_CONF, xconf, sizeof(*xconf)) == -1)
return (-1);
@@ -693,7 +700,6 @@ merge_config(struct ospfd_conf *conf, struct
ospfd_conf *xconf)
struct redistribute *r;
int  rchange = 0;

-   /* change of rtr_id needs a restart */
conf->flags = xconf->flags;
conf->spf_delay = xconf->spf_delay;
conf->spf_hold_time = xconf->spf_hold_time;