OpenOSPFd and kernel routing table (new variant)

2007-05-30 Thread Christian Plattner

Hi,

I am testing OpenBGPD and OpenOSPFD on a couple of Soekris boxes.
Even though I am using the latest code (-stable with ospfd kroute.c
revision 1.48), I am having problems with the kernel routing table
when OSPFD has to react to changes in the topology. I verified the
problem on a virtual setup (a couple of OpenBSD machines on an ESX
server), same result.

The problem can be summarized as follows: When I take down an interface
on one machine manually (e.g., ifconfig em1 down), then the OpenOSPFD
on another machine has no problems to detect this, routes to subnets in
the same AS will be adapted. However, the kernel continues to route
packets to destinations outside of the AS still over the dead link.

Fix: When I restart ospfd, the kernel routing table is OK again.

Here is an example with 3 routers that I have put together using
ESX/VMWare:

/em1-(.1) --- 10.74.96.0/27  --- (.2)--em0\
   +--  (.22)-em0-[R1]   [R2]
   |\em2-(.33) -- 10.74.96.32/27 -- (.34)--em1/
10.0.0.0/24
   |
   +--- (.1)-em1-[R0]-em0 -- (62.2.0.0/16)

Router R0: AS65002 announces 62.2.0.0/16 to R1
Router R1: AS65001 announces 10.74.96.0/21 to R0
Router R2: AS65001 has an IBGP session with R1
Loopback (lo1) addresses: R1=10.74.97.1, R2=10.74.97.2

This setting works fine, I can ping from R2 to machines in 62.2.0.0/16.
Traffic between R1 and R2 flows over the upper link.

However, lets assume that one of the links between R1 and R2 fails.

[R1] # ifconfig em1 down (so eventually R2 will find out that I does
not receive any OSPF packets on em0 anymore).

It takes a while, but then ospfd on R2 has calculated the new topology:

[R2] # ospfctl show rib
Destination  Nexthop   Path TypeType  Cost
0.0.0.1  10.74.96.33   Intra-Area   Router11
10.74.96.0/2710.74.96.33   Intra-Area   Network   21
10.74.96.32/27   10.74.96.34   Intra-Area   Network   11
10.74.97.1/3210.74.96.33   Intra-Area   Network   21
10.0.0.0/24  10.74.96.33   Type 1 ext   Network   111
(uptime column deleted, to comply with the 72 char restriction
of the mailing list).

[R2] # ospfctl show fib
flags: * = valid, O = OSPF, C = Connected, S = Static
Flags  Destination  Nexthop
*O 10.0.0.0/24  10.74.96.33
*  10.74.96.0/2110.74.96.1
*C 10.74.96.0/27link#1
*C 10.74.96.32/27   link#2
*O 10.74.97.1/3210.74.96.33
*  10.74.97.2/3210.74.97.2
*  62.2.0.0/16  10.74.96.1
*S 127.0.0.0/8  127.0.0.1
*C 127.0.0.1/8  link#0
*  127.0.0.1/32 127.0.0.1
*S 224.0.0.0/4  127.0.0.1

This is not good, as the (via IBGP learned) route to 62.2.0.0/16 still
points to 10.74.96.1 (which is not directly reachable anymore).

Now let's kill and restart ospfd on R2, then check again:

# ospfctl show fib
flags: * = valid, O = OSPF, C = Connected, S = Static
Flags  Destination  Nexthop
*O 10.0.0.0/24  10.74.96.33
*  10.74.96.0/2110.74.96.33
*C 10.74.96.0/27link#1
*C 10.74.96.32/27   link#2
*O 10.74.97.1/3210.74.96.33
*  10.74.97.2/3210.74.97.2
*  62.2.0.0/16  10.74.96.33
*S 127.0.0.0/8  127.0.0.1
*C 127.0.0.1/8  link#0
*  127.0.0.1/32 127.0.0.1
*S 224.0.0.0/4  127.0.0.1

Voil`, now it looks OK =)

This is the ospfd.conf of R2:

password="gurke"
router-id 0.0.0.2
redistribute connected
redistribute static

area 0.0.0.0 {

interface lo1

interface em0 {
metric 10
auth-type simple
auth-key $password
}
interface em1 {
metric 11
auth-type simple
auth-key $password
}
}

Any suggstions? Am I making a substantial error?

I did not want to make this posting too long, so if somebody is
interested in the detailed config files then I can make them
available.

Thanks,
- Christian



Re: OpenOSPFd and kernel routing table (new variant)

2007-05-31 Thread Claudio Jeker
On Wed, May 30, 2007 at 08:04:45PM +0200, Christian Plattner wrote:
> Hi,
> 
> I am testing OpenBGPD and OpenOSPFD on a couple of Soekris boxes.
> Even though I am using the latest code (-stable with ospfd kroute.c
> revision 1.48), I am having problems with the kernel routing table
> when OSPFD has to react to changes in the topology. I verified the
> problem on a virtual setup (a couple of OpenBSD machines on an ESX
> server), same result.
> 
> The problem can be summarized as follows: When I take down an interface
> on one machine manually (e.g., ifconfig em1 down), then the OpenOSPFD
> on another machine has no problems to detect this, routes to subnets in
> the same AS will be adapted. However, the kernel continues to route
> packets to destinations outside of the AS still over the dead link.
> 
> Fix: When I restart ospfd, the kernel routing table is OK again.
> 
> Here is an example with 3 routers that I have put together using
> ESX/VMWare:
> 
> /em1-(.1) --- 10.74.96.0/27  --- (.2)--em0\
>+--  (.22)-em0-[R1]   [R2]
>|\em2-(.33) -- 10.74.96.32/27 -- (.34)--em1/
> 10.0.0.0/24
>|
>+--- (.1)-em1-[R0]-em0 -- (62.2.0.0/16)
> 
> Router R0: AS65002 announces 62.2.0.0/16 to R1
> Router R1: AS65001 announces 10.74.96.0/21 to R0
> Router R2: AS65001 has an IBGP session with R1
> Loopback (lo1) addresses: R1=10.74.97.1, R2=10.74.97.2
> 
> This setting works fine, I can ping from R2 to machines in 62.2.0.0/16.
> Traffic between R1 and R2 flows over the upper link.
> 
> However, lets assume that one of the links between R1 and R2 fails.
> 
> [R1] # ifconfig em1 down (so eventually R2 will find out that I does
> not receive any OSPF packets on em0 anymore).
> 
> It takes a while, but then ospfd on R2 has calculated the new topology:
> 
> [R2] # ospfctl show rib
> Destination  Nexthop   Path TypeType  Cost
> 0.0.0.1  10.74.96.33   Intra-Area   Router11
> 10.74.96.0/2710.74.96.33   Intra-Area   Network   21
> 10.74.96.32/27   10.74.96.34   Intra-Area   Network   11
> 10.74.97.1/3210.74.96.33   Intra-Area   Network   21
> 10.0.0.0/24  10.74.96.33   Type 1 ext   Network   111
> (uptime column deleted, to comply with the 72 char restriction
> of the mailing list).
> 
> [R2] # ospfctl show fib
> flags: * = valid, O = OSPF, C = Connected, S = Static
> Flags  Destination  Nexthop
> *O 10.0.0.0/24  10.74.96.33
> *  10.74.96.0/2110.74.96.1
> *C 10.74.96.0/27link#1
> *C 10.74.96.32/27   link#2
> *O 10.74.97.1/3210.74.96.33
> *  10.74.97.2/3210.74.97.2
> *  62.2.0.0/16  10.74.96.1
> *S 127.0.0.0/8  127.0.0.1
> *C 127.0.0.1/8  link#0
> *  127.0.0.1/32 127.0.0.1
> *S 224.0.0.0/4  127.0.0.1
> 
> This is not good, as the (via IBGP learned) route to 62.2.0.0/16 still
> points to 10.74.96.1 (which is not directly reachable anymore).
> 
> Now let's kill and restart ospfd on R2, then check again:
> 
> # ospfctl show fib
> flags: * = valid, O = OSPF, C = Connected, S = Static
> Flags  Destination  Nexthop
> *O 10.0.0.0/24  10.74.96.33
> *  10.74.96.0/2110.74.96.33
> *C 10.74.96.0/27link#1
> *C 10.74.96.32/27   link#2
> *O 10.74.97.1/3210.74.96.33
> *  10.74.97.2/3210.74.97.2
> *  62.2.0.0/16  10.74.96.33
> *S 127.0.0.0/8  127.0.0.1
> *C 127.0.0.1/8  link#0
> *  127.0.0.1/32 127.0.0.1
> *S 224.0.0.0/4  127.0.0.1
> 
> Voil`, now it looks OK =)
> 
> This is the ospfd.conf of R2:
> 
> password="gurke"
> router-id 0.0.0.2
> redistribute connected
> redistribute static
> 
> area 0.0.0.0 {
> 
> interface lo1
> 
> interface em0 {
> metric 10
> auth-type simple
> auth-key $password
> }
> interface em1 {
> metric 11
> auth-type simple
> auth-key $password
> }
> }
> 
> Any suggstions? Am I making a substantial error?
> 
> I did not want to make this posting too long, so if somebody is
> interested in the detailed config files then I can make them
> available.
> 

This is a bgpd bug. Because the 62.2/16 network is handled by bgpd.
I'm currently having a look at this. Not sure why the network does not
swing over to the working link but hopefully I will find it out.

-- 
:wq Claudio



Re: OpenOSPFd and kernel routing table (new variant)

2007-05-31 Thread Claudio Jeker
On Thu, May 31, 2007 at 11:45:08PM +0200, Claudio Jeker wrote:
> On Wed, May 30, 2007 at 08:04:45PM +0200, Christian Plattner wrote:
> > Hi,
> > 
> > I am testing OpenBGPD and OpenOSPFD on a couple of Soekris boxes.
> > Even though I am using the latest code (-stable with ospfd kroute.c
> > revision 1.48), I am having problems with the kernel routing table
> > when OSPFD has to react to changes in the topology. I verified the
> > problem on a virtual setup (a couple of OpenBSD machines on an ESX
> > server), same result.
> > 
> > The problem can be summarized as follows: When I take down an interface
> > on one machine manually (e.g., ifconfig em1 down), then the OpenOSPFD
> > on another machine has no problems to detect this, routes to subnets in
> > the same AS will be adapted. However, the kernel continues to route
> > packets to destinations outside of the AS still over the dead link.
> > 
> > Fix: When I restart ospfd, the kernel routing table is OK again.
> > 
> > Here is an example with 3 routers that I have put together using
> > ESX/VMWare:
> > 
> > /em1-(.1) --- 10.74.96.0/27  --- (.2)--em0\
> >+--  (.22)-em0-[R1]   [R2]
> >|\em2-(.33) -- 10.74.96.32/27 -- (.34)--em1/
> > 10.0.0.0/24
> >|
> >+--- (.1)-em1-[R0]-em0 -- (62.2.0.0/16)
> > 
> > Router R0: AS65002 announces 62.2.0.0/16 to R1
> > Router R1: AS65001 announces 10.74.96.0/21 to R0
> > Router R2: AS65001 has an IBGP session with R1
> > Loopback (lo1) addresses: R1=10.74.97.1, R2=10.74.97.2
> > 
> > This setting works fine, I can ping from R2 to machines in 62.2.0.0/16.
> > Traffic between R1 and R2 flows over the upper link.
> > 
> > However, lets assume that one of the links between R1 and R2 fails.
> > 
> > [R1] # ifconfig em1 down (so eventually R2 will find out that I does
> > not receive any OSPF packets on em0 anymore).
> > 
> > It takes a while, but then ospfd on R2 has calculated the new topology:
> > 
> > [R2] # ospfctl show rib
> > Destination  Nexthop   Path TypeType  Cost
> > 0.0.0.1  10.74.96.33   Intra-Area   Router11
> > 10.74.96.0/2710.74.96.33   Intra-Area   Network   21
> > 10.74.96.32/27   10.74.96.34   Intra-Area   Network   11
> > 10.74.97.1/3210.74.96.33   Intra-Area   Network   21
> > 10.0.0.0/24  10.74.96.33   Type 1 ext   Network   111
> > (uptime column deleted, to comply with the 72 char restriction
> > of the mailing list).
> > 
> > [R2] # ospfctl show fib
> > flags: * = valid, O = OSPF, C = Connected, S = Static
> > Flags  Destination  Nexthop
> > *O 10.0.0.0/24  10.74.96.33
> > *  10.74.96.0/2110.74.96.1
> > *C 10.74.96.0/27link#1
> > *C 10.74.96.32/27   link#2
> > *O 10.74.97.1/3210.74.96.33
> > *  10.74.97.2/3210.74.97.2
> > *  62.2.0.0/16  10.74.96.1
> > *S 127.0.0.0/8  127.0.0.1
> > *C 127.0.0.1/8  link#0
> > *  127.0.0.1/32 127.0.0.1
> > *S 224.0.0.0/4  127.0.0.1
> > 
> > This is not good, as the (via IBGP learned) route to 62.2.0.0/16 still
> > points to 10.74.96.1 (which is not directly reachable anymore).
> > 
> > Now let's kill and restart ospfd on R2, then check again:
> > 
> > # ospfctl show fib
> > flags: * = valid, O = OSPF, C = Connected, S = Static
> > Flags  Destination  Nexthop
> > *O 10.0.0.0/24  10.74.96.33
> > *  10.74.96.0/2110.74.96.33
> > *C 10.74.96.0/27link#1
> > *C 10.74.96.32/27   link#2
> > *O 10.74.97.1/3210.74.96.33
> > *  10.74.97.2/3210.74.97.2
> > *  62.2.0.0/16  10.74.96.33
> > *S 127.0.0.0/8  127.0.0.1
> > *C 127.0.0.1/8  link#0
> > *  127.0.0.1/32 127.0.0.1
> > *S 224.0.0.0/4  127.0.0.1
> > 
> > Voil`, now it looks OK =)
> > 
> > This is the ospfd.conf of R2:
> > 
> > password="gurke"
> > router-id 0.0.0.2
> > redistribute connected
> > redistribute static
> > 
> > area 0.0.0.0 {
> > 
> > interface lo1
> > 
> > interface em0 {
> > metric 10
> > auth-type simple
> > auth-key $password
> > }
> > interface em1 {
> > metric 11
> > auth-type simple
> > auth-key $password
> > }
> > }
> > 
> > Any suggstions? Am I making a substantial error?
> > 
> > I did not want to make this posting too long, so if somebody is
> > interested in the detailed config files then I can make them
> > available.
> > 
> 
> This is a bgpd bug. Because the 62.2/16 network is handled by bgpd.
> I'm currently having a look at this. Not sure why the network does not
> swing over to the working link but hopefully I will find it out.
> 

And here is a preliminary diff for all the curious ones. bgpd needs to
track changes of routes with F_

Re: OpenOSPFd and kernel routing table (new variant)

2007-06-01 Thread Christian Plattner

I applied the diff manually to -stable (watch out for
path_updateall/prefix_updateall), and now it works perfectly.

Thanks, Claudio!


And here is a preliminary diff for all the curious ones. bgpd needs to
track changes of routes with F_NEXTHOP checked and report them to the RDE.
The RDE will then update all active routes that use this nexthop. Seems to
work for me.