Hello,

I have a fascination for networking as some might be aware. I think that a proper network design is the solid foundation underneath a cloud which allows it to scale and provide the flexibility an organization requires.

I've worked a lot on the VXLAN+EVPN+BGP integration in CloudStack and I think it's a great solution and should be the default for anybody who starts to deploy CloudStack today.

VXLAN does have its drawbacks as it requires VXLAN offloading in the NIC, switches and routers who can process it and requires additional networking skills.

In the end a VM needs connectivity, IPv4 and/or IPv6. This allows them to connect to other servers and the rest of the internet.

In the current design, whether it is traditional VLAN or VXLAN we still assume that there is a L2 network. The VLAN or the VNI in VXLAN.

Technically these are not required and we can use pure L3 routing towards the VMs from the host. In my opinion this can simplify networking while also adding scalability.


** cloudbr0 **
On a test machine with plain Libvirt+KVM I created cloudbr0:

113: cloudbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f6:73:63:49:1f:33 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.1/32 scope global cloudbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::b009:e3ff:fe41:1394/64 scope link
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever


You can see I've added two addresses to the bridge:

- 169.254.0.1/32
- fe80::1/64

** Test VM **
I have deployed a test VM which I attached to cloudbr0 and manually added the addresses using netplan:

network:
    ethernets:
        ens18:
            addresses:
            - 2a14:9b80:103::100/128
            - 2.57.57.29/32
            nameservers:
                addresses:
                - 2620:fe::fe
                search: []
            routes:
            - to: 0.0.0.0/0
              via: 169.254.0.1
              on-link: true
            - to: ::/0
              via: fe80::1
    version: 2

This results in:


root@routing-test:~# ip addr show dev ens18
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether bc:24:11:93:d7:94 brd ff:ff:ff:ff:ff:ff
    altname enp0s18
    inet 2.57.57.29/32 scope global ens18
       valid_lft forever preferred_lft forever
    inet6 2a14:9b80:103::100/128 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::be24:11ff:fe93:d794/64 scope link
       valid_lft forever preferred_lft forever
root@routing-test:~#


In the VM you now see the IPv4 and IPv6 routes:


root@routing-test:~# ip -4 r
default via 169.254.0.1 dev ens18 proto static onlink
root@routing-test:~# ip -6 r
2a14:9b80:103::100 dev ens18 proto kernel metric 256 pref medium
fe80::/64 dev ens18 proto kernel metric 256 pref medium
default via fe80::1 dev ens18 proto static metric 1024 pref medium
root@routing-test:~#


** Static route and ARP/NDP entry **
On the HV I needed to add two routes and ARP/NDP entries pointing to the VM


ip -6 route add 2a14:9b80:103::100/128 dev cloudbr0
ip -6 neigh add 2a14:9b80:103::100 lladdr BC:24:11:93:D7:94 dev cloudbr0 nud permanent
ip -4 route add 2.57.57.29/32 dev cloudbr0
ip -4 neigh add 2.57.57.29 lladdr BC:24:11:93:D7:94 dev cloudbr0 nud permanent


BC:24:11:93:D7:94 is the MAC address of the VM in this case.


** L3 Routing with BGP **
On the hypervisor I have the FRR BGP daemon running who advertises the /32 and /128 routes:

- 2.57.57.29/32
- 2a14:9b80:103::100/128


ubuntu# sh ip route 2.57.57.29
Routing entry for 2.57.57.29/32
  Known via "kernel", distance 0, metric 0, best
  Last update 00:00:51 ago
  * directly connected, cloudbr0, weight 1

hv-138-a12-26# show ipv6 route 2a14:9b80:103::100
Routing entry for 2a14:9b80:103::100/128
  Known via "static", distance 1, metric 0
  Last update 6d04h23m ago
    directly connected, cloudbr0, weight 1

Routing entry for 2a14:9b80:103::100/128
  Known via "kernel", distance 0, metric 1024, best
  Last update 6d08h27m ago
  * directly connected, cloudbr0, weight 1

ubuntu#


Both addresses are now advertised upstream towards the other BGP peers while the hypervisor only receives the default routes from upstream (0.0.0.0/0 and ::/0)


*** CloudStack ***
As we only route /32 or /128s towards a VM we gain a lot more flexibility as these IPs can be routed anywhere in your network. No stretching of VLANs nor routing VXLAN between sites.

CloudStack orchestration will need to make sure we program the right routes on the hypervisor, but this is something Libvirt hooks can take care of.

BGP is to be configured by the admin and that is to be documented.

This would be an additional type of network which will not support:

- DHCP
- User-Data from the VR
- A VR at all

UserData will need to come from ConfigDrive and using ConfigDrive the VM will need to configure the IPs locally.

Security Grouping can and will still work as it does right now.

** IPv4 and IPv6 ***
This idea is protocol independent and since DHCP is no longer needed it can work in multiple modes:

- IPv4 only
- IPv6 only (Really single stack!)
- IPv4+IPv6 (Dual Stack)

ConfigDrive will take care of the network configuration.

** What's next? **

I am not proposing anything to be developed right now, but I hope to spark some ideas with people and get a discussion going.

If this will lead to an implementation to be written? Let's see!

Wido

Reply via email to