GitHub user HeinzM created a discussion: data-server. is not reachable in some
vpc isolated guest networks
Hello,
let me describe the setup first:
We try to deploy a k8s cluster in a vpc setup with isolated networking for
control plane and worker nodes.
CS: 4.21
**Cloudstack VPC and isolated networks**
We created a vpc first.
```hcl
resource "cloudstack_vpc" "k8s_vpc01" {
name = var.k8s_vpc01_name
cidr = var.k8s_vpc01_cidr
vpc_offering = var.k8s_vpc01_offering
zone = var.zone
project = var.project_id
}
```
k8s_vpc01_cidr = 10.0.0.0/19
Then we create three networks, one for control plane and two worker networks.
```hcl
resource "cloudstack_network" "k8s_cp_01" {
name = var.k8s_nw_cp01
cidr = var.k8s_nw_cp01_cidr
network_offering = var.vpc_network_offering
zone = var.zone
vpc_id = cloudstack_vpc.k8s_vpc01.id
acl_id = cloudstack_network_acl.k8s_acl_cp.id
project = var.project_id
}
```
```hcl
resource "cloudstack_network" "k8s_wn" {
count = var.k8s_nw_wn_count
project = var.project_id
name = local.wn_nws[count.index].network_name
cidr = local.wn_nws[count.index].cidr
network_offering = var.vpc_network_offering
zone = var.zone
vpc_id = cloudstack_vpc.k8s_vpc01.id
acl_id = cloudstack_network_acl.k8s_acl_wn[count.index].id
}
```
k8s_nw_cp01_cidr = 10.0.1.0/28
k8s_nw_wn01_cidr = 10.0.2.0/28
k8s_nw_wn01_cidr = 10.0.3.0/28
vpc_network_offering = DefaultIsolatedNetworkOfferingForVpcNetworks
**Cloudstack instances**
We deploy nodes for control plane as cloudstack_instance with the following
configuration:
```hcl
resource "cloudstack_instance" "controller" {
depends_on = [cloudstack_network.k8s_cp_01]
count = var.controller_count
project = var.project_id
service_offering = var.compute_offering_cp
template = var.talos_image
name = local.controller_nodes[count.index].name
ip_address = local.controller_nodes[count.index].ip
zone = var.zone
cluster_id = var.cluster_ids[0]
network_id = cloudstack_network.k8s_cp_01.id
user_data =
base64encode(data.talos_machine_configuration.controller[count.index].machine_configuration)
expunge = true
}
```
and for worker
```hcl
resource "cloudstack_instance" "worker" {
depends_on = [
cloudstack_instance.controller,
cloudstack_network.k8s_wn
]
for_each = { for worker in local.worker_nodes: "${worker.name}" =>
worker }
project = var.project_id
service_offering = var.compute_offering_worker
template = var.talos_image
name = each.value.name
ip_address = each.value.ip
zone = var.zone
cluster_id = var.cluster_ids[0]
network_id = each.value.network
user_data =
base64encode(data.talos_machine_configuration.worker.machine_configuration)
expunge = true
root_disk_size = 16
}
```
**The userdata**
controller:
```hcl
data "talos_machine_configuration" "controller" {
count = var.controller_count
cluster_name = var.k8s_cluster_name
cluster_endpoint = local.cluster_endpoint
machine_secrets = talos_machine_secrets.talos.machine_secrets
machine_type = "controlplane"
talos_version = "1.11.3"
config_patches = [
yamlencode({
machine = {
install = {
disk = "/dev/vda"
extraKernelArgs = ["talos.platform=cloudstack"]
}
env = {
http_proxy = var.proxy_server
https_proxy = var.proxy_server
no_proxy = var.no_proxy
}
time = {
servers = var.ntp_servers
}
kubelet = {
extraArgs = {
rotate-server-certificates = true
}
}
network = {
hostname = local.controller_nodes[count.index].name
interfaces = [
{
deviceSelector = {
physical = true
}
addresses: [
"${local.controller_nodes[count.index].ip}/${local.cidr_mask[1]}" ]
routes: [ {
network = "0.0.0.0/0"
gateway = "${local.controller_nodes[count.index].gateway}"
} ]
}
]
nameservers = var.dns_servers
}
}
cluster = {
network = {
cni = {
name = "none"
}
}
proxy = {
disabled = true
}
apiServer = {
certSANs = [ cloudstack_ipaddress.k8s_cp_staticnat_ip01.ip_address ]
}
extraManifests = [
"https://raw.githubusercontent.com/alex1989hu/kubelet-serving-cert-approver/main/deploy/standalone-install.yaml",
"https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml"
]
}
})
]
}
```
worker:
```hcl
data "talos_machine_configuration" "worker" {
cluster_name = var.k8s_cluster_name
cluster_endpoint = local.cluster_endpoint
machine_secrets = talos_machine_secrets.talos.machine_secrets
talos_version = "1.11.3"
machine_type = "worker"
config_patches = [
yamlencode({
machine = {
install = {
disk = "/dev/vda"
extraKernelArgs = ["talos.platform=cloudstack"]
}
env = {
http_proxy = var.proxy_server
https_proxy = var.proxy_server
no_proxy = var.no_proxy
}
time = {
servers = var.ntp_servers
}
}
cluster = {
network = {
cni = {
name = "none"
}
}
proxy = {
disabled = true
}
}
})
]
}
```
proxy_server = http://server.ip:port
no_proxy = "10.0.0.0/8, data-server."
ntp_server = "/dev/ptp0"
What happens next:
The machines come up.
The machines from the control plane cannot resolve data-server. in any case
that has occurred so far.
The machines from the worker networks alternate.
Sometimes data-server. can be resolved in one network and sometimes in the
other.
I can see on the router in the virtual machine that data-server points to the
respective IP addresses from the control plane and worker networks.
What I can also see is that in the worker network, where data-server can be
reached, the DNS points to the IP of the virtual router.
In the other networks, the DNS servers from the local network are used.
Without the network separation, i.e., with just a simple guest network, the
configuration works perfectly.
I can't tell right now whether this is a bug or user error.
Does anyone have any advice for me?
GitHub link: https://github.com/apache/cloudstack/discussions/11879
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]