StathisKap opened a new issue, #33077:
URL: https://github.com/apache/airflow/issues/33077
### Official Helm Chart version
1.10.0 (latest released)
### Apache Airflow version
2.6.2
### Kubernetes Version
k3s latest
### Helm Chart configuration
under worker
```yaml
extraVolumes: []
extraVolumeMounts: []
# Select certain nodes for airflow worker pods.
nodeSelector:
node-role.kubernetes.io/airflow-worker: "true"
priorityClassName: ~
affinity: {}
# default worker affinity is:
# podAntiAffinity:
# preferredDuringSchedulingIgnoredDuringExecution:
# - podAffinityTerm:
# labelSelector:
# matchLabels:
# component: worker
# topologyKey: kubernetes.io/hostname
# weight: 100
tolerations: []
```
### Docker Image customizations
```dockerfile
FROM apache/airflow
#COPY ./dags/ ${AIRFLOW_HOME}/dags/
COPY ./requirements.txt ${AIRFLOW_HOME}/requirements.txt
RUN pip3 install --no-cache-dir apache-airflow==${AIRFLOW_VERSION} -r
${AIRFLOW_HOME}/requirements.txt
```
### What happened
pod/airflow-worker-0 gets stuck on pending, and I get this error
```
0/3 nodes are available: 1 node(s) didn't match Pod's node
affinity/selector. preemption: 0/3 nodes are available: 1 Preemption is not
helpful for scheduling, 2 No preemption victims found for incoming pod..
```
I have 3 nodes, and I'm using k3s
I've set the labels, and for some reason, when I set the node selector to my
master/control-panel, it works, but when i set it to my agents it doesn't.
### What you think should happen instead
it should just spawn the pods at the agents.
### How to reproduce
curl -sfL https://get.k3s.io | sh - to install k3s
```tf
terraform {
required_providers {
hcloud = {
source = "hetznercloud/hcloud"
}
}
}
variable "hcloud_token" {
description = "The API token for Hetzner Cloud"
}
provider "hcloud" {
token = var.hcloud_token
}
resource "hcloud_server" "k3s_agent" {
count = 2
name = "k3s-agent-${count.index}"
server_type = "cx11"
image = "ubuntu-22.04"
ssh_keys = [hcloud_ssh_key.my_key.id]
provisioner "remote-exec" {
inline = [
"curl -sfL https://get.k3s.io | K3S_URL=https://<IP>:6443
K3S_TOKEN='<toke>' sh -"
]
connection {
type = "ssh"
user = "root"
private_key = file("~/.ssh/id_rsa") # Replace with the correct
absolute path
host = self.ipv4_address
}
}
}
resource "hcloud_ssh_key" "my_key" {
name = "my_key"
public_key = file("~/.ssh/id_rsa.pub")
}
```
to create the agents
`helm repo add apache-airflow https://airflow.apache.org`
`helm upgrade --debug --install airflow apache-airflow/airflow --namespace
airflow -f values.yaml`
set labels:
```sh
kubectl label nodes k3s-agent-0 node-role.kubernetes.io/airflow-worker=true
kubectl label nodes k3s-agent-1 node-role.kubernetes.io/airflow-worker=true
```
then upgrade the helm chart
kubectl describe pod airflow-worker-0
```
Name: airflow-worker-0
Namespace: airflow
Priority: 0
Service Account: airflow-worker
Node: <none>
Labels: component=worker
controller-revision-hash=airflow-worker-64d7df4f8c
release=airflow
statefulset.kubernetes.io/pod-name=airflow-worker-0
tier=airflow
Annotations: checksum/airflow-config:
d6a9135fc4481a5bbcf6bace4a4bb82c2fd958c7af2b9c0c1f3e7ddb7715a944
checksum/extra-configmaps:
e862ea47e13e634cf17d476323784fa27dac20015550c230953b526182f5cac8
checksum/extra-secrets:
e9582fdd622296c976cbc10a5ba7d6702c28a24fe80795ea5b84ba443a56c827
checksum/kerberos-keytab:
80979996aa3c1f48c95dfbe9bb27191e71f12442a08c0ed834413da9d430fd0e
checksum/metadata-secret:
cd6de1cad5366c38201917e3ed1ac78bec2655c819758d1fa68bbe0b6539968b
checksum/pgbouncer-config-secret:
1dae2adc757473469686d37449d076b0c82404f61413b58ae68b3c5e99527688
checksum/result-backend-secret:
98a68f230007cfa8f5d3792e1aff843a76b0686409e4a46ab2f092f6865a1b71
checksum/webserver-secret-key:
668251e56927d3d78c4037169030b342f4270aff8e247721420789ede6176254
cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/airflow-worker
Init Containers:
wait-for-airflow-migrations:
Image: stathiskap/custom-airflow:0.0.1
Port: <none>
Host Port: <none>
Args:
airflow
db
check-migrations
--migration-wait-timeout=60
Environment:
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: true
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in
secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__WEBSERVER__SECRET_KEY: <set to the key
'webserver-secret-key' in secret 'airflow-webserver-secret-key'> Optional:
false
AIRFLOW__CELERY__BROKER_URL: <set to the key 'connection' in
secret 'airflow-broker-url'> Optional: false
Mounts:
/opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
/opt/airflow/config/airflow_local_settings.py from config
(ro,path="airflow_local_settings.py")
/var/run/secrets/kubernetes.io/serviceaccount from
kube-api-access-wthl2 (ro)
Containers:
worker:
Image: stathiskap/custom-airflow:0.0.1
Port: 8793/TCP
Host Port: 0/TCP
Args:
bash
-c
exec \
airflow celery worker
Liveness: exec [sh -c CONNECTION_CHECK_MAX_COUNT=0 exec /entrypoint
python -m celery --app airflow.executors.celery_executor.app inspect ping -d
celery@$(hostname)] delay=10s timeout=20s period=60s #success=1 #failure=5
Environment:
DUMB_INIT_SETSID: 0
AIRFLOW__WEBSERVER__EXPOSE_CONFIG: true
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in
secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in
secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__WEBSERVER__SECRET_KEY: <set to the key
'webserver-secret-key' in secret 'airflow-webserver-secret-key'> Optional:
false
AIRFLOW__CELERY__BROKER_URL: <set to the key 'connection' in
secret 'airflow-broker-url'> Optional: false
Mounts:
/opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
/opt/airflow/config/airflow_local_settings.py from config
(ro,path="airflow_local_settings.py")
/opt/airflow/dags from dags (ro)
/opt/airflow/logs from logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from
kube-api-access-wthl2 (ro)
worker-log-groomer:
Image: stathiskap/custom-airflow:0.0.1
Port: <none>
Host Port: <none>
Args:
bash
/clean-logs
Environment:
AIRFLOW__LOG_RETENTION_DAYS: 15
Mounts:
/opt/airflow/logs from logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from
kube-api-access-wthl2 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
logs:
Type: PersistentVolumeClaim (a reference to a
PersistentVolumeClaim in the same namespace)
ClaimName: logs-airflow-worker-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-airflow-config
Optional: false
dags:
Type: PersistentVolumeClaim (a reference to a
PersistentVolumeClaim in the same namespace)
ClaimName: airflow-dags
ReadOnly: false
kube-api-access-wthl2:
Type: Projected (a volume that contains injected data
from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: node-role.kubernetes.io/airflow-worker=true
Tolerations: node.kubernetes.io/not-ready:NoExecute
op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute
op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 27m default-scheduler 0/3 nodes
are available: 1 node(s) didn't match Pod's node affinity/selector. preemption:
0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No
preemption victims found for incoming pod..
Warning FailedScheduling 12m (x3 over 22m) default-scheduler 0/3 nodes
are available: 1 node(s) didn't match Pod's node affinity/selector. preemption:
0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No
preemption victims found for incoming pod..
```
### Anything else
it happens every time
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]