StathisKap opened a new issue, #33077:
URL: https://github.com/apache/airflow/issues/33077

   ### Official Helm Chart version
   
   1.10.0 (latest released)
   
   ### Apache Airflow version
   
   2.6.2
   
   ### Kubernetes Version
   
   k3s latest
   
   ### Helm Chart configuration
   
   under worker
   
   ```yaml
     extraVolumes: []
     extraVolumeMounts: []
   
     # Select certain nodes for airflow worker pods.
     nodeSelector:
       node-role.kubernetes.io/airflow-worker: "true"
     priorityClassName: ~
     affinity: {}
     # default worker affinity is:
     #  podAntiAffinity:
     #    preferredDuringSchedulingIgnoredDuringExecution:
     #    - podAffinityTerm:
     #        labelSelector:
     #          matchLabels:
     #            component: worker
     #        topologyKey: kubernetes.io/hostname
     #      weight: 100
     tolerations: []
     ```
     
      
   
   ### Docker Image customizations
   
   ```dockerfile
   FROM apache/airflow
   #COPY ./dags/ ${AIRFLOW_HOME}/dags/
   COPY ./requirements.txt ${AIRFLOW_HOME}/requirements.txt
   RUN pip3 install --no-cache-dir apache-airflow==${AIRFLOW_VERSION} -r 
${AIRFLOW_HOME}/requirements.txt
   ```
   
   ### What happened
   
   pod/airflow-worker-0 gets stuck on pending, and I get this error
   ```
   0/3 nodes are available: 1 node(s) didn't match Pod's node 
affinity/selector. preemption: 0/3 nodes are available: 1 Preemption is not 
helpful for scheduling, 2 No preemption victims found for incoming pod..
   ```
   
   I have 3 nodes, and I'm using k3s
   
   I've set the labels, and for some reason, when I set the node selector to my 
master/control-panel, it works, but when i set it to my agents it doesn't.
   
   ### What you think should happen instead
   
   it should just spawn the pods at the agents.
   
   ### How to reproduce
   
   curl -sfL https://get.k3s.io | sh - to install k3s
   ```tf
   terraform {
     required_providers {
       hcloud = {
         source = "hetznercloud/hcloud"
       }
     }
   }
   
   variable "hcloud_token" {
     description = "The API token for Hetzner Cloud"
   }
   
   provider "hcloud" {
     token = var.hcloud_token
   }
   
   resource "hcloud_server" "k3s_agent" {
     count       = 2
     name        = "k3s-agent-${count.index}"
     server_type = "cx11"
     image       = "ubuntu-22.04"
     ssh_keys    = [hcloud_ssh_key.my_key.id]
   
     provisioner "remote-exec" {
       inline = [
         "curl -sfL https://get.k3s.io | K3S_URL=https://<IP>:6443 
K3S_TOKEN='<toke>' sh -"
       ]
       connection {
         type        = "ssh"
         user        = "root"
         private_key = file("~/.ssh/id_rsa") # Replace with the correct 
absolute path
         host        = self.ipv4_address
       }
     }
   }
   
   resource "hcloud_ssh_key" "my_key" {
     name       = "my_key"
     public_key = file("~/.ssh/id_rsa.pub")
   }
   ```
   to create the agents
   
   `helm repo add apache-airflow https://airflow.apache.org`
   `helm upgrade --debug --install airflow apache-airflow/airflow --namespace 
airflow -f values.yaml`
   
   
   
   set labels:
   ```sh
    kubectl label nodes k3s-agent-0 node-role.kubernetes.io/airflow-worker=true
    kubectl label nodes k3s-agent-1 node-role.kubernetes.io/airflow-worker=true
    ```
   
   then upgrade the helm chart
    
   kubectl describe pod airflow-worker-0
   ```
   Name:             airflow-worker-0
   Namespace:        airflow
   Priority:         0
   Service Account:  airflow-worker
   Node:             <none>
   Labels:           component=worker
                     controller-revision-hash=airflow-worker-64d7df4f8c
                     release=airflow
                     statefulset.kubernetes.io/pod-name=airflow-worker-0
                     tier=airflow
   Annotations:      checksum/airflow-config: 
d6a9135fc4481a5bbcf6bace4a4bb82c2fd958c7af2b9c0c1f3e7ddb7715a944
                     checksum/extra-configmaps: 
e862ea47e13e634cf17d476323784fa27dac20015550c230953b526182f5cac8
                     checksum/extra-secrets: 
e9582fdd622296c976cbc10a5ba7d6702c28a24fe80795ea5b84ba443a56c827
                     checksum/kerberos-keytab: 
80979996aa3c1f48c95dfbe9bb27191e71f12442a08c0ed834413da9d430fd0e
                     checksum/metadata-secret: 
cd6de1cad5366c38201917e3ed1ac78bec2655c819758d1fa68bbe0b6539968b
                     checksum/pgbouncer-config-secret: 
1dae2adc757473469686d37449d076b0c82404f61413b58ae68b3c5e99527688
                     checksum/result-backend-secret: 
98a68f230007cfa8f5d3792e1aff843a76b0686409e4a46ab2f092f6865a1b71
                     checksum/webserver-secret-key: 
668251e56927d3d78c4037169030b342f4270aff8e247721420789ede6176254
                     cluster-autoscaler.kubernetes.io/safe-to-evict: true
   Status:           Pending
   IP:
   IPs:              <none>
   Controlled By:    StatefulSet/airflow-worker
   Init Containers:
     wait-for-airflow-migrations:
       Image:      stathiskap/custom-airflow:0.0.1
       Port:       <none>
       Host Port:  <none>
       Args:
         airflow
         db
         check-migrations
         --migration-wait-timeout=60
       Environment:
         AIRFLOW__WEBSERVER__EXPOSE_CONFIG:    true
         AIRFLOW__CORE__FERNET_KEY:            <set to the key 'fernet-key' in 
secret 'airflow-fernet-key'>                      Optional: false
         AIRFLOW__CORE__SQL_ALCHEMY_CONN:      <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW__DATABASE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW_CONN_AIRFLOW_DB:              <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW__WEBSERVER__SECRET_KEY:       <set to the key 
'webserver-secret-key' in secret 'airflow-webserver-secret-key'>  Optional: 
false
         AIRFLOW__CELERY__BROKER_URL:          <set to the key 'connection' in 
secret 'airflow-broker-url'>                      Optional: false
       Mounts:
         /opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
         /opt/airflow/config/airflow_local_settings.py from config 
(ro,path="airflow_local_settings.py")
         /var/run/secrets/kubernetes.io/serviceaccount from 
kube-api-access-wthl2 (ro)
   Containers:
     worker:
       Image:      stathiskap/custom-airflow:0.0.1
       Port:       8793/TCP
       Host Port:  0/TCP
       Args:
         bash
         -c
         exec \
         airflow celery worker
       Liveness:  exec [sh -c CONNECTION_CHECK_MAX_COUNT=0 exec /entrypoint 
python -m celery --app airflow.executors.celery_executor.app inspect ping -d 
celery@$(hostname)] delay=10s timeout=20s period=60s #success=1 #failure=5
       Environment:
         DUMB_INIT_SETSID:                     0
         AIRFLOW__WEBSERVER__EXPOSE_CONFIG:    true
         AIRFLOW__CORE__FERNET_KEY:            <set to the key 'fernet-key' in 
secret 'airflow-fernet-key'>                      Optional: false
         AIRFLOW__CORE__SQL_ALCHEMY_CONN:      <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW__DATABASE__SQL_ALCHEMY_CONN:  <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW_CONN_AIRFLOW_DB:              <set to the key 'connection' in 
secret 'airflow-airflow-metadata'>                Optional: false
         AIRFLOW__WEBSERVER__SECRET_KEY:       <set to the key 
'webserver-secret-key' in secret 'airflow-webserver-secret-key'>  Optional: 
false
         AIRFLOW__CELERY__BROKER_URL:          <set to the key 'connection' in 
secret 'airflow-broker-url'>                      Optional: false
       Mounts:
         /opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
         /opt/airflow/config/airflow_local_settings.py from config 
(ro,path="airflow_local_settings.py")
         /opt/airflow/dags from dags (ro)
         /opt/airflow/logs from logs (rw)
         /var/run/secrets/kubernetes.io/serviceaccount from 
kube-api-access-wthl2 (ro)
     worker-log-groomer:
       Image:      stathiskap/custom-airflow:0.0.1
       Port:       <none>
       Host Port:  <none>
       Args:
         bash
         /clean-logs
       Environment:
         AIRFLOW__LOG_RETENTION_DAYS:  15
       Mounts:
         /opt/airflow/logs from logs (rw)
         /var/run/secrets/kubernetes.io/serviceaccount from 
kube-api-access-wthl2 (ro)
   Conditions:
     Type           Status
     PodScheduled   False
   Volumes:
     logs:
       Type:       PersistentVolumeClaim (a reference to a 
PersistentVolumeClaim in the same namespace)
       ClaimName:  logs-airflow-worker-0
       ReadOnly:   false
     config:
       Type:      ConfigMap (a volume populated by a ConfigMap)
       Name:      airflow-airflow-config
       Optional:  false
     dags:
       Type:       PersistentVolumeClaim (a reference to a 
PersistentVolumeClaim in the same namespace)
       ClaimName:  airflow-dags
       ReadOnly:   false
     kube-api-access-wthl2:
       Type:                    Projected (a volume that contains injected data 
from multiple sources)
       TokenExpirationSeconds:  3607
       ConfigMapName:           kube-root-ca.crt
       ConfigMapOptional:       <nil>
       DownwardAPI:             true
   QoS Class:                   BestEffort
   Node-Selectors:              node-role.kubernetes.io/airflow-worker=true
   Tolerations:                 node.kubernetes.io/not-ready:NoExecute 
op=Exists for 300s
                                node.kubernetes.io/unreachable:NoExecute 
op=Exists for 300s
   Events:
     Type     Reason            Age                From               Message
     ----     ------            ----               ----               -------
     Warning  FailedScheduling  27m                default-scheduler  0/3 nodes 
are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 
0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No 
preemption victims found for incoming pod..
     Warning  FailedScheduling  12m (x3 over 22m)  default-scheduler  0/3 nodes 
are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 
0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No 
preemption victims found for incoming pod..
   ```
   
   ### Anything else
   
   it happens every time
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to