lshw42 opened a new issue #19564:
URL: https://github.com/apache/airflow/issues/19564


   ### Apache Airflow version
   
   2.1.4
   
   ### Operating System
   
   Debian
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow==2.1.4
   apache-airflow-providers-amazon==2.2.0
   apache-airflow-providers-celery==2.0.0
   apache-airflow-providers-cncf-kubernetes==2.0.2
   apache-airflow-providers-docker==2.1.1
   apache-airflow-providers-elasticsearch==2.0.3
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-google==5.1.0
   apache-airflow-providers-grpc==2.0.1
   apache-airflow-providers-hashicorp==2.1.0
   apache-airflow-providers-http==2.0.1
   apache-airflow-providers-imap==2.0.1
   apache-airflow-providers-microsoft-azure==3.1.1
   apache-airflow-providers-mysql==2.1.1
   apache-airflow-providers-postgres==2.2.0
   apache-airflow-providers-redis==2.0.1
   apache-airflow-providers-sendgrid==2.0.1
   apache-airflow-providers-sftp==2.1.1
   apache-airflow-providers-slack==4.0.1
   apache-airflow-providers-sqlite==2.0.1
   apache-airflow-providers-ssh==2.1.1
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   - Kubernetes 1.17 (managed)
   - kubectl 1.19
   - Airflow 2.1.4
   - Airflow Helm chart 8.5.2 (Community)
   - Custom registry to pull `alpine` from, configured through 
`airflow_local_settings.py` provided as k8s Secret which is configured as 
`extraVolumeMounts`in `values.yaml` for helm chart
   - KubernetesExecutor with `AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY` 
set
   
   ### What happened
   
   In the DAG the tasks are using `KubernetesPodOperator` like the following:
   ```python
   test1 = KubernetesPodOperator(
       task_id = 'test1',
       name = 'tester_test1',
       namespace = 'airflow',
       image = 'custom path to image',
       image_pull_policy = 'IfNotPresent',
       image_pull_secrets = [k8s.V1LocalObjectReference('...')],
       do_xcom_push = True,
       get_logs = True,
       in_cluster = True,
       is_delete_operator_pod = True,
       init_containers = [init_container],
       dag = dag
   )
   ```
   When triggering the DAG and starting a task the output of `kubectl describe 
pod <worker pod> -n airflow` is like:
   
   > Failed to pull image "alpine": rpc error: code = Unknown desc = Error 
response from daemon: toomanyrequests: You have reached your pull rate limit. 
You may increase the limit by authenticating and upgrading: 
https://www.docker.com/increase-rate-limit
   
   ### What you expected to happen
   
   When triggering the DAG and starting a task the expected output of `kubectl 
describe pod <worker pod> -n airflow` is like:
   
   > Successfully pulled image "alpine"
   
   This problem is well known and common solution is setting 
`airflow_local_settings.py` as described in the [Airflow Helm chart (Community) 
FAQ](https://artifacthub.io/packages/helm/airflow-helm/airflow#how-to-set-airflow-configs).
   
   As far as I understand, this is related to the `KubernetesExecutor`, because 
it is not possible to configure the executor to inject a configuration into the 
used image for the worker Pod. 
   
   It is possible to configure the `KubernetesExecutor` to use a custom image 
from a custom registry by setting 
`AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY` and 
`AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG`. But it is not possible to build a 
custom Airflow 2.1.4 image which is already configured to use the custom 
registry for pulling `alpine` from.
   
   ### How to reproduce
   
   1. Setup Airflow 2.1.4 on Kubernetes using Helm chart 8.5.2 (community)
   2. Configure `airflow_local_settings.py` as described in Helm chart 8.5.2 
(community) FAQ
   3. Use `KubernetesPodOperator` with `do_xcom_push=True`
   4. Trigger DAG that launch the task
   
   ### Anything else
   
   This problem occures every time the Docker pull limit is hit.
   
   `kubectl describe pod tester-test1 -n airflow` returns at end of output:
   
   **Events:**
   | Type | Reason | Age | From | Message |
   |----|------|----|----|-------|
   |  Normal | Scheduled | 21s | default-scheduler | Successfully assigned 
tester-test1 | 
   |  Normal | Pulled | 20s | kubelet | Container image 
"custom/repo/alpine:3.14.0" already present on machine | 
   |  Normal | Created | 20s | kubelet | Created container init-container | 
   |  Normal | Started | 19s | kubelet | Started container init-container | 
   |  Normal | Pulled | 19s | kubelet | Container image 
"custom/repo/xcom_test:latest" already present on machine | 
   |  Normal | Created | 19s | kubelet | Created container base | 
   |  Normal | Started | 19s | kubelet | Started container base | 
   |  Normal | BackOff | 16s (x2 over 17s) | kubelet | Back-off pulling image 
"alpine" | 
   |  Warning | Failed | 16s (x2 over 17s) | kubelet | Error: ImagePullBackOff 
| 
   |  Normal | Pulling | 4s (x2 over 19s) | kubelet | Pulling image "alpine" | 
   |  Warning | Failed | 3s (x2 over 17s) | kubelet | Failed to pull image 
"alpine": rpc error: code = Unknown desc = Error response from daemon: 
toomanyrequests: You have reached your pull rate limit. You may increase the 
limit by authenticating and upgrading: 
https://www.docker.com/increase-rate-limit | 
   |  Warning | Failed | 3s (x2 over 17s) | kubelet | Error: ErrImagePull | 
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to