[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930224#comment-16930224
 ] 

Daniel Imberman edited comment on AIRFLOW-5447 at 9/16/19 4:21 AM:
-------------------------------------------------------------------

[~Yuval.Itzchakov] [~cwegrzyn]Thank you guys for getting this info to us. I 
THINK this might have to do with a bug in the k8s python client which requires 
"create" and "get" privileges for "pods/exec"

[https://stackoverflow.com/questions/53827345/airflow-k8s-operator-xcom-handshake-status-403-forbidden]
 [https://github.com/kubernetes-client/python/issues/690]

The reason I believe this is that this lack of running/updating of pods point 
to a failure or the KubernetesJobWatcher. When we finally started seeing 
similar problems we were seeing these failures from the JobWatcher 
[https://user-images.githubusercontent.com/1036482/64914385-2f0eca80-d71e-11e9-8f8b-44a1c8620b92.png].
 I'm going to look into this further tomorrow and get back ASAP.


was (Author: dimberman):
[~Yuval.Itzchakov] [~cwegrzyn]Thank you guys for getting this info to us. I 
THINK this might have to do with a bug in the k8s kubernetes client which 
requires "create" and "get" privileges for "pods/exec"

[https://stackoverflow.com/questions/53827345/airflow-k8s-operator-xcom-handshake-status-403-forbidden]
[https://github.com/kubernetes-client/python/issues/690]

The reason I believe this is that this lack of running/updating of pods point 
to a failure or the KubernetesJobWatcher. When we finally started seeing 
similar problems we were seeing these failures from the JobWatcher 
[https://user-images.githubusercontent.com/1036482/64914385-2f0eca80-d71e-11e9-8f8b-44a1c8620b92.png].
 I'm going to look into this further tomorrow and get back ASAP.

> KubernetesExecutor hangs on task queueing
> -----------------------------------------
>
>                 Key: AIRFLOW-5447
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: executor-kubernetes
>    Affects Versions: 1.10.4, 1.10.5
>         Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>            Reporter: Henry Cohen
>            Assignee: Daniel Imberman
>            Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to