James Meickle created AIRFLOW-3305:
--------------------------------------

             Summary: KubernetesPodOperator has a race condition for log output
                 Key: AIRFLOW-3305
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3305
             Project: Apache Airflow
          Issue Type: Bug
          Components: kubernetes
    Affects Versions: 1.10.0
            Reporter: James Meickle


The KubernetesPodOperator follows logs from the container in the pod that it 
launches: 
[https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/kubernetes/pod_launcher.py#L96]

This is set to "follow" mode, which streams logs. However, it is possible (but 
not guaranteed) for the pod's container to have started before the log stream 
call reaches the cluster. In this case, re-running the same task may result in 
very different-looking logs, with no notification that there was any 
truncation. This is a confusing experience for operators who are not familiar 
with Kubernetes.

My recommendation is to remove "tail_lines" which should have the effect of 
fetching all previous logs when streaming starts: 
https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#read_namespaced_pod_log



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to