Olivier Girardot created AIRFLOW-3857:
-----------------------------------------
Summary: spark_submit_hook cannot kill pod in kubernetes
Key: AIRFLOW-3857
URL: https://issues.apache.org/jira/browse/AIRFLOW-3857
Project: Apache Airflow
Issue Type: Bug
Components: contrib, kubernetes
Affects Versions: 1.10.2, 1.10.1, 1.10.0
Environment: any
Reporter: Olivier Girardot
Assignee: Olivier Girardot
When trying to kill a driver pod (for example due to a timeout) the
`spark_submit_hook` does not use the proper kubernetes api and ends up calling
a non-existent member on the CoreV1Api object :
{code:java}
[2019-02-09 07:06:19,883] {{logging_mixin.py:95}} INFO - [2019-02-09
07:06:19,883] {{spark_submit_hook.py:552}} INFO - Killing pod
...-20190209-a48f77e00aa0367ca39f6495b8f6cc08-driver on Kubernetes
[2019-02-09 07:06:19,892] {{models.py:1788}} ERROR - 'CoreV1Api' object has no
attribute 'V1DeleteOptions'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1652, in
_run_raw_task
result = task_copy.execute(context=context)
...
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
line 332, in submit
self._process_spark_submit_log(iter(self._submit_sp.stdout.readline, ''))
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
line 379, in _process_spark_submit_log
for line in itr:
File "/usr/local/lib/python3.6/site-packages/airflow/utils/timeout.py", line
42, in handle_timeout
raise AirflowTaskTimeout(self.error_message)
airflow.exceptions.AirflowTaskTimeout: Timeout
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1654, in
_run_raw_task
task_copy.on_kill()
...
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
line 560, in on_kill
body=client.V1DeleteOptions(),
AttributeError: 'CoreV1Api' object has no attribute 'V1DeleteOptions'
[2019-02-09 07:06:19,896] {{models.py:1817}} INFO - All retries failed; marking
task as FAILED
{code}
This is only due to the use of the kube_client (which is really the
kubernetes.client.CoreV1Api) instead of the kubernetes.client.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)