[
https://issues.apache.org/jira/browse/AIRFLOW-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16764192#comment-16764192
]
ASF GitHub Bot commented on AIRFLOW-3857:
-----------------------------------------
ogirardot commented on pull request #4678: [AIRFLOW-3857] spark_submit_hook
cannot kill driver pod in kubernetes
URL: https://github.com/apache/airflow/pull/4678
### Jira
- [x] My PR addresses the following
[AIRFLOW-3857](https://issues.apache.org/jira/browse/AIRFLOW-3857) issue.
### Description
- [x] The spark_submit_hook.on_kill method cannot kill a kubernetes driver
pod because of an issue in the client used. The client used in 1.10.{0,1,2} is
in fact the CoreV1Api instead of the kubernetes client which has the proper
V1DeleteOptions value needed.
### Tests
- [x] No test was done on this part, no test is added yet.
### Commits
- [x] My commits all reference Jira issues in their subject lines, and I
have squashed multiple commits if they address the same issue. In addition, my
commits follow the guidelines from "[How to write a good git commit
message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
1. Subject is limited to 50 characters (not including Jira issue reference)
1. Subject does not end with a period
1. Subject uses the imperative mood ("add", not "adding")
1. Body wraps at 72 characters
1. Body explains "what" and "why", not "how"
### Documentation
- [x] No new feature, just a bug fix
### Code Quality
- [x] Passes `flake8`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> spark_submit_hook cannot kill pod in kubernetes
> -----------------------------------------------
>
> Key: AIRFLOW-3857
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3857
> Project: Apache Airflow
> Issue Type: Bug
> Components: contrib, kubernetes
> Affects Versions: 1.10.0, 1.10.1, 1.10.2
> Environment: any
> Reporter: Olivier Girardot
> Assignee: Olivier Girardot
> Priority: Major
>
> When trying to kill a driver pod (for example due to a timeout) the
> `spark_submit_hook` does not use the proper kubernetes api and ends up
> calling a non-existent member on the CoreV1Api object :
>
> {code:java}
> [2019-02-09 07:06:19,883] {{logging_mixin.py:95}} INFO - [2019-02-09
> 07:06:19,883] {{spark_submit_hook.py:552}} INFO - Killing pod
> ...-20190209-a48f77e00aa0367ca39f6495b8f6cc08-driver on Kubernetes
> [2019-02-09 07:06:19,892] {{models.py:1788}} ERROR - 'CoreV1Api' object has
> no attribute 'V1DeleteOptions'
> Traceback (most recent call last):
> File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1652,
> in _run_raw_task
> result = task_copy.execute(context=context)
> ...
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
> line 332, in submit
> self._process_spark_submit_log(iter(self._submit_sp.stdout.readline, ''))
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
> line 379, in _process_spark_submit_log
> for line in itr:
> File "/usr/local/lib/python3.6/site-packages/airflow/utils/timeout.py", line
> 42, in handle_timeout
> raise AirflowTaskTimeout(self.error_message)
> airflow.exceptions.AirflowTaskTimeout: Timeout
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1654,
> in _run_raw_task
> task_copy.on_kill()
> ...
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/spark_submit_hook.py",
> line 560, in on_kill
> body=client.V1DeleteOptions(),
> AttributeError: 'CoreV1Api' object has no attribute 'V1DeleteOptions'
> [2019-02-09 07:06:19,896] {{models.py:1817}} INFO - All retries failed;
> marking task as FAILED
> {code}
>
> This is only due to the use of the kube_client (which is really the
> kubernetes.client.CoreV1Api) instead of the kubernetes.client.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)