mfjackson opened a new issue #13579:
URL: https://github.com/apache/airflow/issues/13579
**Apache Airflow version**: 2.0.0
**Kubernetes version (if you are using kubernetes)** (use `kubectl
version`): 1.19.4
**Environment**:
- **Cloud provider or hardware configuration**: Google Cloud Platform/GKE
**What happened**:
I successfully cleared the state of a failed task using the graph view UI,
but when I attempted to re-run the cleared task instance in graph view manually
by selecting the task instance and clicking "Run", I received the following
error:
```
Something bad has happened.
Please consider letting us know by creating a bug report using GitHub.
Python version: 3.8.7
Airflow version: 2.0.0
Node: mr_node
-------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line
2447, in wsgi_app
response = self.full_dispatch_request()
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line
1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line
1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/airflow/.local/lib/python3.8/site-packages/flask/_compat.py",
line 39, in reraise
raise value
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line
1950, in full_dispatch_request
rv = self.dispatch_request()
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line
1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/auth.py", line
34, in decorated
return func(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/decorators.py",
line 60, in wrapper
return f(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/www/views.py", line
1366, in run
executor.start()
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/kubernetes_executor.py",
line 493, in start
raise AirflowException("Could not get scheduler_job_id")
airflow.exceptions.AirflowException: Could not get scheduler_job_id
```
**What you expected to happen**:
I expected the task instance to be scheduled and begin running again.
**How to reproduce it**:
Configure Airflow 2.0.0 to run on GCP, clear the state of a finished task
instance using the UI (I was able to reproduce the error on a task instance
maked "Success" as well), and again use the Web UI to "Run" the task.
**Anything else we need to know**:
One important item to note is that when I _only_ clear task instance and do
not attempt to run it manually using the UI, the task does queue and is placing
in a `running` state, but quickly fails with the following error:
```
[2021-01-08 21:08:16,140] {taskinstance.py:1396} ERROR - (0)
Reason: Handshake status 500 Internal Server Error
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
line 296, in websocket_call
client = WSClient(configuration, get_websocket_url(url), headers,
capture_all)
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
line 94, in __init__
self.sock.connect(url, header=header)
File
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_core.py", line
226, in connect
self.handshake_response = handshake(self.sock, *addrs, **options)
File
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_handshake.py",
line 80, in handshake
status, resp = _get_resp_headers(sock)
File
"/home/airflow/.local/lib/python3.8/site-packages/websocket/_handshake.py",
line 165, in _get_resp_headers
raise WebSocketBadStatusException("Handshake status %d %s", status,
status_message, resp_headers)
websocket._exceptions.WebSocketBadStatusException: Handshake status 500
Internal Server Error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1086, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1260, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1300, in _execute_task
result = task_copy.execute(context=context)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
line 335, in execute
final_state, result = self.handle_pod_overlap(
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
line 375, in handle_pod_overlap
final_state, result = self.monitor_launched_pod(launcher, pod)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
line 513, in monitor_launched_pod
(final_state, result) = launcher.monitor_pod(pod, get_logs=self.get_logs)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/kubernetes/pod_launcher.py",
line 151, in monitor_pod
result = self._extract_xcom(pod)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/kubernetes/pod_launcher.py",
line 246, in _extract_xcom
resp = kubernetes_stream(
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/stream.py",
line 35, in stream
return func(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py",
line 841, in connect_get_namespaced_pod_exec
(data) = self.connect_get_namespaced_pod_exec_with_http_info(name,
namespace, **kwargs) # noqa: E501
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py",
line 927, in connect_get_namespaced_pod_exec_with_http_info
return self.api_client.call_api(
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py",
line 340, in call_api
return self.__call_api(resource_path, method,
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/client/api_client.py",
line 172, in __call_api
response_data = self.request(
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/stream.py",
line 30, in _intercept_request_call
return ws_client.websocket_call(config, *args, **kwargs)
File
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/stream/ws_client.py",
line 302, in websocket_call
raise ApiException(status=0, reason=str(e))
kubernetes.client.rest.ApiException: (0)
Reason: Handshake status 500 Internal Server Error
```
Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]